This unit is called “Seeing Race in Statistics”, and it will be used as the first unit in an introductory statistics course. I currently teach two introductory high school statistics courses: Advanced Placement Statistics and Statistics for Health and Business at an interdistrict magnet school in New Haven, Connecticut. Our students come from over ten towns and reflect many educational, social, and cultural backgrounds. The unit is designed with my students in mind, diverse in both social background as well as background knowledge. Within this unit, I will take a three level design that is planned to make these courses more relevant to students and promote questions that interrogate the authority of statistics that students will encounter throughout the course and in their lives.
The skill of interrogating statistics is crucial for all adults in our society to become thinking consumers and users of data. In addition, it is important to deconstruct data to see implicit ideas of domination and subjugation that travel through numbers that can appear nuetral. Statistics shares a creation story with the field of Eugenics. Francis Galton, a mathematician who contributed many of the major ideas to statistics was also one of the originators of eugenics. The influence of eugenic thinking in statistics drives a notion of superiority, fitness and ranking alongside measurements. Milton Reynolds describes this in Shifting Frames:,” The term “eugenics” refers to a scientifically based, ideological movement dedicated to the reiification of race. It is the wellspring of scientific theories used to construct taxonomies of difference within the human family and to legitimize the subjugation of different groups.”.1 Statistics often does the work of justifying this subjugation through its “innocent” and authoritative work as a logical system. These embedded assumptions of superiority are validated by the seeming neutrality of mathematical calculations. The “taxonomies of difference” he describes are invalid and biased assumptions about difference that dominate our interpretations of data, however they appear as factual products legitimized by math.
Statistics is a unique mathematical discipline, an adjunct to and tool to both the natural and social sciences. At root, it is a way to characterize, analyze and quantify the natural world and the conceptualized world both through direct measurement and predictions based on those measurements. Statistics uses these empirical results to construct models of behavior that help us to plan, predict and problem solve. As a mathematical practice it is very modern, unlike its partners of Geometry, Algebra and even Calculus. Statistics has always combined a conceptualized world-view with specific calculations to understand probability and error management. Statistics arose in the seventeenth and eighteenth century from three disparate fields of study; the ancient art of record keeping, the study of astronomy and finally, the study of games and chance. However, as it arose during this period, it also coincided with ascendant notions of eugenics, imperialism, colonialism and white dominance.
Statistical inferences are based on probability theorems. It is a tool to create predictions. These models are not as pure as many mathematical models based only within the closed intellectual system of mathematics. Statistics likes to get its feet messy by drawing on assumptions and then alchemizing those conditions, through beautiful calculations, into a plausible and predictable future. However, inferential statistics are models, and models can be useful but should not be mistaken for reality. Drawing conclusions based on assumptions can only be as accurate as our human conception of the world. The weightiness that mathematical inference carries can disguise assumptions that are threads of the final cloth. As sociologist Tufuku Zuberi says in Thicker than Blood:” Mathematical statements follow one another in a definite order according to certain principles and are accompanied by proofs. The numbers from mathematics are the result of logical calculations. In mathematics the numbers are either exact or have a known or estimable error. Statistics is a system of estimation based on uncertainty. Statistics is a form of applied mathematics. Often in statistics, the numbers are no more than the axioms applied and may have little to do with the conditions of the correct applicability in the real world.”2
Conflating modeling and truth is an issue of concern in our statistically driven society. Within social settings, our communities, education systems, governments and belief systems; the product of our statistical measurement drives political thought and can also become the belief. Models and their assumptions, when repeated often and widely can become our reality. The notion of “alternate facts” that arose during the Trump era, is tied to this idea. If we are to believe statistics and science and use them to guide our thinking, then what do we do with the presence of conflicting models and their mutual assertions of truths with 99% confidence?
How we create our models is then not an arbitrary question. For the AP Statistics students, this “checking of conditions and assumptions” has become a joke-like memorized gauntlet that they must run in order to produce the delicious mathematical product of a confidence interval or a p-value. But at the start, how we collect and identify information is one of the most critical pieces of the puzzle. That is the most dangerous moment where our blind spots and biases will endanger our ability to see and seek truth.
How can we expect that statistics derived by measurement tools that are embedded in our racialized society are not also racialized? Again Zuberi, “Race is a socially constructed process that produces subordinate and superordinate groups. Racial stratification is the key social process behind racial classifications.“3 This point is crucial. It refutes the notion that race classification is a neutral variable. The codes of racial dominance are so pronounced in the United States that racial classification can’t be separated from stratification. Any statistical measurement that uses race and racial proxies as variables, imports the cultural meanings of race and embeds those meanings into the inferences that are intended to be drawn from these models.