Appendix E: Promoting equity in and with social science research
You have come to understand that social research is not a value-free enterprise. Our values shape our choice of research topics, our methodological choices, and the meaning we construct from the results of our data analysis. Our research can also be used to pursue values, such as by conducting applied research to optimize values like effectiveness or efficiency. Equity, or fairness, is a value that deserves careful attention. Our methodological choices and what we learn through research can promote equity or inadvertently perpetuate inequity. The most obvious way research can focus on equity is in the selection of our research questions. Social research is commonly used to explore questions about disparities among different racial and ethnic groups, geographic regions, genders, and socioeconomic groups and to identify ways to improve equity. I just entered “racial disparities” in Google Scholar and found examples of social science research seeking to describe and explain racial disparities in education, health care, and criminal justice just on the first page of results. There are other, perhaps less obvious, ways that our research choices can affect equity, whether our research question is directly about equity or not. Below, I offer some principles and practices to consider as we plan and conduct our own research with the value of equity in mind.
(1) Pursue research questions with the goal of describing and explaining inequities and identifying possible remedies. I’m including this first just in case you skipped the paragraph above and jumped straight to the numbered list. (I do that a lot.) Go back and read that paragraph. This is the most important strategy for pursuing the goal of equity with research.
(2) Disaggregate data to identify inequities. Even if our research project isn’t about inequity per se, we can still take the opportunity to look for evidence of equity and inequity. This is very common in program evaluations. The primary goal of an evaluation of an afterschool tutoring program may be to determine if students’ academic performance improves due to participation in the program. We may take the opportunity, though, to disaggregate our data to ask comparative questions from an equity perspective: Does the program work equally well for students of different genders? Different races? Different ages? For native and non-native English speakers? Follow-up research questions could explore why we do or do not see disparities, which could help people leading this tutoring program and similar programs to improve or sustain equitable outcomes.
(3) Conduct within-group analysis. I think the most overlooked opportunity for conducting research from an equity perspective is to examine variation in outcomes within groups. Imagine that we conduct our evaluation of the afterschool tutoring program, disaggregate our data, and discover that native English speakers see improved academic performance as a result of participating in the program, but non-native speakers do not. A next step could then be to look at variation in outcomes within the group of non-native speakers. Most likely, we will learn that, while they benefit less than the native speakers on average, there is still variation in learning outcomes among the non-native speakers. Some of these students probably benefit from the tutoring program more than others. We may be able to identify factors that help explain that variation. Did the students for whom the program was helpful have tutors who also spoke their native language? Did these students seek help with one subject more often than another? Do these students have different levels of parental support? By exploring within-group variation, we are able to go beyond simply identifying disparities to identifying possible strategies for reducing disparities.
(4) Be thoughtful about demographic control variables. This appendix follows the appendix on elaboration modeling in hopes that you already have a good grasp of the role of control variables in our research. (If you are unsure about why we use control variables, reading about elaboration modeling first is a good idea.) Demographic factors are often included in research designs as control variables. This is, in itself, fine and often a good idea. We must, though, take care in how we interpret our findings. Imagine reading this interpretation of multiple regression results in a journal article reporting the outcomes of a job training program evaluation:
For every additional month of job training, the model predicts participants’ starting wages will increase by $2 per hour, holding race constant.
In this example, our independent variable is months of job training, our dependent variable is starting wage, and race is a control variable. If this is the extent of the interpretation of the results, we cannot know if the authors are overlooking an inequitable outcome, but we would be rightly suspicious that this is the case. If participants’ race was used as a control variable in the model they have presented, it was likely a statistically significant control variable (or why else include it in the final model?). It’s possible that the difference between racial groups was negligible or quite substantive—we don’t know. When using race or other characteristics as control variables, then, it is essential to explicitly describe the relationship between race and the dependent variable. We should never mindlessly include demographic characteristics as control variables just because that’s what everyone does without bothering to interpret the impact of those control variables on our findings.
(5) Do not assume white men as “normal” when using dummy variables. If this is the first time you’ve encountered the term dummy variable, you may think I am about to caution against using the term dummy. Nope. That is just the jargon used to describe a certain type of dichotomous variable. If we had a regular, non-dummy variable for race, using the U.S. Census categories, our data for three survey respondents might look like this:
Name | Race |
---|---|
Ed | White |
Margaret | Black or African American |
Alleen | Asian American |
There, we have a variable titled Race with the values White, Black or African American, Asian American, plus two others that are not represented in our data, American Indian/Alaska Native, and Native Hawaiian/Pacific Islander.
If we use dummy coding for our race variable, those same three survey respondents’ data would like this:
Name | White | Black or African American | Asian American | American Indian/Alaska Native | Native Hawaiian/Pacific Islander |
---|---|---|---|---|---|
Ed | 1 | 0 | 0 | 0 | 0 |
Margaret | 0 | 1 | 0 | 0 | 0 |
Alleen | 0 | 0 | 1 | 0 | 0 |
Now, we have five dummy variables, one for each race category. Each of the variables can take on the values of zero (meaning, basically, no) or one (meaning yes). This approach to organizing our data has the benefit of transforming the nominal-level data to ratio-level, which gives us many more options for quantitative analysis. Dummy variables are commonly used in regression analysis. When dummy variables are used as independent variables in regression analysis, one of the dummy variables is omitted from the analysis and becomes the reference category. Here is an example of such a regression model using abbreviated names for the race dummy variables above and a hypothetical index of attitude toward entrepreneurship:
Predicted Entrepreneurship Attitude = β0 + β1*Black + β2*Asian + β3*AIAN + β4*NHPI
Note that the White dummy variable is not included in the model; it serves as the reference category. This is fine; there is no one right way to select the reference category, and mathematically, it doesn’t matter. Statistical software packages might select the reference group alphabetically, or we might select the category with the most cases. Sometimes, though, we select the reference category because it is considered normal or typical. If we dummy coded a COVID status variable, for example, we could have dummy variables for people who have never had COVID, people who have COVID, and people who have recovered from COVID. In this example, it would be reasonable to use never had COVID as our reference category because that is “normal.” Here is where we must be careful with dummy variables (and if you are new to dummy variables or regression analysis, this is the important point): In presenting our findings, we must be careful not to treat different demographic groups as “normal.” In the model above, we should reconsider this type of presentation of results:
Race | Predicted entrepreneurship index score |
---|---|
Black or African American | -2 |
Asian American | -1 |
American Indian/Alaska Native | +4 |
Native Hawaiian/Pacific Islander | +3 |
Reference group: White respondents |
That presentation implies that white respondents should be considered the norm—the standard to which other groups are compared. Instead, we could present, say, mean values for each group and highlight more meaningful comparisons among them.
(6) Involve stakeholders in planning and conducting research and (7) examine your own biases. I am offering the sixth and seventh recommendations together because they are closely related. In research about any group of people, it is a good idea to consult with members of that group in planning or, even better, conducting the research. I, a white, middle-aged man, have found this to be essential to learning about the attitudes of young, mostly African-American and Hispanic, people toward sex education programs in middle and high schools. By asking representatives of this group for feedback on survey items and plans for administering surveys, I was able to dodge potential misunderstandings and resistance to their peers’ participation that I otherwise would not have anticipated. This is due in no small part to my own biases. I think about the world in a certain way that is shaped by my own experiences, and this will affect concrete research methods choices, like how I word questions that I ask research participants, how I invite people to participate in research, and how I go about collecting the data. It is important for me to reflect on how my own biases may influence such choices and perhaps to read about others’ perspectives, but self-reflection and reading can only go so far. Inviting others to provide their ideas about research plans and engaging with diverse research collaborators are invaluable when I am conducting research about—or, put better, hoping to learn from—people who have had different life experiences than me.
(8) Honor the humanity of research participants. We could surely extend this list much further, but instead of a long list of tips, I will conclude with this guiding principle that should be foundational to all research about people, repeated from what I’ve written elsewhere about research ethics: Our research participants are not merely “subjects,” they are neither data points nor ID numbers, they cannot be fully known by the values we assign to variables for them, and they are not individual representatives of the generalizations we hope to derive from our research (see Appendix F on this last point). The people who participate in research are individuals of inestimable worth and dignity, and they should be respected accordingly.