Appendix G: Research methods glossary
This glossary provides definitions for the research methods jargon found in this book and for some other terms you might encounter as you learn more about research methods.
Accuracy, level of (in sampling): The breadth of the interval in which parameters can be estimated using statistics with a given level of confidence
Administrative data: Data collected in the course of implementing a policy or program or operating an organization
Alternative hypothesis: See hypothesis testing
Analytic generalizability: The extent to which a theory applies (“generalizes”) to a given case; demonstrating analytic generalizability is held by some researchers as a goal for qualitative research
Antecedent variable: An independent variable that causes changes in the key independent variable, which, in turn, causes change in the dependent variable
Association: A probabilistic relationship between two or more variables
Axial coding: Organizing the themes that emerge from open coding, frequently by combining them into general themes subdivided into more specific themes and identifying additional relationships among codes, resulting in an organized set of codes that can be used in subsequent analysis of qualitative data
Bias: The systematic distortion of findings due to a shortcoming of the research design
Case study comparison research design: Research design in which multiple case studies are conducted and compared
Case study research design: Systematic study of a complex case (such as an event, a program, a policy) that is in-depth, holistic, using multiple data sources/methods/collection techniques
Case: An object of systematic observations; an entity to which we assign values for variables
Census: (1) A sample comprised of the entire population; (2) a study in which the sample is comprised of the entire population
Chunking: Identifying short segments of meaningful qualitative data to be coded and analyzed
Closed-ended question: A survey or interview question that requires respondents to select from a set of predetermined responses
Cluster sampling: A probability sampling design in which successively narrower aggregates of cases are selected before ultimately selecting cases for inclusion in the sample
Coding: See axial coding, open coding, selective coding
Concept: An abstraction derived from what many instances of it have in common
Concurrent validity: A type of criterion validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) relates to another variable measured at the same time as would be expected if the variable accurately measures what it is intended to measure
Confidence, level of (in sampling): The certainty, expressed as a percentage, with which parameters can be estimated using statistics with a given level of accuracy; the percentage of times an estimated parameter would be expected to be within a given range (the level of accuracy) if calculated using data collected from a large number of hypothetical samples
Confidence interval: The range of values we estimate a population parameter to fall in at a given level of confidence
Content validity: An aspect of operational validity describing the extent to which the operationalization of an abstract concept measures the full breadth of meaning connoted by the concept
Control variable: A variable that might threaten nonspuriousness when examining the causal relationship between an independent variable and dependent variable; control variables are plausibly related to both the independent and dependent variables and could thus explain an observed association between them; in an experiment or quasi-experiment, control variables are those variables held constant so that they cannot affect the dependent variable while the independent variable is manipulated
Convenience sampling: A nonprobability sampling design in which cases are selected because they are convenient for the researcher
Conversational interviews: Interview conducted following a very flexible protocol outlining general themes but permitting the interview to evolve like a natural conversation between the researcher and respondent
Criterion validity: An aspect of operational validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) is associated with another variable as would be expected if the variable accurately measures what it is intended to measure
Cross sectional research design: A formal research design in which data are collected in one “wave” of data collection, with data analysis making no distinction among data collected at different times
Data analysis: Systematically finding patterns in data
Dependent variable: A variable with values that are dependent on the values of another variable; in a cause-and-effect relationship, the variable representing the effect
Descriptive data analysis: Quantitative data analysis that summarizes characteristics of the sample
Discriminate validity: An aspect of operational validity describing the extent to which the operationalization of an abstract concept discriminates between the target concept and other concepts
Disproportionate stratified sampling: A probability sampling design in which the proportions of cases in the population demonstrating known characteristics are intentionally and strategically different for the cases in the sample, usually to permit comparisons among subsets of the sample that may otherwise have had too few cases
Dissemination: To share the results of a study and how it was conducted widely, usually by publication
Double-barreled question: A question, such as in an interview or survey, that is actually asking two questions at once
Dummy variables and dummy coding: A dummy variable is a variable that takes on two values: one (meaning, basically, yes) and zero (meaning no). Dummy coding is the process of transforming a single categorical variable into a series of dummy variables, with each value of the original categorical variable transformed into its own dummy variable. For example, the variable student classification with the values freshman, sophomore, junior, and senior, can be transformed into four dummy variables, freshman, sophomore, junior, and senior, each taking on the values of one or zero. Dummy coding a categorical variable thus yields a series of ratio- level variables, enabling a much wider range of quantitative analysis options.
Ecological fallacy: A research finding made in error by mistakenly applying what has been learned about groups of cases to individual cases
Effect size: A quantitative measure of the magnitude of a statistical relationship Empirical research: Generating knowledge based on systematic observations Empirical: Based on systematic observation
Empiricism: The stance that the only things that are “real” and therefore matter are those things that can be directly observed; not to be confused with empirical
Experimental research design: A formal research design in which cases are randomly assigned to at least one experimental group and one control group with the researcher determining the values of the independent variables that will be assigned to each group and the dependent variable measured after (and usually before as well) manipulation of the independent variable
External validity: The generalizability of claims generated by empirical research beyond cases directly observed
Face validity: An aspect of operational validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) appears to measure what it is intended to measure
Fact-value dichotomy: The naïve view that fact and value are always wholly distinct categories
Focus group:A group of individuals who share something in common of relevance to the research project who are interviewed together and encouraged to interact to allow themes to emerge from the group discourse
Generalize: To make claims beyond what can be claimed based on direct observation, such as making claims about an entire population based on observations of a sample of the population
Hawthorne effect: Bias resulting from changes in research participants’ behavior effected by their awareness of being observed
Hypothesis: A statement describing the expected relationship between two or more variables
Hypothesis testing: A method used in inferential statistics wherein the statistical relationships observed in sample data are compared to a hypothetical distribution of data in which there is no analogous relationship to generate an estimate of how likely or unlikely the observed relationship is; the observed relationship being tested is stated as the alternative hypothesis, which is compared to the statement of no relationship, the null hypothesis
Independent variable: A variable with values that, at least in part, determine values of another variable; in a cause-and-effect relationship, the variable representing the cause
Inferential data analysis: Quantitative data analysis that uses statistics to estimate parameters
Informed consent: An individual’s formal agreement to participate in a study after receiving information about the study’s risks and benefits, assurances that participation is voluntary, what participation will entail, confidentiality safeguards, and whom to contact if they have questions or concerns about the study
Institutional Review Board: A committee responsible for ensuring compliance with ethical standards for conducting research at an institution, such as a university
Internal validity: The truth of causal claims inferred from empirical research
Interval scale of measurement: Describes a variable with numeric values but no natural zero
Intervening variable: An independent variable that itself is affected by the key independent variable and then, in turn, causes change in the dependent variable
Interview protocol: The set of instructions and questions used to guide interviews
Latent variable: A variable that cannot be directly observed, such as an abstract concept, attitude, or private behavior
Literature review: (1) The process of finding and learning from previous research as one of the early steps in the research process; (2) a paper that summarizes, structures, and evaluates the existing body of knowledge addressing a research question; (3) a section of a larger research report that summarizes, structures, and evaluates the existing body of knowledge being addressed by the research and locates the research being reported in that larger body of knowledge
Logic model: A diagram depicting the way a program is intended to work, including its inputs, activities, outputs, and outcomes
Manifest variable: A variable that can be observed and is thought to indicate the values of latent variable
Memoing: Writing notes to document the qualitative researchers’ thought processes associated with every step of qualitative research and their evolving ideas about what is being learned during the course of data analysis
Meta-analysis: A method of synthesizing previous research using statistical techniques that combine the results from multiple separate studies; the results of research using this method
Mixed methods research: Research using both qualitative and quantitative data
Natural experiment: A quasi-experimental design that capitalizes on “naturally” occurring variation in the independent variable
Nominal scale of measurement: Describes a variable with categorical values that have no inherent order
Nonparametric data analysis: Analysis of quantitative data using statistical techniques suitable because the data do not have an underlying normal distribution, homogeneous variance, and independent error terms
Nonprobability sampling design: A strategy for selecting a sample in which the probability of cases being selected is either unknown or not considered when selecting cases for inclusion in the sample, with sample selection made for some other reason (see convenience sampling, purposive sampling, quota sampling, and snowball sampling)
Nonspurious: Not attributable to any other factor
Null hypothesis: See hypothesis testing
Open coding: Assigning labels/descriptors/tags to “chunks” of qualitative data that note the data’s significance for addressing the research question; a first step in identifying important themes that emerge from qualitative data
Open-ended question: A survey or interview question without any predetermined responses
Operational validity: The extent to which a variable (or set of variables intended to operationalize a single concept) accurately and thoroughly measures what it is intended to measure
Operationalize: To describe how observations will be made so that values can be assigned to variables for cases
Ordinal scale of measurement: Describes a variable with categorical values that have an inherent order
Panel research design: A formal research design in which data are collected at different points across time from the same sample
Parameter: A quantified summary characteristic of a population
Parametric data analysis: Analysis of quantitative data using statistical techniques suitable only because the data have an underlying normal distribution, homogeneous variance, and independent error terms
Peer review: The process of having a research report (or other form of scholarship) reviewed by scholars in the field, usually as a prerequisite for publication
Plagiarism: The written misrepresentation of someone else’s words or ideas as one’s own
Point estimate: A statistic calculated from sample data used to estimate the population parameter; usually referred to in distinction to the confidence interval
Policy model: An explanation of how a policy is supposed to work, including its inputs, how it is intended to be implemented, its intended outcomes, and the assumptions that undergird the intended change process
Population: Total set of cases of interest; all cases to which the research is intended to apply
Predictive validity: A type of criterion validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) predicts future change in another variable as would be expected if the variable accurately measures what it is intended to measure
Probability sampling design: A strategy for selecting a sample in which every case in the population has a known (or knowable) nonzero probability of being included in the sample
Proportionate stratified sampling: A probability sampling design in which the proportions of cases in the population demonstrating known characteristics are replicated in the sample
Purposive sampling: A nonprobability sampling design in which cases are selected because they are of interest, typical, or atypical as suits the purposes of the research
Qualitative data: Textual data
Quantitative data: Numeric data
Quasi-experimental research design: A formal research design similar to experimental research design but with assignment to experimental and comparison groups made in a nonrandom fashion
Quota sampling: A nonprobability sampling design in which cases are selected as in convenience sampling but such that the sample demonstrates desired proportions of characteristics, either to replicate known population characteristics or permit comparisons of subsets of the sample
Ratio scale of measurement: Describes a variable with numeric values and a natural zero
Reliability: The extent to which hypothetical repeated measures of variables would generate the same values for the same cases
Research design: 1) Generally, a description of the entire research process; 2) more narrowly, the formal research design used to structure the research, including cross-sectional, time series, panel, experimental, quasi-experimental, and case study research designs
Response set bias: Bias resulting from a response set that leads respondents to select responses other than more accurate responses
Response set: The set of responses that respondents may select from when answering a closed-ended question
Sample: Subset of population used to learn about the population; the cases which are observed Sampling error: The difference between a statistic and its corresponding parameter Sampling frame: List of cases from which a sample is selected
Secondary data: Data collected by someone other than the researcher, usually without having anticipated how the data would ultimately be used by the researcher
Selective coding: Assigning a set of codes (such as a system of codes developed through axial coding) to “chunks” of qualitative data
Semi-structured interviews: Interviews conducted following an interview protocol that specifies questions and potential follow-up questions but permitting flexibility in the order and specific wording of questions
Simple random sampling: A probability sampling design in which every case in the population has an equal probability of being selected for inclusion in the sample
Snowball sampling: A nonprobability sampling design in which one case is selected for the sample, which then leads the researcher to another case for inclusion in the sample, then another case, and so on (also called network sampling when cases are people)
Social desirability bias: The tendency of interviewees to provide responses they think are more socially acceptable than accurate responses
Standardized interview: Interviews conducted following an interview protocol requiring identical wording and question order for all respondents
Statistic: A quantified summary characteristic of a sample
Systematic sampling: A probability sampling design in which every kth case in the sampling frame is selected for inclusion in the sample; if there is a discrete (as opposed to hypothetically infinite) sampling frame, k equals the number of cases in the population divided by the number of cases desired to be in the sample
Theory: A set of concepts and relationships among those concepts posited in a formal statement to describe or explain the phenomenon of interest
Time series research design: A formal research design in which data are collected at different points across time from independent samples
Unit of analysis: The entity—the whom or what—that is being studied; the entity for which observations are being recorded in a study
Validity: Truthfulness of claims made based on research; see operational validity, face validity, content validity, discriminate validity, criterion validity, concurrent validity, predictive validity, internal validity, external validity
Variable: Logical groupings of attributes; the category to which these attributes belong; a factor/quality/condition that can take on more than one value/state