Root Out Respondent Bias in Your Survey
Online surveys, like any other research method, need to recognize and deal with various kinds of bias. The term has a technical meaning – artefacts that skew your data in some way, not the colloquial meaning of a person showing improper prejudice or pre-judgement. Here are some potential sources of bias in online surveys, at the level of the survey overall rather than specific questions and analytic steps:
- Sampling bias: This occurs if the collection of respondents you obtain (the sample) is not representative of everyone you want to study (the population). For example, if the survey is mostly distributed to general counsel of publicly traded companies, it is biased against general counsel of privately held companies, not-for-profits, and government agencies, and therefore cannot generalize its findings to all law departments. Or, if a software vendor primarily surveys its user base and contacts, the results will be biased by that filter.
One form of sampling bias is called “imbalance.” Imagine a survey that seeks attitudes in law firms of 100+ lawyers regarding artificial intelligence. The sponsor wants to report its findings by levels in the law firms (e.g., partners, associates, contract lawyers, paralegals, and staff). The sponsor might have reason to believe that a balanced participation rate would have 30%, 25%, 5%, 10%, and 30%, respectively, for those levels. If the actual survey data set results in widely different percentages, sampling bias has reared its head.
- Self-selection bias: This occurs when respondents decide on their own to take the survey, rather than being randomly selected or required to. When the sponsor invites a large group to take a survey, but only a portion elect to do so, that can lead to a sample that is not representative of the population. The general counsel of inefficient law departments don’t want to provide their data to a benchmark survey lest it confirm they are operating inefficiently. They are probably less likely to choose to participate in benchmark surveys than their fitter counterparts. Companies with less flattering benchmark figures might systematically decline to take part. Their numbers not being in the data, the benchmark results conceal mediocrity. Because of self-selection bias, published benchmarks may be better than what actually and typically prevails.
This is the mischief of self-selection bias. Some findings, like total legal spending as a proportion of corporate revenue, may actually be higher than the benchmark. If only the lean ones weigh themselves, the world is fatter than the resulting medians and averages.
- Response bias: This bias worms its way into a survey if respondents do not answer questions honestly or accurately. This can be due to a variety of factors, such as the wording of the questions or the personal characteristics of the respondent. Some people dislike surveys, so those who do respond are all alike in some way (remember, survey bias is an artefact that disturbs the expected distribution of data). For example, the general counsel who respond to a survey collecting benchmark data might be more attuned to managing their departments effectively. Or, without a full range of salary data, it’s impossible to tell whether the participants in a salary survey resemble the real world and submitted accurate figures.
Even if you were to ask for things like socioeconomic status, race, age, etc., there could always be one more variable you forgot to account for which threatens the validity of the survey’s findings because of response bias. That being said, if you’re willing to make assumptions about which variables effect responses, you could attempt an adjustment by “post-stratification.”
- Extremism bias: In Proofiness: How you’re being fooled by the numbers (Penguin 2010) at 108, the author Charles Seife makes a point about bias: “When surveys and polls depend on voluntary response, it’s almost always the case that people with strong opinions tend to respond much more often than those who don’t have strong opinions. This introduces a bias: the poll disproportionately reflects extreme opinions at the expense of moderate ones.”
Worse, “People are relatively silent when they’re reasonably content, but if they’re angry they tend to shout it from the mountaintop.” What might this phenomenon say about data on dissatisfaction with law firms, with law departments or with software vendors? We hear from the few loudmouth critics more than from the many silent supporters.
Social desirability bias Many people succumb to giving socially acceptable answers rather than being honest. Sensitive or controversial topics elicit this bias. The tendency can be minimized if a rating or ranking question has an even-number scale, which prevents people from sitting on the fence. Fears about anonymity may also bias some answers toward more positive ratings.
Central tendency bias: People avoid choosing the most extreme responses on a scale such as “Very Helpful” or “Strongly Disagree.” You can reduce the effect of this bias with clear definitions, such as “’Very Helpful’ means you got what you needed from the Legal Department.”
Self-report bias: Consider what may happen if a survey asks for typical hours worked during a week. People may inflate reality to feel better about themselves or to compare favorably to a number they project to be true. For this reason compensation surveys are more trustworthy if the Human Resource department fills them out rather than the employees themselves. HR won’t exaggerate; a person who wants to use the results to argue for more pay might be tempted to boost the real figure.
To reduce the potential for these manifestations of bias in online surveys, it is important to be sensitive to them, to design the survey and sampling procedures carefully, and to take what steps you can to ensure that the survey is administered in a way that maximizes the response rate.