Pulling Back the Curtain: Using Mechanical Turk for Research

Pulling Back the Curtain: Using Mechanical Turk for Research

Jun 13, 2017
Pulling Back the Curtain

Although random samples of a population are the gold standard in social science research, convenience samples—or groups of people who are chosen because they are easy to reach—can be a valuable tool for researchers in the initial phases of a research program. The Internet has opened up new avenues for easily accessing large groups of potential study participants through social networking sites like Facebook, discussion forums like Reddit, or commercial market research panels. One increasingly popular source of convenience samples is Amazon’s Mechanical Turk (MTurk).

MTurk is an online labor market that lets people and companies crowdsource tasks to workers who select the ones they want to complete. It is named after the Mechanical Turk, an eighteenth-century chess-playing machine that “defeated” opponents including Benjamin Franklin and Napoleon Bonaparte, but was secretly controlled by a human chess master hidden inside the machine. Similarly, MTurk was designed to provide workers who can “hide” inside technology companies, providing the “artificial artificial intelligence” necessary to do tasks that cannot yet be done by machine. For example, MTurk workers help identify duplicate products on Amazon.com, convert scanned business cards to professional connections on LinkedIn, and categorize Twitter posts—deciding, for example, whether a post like “Ryan plays his Trump card” is about politics or last night’s card game.

MTurk is also an exciting tool for researchers. Some use it to crowdsource large tasks of their own, like identifying topics of media stories or transcribing interviews, but it has also become a popular source of participants for research studies. MTurk is fast (hundreds of responses per day) and inexpensive ($2 to $7 per participant hour), allowing researchers to rapidly pilot-test surveys for length and comprehension, or even to test research hypotheses like how people respond to different messages about a policy, how they decide between different products or services, or how different demographic or personality variables are correlated with each other. In this role, the results provided by MTurk samples fill an important gap between a researcher’s own intuition about how people think and behave (instant, free, and based on little evidence), and a full-scale, probability-based sample of a population (expensive, slow, and very accurate). The appeal of MTurk to researchers is clear in the hundreds of peer-reviewed social science papers published every year that incorporate data from MTurk samples. I summarize what is known about the pros and cons of conducting research using MTurk in an article in the Annual Review of Clinical Psychology.

Who are these people?

Initially, some researchers were skeptical about data that came from MTurk workers. Who are these people, and why would they work for so little money? Fortunately, we can study these questions in detail because it is so easy to find MTurk workers. In fact, dozens of studies about these topics have been published, making this population one of the best-understood convenience samples available today. At any given time, the number of active MTurk users is estimated to be somewhere between 10,000 and 25,000 people, most of whom live in the United States or India. The U.S. MTurk worker population is diverse, but by no means representative of Americans as a whole. For example, as might be expected from a new Internet community, MTurk workers tend to be young and well-educated. They also tend to be students or people without full-time jobs.

Workers are mostly motivated by the money they earn for completing tasks, but money isn’t the only factor. Because the tasks take only a short time to complete and can be done anywhere, MTurk offers a flexibility that many jobs do not. Many workers also say they select tasks they find to be intrinsically enjoyable or that allow them to improve skills such as translating. For many workers, MTurk falls somewhere between a way to pass idle time and a traditional job.

Can we trust what they say?

A major concern about seemingly anonymous survey respondents is that they may lie or not take research studies seriously. Many studies have revealed that by this standard, the quality of data that MTurk workers produce can be as high as—or higher than—the quality produced by people recruited for other non-probability samples. The demographic information that workers report is consistent over time, outcome measures that are supposed to be correlated with each other are in fact correlated with each other, and there is relatively little evidence of negative behavior—for example, skipping questions or selecting responses at random. At least part of the reason for this diligence is that each worker has a unique identification number linked to his or her “real world” identity. MTurk includes an online reputation system through which requestors can reject poor quality work, and in which a high rate of rejection restricts access to future work opportunities.  

None of this is to say that research conducted on MTurk is without its limitations, or that researchers using this platform will never confront data quality issues. All research methods involve challenges, and our understanding of best practices for comparatively new methods like online surveys will keep evolving. However, many of the known issues with using the platform to recruit participants  can be avoided by applying appropriate research methods. Most exciting, the large community of researchers using this platform has ensured that our understanding of these best practices is growing rapidly.

About the Author