Tutorial on How to use Mechanical Turk for Behavioral Research

Tutorial Description

In this tutorial, we will demonstrate how to conduct behavioral research on Amazon's Mechanical Turk. We will begin by discussing the four main advantages to using Mechanical Turk as a platform for running online studies: access to a large pool of participants, diversity of the participants, low cost of running studies, and faster research cycle. We will outline the fundamental components of a job on Mechanical Turk and discuss the features of the marketplace, including who is doing the work. We will describe how to run three kinds of studies on Mechanical Turk: surveys, experiments with random assignment, and synchronous experiments. We will demonstrate the mechanics of putting a task on Mechanical Turk by creating a survey, posting the job to Mechanical Turk, reviewing the responses and paying the workers. Finally, we will discuss methods for quality assurance and ethical issues surrounding Mechanical Turk.

We anticipate the tutorial will last approximately four hours.

Tutorial Objectives

In this tutorial we will describe a new tool that has emerged in the last 5 years for conducting online behavioral research: crowdsourcing platforms. One of the main benefits of these platforms to behavioral researchers is that they provide access to a large set of people who are willing to do tasks---including participating in research studies---for relatively low pay. The crowdsourcing site with one of the largest subject pools is Amazon's Mechanical Turk (AMT), so it is the focus of this tutorial.

In this tutorial, we will begin by discussing some of the advantages of doing experiments on Mechanical Turk. Specifically, there are four main advantages to using Mechanical Turk as a platform for running online experiments:

  1. While researchers at large universities typically have access to large numbers of undergraduates participating in experiments in exchange for academic credit, these subject pools may be much smaller or even non-existent in smaller colleges and universities, or may be unavailable to all researchers. The options for non-academic researchers are even fewer, with recruitment generally limited to ads posted online and flyers posted in public areas. Mechanical Turk offers a very large pool of online participants for these researchers.
  2. Workers on Mechanical Turk tend to be from a very diverse background, spanning a wide range of age, ethnicity, socio-economic status (SES), language, and country of origin. Unfortunately, the population of workers on AMT is not representative of any one country or region, but it does open the doors to cross-cultural and international research at a very low cost and can broaden the validity of studies beyond the undergraduate population.
  3. Studies on Mechanical Turk can be conducted at a very low cost, which clearly compare favorably to paid laboratory participants. For example, Paolacci and colleagues (2010) replicated classic studies from the judgment and decision-making literature at a cost of approximately $1.71 / hour, and obtained results that neatly paralleled the same studies conducted with undergraduates in a laboratory setting.
  4. All too often, research is delayed because of the time it takes to recruit participants and recover from errors in the methodology. For instance, many academic researchers experience the drought / flood cycle of undergraduate subject pools, with supply of participants exceeding demand at the beginning and end of a semester, and then dropping to almost nothing at all other times. The participant availability on Mechanical Turk is relatively stable, with fluctuations in supply largely due to variability in the number of jobs available in the market. Moreover, experiments can be built and put on Mechanical Turk easily and rapidly, which further reduces the time to iterate the cycle of theory development and experimental execution.

We will then discuss how the behavior of workers compares to laboratory subjects, citing work by researchers from computer science and psychology. Then, we will walk through the mechanics of putting a task on Mechanical Turk including recruiting subjects, executing the task, and reviewing the work that was submitted. We will also provide solutions to common problems that a researcher might face when executing their research on this platform such as techniques for conducting synchronous experiments, methods to ensure high quality work, how to keep data private, and how to maintain code security.

Tutorial Take-aways

At the end of this tutorial, we expect the audience members to:

  1. Understand the Mechanical Turk marketplace, including who the workers are, what kinds of jobs are typically available, and what the basic components of a job on Mechanical Turk look like.
  2. Understand the basics of creating a job on Mechanical Turk, including how to create a Human Intelligence Task (HIT), how much to pay, how long to expect work to take, and how to approve work and pay workers.
  3. Be able to create a basic HIT using one of Amazon's templates, for the creation of surveys and similar study materials.
  4. Have a basic understanding of methods for randomizing workers into experimental conditions.
  5. Understand the basic components necessary for running a synchronous experiment on Mechanical Turk.
  6. Know how to utilize methods for ensuring quality data in the Mechanical Turk workspace.
  7. Be aware of the ethical issues surrounding Mechanical Turk and how to address them.

Tutorial Content

Materials

The majority of the tutorial will be in a presentation format, utilizing slides to explain the process of building and running an experiment on Mechanical Turk. Ideally, if there is access to the internet, we will actually build and run a study on Mechanical Turk. This will allow us to not only demonstrate how to do research on Mechanical Turk, it will also demonstrate how quickly and efficiently research conducted on Mechanical Turk can be. We will send slides used in a previous version of this tutorial and a manuscript on which the tutorial is based to the tutorial chair.

Outline

We will begin by motivating the use of Mechanical Turk for behavioral research, focusing on the four advantages outlined in the introduction to this proposal. We will also discuss prior research that demonstrates the validity of using workers as participants.

We will then discuss the basic concepts associated with using Mechanical Turk, including who are the workers, who are the other requesters (i.e., employers), and what constitutes a Human Intelligence Task (HIT). This will include the information workers use to find HITs, where a HIT is stored, and the lifecycle of a HIT, from creation, to execution of the task, to approval of the work and payment of the worker. We will also discuss the typical cost of a HIT and how to assess the value of the work being requested.

We will then talk about three types of studies that can be conducted on Mechanical Turk: surveys, experiments with random assignment, and synchronous experiments. When introducing surveys, we will actually create a survey with the help of the audience and post the job to Mechanical Turk. This will demonstrate both how to do surveys specifically as well as how to generally create HITs on Mechanical Turk. When discussing experiments with random assignment and synchronous experiments, we will review work we have conducted in both of these categories.

We will also briefly discuss other tools for creating HITs, including the command-line tools (CLTs) and PHP scripts that we will make available to the audience. After discussing these tools, we will retrieve and present the results of the survey initiated earlier in the tutorial, which will demonstrate the methods for retrieving results and paying workers, as well as the speed and low cost of conducting research on Mechanical Turk.

Finally, we will discuss some of the issues specific to conducting research on Mechanical Turk, including quality assurance, data security, and ethical issues (many of which apply to online research generally).

Potential Audience

Because Mechanical Turk is a tool for any researcher who does studies online, we expect this tutorial to have very broad appeal. The tutorial will be especially relevant to those who are unfamiliar with Mechanical Turk or want to know details about good practices when doing research on the site and ways to ensure reliable data collection. This tutorial will not be particularly useful to individuals who are already using Mechanical Turk for sophisticated research, who can only conduct their research in the laboratory, or are uninterested in conducting behavioral research.

Prerequisite Knowledge

We only expect audience members to have very basic familiarity with conducting behavioral research and the internet. No other knowledge will be required.

Resources

Attendees are invited to read our manuscript on which this paper is based, which can be found here. You may also download (.pptx) slides from an earlier (and shorter) version of this tutorial.