Jan. 4, 2019Print | PDF
They’re classic, well-known psychological studies that have informed policy, popular culture and a great deal of scholarship. One suggests you can “prime” people to think or act a certain way – for instance, expose people to words suggesting old age and they’ll walk out of the lab more slowly. Another asserts that people make different decisions depending on whether they’re asked to choose something or reject something.
But these studies and a significant percentage of others may not hold up, according to a growing body of work to which Wilfrid Laurier University researchers are contributing.
“In a lot of sciences, especially psychological science, there has been somewhat of a crisis in that people are unable to replicate a lot of findings,” says Associate Professor Tripat Gill, who is Canada Research Chair in Market Insight and Innovation. “There has been a reduction in trust of the findings in psychology, so that has led to efforts to try to address the issue.”
Gill and his former colleague, Associate Professor Grant Packard, participated in the Many Labs project of the U.S.-based Center for Open Science. The first phase of the project, Many Labs 1, attempted to replicate 13 classic psychological studies. The next phase of the project, Many Labs 2, attempted to replicate 28.
Many Labs 1 found only 10 of the 13 effects the original authors reported could be consistently reproduced. Many Labs 2 found fully half the 28 effects could not be replicated.
Gill is a faculty member at Laurier’s Lazaridis School of Business and Economics, as was Packard until recently. Both specialize in marketing but use psychology extensively in their research. Gill is director of Laurier’s Consumer Research Laboratory and uses psychological research techniques to perform consumer behaviour experiments. Packard is trained as a social psychologist and focuses his work on areas connecting psychology and marketing.
Packard was the first to get involved in Many Labs. For Many Labs 1, which ran from 2012 to 2014, he helped with some of the planning and editing in addition to running experiments. Gill joined Packard in running experiments for Many Labs 2, though both emphasize they were contributors, not project leaders.
The keys of the Many Labs project were size and diversity. When individual researchers have previously tried unsuccessfully to replicate the results of other scholars’ studies, the original researchers often responded that the results could be explained by differences in the size or composition of the study participant groups.
To counter these objections, the Many Labs experiments were conducted by hundreds of researchers around the world. More than 6,000 subjects participated in Many Labs 1 and more than 15,000 in Many Labs 2. In general, effects that held up in one lab held up around the world, while the reverse was also true.
Separately from the psychology-focused Many Labs project, Gill has tried to replicate marketing and consumer behaviour studies in his lab. He has published two papers on his results – and neither was encouraging for his discipline.
One of the studies Gill tried to replicate was based on the idea of priming, which the Many Labs project had found problematic. Previous researchers had found exposing people to images of dogs made them feel more positive about product logos and names related to both dogs and cats.
“For instance, they found that preference for the brand Puma goes up after priming with dogs,” says Gill. “I tried to replicate that through multiple studies, using Puma and also other cat-related brands like Jaguar. And I could not.”
Gill also examined research that had found when people are faced with a large number of choices, it helps to put them in groups, even if the groupings are meaningless. For example, the researchers found people were more satisfied choosing from five lists labelled A-E, consisting of 10 items each, than choosing from one list of 50 items.
With former doctoral student David Lewis (PhD ’17), Gill tested if people would be more satisfied choosing financial products from grouped lists. The result? Putting items in groups had no effect on choice satisfaction.
“We use the effects previous studies have found to make recommendations to business managers and marketers. They’re considered like truth but people haven’t questioned or checked them enough,” says Gill. “This is an area where I feel the marketing discipline is lagging. I would love to take part in a Many Labs for marketing. Maybe in the future I might initiate this myself.”
Gill and Packard see three causes behind the replication crisis. One is human error, most commonly “a misunderstanding about how to properly perform and interpret statistical results from human participant experiments,” says Packard.
Another is what’s known as the file drawer problem. “Journals have a bias towards publishing cool stuff which they think will get a lot of attention,” says Gill. “But the truth of a finding should be more important than how exciting it is.”
The most disturbing possibility is intellectual dishonesty, says Gill. For instance, some researchers are thought to “hack” a study’s p-value, a number that indicates how significant statistical results are, by running studies only until they achieve a significant p-value, or by reporting only some of their results because their full results are statistically insignificant.
To improve people’s confidence in research, the culture of academia needs to change, says Gill. For starters, he’d like to see more researchers across disciplines engaging in replication efforts.
Gill has had other researchers check his work and would welcome more replication attempts. For instance, a decade-old study of his on convergent products, which bring together two or more categories – such as a phone and a camera – found functional products that add a just-for-fun use do well. With entertainment-oriented products, however, it’s best to add a function that sticks with the same goal. Two subsequent research studies were separately able to replicate his work.
Some journals are beginning to take replication seriously, such as by creating sections devoted to replication research. Another measure is pre-registering papers based on proposals. Researchers submit their hypotheses and the methods they will use, including the number of study participants. The journal then guarantees publication regardless of the results.
“I think this is the best step so far because researchers are not scared about whether they’ll get an effect or not,” says Gill.
To help improve the credibility of future studies, Gill and his lab have joined the Psychological Science Accelerator, a network of more than 350 labs on every populated continent that work together to collect data for democratically selected studies, whether these are replication studies or novel ones.
Gill says universities also need to change. To encourage up-and-coming scholars to take replication seriously, he offers his students the chance to perform replication studies for term papers and other assignments rather than always designing original experiments.
The way faculty members are rewarded should also be re-examined, says Gill.
“You only get tenure if you have publications but you only get publications if you have positive and exciting findings,” says Gill. “I think this push to publish only positive findings could be a big factor behind intellectual dishonesty.”
We see you are accessing our website on IE8. We recommend you view in Chrome, Safari, Firefox or IE9+ instead.×