3730 Walnut Street
500 Jon M. Huntsman Hall
Philadelphia, PA 19104
Research Interests: behavioral economics, consumer behavior, experimental methodology, judgment and decision making
Links: Personal Website
Professor Simonsohn studies judgment, decision making, and methodological topics
He is a reviewing editor for the journal Science, an associate editor of Management Science, and a consulting editor for the journal Perspectives on Psychological Science.
He teaches decision making related courses to undergraduates, MBA and PhD students (OID290, OID690, OID900, and OID937)
He has published in psychology, management, marketing, and economic journals.
Abstract: In 2010-2012, a few largely coincidental events led experimental psychologists to realize that their approach to collecting, analyzing, and reporting data made it too easy to publish false-positive findings. This sparked a period of methodological reflection that we review here and call “psychology’s renaissance.” We begin by describing how psychology’s concerns with publication bias shifted from worrying about file-drawered studies to worrying about p-hacked analyses. We then review the methodological changes that psychologists have proposed and, in some cases, embraced. In describing how the renaissance has unfolded, we attempt to describe different points of view fairly but not neutrally, so as to identify the most promising paths forward. In so doing, we champion disclosure and pre- registration, express skepticism about most statistical solutions to publication bias, take positions on the analysis and interpretation of replication failures, and contend that “meta-analytical thinking” increases the prevalence of false-positives. Our general thesis is that the scientific practices of experimental psychologists have improved dramatically.
Abstract: In a well-known article, Carney, Cuddy, and Yap (2010) documented the benefits of “power posing.” In their study, participants (N=42) who were randomly assigned to briefly adopt expansive, powerful postures sought more risk, had higher testosterone levels, and had lower cortisol levels than those assigned to adopt contractive, powerless postures. In their response to a failed replication by Ranehill et al. (2015), Carney, Cuddy, and Yap (2015) reviewed 33 successful studies investigating the effects of expansive vs. contractive posing, focusing on differences between these studies and the failed replication, to identify possible moderators that future studies could explore. But before spending valuable resources on that, it is useful to establish whether the literature that Carney et al. (2015) cited actually suggests that power posing is effective. In this paper we rely on p-curve analysis to answer the following question: Does the literature reviewed by Carney et al. (2015) suggest the existence of an effect once we account for selective reporting? We conclude not. The distribution of p-values from those 33 studies is indistinguishable from what is expected if (1) the average effect size were zero, and (2) selective reporting (of studies and/or analyses) were solely responsible for the significant effects that are published. Although more highly powered future research may find replicable evidence the purported benefits of power posing (or unexpected detriments), the existing evidence is too weak to justify a search for moderators or to advocate for people to engage in power posing to better their lives.
Abstract: This invited paper describes how we came to write an article called "False-Positive Psychology."
Abstract: We define transactions as weird when they include unexplained features, that is, features not implicitly, explicitly, or self-evidently justified, and propose that people are averse to weird transactions. In six experiments, we show that risky options used in previous research paradigms often attained uncertainty via adding an unexplained transaction feature (e.g., purchasing a coin flip or lottery), and behavior that appears to reflect risk aversion could instead reflect an aversion to weird transactions. Specifically, willingness to pay drops just as much when adding risk to a transaction as when adding unexplained features. Holding transaction features constant, adding additional risk does not further reduce willingness to pay. We interpret our work as generalizing ambiguity aversion to riskless choice.
Abstract: Why have companies faced a backlash for running experiments? Academics and pundits have argued that it is because the public finds corporate experimentation objectionable. In this paper we investigate “experiment aversion,” finding evidence that, if anything, experiments are rated more highly than the least acceptable policies that they contain. In six studies participants evaluated the acceptability of either corporate policy changes or of experiments testing those policy changes. When all policy changes were deemed acceptable, so was the experiment, even when it involved deception, unequal outcomes, and lack of consent. When a policy change was unacceptable, the experiment that included it was deemed less unacceptable. Experiments are not unpopular, unpopular policies are unpopular.
Uri Simonsohn, Joseph Simmons, Leif D. Nelson (Working), Specification Curve: Descriptive and Inferential Statistics on All Reasonable Specifications.
Abstract: Empirical results often hinge on data analytic decisions that are simultaneously defensible, arbitrary, and motivated. To mitigate this problem we introduce Specification-Curve Analysis. This approach consists of three steps: (i) estimating the full set of theoretically justified, statistically valid, and non-redundant analytic specifications, (ii) displaying the results graphically in a manner that allows identifying which analytic decisions produce different results, and (iii) conducting statistical tests to determine whether the full set of results is inconsistent with the null hypothesis of no effect. We illustrate its use by applying it to three published findings. One proves robust, one weak, one not robust at all. Although it is impossible to eliminate subjectivity in data analysis, Specification-Curve Analysis minimizes the impact of subjectivity on the reporting of results, resulting in a more systematic, thorough, and objective presentation of the data.
Uri Simonsohn, Joseph Simmons, Leif D. Nelson (2015), Better P-curves: Making P-curve Analysis More Robust To Errors, Fraud, and Ambitious P-hacking, A Reply to Ulrich and Miller (2015), Journal of Experimental Psychology: General, 144 (December), pp. 1146-1152.
Abstract: When studies examine true effects, they generate right-skewed p-curves, distributions of statistically significant results with more low (.01s) than high (.04s) p-values. What else can cause a right-skewed p-curve? First, we consider the possibility that researchers report only the smallest significant p-value (as conjectured by Ulrich & Miller, 2015), concluding that it is a very uncommon problem. We then consider more common problems, including (1) p-curvers selecting the wrong p-values, (2) fake data, (3) honest errors, and (4) ambitiously p-hacked (beyond p<.05) results. We evaluate the impact of these common problems on the validity of p-curve analysis, and provide practical solutions that substantially increase its robustness.
Uri Simonsohn, Leif D. Nelson, Joseph Simmons (2014), P-Curve and Effect Size: Correcting for Publication Bias Using Only Significant Results, Perspectives on Psychological Science, 9 (December), pp. 666-681.
Abstract: Journals tend to publish only statistically significant evidence, creating a scientific record that markedly overstates the size of effects. We provide a new tool that corrects for this bias without requiring access to nonsignificant results. It capitalizes on the fact that the distribution of significant p-values, p-curve, is a function of the true underlying effect. Researchers armed only with sample sizes and test results of the published findings can correct for publication bias. We validate the technique with simulations and by re-analyzing data from the Many-Labs Replication project. We demonstrate that p-curve can arrive at inferences opposite that of existing tools by re-analyzing the meta-analysis of the “choice overload” literature.
Abstract: Because scientists tend to report only studies (publication bias) or analyses (p-hacking) that “work”, readers must ask, “Are these effects true, or do they merely reflect selective reporting?” We introduce p-curve as a way to answer this question. P-curve is the distribution of statistically significant p-values for a set of studies (ps < .05). Because only true effects are expected to generate right-skewed p-curves – containing more low (.01s) than high (.04s) significant p-values – only right-skewed p-curves are diagnostic of evidential value. By telling us whether we can rule out selective reporting as the sole explanation for a set of findings, p-curve offers a solution to the age-old inferential problems caused by file-drawers of failed studies and analyses.
There has been increasing interest in recent years as to how managers make decisions when there is uncertainty regarding the value or likelihood of final outcomes. What type of information do they collect? How do they process the data? What factors influence the decisions? This course will address these issues. By understanding managerial decision processes we may be better able to prescribe ways of improving managerial behavior. Building on recent work in cognitive psychology, students will gain an understanding of the simplified rules of thumb and apparent systematic biases that individuals utilize in making judgments and choices under uncertainty. At the end of the course, students should understand the decision making process more thoroughly and be in a position to become a better manager.
This course is an intensive introduction to various scientific perspectives on the processes through which people make decisions. Perspectives covered include cognitive psychology of human problem-solving, judgment and choice, theories of rational judgment and decision, and the mathematical theory of games. Much of the material is technically rigorous. Prior or current enrollment in STAT 101 or the equivalent, although not required, is strongly recommended.
The course is an introduction to research on normative, descriptive and prescriptive models of judgement and choice under uncertainty. We will be studying the underlying theory of decision processes as well as applications in individual group and organizational choice. Guest speakers will relate the concepts of decision processes and behavioral economics to applied problems in their area of expertise. As part of the course there will be a theoretical or empirical term paper on the application of decision processes to each student's particular area of interest.
This PhD-level course is for students who have already completed at least a year of basic stats/methods training. It assumes students already received a solid theoretical foundation and seeks to pragmatically bridge the gap between standard textbook coverage of methodological and statistical issues and the complexities of everyday behavioral science research. This course focuses on issues that (i) behavioral researchers are likely to encounter as they conduct research, but (ii) may struggle to figure out independently by consulting a textbook or published article.
When trying to get into graduate school or land a new job, applicants expect to be evaluated against the relative strength or weakness of the entire pool of candidates. But a recent paper co-authored by Wharton professor Uri Simonsohn suggests that perhaps they should also be worried about the timing of their interviews.Knowledge @ Wharton - 2013/02/13