Hamsa Bastani

Hamsa Bastani
  • Assistant Professor of Operations, Information and Decisions

Contact Information

  • office Address:

    3730 Walnut Street
    557 Jon M. Huntsman Hall
    Philadelphia, PA 19104

Research Interests: machine learning algorithms & applications to healthcare, revenue management, social good

Links: Personal Website


Hamsa Bastani is an assistant professor in Operations Information and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations, pricing, recommendation systems, and social good. Her work has been recognized by the George Nicholson, MSOM, Service Science, and Health Applications Society best student paper awards, the Pierskalla best paper award in healthcare operations, and the early-career People’s Choice award in sustainable operations. She previously completed her PhD at Stanford University, and was a Herman Goldstine postdoctoral fellow at IBM Research.

Continue Reading


  • Hamsa Bastani and Mohsen Bayati (2019), Online Decision-Making with High-Dimensional Covariates, Operations Research.

    Abstract: Big data have enabled decision makers to tailor decisions at the individual level in a variety of domains, such as personalized medicine and online advertising. Doing so involves learning a model of decision rewards conditional on individual-specific covariates. In many practical settings, these covariates are high dimensional; however, typically only a small subset of the observed features are predictive of a decision’s success. We formulate this problem as a K-armed contextual bandit with high-dimensional covariates and present a new efficient bandit algorithm based on the LASSO estimator. We prove that our algorithm’s cumulative expected regret scales at most polylogarithmically in the covariate dimension d; to the best of our knowledge, this is the first such bound for a contextual bandit. The key step in our analysis is proving a new tail inequality that guarantees the convergence of the LASSO estimator despite the non-i.i.d. data induced by the bandit policy. Furthermore, we illustrate the practical relevance of our algorithm by evaluating it on a simplified version of a medication dosing problem. A patient’s optimal medication dosage depends on the patient’s genetic profile and medical records; incorrect initial dosage may result in adverse consequences, such as stroke or bleeding. We show that our algorithm outperforms existing bandit methods and physicians in correctly dosing a majority of patients.

  • Arielle Anderer, Hamsa Bastani, John Silberholz (Under Revision), Adaptive Clinical Trial Designs with Surrogates: When Should We Bother?.

    Abstract: The success of a new drug is assessed within a clinical trial using a primary endpoint, which is typically the true outcome of interest, e.g., overall survival. However, regulators sometimes allow drugs to be approved using a surrogate outcome — an intermediate indicator that is faster or easier to measure than the true outcome of interest, e.g., progression-free survival — as the primary endpoint when there is demonstrable medical need. While using a surrogate outcome (instead of the true outcome) as the primary endpoint can substantially speed up clinical trials and lower costs, it can also result in poor drug approval decisions since the surrogate is not a perfect predictor of the true outcome. In this paper, we propose combining data from both surrogate and true outcomes to improve decision-making within a clinical trial. In contrast to broadly used clinical trial designs that rely on a single primary endpoint, we propose a Bayesian adaptive clinical trial design that simultaneously leverages both observed outcomes to inform trial decisions. We perform comparative statics on the relative benefit of our approach, illustrating the types of diseases and surrogates for which our proposed design is particularly advantageous. Finally, we illustrate our proposed design on metastatic breast cancer. We use a large-scale clinical trial database to construct a Bayesian prior, and simulate our design on a subset of clinical trials. We estimate that our proposed design would yield a 5% increase in trial benefits relative to existing clinical trial designs.

  • Hamsa Bastani, David Simchi-Levi, Ruihao Zhu (Under Revision), Meta Dynamic Pricing: Learning Across Experiments.

    Abstract: We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation where the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation), and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the prior as a function of its estimation error, thereby ensuring convergence of each price experiment. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N; an immediate consequence of our analysis is that the price of an unknown prior in Thompson sampling is negligible in experiment-rich environments with shared structure (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared to prior-independent algorithms or a naive approach of greedily using the updated prior across products.

  • Hamsa Bastani and Joann F. de Zegher (Under Revision), Do Policies with Limited Enforcement Reduce Harm? Evidence from Transshipment Bans.

    Abstract: To mitigate environmental and social harm in supply chains, buyers often provide incentives or impose sanctions to discourage harmful behavior by suppliers. However, such policies are often implemented with limited monitoring and enforcement; theory suggests that such conditions may cause strategic behavior by suppliers, leading to unintended consequences. We study empirically if a policy with limited enforcement (1) can reduce harm, (2) leads to evasion and strategic behavior, and (3) increases raw material costs. We study these questions in the context of a ban on seafood transshipments. Seafood transshipments have been associated with illegal fishing and widespread forced labor in seafood supply chains, leading to pressure on seafood buyers to ban transshipments in their supply chain. Buyers have argued against such a ban, indicating that it would simply lead to evasion (because transshipments are difficult to monitor), while increasing costs (because transshipments allow for more efficient logistics). Directly studying the effect of a supply chain ban by buyers is difficult; instead, we study the effect of geographic bans implemented by international management organizations, and show that the resulting findings provide a conservative estimate of the effect of a supply chain ban. Using remote sensing data and exploiting variation over time and across regions, we find that a geographic ban reduces transshipments by 57% despite significant enforcement challenges. A difference-in-difference analysis of landing prices suggests that this reduction comes at a cost of 3.2% higher prices. In contrast to theoretical predictions, the ban does not appear to cause significant strategic evasion.

  • Hamsa Bastani (Under Revision), Predicting with Proxies.

    Abstract: Predictive analytics is increasingly used to guide decision-making in many applications. However, in practice, we often have limited data on the true predictive task of interest, and must instead rely on more abundant data on a closely-related proxy predictive task. For example, e-commerce platforms use abundant customer click data (proxy) to make product recommendations rather than the relatively sparse customer purchase data (true outcome of interest); alternatively, hospitals often rely on medical risk scores trained on a different patient population (proxy) rather than their own patient population (true cohort of interest) to assign interventions. However, not accounting for the bias in the proxy can lead to sub-optimal decisions. Using real datasets, we find that this bias can often be captured by a sparse function of the features. Thus, we propose a novel two-step estimator that uses techniques from high-dimensional statistics to efficiently combine a large amount of proxy data and a small amount of true data. We prove upper bounds on the error of our proposed estimator and lower bounds on several heuristics commonly used by data scientists; in particular, our proposed estimator can achieve the same accuracy with exponentially less true data (in the number of features). Our proof relies on a new tail inequality on the convergence of LASSO for approximately sparse vectors. Finally, we demonstrate the effectiveness of our approach on e-commerce and healthcare datasets; in both cases, we achieve significantly better predictive accuracy as well as managerial insights into the nature of the bias in the proxy data.

  • Hamsa Bastani, Osbert Bastani, Carolyn Kim (Draft), Interpreting Predictive Models for Human-in-the-Loop Analytics.

    Abstract: Machine learning is increasingly used to inform consequential decisions. Yet, these predictive models have been found to exhibit unexpected defects when trained on real-world observational data, which are plagued with confounders and biases. Thus, it is critical to involve domain experts in an interactive process of developing predictive models; interpretability offers a promising way to facilitate this interaction. We propose a novel approach to interpreting complex, blackbox machine learning models by constructing simple decision trees that summarize their reasoning process. Our algorithm leverages active learning to extract richer and more accurate interpretations than several baselines. Furthermore, we prove that by generating a sufficient amount of data through our active learning strategy, the extracted decision tree converges to the exact decision tree, implying that we provably avoid overfitting. We evaluate our algorithm on a random forest to predict diabetes risk on a real electronic medical record dataset, and show that it produces significantly more accurate interpretations than several baselines. We also conduct a user study demonstrating that humans are able to better reason about our interpretations than state-of-the-art rule lists. We then perform a case study with domain experts (physicians) regarding our diabetes risk prediction model, and describe several insights they derived using our interpretation. Of particular note, the physicians discovered an unexpected causal issue by investigating a subtree in our interpretation; we were able to then verify that this endogeneity indeed existed in our data, underscoring the value of interpretability.

  • Hamsa Bastani, Pavithra Harsha, Georgia Perakis, Divya Singhvi (Under Revision), Learning Personalized Product Recommendations with Customer Disengagement.

    Abstract: We consider the problem of sequential product recommendation when customer preferences are unknown. First, we present empirical evidence of customer disengagement using a sequence of ad campaigns from a major airline carrier. In particular, customers decide to stay on the platform based on the relevance of recommendations. We then formulate this problem as a linear bandit, with the notable difference that the customer's horizon length is a function of past recommendations. We prove that any algorithm in this setting achieves linear regret. Thus, no algorithm can keep all customers engaged; however, we can hope to keep a subset of customers engaged. Unfortunately, we find that classical bandit learning as well as greedy algorithms provably over-explore, thereby incurring linear regret for every customer. We propose modifying bandit learning strategies by constraining the action space upfront using an integer program. We prove that this simple modification allows our algorithm to achieve sublinear regret for a significant fraction of customers. Furthermore, numerical experiments on real movie recommendations data demonstrate that our algorithm can improve customer engagement with the platform by up to 80%.

  • Hamsa Bastani, Joel Goh, Mohsen Bayati (2018), Evidence of Upcoding in Pay-for-Performance Programs, Management Science.

    Abstract: Recent Medicare legislation seeks to improve patient care quality by financially penalizing providers for hospital-acquired infections (HAIs). However, Medicare cannot directly monitor HAI rates and instead relies on providers accurately self-reporting HAIs in claims to correctly assess penalties. Consequently, the incentives for providers to improve service quality may disappear if providers upcode, i.e., misreport HAIs (possibly unintentionally) in a manner that increases reimbursement or avoids financial penalties. Identifying upcoding in claims data is challenging because of unobservable confounders (e.g., patient risk). We leverage state-level variations in adverse event reporting regulations and instrumental variables to discover contradictions in HAI and present-on-admission (POA) infection reporting rates that are strongly suggestive of upcoding. We conservatively estimate that 10,000 out of 60,000 annual reimbursed claims for POA infections (18.5%) were upcoded HAIs, costing Medicare $200 million. Our findings suggest that self-reported quality metrics are unreliable and, thus, that recent legislation may result in unintended consequences.

  • Hamsa Bastani, Mohsen Bayati, Khashayar Khosravi (Under Revision), Mostly Exploration-Free Algorithms for Contextual Bandits.

    Abstract: The contextual bandit literature has traditionally focused on algorithms that address the exploration-exploitation tradeoff. In particular, greedy algorithms that exploit current estimates without any exploration may be sub-optimal in general. However, exploration-free greedy algorithms are desirable in practical settings where exploration may be costly or unethical (e.g., clinical trials). Surprisingly, we find that a simple greedy algorithm can be rate-optimal (achieves asymptotically optimal regret) if there is sufficient randomness in the observed contexts (covariates). We prove that this is always the case for a two-armed bandit under a general class of context distributions that satisfy a condition we term covariate diversity. Furthermore, even absent this condition, we show that a greedy algorithm can be rate optimal with positive probability. Thus, standard bandit algorithms may unnecessarily explore. Motivated by these results, we introduce Greedy-First, a new algorithm that uses only observed contexts and rewards to determine whether to follow a greedy algorithm or to explore. We prove that this algorithm is rate-optimal without any additional assumptions on the context distribution or the number of arms. Extensive simulations demonstrate that Greedy-First successfully reduces exploration and outperforms existing (exploration-based) contextual bandit algorithms such as Thompson sampling or upper confidence bound (UCB).

  • Hamsa Bastani, Mohsen Bayati, Mark Braverman, Ramki Gummadi, Ramesh Johari (Draft), Analysis of Medicare Pay-for-Performance Contracts.

    Abstract: Medicare has sought to improve patient care through pay-for-performance (P4P) programs that better align hospitals' financial incentives with quality of service. However, the design of these policies is subject to a variety of practical and institutional constraints, such as the use of "small" performance-based incentives. We develop a framework based on a stylized principal-agent model to characterize the optimal P4P mechanism within any set of feasible mechanisms in the regime of small incentives. Importantly, our feasible set can be flexibly modified to include institutional constraints. We apply our results to examine debated design choices in existing Medicare P4P programs, and offer several insights and policy recommendations. In particular, we find that these mechanisms may benefit by incorporating bonuses for top-performers, and using a single performance cutoff to uniformly assess performance-based payments. We also examine a number of comparative statics that shed light on when P4P mechanisms are effective.


Past Courses


    This class provides a high-level introduction to the field of judgment and decision making (JDM) and in-depth exposure to the process of doing research in this area. Throughout the semester you will gain hands-on experience with several different JDM research projects. You will be paired with a PhD student or faculty mentor who is working on a variety of different research studies. Each week you will be given assignments that are central to one or more of these studies, and you will be given detailed descriptions of the research projects you are contributing to and how your assignments relate to the successful completion of these projects. To complement your hands-on research experience, throughout the semester you will be assigned readings from the book Nudge by Thaler and Sunstein, which summarizes key recent ideas in the JDM literature. You will also meet as a group for an hour once every three weeks with the class's faculty supervisor and all of his or her PhD students to discuss the projects you are working on, to discuss the class readings, and to discuss your own research ideas stimulated by getting involved in various projects. Date and time to be mutually agreed upon by supervising faculty and students. the 1CU version of this course will involve approx. 10 hours of research immersion per week and a 10-page paper. The 0.5 CU version of this course will involve approx 5 hours of research immersion per week and a 5-page final paper. Please contact Maurice Schweitzer if you are interested in enrolling in the course: schweitzer@wharton.upenn.edu


    Understanding how to use data and business analytics can be the key differential for a company's success or failure. This course is designed to introduce fundamental quantitative decision-making tools for a broad range of managerial decision problems. Topics covered include linear, nonlinear, and discrete optimization, dynamic programming, and simulation. Students will apply these quantitative models in applications of portfolio management, electricity auctions, revenue management for airlines, manufacturing, advertising budget allocation, and healthcare scheduling operations. Emphasis in this course is placed on mathematical modeling of real world problems and implementation of decision making tools.


    This course number is currently used for several course types including independent studies, experimental courses and Management & Technology Freshman Seminar. Instructor permission required to enroll in any independent study. Wharton Undergraduate students must also receive approval from the Undergraduate Division to register for independent studies. Section 002 is the Management and Technology Freshman Seminar; instruction permission is not required for this section and is only open to M&T students. For Fall 2020, Section 004 is a new course titled AI, Business, and Society. The course provides a overview of AI and its role in business transformation. The purpose of this course is to improve understanding of AI, discuss the many ways in which AI is being used in the industry, and provide a strategic framework for how to bring AI to the center of digital transformation efforts. In terms of AI overview, we will go over a brief technical overview for students who are not actively immersed in AI (topic covered include Big Data, data warehousing, data-mining, different forms of machine learning, etc). In terms of business applications, we will consider applications of AI in media, Finance, retail, and other industries. Finally, we will consider how AI can be used as a source of competitive advantage. We will conclude with a discussion of ethical challenges and a governance framework for AI. No prior technical background is assumed but some interest in (and exposure to) technology is helpful. Every effort is made to build most of the lectures from the basics.

Awards and Honors

  • 1st Place, Pierskalla Award for Best Paper in Healthcare, 2019
  • 2nd Place, Service Science Best Paper Award, 2019
  • People’s Choice Award, Early-Career Sustainable OM Workshop, 2019
  • Finalist, Pierskalla Award for Best Paper in Healthcare, 2018
  • 1st Place, Pierskalla Award for Best Paper in Healthcare, 2016
  • 1st Place, George Nicholson Student Paper Competition, 2016
  • 1st Place, MSOM Student Paper Competition, 2016
  • 1st Place, IBM Service Science Best Student Paper Award, 2016
  • 1st Place, Health Applications Society Best Student Paper Award, 2015

In the News

Knowledge @ Wharton


Latest Research

Hamsa Bastani and Mohsen Bayati (2019), Online Decision-Making with High-Dimensional Covariates, Operations Research.
All Research

In the News

Beyond Clicks: Getting the Most out of Big Data

Recent Wharton research aims to help companies navigate the complicated waters of Big Data by offering a better way to use predictive analytics.

Knowledge @ Wharton - 2019/04/19
All News

Awards and Honors

1st Place, Pierskalla Award for Best Paper in Healthcare 2019
All Awards