Hamsa Bastani

Hamsa Bastani
  • Associate Professor of Operations, Information and Decisions
  • Associate Professor of Statistics and Data Science (secondary)

Contact Information

  • office Address:

    3730 Walnut Street
    557 Jon M. Huntsman Hall
    Philadelphia, PA 19104

Research Interests: machine learning algorithms & applications to healthcare, revenue management, social good

Links: Personal Website, CV, LinkedIn


Hamsa Bastani is an Associate Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations, social good, and revenue management. Her work has received several recognitions, including the Wagner Prize for Excellence in Practice (2021), the Pierskalla Award for the best paper in healthcare (2016, 2019, 2021), the Behavioral OM Best Paper Award (2021), as well as first place in the George Nicholson and MSOM student paper competitions (2016). She previously completed her PhD at Stanford University, and spent a year as a Herman Goldstine postdoctoral fellow at IBM Research.

Continue Reading


  • Xinmeng Huang, Kan Xu, Donghwan Lee, Seyed Hamed Hassani, Hamsa Bastani, Edgar Dobriban Optimal Heterogeneous Collaborative Linear Regression and Contextual Bandits.

  • Arielle Anderer, Hamsa Bastani, John Silberholz (2022), Adaptive Clinical Trial Designs with Surrogates: When Should We Bother?, Management Science (2022).

    Abstract: The success of a new drug is assessed within a clinical trial using a primary endpoint, which is typically the true outcome of interest, e.g., overall survival. However, regulators sometimes approve drugs using a surrogate outcome --- an intermediate indicator that is faster or easier to measure than the true outcome of interest, e.g., progression-free survival --- as the primary endpoint when there is demonstrable medical need. While using a surrogate outcome (instead of the true outcome) as the primary endpoint can substantially speed up clinical trials and lower costs, it can also result in poor drug approval decisions since the surrogate is not a perfect predictor of the true outcome. In this paper, we propose combining data from both surrogate and true outcomes to improve decision-making within a late-phase clinical trial. In contrast to broadly used clinical trial designs that rely on a single primary endpoint, we propose a Bayesian adaptive clinical trial design that simultaneously leverages both observed outcomes to inform trial decisions. We perform comparative statics on the relative benefit of our approach, illustrating the types of diseases and surrogates for which our proposed design is particularly advantageous. Finally, we illustrate our proposed design on metastatic breast cancer. We use a large-scale clinical trial database to construct a Bayesian prior, and simulate our design on a subset of clinical trials. We estimate that our design would yield a 16% decrease in trial costs relative to existing clinical trial designs, while maintaining the same Type I/II error rates.

  • Kan Xu and Hamsa Bastani (Under Review), Learning Across Bandits in High Dimension via Robust Statistics.

    Abstract: Decision-makers often face the "many bandits" problem, where one must simultaneously learn across related but heterogeneous contextual bandit instances. For instance, a large retailer may wish to dynamically learn product demand across many stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to dynamically learn patient risk across many providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. We study the setting where the unknown parameter in each bandit instance can be decomposed into a global parameter plus a sparse instance-specific term. Then, we propose a novel two-stage estimator that exploits this structure in a sample-efficient way by using a combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). We embed this estimator within a bandit algorithm, and prove that it improves asymptotic regret bounds in the context dimension d; this improvement is exponential for data-poor instances. We further demonstrate how our results depend on the underlying network structure of bandit instances.

  • Hamsa Bastani, Pavithra Harsha, Georgia Perakis, Divya Singhvi (2021), Learning Personalized Product Recommendations with Customer Disengagement, MSOM.

    Abstract: Problem definition: We study personalized product recommendations on platforms when customers have unknown preferences. Importantly, customers may disengage when offered poor recommendations. Academic/practical relevance: Online platforms often personalize product recommendations using bandit algorithms, which balance an exploration-exploitation trade-off. However, customer disengagement—a salient feature of platforms in practice—introduces a novel challenge because exploration may cause customers to abandon the platform. We propose a novel algorithm that constrains exploration to improve performance. Methodology: We present evidence of customer disengagement using data from a major airline’s ad campaign; this motivates our model of disengagement, where a customer may abandon the platform when offered irrelevant recommendations. We formulate the customer preference learning problem as a generalized linear bandit, with the notable difference that the customer’s horizon length is a function of past recommendations. Results: We prove that no algorithm can keep all customers engaged. Unfortunately, classical bandit algorithms provably overexplore, causing every customer to eventually disengage. Motivated by the structural properties of the optimal policy in a scalar instance of our problem, we propose modifying bandit learning strategies by constraining the action space up front using an integer program. We prove that this simple modification allows our algorithm to perform well by keeping a significant fraction of customers engaged. Managerial implications: Platforms should be careful to avoid overexploration when learning customer preferences if customers have a high propensity for disengagement. Numerical experiments on movie recommendations data demonstrate that our algorithm can significantly improve customer engagement.

  • Hamsa Bastani, Kimon Drakopoulos, Vishal Gupta (Forthcoming), Interpretable OR for High-Stakes Decisions: Designing the Greek COVID-19 Testing System.

    Abstract: In the summer of 2020, in collaboration with the Greek government, we designed and deployed Eva – the first national scale, reinforcement learning system for targeted COVID-19 testing. In this paper, we detail the rationale for three major design/algorithmic elements: Eva’s testing supply chain, estimating COVID-19 prevalence, and test allocation. Specifically, we describe the design of Eva’s supply chain to collect and process thousands of biological samples per day with special emphasis on capacity procurement. Then, we propose a novel, empirical Bayes estimation strategy to estimate COVID-19 prevalence among different passenger types with limited data and showcase how these estimates were instrumental for a variety of downstream decision-making. Finally, we propose a novel, multi-armed bandit algorithm that dynamically allocates tests to arriving passengers in a non-stationary environment with delayed feedback and batched decisions. All of our design and algorithmic choices emphasize the need for transparent reasoning to enable human-in-the- loop analytics. Such transparency was crucial to building trust and buy-in among policymakers and public health experts in a period of global crisis.

  • Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani (2021), Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings, ICML.

    Abstract: Sparse regression has recently been applied to enable transfer learning from very limited data. We study an extension of this approach to unsupervised learning—in particular, learning word embeddings from unstructured text corpora using low-rank matrix factorization. Intuitively, when transferring word embeddings to a new domain, we expect that the embeddings change for only a small number of words—e.g., the ones with novel meanings in that domain. We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domain—e.g., a single article of text. We prove generalization bounds for our algorithm. Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results.

  • Pia Ramchandani, Hamsa Bastani, Emily Wyatt (Under Review), Unmasking Human Trafficking Risk in Commercial Sex Supply Chains with Machine Learning.

    Abstract: The covert nature of sex trafficking provides a significant barrier to generating large-scale, data-driven insights to inform law enforcement, policy and social work. We leverage massive deep web data (collected globally from leading commercial sex websites) in tandem with a novel machine learning framework to unmask suspicious recruitment-to-sales pathways, thereby providing the first global network view of trafficking risk in commercial sex supply chains. This allows us to infer likely recruitment-to-sales trafficking routes of criminal entities, deceptive approaches used to recruit victims, and regional variations in recruitment vs. sales pressure. These insights can help law enforcement agencies along trafficking routes better coordinate efforts, as well as target local counter-trafficking policies and interventions towards exploitative behavior frequently exhibited in that region.

  • Wanqiao Xu, Kan Xu, Hamsa Bastani, Osbert Bastani (Under Review), Mind the Gap: Safely Bridging Offline and Online Reinforcement Learning.

    Abstract: A key challenge to deploying reinforcement learning in practice is exploring safely. We propose a natural safety property—uniformly outperforming a conservative policy (adaptively estimated from all data observed thus far), up to a per-episode exploration budget. This property formalizes the idea that we should spread out exploration to avoid taking actions significantly worse than the ones that are currently known to be good. We then design an algorithm that uses a UCB reinforcement learning policy for exploration, but overrides it as needed to ensure safety with high probability. To ensure exploration across the entire state space, it adaptively determines when to explore (at different points of time across different episodes) in a way that allows “stitching” sub-episodes together to obtain a meta-episode that is equivalent to using UCB for the entire episode. Then, we establish reasonable assumptions about the underlying MDP under which our algorithm is guaranteed to achieve sublinear regret while ensuring safety; under these assumptions, the cost of imposing safety is only a constant factor.

  • Kan Xu, Hamsa Bastani, Osbert Bastani (Under Review), Robust Generalization of Quadratic Neural Networks via Function Identification.

    Abstract: A key challenge facing deep learning is that neural networks are often not robust to shifts in the underlying data distribution. We study this problem from the perspective of the statistical concept of parameter identification. Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. In contrast, if we can identify the “true” parameters, then the model generalizes to arbitrary distribution shifts. However, neural networks are typically overparameterized, making parameter identification impossible. We show that for quadratic neural networks, we can identify the function represented by the model even though we cannot identify its parameters. Thus, we can obtain robust generalization bounds even in the overparameterized setting. We leverage this result to obtain new bounds for contextual bandits and transfer learning with quadratic neural networks. Overall, our results suggest that we can improve robustness of neural networks by designing models that can represent the true data generating process. In practice, the true data generating process is often very complex; thus, we study how our framework might connect to neural module networks, which are designed to break down complex tasks into compositions of simpler ones. We prove robust generalization bounds when individual neural modules are identifiable.

  • Hamsa Bastani, Kimon Drakopoulos, Vishal Gupta, Jon Vlachogiannis, Christos Hadjicristodoulou, Pagona Lagiou, Gkikas Magiorkinis, Dimitrios Paraskevis, Sotirios Tsiodras (2021), Efficient and Targeted COVID-19 Border Testing via Reinforcement Learning, Nature, 599 (7883), pp. 108-113.

    Abstract: Throughout the coronavirus disease 2019 (COVID-19) pandemic, countries have relied on a variety of ad hoc border control protocols to allow for non-essential travel while safeguarding public health, from quarantining all travellers to restricting entry from select nations on the basis of population-level epidemiological metrics such as cases, deaths or testing positivity rates. Here we report the design and performance of a reinforcement learning system, nicknamed Eva. In the summer of 2020, Eva was deployed across all Greek borders to limit the influx of asymptomatic travellers infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and to inform border policies through real-time estimates of COVID-19 prevalence. In contrast to country-wide protocols, Eva allocated Greece’s limited testing resources on the basis of incoming travellers’ demographic information and testing results from previous travellers. By comparing Eva’s performance against modelled counterfactual scenarios, we show that Eva identified 1.85 times as many asymptomatic, infected travellers as random surveillance testing, with up to 2–4 times as many during peak travel, and 1.25–1.45 times as many asymptomatic, infected travellers as testing policies that utilize only epidemiological metrics. We demonstrate that this latter benefit arises, at least partially, because population-level epidemiological metrics had limited predictive value for the actual prevalence of SARS-CoV-2 among asymptomatic travellers and exhibited strong country-specific idiosyncrasies in the summer of 2020. Our results raise serious concerns on the effectiveness of country-agnostic internationally proposed border control policies that are based on population-level epidemiological metrics. Instead, our work represents a successful example of the potential of reinforcement learning and real-time data for safeguarding public health.


All Courses

  • OIDD2990 - Judg & Dec Making Res Im

    This class provides a high-level introduction to the field of judgment and decision making (JDM) and in-depth exposure to the process of doing research in this area. Throughout the semester you will gain hands-on experience with several different JDM research projects. You will be paired with a PhD student or faculty mentor who is working on a variety of different research studies. Each week you will be given assignments that are central to one or more of these studies, and you will be given detailed descriptions of the research projects you are contributing to and how your assignments relate to the successful completion of these projects. To complement your hands-on research experience, throughout the semester you will be assigned readings from the book Nudge by Thaler and Sunstein, which summarizes key recent ideas in the JDM literature. You will also meet as a group for an hour once every three weeks with the class's faculty supervisor and all of his or her PhD students to discuss the projects you are working on, to discuss the class readings, and to discuss your own research ideas stimulated by getting involved in various projects. Date and time to be mutually agreed upon by supervising faculty and students. the 1CU version of this course will involve approx. 10 hours of research immersion per week and a 10-page paper. The 0.5 CU version of this course will involve approx 5 hours of research immersion per week and a 5-page final paper. Please contact Professor Joseph Simmons if you are interested in enrolling in the course: jsimmo@wharton.upenn.edu

  • OIDD3210 - Intro To Mgmt Science

    Understanding how to use data and business analytics can be the key differential for a company's success or failure. This course is designed to introduce fundamental quantitative decision-making tools for a broad range of managerial decision problems. Topics covered include linear, nonlinear, and discrete optimization, dynamic programming, and simulation. Students will apply these quantitative models in applications of portfolio management, electricity auctions, revenue management for airlines, manufacturing, advertising budget allocation, and healthcare scheduling operations. Emphasis in this course is placed on mathematical modeling of real world problems and implementation of decision making tools.

  • OIDD3990 - Supervised Study

    This course number is currently used for several course types including independent studies, experimental courses and Management & Technology Freshman Seminar. Instructor permission required to enroll in any independent study. Wharton Undergraduate students must also receive approval from the Undergraduate Division to register for independent studies. Section 002 is the Management and Technology Freshman Seminar; instruction permission is not required for this section and is only open to M&T students. For Fall 2020, Section 004 is a new course titled AI, Business, and Society. The course provides a overview of AI and its role in business transformation. The purpose of this course is to improve understanding of AI, discuss the many ways in which AI is being used in the industry, and provide a strategic framework for how to bring AI to the center of digital transformation efforts. In terms of AI overview, we will go over a brief technical overview for students who are not actively immersed in AI (topic covered include Big Data, data warehousing, data-mining, different forms of machine learning, etc). In terms of business applications, we will consider applications of AI in media, Finance, retail, and other industries. Finally, we will consider how AI can be used as a source of competitive advantage. We will conclude with a discussion of ethical challenges and a governance framework for AI. No prior technical background is assumed but some interest in (and exposure to) technology is helpful. Every effort is made to build most of the lectures from the basics.

  • OIDD9410 - Dist System Sem

    Seminar on distribution systems models and theory. Reviews current research in the development and solution of models of distribution systems. Emphasizes multi-echelon inventory control, logistics management, network design, and competitive models.

Awards and Honors

  • 1st Place, Wagner Prize for Excellence in Operations Research Practice, 2021
  • 1st Place, Pierskalla Award for Best Paper in Healthcare, 2021
  • 2nd Place, Public Sector in Operations Best Paper Award, 2021
  • 1st Place, Behavioral OM Best Working Paper Award, 2021
  • 2nd Place, TIMES Working Paper Award, 2021
  • Finalist, Public Sector in Operations Best Paper Award, 2020
  • People’s Choice Award, Early-Career Sustainable OM Workshop, 2020
  • 2nd Place, Service Science Best Paper Award, 2019
  • 1st Place, Pierskalla Award for Best Paper in Healthcare, 2019
  • People’s Choice Award, Early-Career Sustainable OM Workshop, 2019
  • Finalist, Pierskalla Award for Best Paper in Healthcare, 2018
  • 1st Place, IBM Service Science Best Student Paper Award, 2016
  • 1st Place, MSOM Student Paper Competition, 2016
  • 1st Place, George Nicholson Student Paper Competition, 2016
  • 1st Place, Pierskalla Award for Best Paper in Healthcare, 2016
  • 1st Place, Health Applications Society Best Student Paper Award, 2015

In the News


In the News

How Can AI Improve Health Care?

Wharton professors explain how AI elevates health care practices, from prescription reminders to emergency triage.Read More

Knowledge at Wharton - 11/9/2023
All News

Wharton Magazine

Keys to Getting the Most From Machine Learning
Wharton Magazine - 10/21/2022

Wharton Stories

Research Spotlight: Prof. Hamsa Bastani on Using Machine Learning To Combat Human Trafficking

In Wharton Social Impact’s “Research Spotlight” series, we highlight recent research by Wharton professors and doctoral students whose research focuses on the intersection of business and impact. This month, we spoke with Hamsa Bastani, assistant professor of operations, information, and decisions at Wharton.   Your study explores how deep web data…

Wharton Stories - 02/18/2022
All Stories