3730 Walnut Street
557 Jon M. Huntsman Hall
Philadelphia, PA 19104
Links: Personal Website
Hamsa Bastani is an assistant professor in Operations Information and Decisions at the Wharton School, University of Pennsylvania. She is interested in machine learning and high-dimensional statistics, with applications to healthcare operations, sustainable supply chains, and revenue management. Her work has been recognized by the INFORMS Pierskalla best paper award as well as the George Nicholson, MSOM, Service Science, and Health Applications Society best student paper awards. She previously completed her PhD at Stanford University, and was a Herman Goldstine postdoctoral fellow at IBM Research.
Abstract: We study the problem of learning across a sequence of price experiments for related products, focusing on implementing the Thompson sampling algorithm for dynamic pricing. We consider a practical formulation of this problem where the unknown parameters of the demand function for each product come from a prior that is shared across products, but is unknown a priori. Our main contribution is a meta dynamic pricing algorithm that learns this prior online while solving a sequence of non-overlapping pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the current estimate of the prior to achieve good performance (meta-exploitation), and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the prior as a function of its estimation error, thereby ensuring convergence of each price experiment. We prove that the price of an unknown prior for Thompson sampling is negligible in experiment-rich environments (large N). In particular, our algorithm’s meta regret can be upper bounded by O(√NT) when the covariance of the prior is known, and O(N^3/4 √T) otherwise. Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared to prior-independent algorithms or a naive approach of greedily using the updated prior across products.
Hamsa Bastani and Joann F. de Zegher (Draft), Do Policies with Limited Enforcement Reduce Harm? Evidence from Transshipment Bans.
Abstract: To mitigate environmental and social harm in supply chains, buyers often provide incentives or impose sanctions to discourage harmful behavior by suppliers. However, such policies are often implemented with limited monitoring and enforcement; theory suggests that such conditions may cause strategic behavior by suppliers, leading to unintended consequences. We study empirically if a policy with limited enforcement (1) can reduce harm, (2) leads to evasion and strategic behavior, and (3) increases raw material costs. We study these questions in the context of a ban on seafood transshipments. Seafood transshipments have been associated with illegal fishing and widespread forced labor in seafood supply chains, leading to pressure on seafood buyers to ban transshipments in their supply chain. Buyers have argued against such a ban, indicating that it would simply lead to evasion (because transshipments are difficult to monitor), while increasing costs (because transshipments allow for more efficient logistics). Directly studying the effect of a supply chain ban by buyers is difficult; instead, we study the effect of geographic bans implemented by international management organizations, and show that the resulting findings provide a conservative estimate of the effect of a supply chain ban. Using remote sensing data and exploiting variation over time and across regions, we find that a geographic ban reduces transshipments by 57% despite significant enforcement challenges. A difference-in-difference analysis of landing prices suggests that this reduction comes at a cost of 3.2% higher prices. In contrast to theoretical predictions, the ban does not appear to cause significant strategic evasion.
Abstract: Predictive analytics is increasingly used to guide decision-making in many applications. However, in practice, we often have limited data on the true predictive task of interest, and must instead rely on more abundant data on a closely-related proxy predictive task. For example, e-commerce platforms use abundant customer click data (proxy) to make product recommendations rather than the relatively sparse customer purchase data (true outcome of interest); alternatively, hospitals often rely on medical risk scores trained on a different patient population (proxy) rather than their own patient population (true cohort of interest) to assign interventions. However, not accounting for the bias in the proxy can lead to sub-optimal decisions. Using real datasets, we find that this bias can often be captured by a sparse function of the features. Thus, we propose a novel two-step estimator that uses techniques from high-dimensional statistics to efficiently combine a large amount of proxy data and a small amount of true data. We prove upper bounds on the error of our proposed estimator and lower bounds on several heuristics commonly used by data scientists; in particular, our proposed estimator can achieve the same accuracy with exponentially less true data (in the number of features). Our proof relies on a new tail inequality on the convergence of LASSO for approximately sparse vectors. Finally, we demonstrate the effectiveness of our approach on e-commerce and healthcare datasets; in both cases, we achieve significantly better predictive accuracy as well as managerial insights into the nature of the bias in the proxy data.
Hamsa Bastani, Osbert Bastani, Carolyn Kim (Under Review), Interpreting Predictive Models for Human-in-the-Loop Analytics.
Abstract: Machine learning is increasingly used to inform consequential decisions. Yet, these predictive models have been found to exhibit unexpected defects when trained on real-world observational data, which are plagued with confounders and biases. Thus, it is critical to involve domain experts in an interactive process of developing predictive models; interpretability offers a promising way to facilitate this interaction. We propose a novel approach to interpreting complex, blackbox machine learning models by constructing simple decision trees that summarize their reasoning process. Our algorithm leverages active learning to extract richer and more accurate interpretations than several baselines. Furthermore, we prove that by generating a sufficient amount of data through our active learning strategy, the extracted decision tree converges to the exact decision tree, implying that we provably avoid overfitting. We evaluate our algorithm on a random forest to predict diabetes risk on a real electronic medical record dataset, and show that it produces significantly more accurate interpretations than several baselines. We also conduct a user study demonstrating that humans are able to better reason about our interpretations than state-of-the-art rule lists. We then perform a case study with domain experts (physicians) regarding our diabetes risk prediction model, and describe several insights they derived using our interpretation. Of particular note, the physicians discovered an unexpected causal issue by investigating a subtree in our interpretation; we were able to then verify that this endogeneity indeed existed in our data, underscoring the value of interpretability.
Hamsa Bastani, Pavithra Harsha, Georgia Perakis, Divya Singhvi (Under Review), Sequential Learning of Product Recommendations with Customer Disengagement.
Abstract: We consider the problem of sequential product recommendation when customer preferences are unknown. First, we present empirical evidence of customer disengagement using a sequence of ad campaigns from a major airline carrier. In particular, customers decide to stay on the platform based on the relevance of recommendations. We then formulate this problem as a linear bandit, with the notable difference that the customer's horizon length is a function of past recommendations. We prove that any algorithm in this setting achieves linear regret. Thus, no algorithm can keep all customers engaged; however, we can hope to keep a subset of customers engaged. Unfortunately, we find that classical bandit learning as well as greedy algorithms provably over-explore, thereby incurring linear regret for every customer. We propose modifying bandit learning strategies by constraining the action space upfront using an integer program. We prove that this simple modification allows our algorithm to achieve sublinear regret for a significant fraction of customers. Furthermore, numerical experiments on real movie recommendations data demonstrate that our algorithm can improve customer engagement with the platform by up to 80%.
Hamsa Bastani, Joel Goh, Mohsen Bayati (2018), Evidence of Upcoding in Pay-for-Performance Programs, Management Science.
Abstract: Recent Medicare legislation seeks to improve patient care quality by financially penalizing providers for hospital-acquired infections (HAIs). However, Medicare cannot directly monitor HAI rates and instead relies on providers accurately self-reporting HAIs in claims to correctly assess penalties. Consequently, the incentives for providers to improve service quality may disappear if providers upcode, i.e., misreport HAIs (possibly unintentionally) in a manner that increases reimbursement or avoids financial penalties. Identifying upcoding in claims data is challenging because of unobservable confounders (e.g., patient risk). We leverage state-level variations in adverse event reporting regulations and instrumental variables to discover contradictions in HAI and present-on-admission (POA) infection reporting rates that are strongly suggestive of upcoding. We conservatively estimate that 10,000 out of 60,000 annual reimbursed claims for POA infections (18.5%) were upcoded HAIs, costing Medicare $200 million. Our findings suggest that self-reported quality metrics are unreliable and, thus, that recent legislation may result in unintended consequences.
Hamsa Bastani, Mohsen Bayati, Khashayar Khosravi (Under Review), Mostly Exploration-Free Algorithms for Contextual Bandits.
Abstract: The contextual bandit literature has traditionally focused on algorithms that address the exploration-exploitation tradeoff. In particular, greedy algorithms that exploit current estimates without any exploration may be sub-optimal in general. However, exploration-free greedy algorithms are desirable in practical settings where exploration may be costly or unethical (e.g., clinical trials). Surprisingly, we find that a simple greedy algorithm can be rate-optimal (achieves asymptotically optimal regret) if there is sufficient randomness in the observed contexts (covariates). We prove that this is always the case for a two-armed bandit under a general class of context distributions that satisfy a condition we term covariate diversity. Furthermore, even absent this condition, we show that a greedy algorithm can be rate optimal with positive probability. Thus, standard bandit algorithms may unnecessarily explore. Motivated by these results, we introduce Greedy-First, a new algorithm that uses only observed contexts and rewards to determine whether to follow a greedy algorithm or to explore. We prove that this algorithm is rate-optimal without any additional assumptions on the context distribution or the number of arms. Extensive simulations demonstrate that Greedy-First successfully reduces exploration and outperforms existing (exploration-based) contextual bandit algorithms such as Thompson sampling or upper confidence bound (UCB).
Hamsa Bastani, Mohsen Bayati, Mark Braverman, Ramki Gummadi, Ramesh Johari (Draft), Analysis of Medicare Pay-for-Performance Contracts.
Abstract: Medicare has sought to improve patient care through pay-for-performance (P4P) programs that better align hospitals' financial incentives with quality of service. However, the design of these policies is subject to a variety of practical and institutional constraints, such as the use of "small" performance-based incentives. We develop a framework based on a stylized principal-agent model to characterize the optimal P4P mechanism within any set of feasible mechanisms in the regime of small incentives. Importantly, our feasible set can be flexibly modified to include institutional constraints. We apply our results to examine debated design choices in existing Medicare P4P programs, and offer several insights and policy recommendations. In particular, we find that these mechanisms may benefit by incorporating bonuses for top-performers, and using a single performance cutoff to uniformly assess performance-based payments. We also examine a number of comparative statics that shed light on when P4P mechanisms are effective.
Hamsa Bastani and Mohsen Bayati (Under Revision), Online Decision-Making with High-Dimensional Covariates.
Abstract: Big data has enabled decision-makers to tailor decisions at the individual-level in a variety of domains such as personalized medicine and online advertising. This involves learning a model of decision rewards conditional on individual-specific covariates. In many practical settings, these covariates are high-dimensional; however, typically only a small subset of the observed features are predictive of a decision's success. We formulate this problem as a multi-armed bandit with high-dimensional covariates, and present a new efficient bandit algorithm based on the LASSO estimator. The key step in our analysis is proving a new oracle inequality that guarantees the convergence of the LASSO estimator despite the non-i.i.d. data induced by the bandit policy. Furthermore, we illustrate the practical relevance of our algorithm by evaluating it on a simplified version of a medication dosing problem. A patient's optimal medication dosage depends on the patient's genetic profile and medical records; incorrect initial dosage may result in adverse consequences such as stroke or bleeding. We show that our algorithm outperforms existing bandit methods as well as physicians to correctly dose a majority of patients.
Understanding how to use data and business analytics can be the key differential for a company's success or failure. This course is designed to introduce fundamental quantitative decisionmaking tools for a broad range of managerial decision problems. Topics covered include linear, nonlinear, and discrete optimization, dynamic programming, and simulation. Students will apply these quantitative models in applications of portfolio management, electricity auctions, revenue management for airlines, manufacturing, advertising budget allocation, and healthcare scheduling operations. Emphasis in this course is placed on mathematical modeling of real world problems and implementation of decision making tools.
Recent Wharton research aims to help companies navigate the complicated waters of Big Data by offering a better way to use predictive analytics.Knowledge @ Wharton - 2019/04/19