-
The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters Exam Performances
Authors:
Allen Nie,
Yash Chandak,
Miroslav Suzara,
Malika Ali,
Juliette Woodrow,
Matt Peng,
Mehran Sahami,
Emma Brunskill,
Chris Piech
Abstract:
Large language models (LLMs) are quickly being adopted in a wide range of learning experiences, especially via ubiquitous and broadly accessible chat interfaces like ChatGPT and Copilot. This type of interface is readily available to students and teachers around the world, yet relatively little research has been done to assess the impact of such generic tools on student learning. Coding education…
▽ More
Large language models (LLMs) are quickly being adopted in a wide range of learning experiences, especially via ubiquitous and broadly accessible chat interfaces like ChatGPT and Copilot. This type of interface is readily available to students and teachers around the world, yet relatively little research has been done to assess the impact of such generic tools on student learning. Coding education is an interesting test case, both because LLMs have strong performance on coding tasks, and because LLM-powered support tools are rapidly becoming part of the workflow of professional software engineers. To help understand the impact of generic LLM use on coding education, we conducted a large-scale randomized control trial with 5,831 students from 146 countries in an online coding class in which we provided some students with access to a chat interface with GPT-4. We estimate positive benefits on exam performance for adopters, the students who used the tool, but over all students, the advertisement of GPT-4 led to a significant average decrease in exam participation. We observe similar decreases in other forms of course engagement. However, this decrease is modulated by the student's country of origin. Offering access to LLMs to students from low human development index countries increased their exam participation rate on average. Our results suggest there may be promising benefits to using LLMs in an introductory coding class, but also potential harms for engagement, which makes their longer term impact on student success unclear. Our work highlights the need for additional investigations to help understand the potential impact of future adoption and integration of LLMs into classrooms.
△ Less
Submitted 25 April, 2024;
originally announced July 2024.
-
Exploring causal effects of hormone- and radio-treatments in an observational study of breast cancer using copula-based semi-competing risks models
Authors:
Tonghui Yu,
Mengjiao Peng,
Yifan Cui,
Elynn Chen,
Chixiang Chen
Abstract:
Breast cancer patients may experience relapse or death after surgery during the follow-up period, leading to dependent censoring of relapse. This phenomenon, known as semi-competing risk, imposes challenges in analyzing treatment effects on breast cancer and necessitates advanced statistical tools for unbiased analysis. Despite progress in estimation and inference within semi-competing risks regre…
▽ More
Breast cancer patients may experience relapse or death after surgery during the follow-up period, leading to dependent censoring of relapse. This phenomenon, known as semi-competing risk, imposes challenges in analyzing treatment effects on breast cancer and necessitates advanced statistical tools for unbiased analysis. Despite progress in estimation and inference within semi-competing risks regression, its application to causal inference is still in its early stages. This article aims to propose a frequentist and semi-parametric framework based on copula models that can facilitate valid causal inference, net quantity estimation and interpretation, and sensitivity analysis for unmeasured factors under right-censored semi-competing risks data. We also propose novel procedures to enhance parameter estimation and its applicability in real practice. After that, we apply the proposed framework to a breast cancer study and detect the time-varying causal effects of hormone- and radio-treatments on patients' relapse-free survival and overall survival. Moreover, extensive numerical evaluations demonstrate the method's feasibility, highlighting minimal estimation bias and reliable statistical inference.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
A Deep Learning Approach to Nonparametric Propensity Score Estimation with Optimized Covariate Balance
Authors:
Maosen Peng,
Yan Li,
Chong Wu,
Liang Li
Abstract:
This paper proposes a novel propensity score weighting analysis. We define two sufficient and necessary conditions for a function of the covariates to be the propensity score. The first is "local balance", which ensures the conditional independence of covariates and treatment assignment across a dense grid of propensity score values. The second condition, "local calibration", guarantees that a bal…
▽ More
This paper proposes a novel propensity score weighting analysis. We define two sufficient and necessary conditions for a function of the covariates to be the propensity score. The first is "local balance", which ensures the conditional independence of covariates and treatment assignment across a dense grid of propensity score values. The second condition, "local calibration", guarantees that a balancing score is a propensity score. Using three-layer feed-forward neural networks, we develop a nonparametric propensity score model that satisfies these conditions, effectively circumventing the issue of model misspecification and optimizing covariate balance to minimize bias and stabilize the inverse probability weights. Our proposed method performed substantially better than existing methods in extensive numerical studies of both real and simulated benchmark datasets.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Continuous-Time Modeling and Analysis of Particle Beam Metrology
Authors:
Akshay Agarwal,
Minxu Peng,
Vivek K. Goyal
Abstract:
Particle beam microscopy (PBM) performs nanoscale imaging by pixelwise capture of scalar values representing noisy measurements of the response from secondary electrons (SEs) integrated over a dwell time. Extended to metrology, goals include estimating SE yield at each pixel and detecting differences in SE yield across pixels; obstacles include shot noise in the particle source as well as lack of…
▽ More
Particle beam microscopy (PBM) performs nanoscale imaging by pixelwise capture of scalar values representing noisy measurements of the response from secondary electrons (SEs) integrated over a dwell time. Extended to metrology, goals include estimating SE yield at each pixel and detecting differences in SE yield across pixels; obstacles include shot noise in the particle source as well as lack of knowledge of and variability in the instrument response to single SEs. A recently introduced time-resolved measurement paradigm promises mitigation of source shot noise, but its analysis and development have been largely limited to estimation problems under an idealization in which SE bursts are directly and perfectly counted. Here, analyses are extended to error exponents in feature detection problems and to degraded measurements that are representative of actual instrument behavior for estimation problems. For estimation from idealized SE counts, insights on existing estimators and a superior estimator are also provided. For estimation in a realistic PBM imaging scenario, extensions to the idealized model are introduced, methods for model parameter extraction are discussed, and large improvements from time-resolved data are presented.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Minimax Optimal Online Imitation Learning via Replay Estimation
Authors:
Gokul Swamy,
Nived Rajaraman,
Matthew Peng,
Sanjiban Choudhury,
J. Andrew Bagnell,
Zhiwei Steven Wu,
Jiantao Jiao,
Kannan Ramchandran
Abstract:
Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy. However, in the finite sample regime, even if one has no optimization error, empirical variance can lead to a performance gap tha…
▽ More
Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy. However, in the finite sample regime, even if one has no optimization error, empirical variance can lead to a performance gap that scales with $H^2 / N$ for behavioral cloning and $H / \sqrt{N}$ for online moment matching, where $H$ is the horizon and $N$ is the size of the expert dataset. We introduce the technique of replay estimation to reduce this empirical variance: by repeatedly executing cached expert actions in a stochastic simulator, we compute a smoother expert visitation distribution estimate to match. In the presence of general function approximation, we prove a meta theorem reducing the performance gap of our approach to the parameter estimation error for offline classification (i.e. learning the expert policy). In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by our approach achieves the optimal $\widetilde{O} \left( \min({H^{3/2}} / {N}, {H} / {\sqrt{N}} \right)$ dependency, under significantly weaker assumptions compared to prior work. We implement multiple instantiations of our approach on several continuous control tasks and find that we are able to significantly improve policy performance across a variety of dataset sizes.
△ Less
Submitted 14 January, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Linear Representation Meta-Reinforcement Learning for Instant Adaptation
Authors:
Matt Peng,
Banghua Zhu,
Jiantao Jiao
Abstract:
This paper introduces Fast Linearized Adaptive Policy (FLAP), a new meta-reinforcement learning (meta-RL) method that is able to extrapolate well to out-of-distribution tasks without the need to reuse data from training, and adapt almost instantaneously with the need of only a few samples during testing. FLAP builds upon the idea of learning a shared linear representation of the policy so that whe…
▽ More
This paper introduces Fast Linearized Adaptive Policy (FLAP), a new meta-reinforcement learning (meta-RL) method that is able to extrapolate well to out-of-distribution tasks without the need to reuse data from training, and adapt almost instantaneously with the need of only a few samples during testing. FLAP builds upon the idea of learning a shared linear representation of the policy so that when adapting to a new task, it suffices to predict a set of linear weights. A separate adapter network is trained simultaneously with the policy such that during adaptation, we can directly use the adapter network to predict these linear weights instead of updating a meta-policy via gradient descent, such as in prior meta-RL methods like MAML, to obtain the new policy. The application of the separate feed-forward network not only speeds up the adaptation run-time significantly, but also generalizes extremely well to very different tasks that prior Meta-RL methods fail to generalize to. Experiments on standard continuous-control meta-RL benchmarks show FLAP presenting significantly stronger performance on out-of-distribution tasks with up to double the average return and up to 8X faster adaptation run-time speeds when compared to prior methods.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Weighed Domain-Invariant Representation Learning for Cross-domain Sentiment Analysis
Authors:
Minlong Peng,
Qi Zhang,
Xuanjing Huang
Abstract:
Cross-domain sentiment analysis is currently a hot topic in the research and engineering areas. One of the most popular frameworks in this field is the domain-invariant representation learning (DIRL) paradigm, which aims to learn a distribution-invariant feature representation across domains. However, in this work, we find out that applying DIRL may harm domain adaptation when the label distributi…
▽ More
Cross-domain sentiment analysis is currently a hot topic in the research and engineering areas. One of the most popular frameworks in this field is the domain-invariant representation learning (DIRL) paradigm, which aims to learn a distribution-invariant feature representation across domains. However, in this work, we find out that applying DIRL may harm domain adaptation when the label distribution $\rm{P}(\rm{Y})$ changes across domains. To address this problem, we propose a modification to DIRL, obtaining a novel weighted domain-invariant representation learning (WDIRL) framework. We show that it is easy to transfer existing SOTA DIRL models to WDIRL. Empirical studies on extensive cross-domain sentiment analysis tasks verified our statements and showed the effectiveness of our proposed solution.
△ Less
Submitted 17 September, 2019;
originally announced September 2019.
-
Address Instance-level Label Prediction in Multiple Instance Learning
Authors:
Minlong Peng,
Qi Zhang
Abstract:
\textit{Multiple Instance Learning} (MIL) is concerned with learning from bags of instances, where only bag labels are given and instance labels are unknown. Existent approaches in this field were mainly designed for the bag-level label prediction (predict labels for bags) but not the instance-level (predict labels for instances), with the task loss being only defined at the bag level. This restri…
▽ More
\textit{Multiple Instance Learning} (MIL) is concerned with learning from bags of instances, where only bag labels are given and instance labels are unknown. Existent approaches in this field were mainly designed for the bag-level label prediction (predict labels for bags) but not the instance-level (predict labels for instances), with the task loss being only defined at the bag level. This restricts their application in many tasks, where the instance-level labels are more interested. In this paper, we propose a novel algorithm, whose loss is specifically defined at the instance level, to address instance-level label prediction in MIL. We prove that the loss of this algorithm can be unbiasedly and consistently estimated without using instance labels, under the i.i.d assumption. Empirical study validates the above statements and shows that the proposed algorithm can achieve superior instance-level and comparative bag-level performance, compared to state-of-the-art MIL methods. In addition, it shows that the proposed method can achieve similar results as the fully supervised model (trained with instance labels) for label prediction at the instance level.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.
-
Learning Bodily and Temporal Attention in Protective Movement Behavior Detection
Authors:
Chongyang Wang,
Min Peng,
Temitayo A. Olugbade,
Nicholas D. Lane,
Amanda C. De C. Williams,
Nadia Bianchi-Berthouze
Abstract:
For people with chronic pain, the assessment of protective behavior during physical functioning is essential to understand their subjective pain-related experiences (e.g., fear and anxiety toward pain and injury) and how they deal with such experiences (avoidance or reliance on specific body joints), with the ultimate goal of guiding intervention. Advances in deep learning (DL) can enable the deve…
▽ More
For people with chronic pain, the assessment of protective behavior during physical functioning is essential to understand their subjective pain-related experiences (e.g., fear and anxiety toward pain and injury) and how they deal with such experiences (avoidance or reliance on specific body joints), with the ultimate goal of guiding intervention. Advances in deep learning (DL) can enable the development of such intervention. Using the EmoPain MoCap dataset, we investigate how attention-based DL architectures can be used to improve the detection of protective behavior by capturing the most informative temporal and body configurational cues characterizing specific movements and the strategies used to perform them. We propose an end-to-end deep learning architecture named BodyAttentionNet (BANet). BANet is designed to learn temporal and bodily parts that are more informative to the detection of protective behavior. The approach addresses the variety of ways people execute a movement (including healthy people) independently of the type of movement analyzed. Through extensive comparison experiments with other state-of-the-art machine learning techniques used with motion capture data, we show statistically significant improvements achieved by using these attention mechanisms. In addition, the BANet architecture requires a much lower number of parameters than the state of the art for comparable if not higher performances.
△ Less
Submitted 17 July, 2019; v1 submitted 24 April, 2019;
originally announced April 2019.