Audrey Huang

I am a Computer Science PhD student at UIUC, where I am fortunate to be advised by Nan Jiang.

Previously, I obtained my MS from CMU's Machine Learning Department, where I worked with Zack Lipton and Kamyar Azizzadenesheli, after completing a BS in Computer Science at Caltech.

google scholar     email


My research focuses on developing and analyzing reinforcement learning algorithms. I care about sample and computational efficiency, and alignment with human desiderata. I commonly use tools from statistical learning theory, optimization, and behavioral economics.


Non-adaptive Online Finetuning for Offline Reinforcement Learning
(under review, 2023) Audrey Huang, Mohammad Ghavamzadeh, Nan Jiang, Marek Petrik.
Given an offline dataset, how should we collect a non-adaptive online dataset in order to maximize improvement over the purely offline policy?

Reinforcement Learning in Low-Rank MDPs with Density Features
(ICML 2023) Audrey Huang*, Jinglin Chen*, Nan Jiang.
We model occupancies for reward-free RL in low-rank MDPs, with novel inductive error analysis to tame error exponentiation.

Beyond the Return: Off-policy Function Estimation under User-specified
Error-measuring Distributions

(NeurIPS 2022) Audrey Huang, Nan Jiang.
Regularization is key for accurate off-policy value and density-ratio estimation under general function approximation.

Supervised Learning with General Risk Functionals
(ICML 2022) Liu Leqi, Audrey Huang, Zachary Lipton, Kamyar Azizzadenesheli.
Uniform convergence and gradient-based optimization of general risk functionals using distribution-centric methods.

Offline Reinforcement Learning with Realizability and Single-policy Concentrability
(COLT 2022) Wenhao Zhan, Baihe Huang, Audrey Huang, Nan Jiang, Jason Lee.
Sample-efficient offline learning under only realizability and single-policy concentratability, leveraging primal-dual MDP formulation and regularization.

Off Policy Risk Assessment in Markov Decision Processes
(AISTATS 2022) Audrey Huang, Liu Leqi, Zachary Lipton, Kamyar Azizzadenesheli.
Extension of below paper to sequential MDPs using a novel CDF Bellman operator. Certain CDF estimators are better for certain plug-in risk estimates.

Off Policy Risk Assessment in Contextual Bandits
(NeurIPS 2021) Audrey Huang, Liu Leqi, Zachary Lipton, Kamyar Azizzadenesheli.
Estimation of general risk measures from logged data using CDF + plug-in approach, with doubly robust CDF estimator for variance reduction.

Graph-Structured Visual Imitation
(CoRL 2019, spotlight) Maximilian Sieb, Zhou Xian, Audrey Huang, Oliver Kroemer, Katerina Fragkiadaki.
Visual entity correspondence-based reward drives successful robotic imitation of manipulation tasks from videos.

Workshop Papers

RiskyZoo: A Library for Risk-Sensitive Supervised Learning
(ICML 2022, Workshop on Responsible Decision Making in Dynamic Environments)
William Wong, Audrey Huang, Liu Leqi, Kamyar Azizzadenesheli, Zachary Lipton.

On the Convergence and Optimality of Policy Gradient for Markov Coherent Risk
(NeurIPS 2020, Real World Reinforcement Learning Workshop)
Audrey Huang, Liu Leqi, Zachary Lipton, Kamyar Azizzadenesheli.