Arnie He

I'm a 4th-year undergraduate student pursuing degrees in Computer Science and Mathematics at Brown University, where I do research advised by Professor Amy Greenwald and Professor George Konidaris. My main research interest lies in automated decision making systems with a focus on reinforcement learning and multi-agent interactions.

Outside of research, I enjoy any competitive sports, i.e. tennis and soccer. I also enjoy strategic games including go and the civilization series.

Research

Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning

To present at Neurips ARLET workshop 2025

Under review - [paper] [code]

Naicheng He*, Kaicheng Guo*, Arjun Prakash*, Saket Tiwari, Tyrone Serapio, Ruo Yu Tao, Amy Greenwald, George Konidaris

We investigate why deep neural networks suffer from loss of plasticity in deep continual learning, failing to learn new tasks without reinitializing parameters. We show that this failure is preceded by Hessian spectral collapse at new-task initialization, where meaningful curvature directions vanish and gradient descent becomes ineffective. To characterize the necessary condition for successful training, we introduce the notion of τ-trainability and show that current plasticity preserving algorithms can be unified under this framework. Targeting spectral collapse directly, we then discuss the Kronecker factored approximation of the Hessian, which motivates two regularization enhancements: maintaining high effective feature rank and applying L2 penalties. Experiments on continual supervised and reinforcement learning tasks confirm that combining these two regularizers effectively preserves plasticity.

Inverse Reinforcement Learning on GPUDrive

Presented at NYRL workshop 2025

In progress - [write-up] [code] [poster]

Naicheng He, Arjun Prakash, Gokul Swamy, Amy Greenwald, Eugene Vinitsky

We explore the use of inverse reinforcement learning (IRL) to develop robust driving policies in GPUDrive. Using demonstrations from either human experts or PPO-trained agents, we investigate GAIL-style approaches with PPO as inner-loop optimizers, and discriminators trained on egocentric observations or observation-action pairs. Our experiments span 75 worlds with varying numbers of controlled agents and we investigated the difficulty of scaling IRL from single-agent to multi-agent environments. We evaluate policies using task-relevant metrics such as off-road counts, collisions, goal-reaching rates, and hand-crafted episodic returns. These early results raise intriguing questions about reward generalization, scalability, and the design of efficient algorithms for multi-agent autonomy.

Bi-Level Policy Optimization with Nyström Hypergradients

Under review - [paper] [code] [poster]

Arjun Prakash*, Naicheng He*, Denizalp Goktas, Amy Greenwald

The dependency of the actor on the critic in actor-critic (AC) reinforcement learning means that AC can be characterized as a bilevel optimization (BLO) problem, also called a Stackelberg game. This characterization motivates two modifications to vanilla AC algorithms. First, the critic's update should be nested to learn a best response to the actor's policy. Second, the actor should update according to a hypergradient that takes changes in the critic's behavior into account. Computing this hypergradient involves finding an inverse Hessian vector product, a process that can be numerically unstable. We thus propose a new algorithm, Bilevel Policy Optimization with Nyström Hypergradients (BLPO), which uses nesting to account for the nested structure of BLO, and leverages the Nyström method to compute the hypergradient. Theoretically, we prove BLPO converges to (a point that satisfies the necessary conditions for) a local strong Stackelberg equilibrium in polynomial time with high probability, assuming a linear parametrization of the critic's objective. Empirically, we demonstrate that BLPO performs on par with or better than PPO on a variety of discrete and continuous control tasks.

→ View More Projects I've Worked On

TA Experience

CSCI 1470: Deep Learning (Fall 2023)
CSCI 1440: Algorithmic Game Theory (Spring 2024)

Naicheng He(Arnie)

Research

Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning

Inverse Reinforcement Learning on GPUDrive

Bi-Level Policy Optimization with Nyström Hypergradients

TA Experience