About
I’m a researcher at OpenAI with contributions to GPT-OSS and GPT5.x , models which marked significant leap in artificial intelligence. I work on reinforcement learning methods that help agents learn from a mix of signals—experience, demonstrations, and feedback—so they can generalize better and make reliable decisions in complex environments. During my Ph.D., I developed new approaches for off-policy RL (Dual RL, CPL), Inverse RL (f-IRL, SMORe) and Unsupervised RL (RLZero, PSM, RLDP).Previously, I completed my Ph.D. in the Computer Science Department at UT Austin, co-advised by Prof. Scott Niekum and Prof. Amy Zhang. Before that, I was a Master’s student in Computer Science (2019–20) at Carnegie Mellon University, where I worked in the Robot Perceiving and Doing Lab with Prof. David Held. I’ve also worked on imitative motion planning at Uber ATG, reinforcement learning for large action spaces during an internship at NVIDIA, and semantic segmentation during time at ETH Zurich.
I received my bachelor’s degree in Computer Science from IIT Kharagpur, supported by the Aditya Birla Scholarship (2015–19). At IIT Kharagpur, I spent most of my time building autonomous driving systems in the Autonomous Ground Vehicle Lab with Prof. Debashis Chakravarty, leading perception and planning efforts (lane detection, Frenet planning, Hybrid A* planning, and segmentation). I also completed my bachelor’s thesis on safe reinforcement learning with Prof. Pabitra Mitra. Outside of research, I enjoy tennis, badminton, skiing, running, hiking, and traveling.
News
- 01/27/2026: RLDP, a new approach to unsupervised RL was accepted at ICLR 2026.
- 09/18/2025: RLZero, zero-shot approach for prompt to policy was accepted at NeurIPS 2025.
- 05/09/2025: Fast Adaptation with Behavioral Foundation Models, work from FAIR internship, accepted at RLC 2025.
- 05/02/2025: Proto Successor Measure (Unsupervised RL) accepted at ICML 2025.
- 04/11/2025: CRESTE: Scalable Mapless Navigation with Internet Scale Priors and Counterfactual Guidance accepted at RSS 2025.
- 01/22/2025: Iterative Dual RL accepted at ICLR 2025.
- 08/01/2024: Research Intern at Meta FAIR, Paris working on Unsupervised RL
- 09/25/2024: Our Scaling laws study of Direct Alignment Algorithms for RLHF is accepted at NeurIPS 2024.
- 09/04/2024: DILO is accepted at CoRL 2024.
- 01/16/2024: Dual-RL, CPL and SMoRe accepted in ICLR 2024.
- 05/01/2023: I'll be starting as a research intern at Meta AI research working on RL.
- 02/16/2023: Our recent work on Dual RL is now public. Check it our for SOTA algos in RL and IL .
- 01/09/2023: rank-game (Unified approach to learning from preferences and imitations) was featured in the Microsoft Research Blog.
- 10/23/2022: Our work FlowPlan awarded best paper at IROS BADUE 2022.
- 03/10/2022: I'll be starting as a research intern at NVIDIA research working on reinforcement learning.
- 11/07/2021: Our work on model-based RL LOOP nominated for Best Paper at CoRL 2021.