Hi! I am a Ph.D. student in Computer Science at Texas A&M University, advised by Dr. Shuiwang Ji; actively collaborate with Dr. Dileep Kalathil and Dr. James Caverlee. My research focuses on post-training of Large Language Models (LLMs) and Diffusion Language Models (DLMs).

More specifically, I develop curriculum-based reinforcement learning approaches that train models at the frontier of their learnability, continuously expanding what they can reason about and yielding progressively stronger models. I have published multiple papers as (co-)first author in top-tier venues including ICLR, CVPR, and EMNLP.

During my M.S. at Texas A&M, advised by Dr. Shu Kong and Dr. James Caverlee, I focused on identifying and mitigating biases in Vision-Language Models (VLMs) to improve their robustness and fairness, especially for use in multimodal chatbots and diffusion models. Prior to that, I completed my undergraduate degree in Computer Science and Engineering at the Vellore Institute of Technology, Chennai.

🔥 News

  • 2026.03: I have been awarded the ICLR 2026 Financial Assistance grant to attend the conference in Rio!
  • 2026.01: Our paper, “Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning” was accepted at ICLR 2026!
  • 2025.05: I will be interning as an Applied Scientist at Amazon (Santa Clara, CA) this summer.
  • 2025.02: Our paper, “Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning” was accepted at CVPR 2025!
  • 2024.06: Our paper, “The Neglected Tails of Vision-Language Models” was accepted to DMLR as ORAL.
  • 2024.06: I have begun my Ph.D. at Texas A&M under the supervision of Dr. Shuiwang Ji.
  • 2024.02: Our paper, “The Neglected Tails of Vision-Language Models” has been accepted at CVPR 2024!
  • 2023.12: I will be presenting our paper, “Prompting Scientific Names for Zero-Shot Species Recognition” at EMNLP 2023.
  • 2022.02: I was awarded the Texas A&M Computer Science departmental scholarship.

📝 Publications

ICLR 2026
sym

Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

Shubham Parashar, Shurui Gui, Xiner Li, Hongyi Ling, Sushil Vemuri, Blake Olson, Eric Li, Yu Zhang, James Caverlee, Dileep Kalathil, Shuiwang Ji

Paper Code

  • We propose E2H Reasoner, a curriculum-based RL approach that schedules tasks from easy to hard using a probabilistic scheduler, enabling small LLMs (1.5B–3B) to solve tasks they initially failed at zero-shot.
  • We provide theoretical convergence guarantees showing curriculum stages require fewer samples than direct learning on hard tasks.
CVPR 2025
sym

Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning

Tian Liu, Huixin Zhang, Shubham Parashar, Shu Kong

Paper Website Poster Code

  • We retrieve open-world data for solving few-shot recognition, and propose stage-wise training to mitigate the imbalanced distribution and domain gap issues.
CVPR 2024 (DMLR 2024 ORAL)
sym

The Neglected Tails in Vision-Language Models.

Shubham Parashar, Zhiqiu Lin, Tian Liu, Xianjue Dong, Yanan Li, Deva Ramanan, James Caverlee, Shu Kong

Project Paper

  • Our study is the first to reveal bias in popular VLMs like CLIP due to long-tailed training data.
  • To mitigate the bias, we propose a novel prompting method and retrieval augmented strategy.
  • Both our methods achieve a new SOTA and our retrieval augmented method is 100x cheaper on compute resources compared to the previous retrieval augmented strategies.
EMNLP 2023
sym

Prompting Scientific Names for Zero-Shot Species Recognition

Shubham Parashar, Zhiqiu Lin, Yanan Li, Shu Kong

Paper

  • We propose an embarassingly simple prompting method that boosts the zero-shot accuracy of VLMs on 4 fine-grained species recognition benchmarks by 2-5x.

📝 Preprints

Preprint 2025
sym

Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights

Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji

Paper Code

  • We construct Sys2Bench, a comprehensive benchmark evaluating inference-time techniques across 11 diverse tasks spanning arithmetic, logical, common sense, algorithmic reasoning, and planning.
  • We reveal that simply scaling inference-time compute does not consistently improve performance, highlighting fundamental limitations of existing methods.
Preprint 2025
sym

Complex LLM Planning via Automated Heuristics Discovery

Shubham Parashar, Hongyi Ling, Sambhav Khurana, Blake Olson, Anwesha Basu, Gaurangi Sinha, Zhengzhong Tu, James Caverlee, Shuiwang Ji

Paper Code OpenReview

  • We propose AutoHD, which prompts LLMs to generate explicit heuristic functions as Python code to guide inference-time search, removing reliance on unreliable self-verification.
  • AutoHD refines heuristics functions through an evolutionary process, which are then used for inference time search.

🎖 Honors and Awards

  • 2026: Financial Assistance Award — International Conference on Learning Representations (ICLR) (awarded to ~325 of ~15,000 authors, ~2.2%)
  • 2024: Best Reviewer Award — Annual Conference on Neural Information Processing Systems (NeurIPS)
  • 2024: Travel Award — Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2023: Travel Grant — CSE@TAMU
  • 2023: Department Scholarship — CSE@TAMU

📚 Teaching & Service

Teaching

  • 2024: Teaching Assistant — CSCE 421: Machine Learning (Texas A&M)
  • Spring 2024: Grader — CSCE 636: Deep Learning (Texas A&M)
  • Spring 2023: Grader — CSCE 642: Deep Reinforcement Learning (Texas A&M)

Professional Service

  • 2026: Reviewer — ACM Computing Surveys (CSUR)
  • 2026: Reviewer — Transactions on Pattern Analysis and Machine Intelligence (PAMI)
  • 2025: Reviewer — Transactions on Machine Learning Research (TMLR)
  • 2025, 2026: Reviewer — International Conference on Machine Learning (ICML)
  • 2025, 2026: Reviewer — International Conference on Learning Representations (ICLR)
  • 2024, 2025: Reviewer — Annual Conference on Neural Information Processing Systems (NeurIPS)
  • 2024: Reviewer — Data-centric Machine Learning Research Workshop @ ICML
  • 2024: Organizer — Visual Perception via Learning in an Open World Workshop @ CVPR
  • 2024: Reviewer — What is Next in Multimodal Foundation Models Workshop @ CVPR

Mentoring

  • 2025–Present: Lakshmi Jotsna Madhavarapu — Incoming Data Scientist Intern @ Capital One
  • 2025–Present: Atharv Chagi — Incoming Software Engineer Intern @ Texas Instruments
  • 2024–2025: Blake Olson — MS CS @ UT Austin
  • 2024–2025: Eric Li — Junior @ Texas A&M University

đź“– Education

  • 2024.06 - Present, Ph.D. in Computer Science, Texas A&M University, College Station, Texas
  • 2022.08 - 2024.05, M.S. in Computer Science, Texas A&M University, College Station, Texas
  • 2015.07 - 2019.05, Undergraduate, Vellore Institute of Technology University, Chennai, India

đź’» Experience

  • 2025.05 - 2025.08, AWS, Santa Clara, Ca.
  • 2023.05 - 2023.08, HPE, Houston, Tx.
  • 2019.07 - 2022.09, PayPal, Bangalore, India