Graduate Student · Stanford · SAIL

Muhammad
Ahmed Mohsin

Co-advised by Dr. Emily Fox and Dr. John M. Cioffi

Research on preference optimization and alignment for LLMs, adaptive test-time scaling and discovery, and self-evolving multi-agent systems.

Graduate student at Stanford — LLM post-training, test-time scaling, and reinforcement learning. Collaborations with Google DeepMind, Meta, Amazon AGI, and Microsoft Core AI.

Research Interests

My research spans preference optimization and alignment for LLMs, adaptive test-time scaling and discovery, and interactive multi-agent systems — with collaborations across Google DeepMind, Meta, Amazon AGI, and Microsoft Core AI.

Preference Optimization and Alignment for LLMs

ICML'26 · CoLM'26 · NeurIPS'26 · EMNLP'26

Designing sample-efficient preference learning and reinforcement-learning methods for aligning language-model reasoning: continuous-utility and general-preference formulations (CU-DPO, General Preference RL), active alignment under Bayesian general preference models with calibrated uncertainty, and sycophancy reduction.

Test-Time Scaling and Test-Time Discovery

ICML'26 · CoLM'26 · NeurIPS'26 · EMNLP'26

Developing adaptive test-time compute methods for reasoning and scientific discovery under uncertainty: dynamic control of inference depth, tool invocation, and verification under strict budgets; stratified scaling search for diffusion language models; and epistemic-uncertainty-driven test-time training for discovery.

Interactive and Multi-Agent Systems

NeurIPS'26 · Ongoing

Building self-evolving multi-agent ecosystems in which agents accumulate scoped memory, earn reputation through Bayesian posteriors, and coordinate over an evolving social graph. Related work on learning from code-agent trajectories via causal redundancy analysis, and on privacy, security, and shared context in collaborative agentic reasoning.

Experience

Stanford Artificial Intelligence Laboratory (SAIL)

Sep 2025 – Present

Prof. Emily Fox

Internet of Evolving Agents

  • Co-developed a modular multi-agent ecosystem where autonomous agents evolve their capabilities, reputation, and social connections over time through Bayesian reputation updates, dynamic team formation, and social graph evolution. The framework enables emergent specialization and self-organizing collaboration for complex task execution (NeurIPS 2026, in progress).

Test-Time Compute and Reasoning in Large Language Models

  • Submitted work on adaptive test-time compute allocation for LLM reasoning, focusing on dynamic control of inference depth, tool invocation, and verification under strict compute budgets, and analyzing principled trade-offs among accuracy, latency, and reliability.

Bayesian Preference Alignment for Mathematical Reasoning

  • Developed active learning frameworks for Bayesian General Preference Models and CU-DPO to align small language models for mathematical reasoning, enabling sample-efficient preference learning with calibrated uncertainty (ICML 2026, CoLM 2026).

Intel Corporation

Sep 2024 – Dec 2024

Dr. John M. Cioffi

Neural Gaussian Radio Fields for Environment Perception

  • Developed a 3D computer vision-based channel estimation framework for next-generation wireless networks, implementing a CUDA-based differentiable real-time pipeline achieving 1 ms inference latency (KDD 2026 submission).

Samsung Semiconductors

Jun 2024 – Sep 2025

Dr. John M. Cioffi

Graph Neural Networks for Accelerating Low-Rank SDP Solvers

  • Developed a constraint-graph representation of SDPs with a Graph Attention Network encoder to predict rank trajectories, integrating learned rank schedules into low-rank solvers to eliminate manual heuristics and achieve up to 3× speedups on large-scale benchmarks (JMLR 2026).

Selected Publications

01
EMNLP 2026 · in progress

Distilling Disagreement at Test Time

Muhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal, John M. Cioffi, Emily Fox

02
NeurIPS 2026 · in progress

General Preference Reinforcement Learning

Muhammad Umer*, M. A. Mohsin*, A. Bilal*, E. Fox

03
NeurIPS 2026 · in progress

Epistemic Uncertainty for Test-Time Discovery

Muhammad Ahmed Mohsin*, Kainat Riaz*, Muhammad Umer, Ahsan Bilal, John M. Cioffi, Emily Fox

04
NeurIPS 2026 · in progress

Internet of Evolving Agents

Z. Ali*, M. A. Mohsin*, M. Umer, A. Bilal, E. Fox

05
CoLM 2026 · in progress

Sycophancy Disentanglement in LLMs via Reward Decomposition

M. A. Mohsin*, A. Bilal*, M. Umer, E. Fox

06
CoLM 2026 · in progress

S³: Stratified Scaling Search for Test-Time in Diffusion Language Models

A. Bilal*, M. A. Mohsin*, M. Umer, D. F. Hougen

12 more

Conference Travels

IEEE ICASSP 2026

IEEE ICASSP 2026

Barcelona, Spain

Presenting work on diffusion models and non-stationary channel estimation

·1 / 5

News

2026

Knight-Hennessy Fellowship finalist.

2026

Served as an Area Chair for ICASSP and NeurIPS 2025.

2026

Selected as a Qualcomm Fellowship finalist.

2026

Serving as Workshop Co-Chair for VTC Fall 2026 in Boston.

Sep 2025

Served as a member of the Technical Program Committee at NeurIPS 2025 and also as a NeurIPS reviewer.

Aug 2025

Received the Exemplary Reviewer recognition for IEEE Wireless Communications Letters 2025.

Jul 2025

Added as a founding member of the IEEE Special Interest Group on AI-driven TN-NTN Networks.

Jul 2025

Paper accepted at ICML 2025 on "Continual Learning for Wireless Channel Estimation," along with a student travel grant to ICML.

May 2025

Received an ICC student travel grant for Montreal and a best workshop paper award for RAG-optimized wireless environment perception.

Jan 2025

Two papers on Hierarchical Deep RL and Joint Source Compression accepted at AAAI 2025 in Philadelphia.

Dec 2024

Two papers on diffusion-based Langevin dynamics and minPMAC optimization accepted at IEEE ICASSP 2025 in India.

Dec 2024

Awarded a Globecom 2024 travel grant for travel to Cape Town.

Sep 2024

Received a Best Poster Award nomination at the 6G Summit in Abu Dhabi.

Apr 2024

Awarded the Rector's Gold Medal for best undergraduate thesis.

Jan 2024

Accepted to Stanford with the Stanford Graduate Fellowship.

Aug 2022

First paper accepted at ICDAR 2023, outperforming Microsoft's DiT on table recognition tasks.

Oct 2021

Received the ECAT Scholarship for ranking among the top 10 in Pakistan in the engineering category test.

Jun 2021

Received the President's Medal for ranking among the top 3 students across Pakistan at the HSSC level.

2021 – 2023

Awarded the NUST scholarship for maintaining a 4.0 GPA.

Service & Awards

Conference Reviewer

AAAIICDARNeurIPSICMLICLRKDDCoLM

Journal Reviewer

IEEE TAI/Transactions on AITMLR/Transactions on Machine Learning ResearchAAI/Applied Artificial Intelligence

Technical Program Committee

NeurIPS 2025ICASSP 2026

Leadership

Workshop Co-ChairIEEE ICC 2026Area ChairICASSP 2026, NeurIPS 2025

Awards & Recognitions

Stanford Graduate Fellowship

Knight-Hennessy Fellowship — Finalist

President's Medal — Third position nationwide in pre-engineering

6G Summit Abu Dhabi — Best Poster Nomination

IEEE ICC Canada — Best Workshop Paper Award

IEEE Communications Society Competition — Honorary Mention

IEEE FIT 2026 — Best Main Conference Paper Award

Rector's Gold Medal — Best final year project cohort (NUST, 2024)

Travel Grants

IEEE GLOBECOM 2024 (Cape Town, South Africa)IEEE ICC 2025 (Canada)Stanford Conference Travel Grant 2025

Get in Touch

I am always open to discussing new research collaborations and opportunities.

muahmed@stanford.edu
Stanford, CA, USA