I'm a graduate student @ SAIL (Stanford AI Lab) co-advised by Dr. Emily Fox and Dr. John M. Cioffi

Muhammad Ahmed Mohsin

My research focuses on LLM post-training and inference, including preference optimization, active learning, and alignment for reasoning models, alongside reinforcement learning for high-diversity generation and adaptive agentic test-time compute.

I also develop Internet of Evolving Agents frameworks for self-evolving multi-agent systems with dynamic reputation modeling and social graph-based coordination, and work on applied reinforcement learning for complex, dynamic, and non-stationary decision-making, including RL methods tailored for LLM reasoning.

Building Conoid — visit conoid.ai

Research Interests

Main Area

My primary research spans LLM inference and test-time scaling, including test-time training for scientific discovery under uncertainty, adaptive compute allocation, agentic planning, and stratified scaling search for reasoning in large language models and diffusion language models. I also work on evolving agentic systems through Internet of Evolving Agents frameworks with dynamic reputation modeling and social graph-based coordination. In addition, I focus on reinforcement learning for LLMs, including preference optimization, active learning, alignment, and reward decomposition methods to reduce sycophancy and improve reasoning reliability.

LLM Inference and Test-Time Scaling

Working on test-time training methods for scientific discovery under uncertainty, with a focus on adaptive compute allocation, agentic planning, and stratified scaling search for test-time reasoning in large language models and diffusion language models.

CoLM'26, NeurIPS'26, Ongoing

Evolving Agentic Systems

Developing Internet of Evolving Agents frameworks for self-evolving multi-agent systems with dynamic reputation modeling and social graph-based coordination mechanisms.

NeurIPS'26, Ongoing

Reinforcement Learning for LLMs

Research on preference optimization, active learning, and alignment methods for large language model reasoning systems. Current work also explores reinforcement learning approaches for reward decomposition to mitigate sycophancy and improve alignment.

ICML'26, NeurIPS'26, Ongoing

Experience

Stanford Artificial Intelligence Laboratory (SAIL)

December 2025 – Present · Advisor: Prof. Emily Fox

Project: Internet of Evolving Agents

Co-developed a modular multi-agent ecosystem where autonomous agents evolve their capabilities, reputation, and social connections over time through Bayesian reputation updates, dynamic team formation, and social graph evolution. The framework enables emergent specialization and self-organizing collaboration for complex task execution (NeurIPS 2026, in progress).

Project: Test-Time Compute and Reasoning in Large Language Models

Currently working on adaptive test-time compute strategies for improving reasoning accuracy in LLMs, focusing on dynamic control of inference depth, tool usage, and verification under strict compute constraints. The work studies principled trade-offs between accuracy, latency, and reliability via adaptive compute allocation.

Project: Bayesian Preference Alignment for Mathematical Reasoning

Developed active learning frameworks for Bayesian General Preference Models and Continuous-Utility Direct Preference Optimization (CU-DPO) to align small language models for mathematical reasoning tasks, enabling sample-efficient preference learning with calibrated uncertainty (ICML 2026 and CoLM 2026).

Intel Corporation

September 2024 – December 2024 · Advisor: Dr. John M. Cioffi

Project: Neural Gaussian Radio Fields for Environment Perception

Worked on 3D computer vision-based channel estimation for next-generation wireless networks.
Implemented a CUDA-based differentiable real-time pipeline with 1 ms inference time, leading to KDD 2026 submission.

Samsung Semiconductors

June 2024 – September 2025 · Advisor: Dr. John M. Cioffi

Project: Deep Reinforcement Learning Accelerated Optimization: Graph Neural Networks for Accelerating Low-Rank SDP Solvers (expected NeurIPS 2026)

Developed a constraint-graph representation of SDPs and a GNN encoder (Graph Attention) with sequence prediction to learn rank trajectories directly from problem structure.
Integrated learned rank schedules into low-rank solvers to remove hand-tuned rank heuristics and reduce trial-and-error, yielding up to 3× speedups on large-scale benchmarks.

Selected Publications

NeurIPS 2026In progress

Internet of Evolving Agents

Z. Ali*, M. T. Shah*, M. A. Mohsin*, M. Umer, A. Bilal, E. Fox

paper

NeurIPS 2026In progress

Sycophancy Disentanglement in LLMs via Reward Decomposition

M. A. Mohsin, A. Bilal, M. Umer, E. Fox

paper

CoLM 2026In progress

S³: Stratified Scaling Search for Test-Time in Diffusion Language Models

A. Bilal*, M. A. Mohsin*, M. Umer, D. F. Hougen

paper

CoLM 2026In progress

Active Alignment with Bayesian General Preference Models

M. Umer*, M. A. Mohsin*, A. Bilal, Ellen Vitercik, J. M. Cioffi

paper

ICML 2026In progress

Continuous-Utility Direct Preference Optimization

M. A. Mohsin, M. Umer, A. Bilal, Ellen Vitercik, J. M. Cioffi

paper

ICML 2026In progress

What If We Allocate Test-Time Compute Adaptively?

A. Bilal*, M. A. Mohsin*, M. Umer, D. F. Hougen, J. M. Cioffi

paper

JMLR 2026In progress

Graph Neural Network for Accelerating Low-Rank SDP Solvers

M. A. Mohsin, M. Umer, A. Bilal, J. M. Cioffi, Ellen Vitercik

paper

KDD 2026In progress

Neural Gaussian Radio Fields for Channel Estimation

M. A. Mohsin*, M. Umer*, A. Bilal, J. M. Cioffi

paper

Conference Travels

IEEE Globecom 2024

Cape Town, South Africa

Presenting research on AI-driven wireless networks

News

2026

Served as an Area Chair for ICASSP.

2026

Selected as a Qualcomm Fellowship finalist.

2026

Serving as Workshop Co-Chair for VTC Fall 2026 in Boston.

September 2025

Served as a member of the Technical Program Committee at NeurIPS 2025 and also as a NeurIPS reviewer.

August 2025

Received the Exemplary Reviewer recognition for IEEE Wireless Communications Letters 2025.

July 2025

Added as a founding member of the IEEE Special Interest Group on AI-driven TN-NTN Networks.

July 2025

Paper accepted at ICML 2025 on "Continual Learning for Wireless Channel Estimation," along with a student travel grant to ICML.

May 2025

Received an ICC student travel grant for Montreal and a best workshop paper award for RAG-optimized wireless environment perception.

Jan 2025

Two papers on Hierarchical Deep RL and Joint Source Compression accepted at AAAI 2025 in Philadelphia.

Dec 2024

Two papers on diffusion-based Langevin dynamics and minPMAC optimization accepted at IEEE ICASSP 2025 in India.

Dec 2024

Awarded a Globecom 2024 travel grant for travel to Cape Town.

Sept 2024

Received a Best Poster Award nomination at the 6G Summit in Abu Dhabi.

Apr 2024

Awarded the Rector’s Gold Medal for best undergraduate thesis.

Jan 2024

Accepted to Stanford with the Stanford Graduate Fellowship.

Aug 2022

First paper accepted at ICDAR 2023, outperforming Microsoft’s DiT on table recognition tasks.

Oct 2021

Received the ECAT Scholarship for ranking among the top 10 in Pakistan in the engineering category test.

Jun 2021

Received the President’s Medal for ranking among the top 3 students across Pakistan at the HSSC level.

Jan 2021 - Jun 2023

Awarded the NUST scholarship for maintaining a 4.0 GPA.

Service & Awards

Conference Reviewer

AAAIICDARGLOBECOMWCNCICASSPICCICMLNeurIPSTMLRKDD

Journal Reviewer

IEEE TVT/Vehicular TechnologyIEEE GCN/Green Comm. & NetworksIEEE CL/Communication LettersIEEE WC/Wireless CommunicationsIEEE WCM/Wireless Comm. MagazineIEEE TNSE/Network Science & Engineering

Technical Program Committee & Leadership

WCNC 2026ICC 2026VTC 2026NeurIPS 2025ICASSP 2026

Workshop Co-Chair—IEEE ICC 2026Area Chair—ICASSP 2026, NeurIPS 2025

Awards & Travel Grants

★6G Summit Abu Dhabi — Best Poster Nomination

★IEEE ICC Canada — Best Workshop Paper Award

★IEEE Communications Society Competition — Honorary Mention

★IEEE FIT 2026 — Best Main Conference Paper Award

IEEE GLOBECOM 2024 (Cape Town, South Africa)IEEE ICC 2025 (Canada)Stanford Conference Travel Grant 2025

Get in Touch

I am always open to discussing new research collaborations and opportunities.

muahmed@stanford.edu

Stanford, CA, USA