I'm a student at Gati Shakti Vishwavidyalaya working on language models, agentic systems, and mechanistic interpretability. I build to understand.
I'm an AI & Data Science student at Gati Shakti Vishwavidyalaya, Vadodara, specializing in Transportation & Logistics. I build things from scratch — not to collect credentials, but to actually understand how they work.
My current obsession is efficient language models: making small models punch above their weight through architectural choices, not compute. I'm also interested in agentic systems and what happens inside transformer networks when they do what they do.
A 100–300M parameter language model trained from scratch in PyTorch on consumer hardware (M4 MacBook Air, 16GB RAM). Incorporates architectural efficiency techniques from DeepSeek — MLA, DeepSeekMoE, RoPE, GaLore. Fine-tuned via SFT on a human-refined dataset and deployed as a Discord bot with RAG and agentic capabilities.
An agentic Discord moderation system built on a ReAct-style loop with a tool registry, structured JSON tool calling via Gemma, observation injection, retry logic, and async Discord architecture. Built deliberately from scratch to understand every layer of how agents actually work.
Preprint · 2024
A single-authored study on mechanistic interpretability of a self-trained small language model. Includes ablations on architectural components to understand what each contributes to model behavior.
Exposure to operational systems and logistics workflows at a divisional railway management office
I'm always open to interesting conversations, collaborations, or just a good technical discussion.