profile

Soumyadeep Roy

MTech (Research),Department of Computer Science & Automation(CSA),Indian Institute of Science (Bengaluru)

About Me

I am an MTech (Research) student at the Department of Computer Science & Automation(CSA),IISc Bangalore working on theoretical Reinforcement Learning, including offline RL,regret analysis & sample complexity. I am also interested in LLM alignment and RLHF.And more recently,GenAI.

Education

M.Tech (Research),Computer Science & Engineering

2023 – 2026

Depart. of CSA,Indian Institute of Science (IISc), Bangalore

B.Tech, Computer Science & Engineering

2015 – 2019

Govt. College of Engineering & Ceramic Technology, Kolkata

Technical Skills

Professional Experience

Junior Research Fellow (CSA, IISc)

Dec 2022 – Present Β· Bengaluru, India

Working in RL,in particular offline RL.

Teaching Assistant

Jan 2025 – Apr 2026 Β· 1 yr 4 mos Β· IISc Bangalore

  • E0 270 Machine Learning (Jan–Apr 2025)
  • E0 232 Probability and Statistics (Aug–Dec 2025)
  • E0 331 Optimization for Machine Learning (Jan–Apr 2026)
  • E1 213 Pattern Recognition & Neural Networks (Jan–Apr 2026)

Conducted tutorials, invigilated examinations, and prepared problem sets, assignments, and exam papers.

Junior Research Fellow (CNS, IISc)

Aug 2022 – Nov 2022 Β· Bengaluru, India

Worked at Centre for Neuroscience (CNS), IISc.

Graduate Courses at IISc

Machine Learning
Computational Methods of Optimization
Stochastic Models & Applications
Reinforcement Learning
Measure Theoretic Probability
Topics in Stochastic Approximation Algorithms
Statistical Learning Theory
Stochastic Processes & Queueing Theory
Online Optimization & Control

Projects

πŸ€– DSA Revision Tracker Bot

Spaced repetition Telegram bot integrated with Google Calendar to automate daily problem revisions.

Python Β· Telegram API Β· PostgreSQL Β· OAuth2 GitHub β†’

πŸš€ AlignLab

Built a reproducible framework for efficient LLM fine-tuning and alignment using LoRA, QLoRA, Reward Modeling, and Direct Preference Optimization (DPO).

LLMs Β· PyTorch Β· Hugging Face Β· RLHF Β· DPO Β· PEFT GitHub β†’

🏒 AttriSight

Machine learning platform for predicting employee attrition using ensemble models, explainability, and real-time inference.

ML Β· XGBoost Β· SHAP Β· FastAPI GitHub β†’

🧠 SARC: Semi-Autonomous Research Collaborator

AI-powered research operating system with LLM councils, experiment orchestration, and automated infrastructure management.

AI Agents Β· Systems Design Β· FastAPI Β· Next.js Β· Redis Β· PostgreSQL GitHub β†’

πŸŽ“ DSA Patterns

Interactive guide mapping 19 core DSA patterns and dependencies across 7 learning tiers.

Education Β· DSA GitHub β†’

Publications

Sample Efficient Active Algorithms for Offline Reinforcement Learning
[arxiv.org]

Blog & Notes

🧠 Personal Notes

My understanding of RL,LLM,ML,GenAI & related theory.

View Notes β†’

πŸ“š Learning Resources

Curated resources for ML, RL, optimization, LLM,etc.

View Resources β†’

πŸ“– Books

Books I find interesting !!

View Books β†’

Contact

πŸ“ Mailing Address:
Department of Computer Science & Automation (CSA)
Indian Institute of Science
Bangalore – 560012
Karnataka, India

🌍 View CSA on Google Maps

Get In Touch