Soumyadeep Roy

About Me

I am an MTech (Research) student at the Department of Computer Science & Automation(CSA),IISc Bangalore working on theoretical Reinforcement Learning, including offline RL,regret analysis & sample complexity. I am also interested in LLM alignment and RLHF.And more recently,GenAI.

Education

M.Tech (Research),Computer Science & Engineering

2023 – 2026

Depart. of CSA,Indian Institute of Science (IISc), Bangalore

B.Tech, Computer Science & Engineering

2015 – 2019

Govt. College of Engineering & Ceramic Technology, Kolkata

Technical Skills

Programming: Python, C,C++
Machine Learning: PyTorch, NumPy, Scikit-learn, Model Evaluation, Optimization
Reinforcement Learning: MDPs, Q-Learning, Policy Gradient Methods, Offline RL, POMDPs
Large Language Models (LLMs): Transformers, Fine-tuning (basics), Embeddings, Retrieval-Augmented Generation (RAG)
Backend Development: FastAPI, Flask, REST APIs, JSON Handling
Databases: SQL, Query Optimization, Basic Schema Design
Tools & Technologies: Git, Linux, Docker (basic), Jupyter Notebook
Core CS: Operating Systems, DBMS, Data Structures & Algorithms,Computer Networks(basics)

Professional Experience

Junior Research Fellow (CSA, IISc)

Dec 2022 – Present · Bengaluru, India

Working in RL,in particular offline RL.

Teaching Assistant

Jan 2025 – Apr 2026 · 1 yr 4 mos · IISc Bangalore

E0 270 Machine Learning (Jan–Apr 2025)
E0 232 Probability and Statistics (Aug–Dec 2025)
E0 331 Optimization for Machine Learning (Jan–Apr 2026)
E1 213 Pattern Recognition & Neural Networks (Jan–Apr 2026)

Conducted tutorials, invigilated examinations, and prepared problem sets, assignments, and exam papers.

Junior Research Fellow (CNS, IISc)

Aug 2022 – Nov 2022 · Bengaluru, India

Worked at Centre for Neuroscience (CNS), IISc.

Graduate Courses at IISc

Machine Learning

Computational Methods of Optimization

Stochastic Models & Applications

Reinforcement Learning

Measure Theoretic Probability

Topics in Stochastic Approximation Algorithms

Statistical Learning Theory

Stochastic Processes & Queueing Theory

Online Optimization & Control

Projects

🤖 DSA Revision Tracker Bot

Spaced repetition Telegram bot integrated with Google Calendar to automate daily problem revisions.

Python · Telegram API · PostgreSQL · OAuth2 GitHub →

🚀 AlignLab

Built a reproducible framework for efficient LLM fine-tuning and alignment using LoRA, QLoRA, Reward Modeling, and Direct Preference Optimization (DPO).

LLMs · PyTorch · Hugging Face · RLHF · DPO · PEFT GitHub →

🏢 AttriSight

Machine learning platform for predicting employee attrition using ensemble models, explainability, and real-time inference.

ML · XGBoost · SHAP · FastAPI GitHub →

🧠 SARC: Semi-Autonomous Research Collaborator

AI-powered research operating system with LLM councils, experiment orchestration, and automated infrastructure management.

AI Agents · Systems Design · FastAPI · Next.js · Redis · PostgreSQL GitHub →

🎓 DSA Patterns

Interactive guide mapping 19 core DSA patterns and dependencies across 7 learning tiers.

Education · DSA GitHub →

Publications

Sample Efficient Active Algorithms for Offline Reinforcement Learning
[arxiv.org]

Blog & Notes

🧠 Personal Notes

My understanding of RL,LLM,ML,GenAI & related theory.

View Notes →

📚 Learning Resources

Curated resources for ML, RL, optimization, LLM,etc.

View Resources →

📖 Books

Books I find interesting !!

View Books →

Contact

📍 Mailing Address:
Department of Computer Science & Automation (CSA)
Indian Institute of Science
Bangalore – 560012
Karnataka, India

🌍 View CSA on Google Maps