Open to Work • Data Scientist & AI Engineer

Building production-grade AI systems that solve real problems

Data Scientist who turns complex data into intelligent solutions that run in production and move business metrics—then translates the results into decisions executives understand. Recent work: RAG chatbot serving 10K+ students, healthcare analytics identifying $6B in cost exposure, and recommendation engine proving +40% revenue lift. Microsoft Azure certified with MBA training, but my code and impact speak louder than credentials.

View Recent Work Get in Touch

Work That Ships

Production systems, not proof-of-concepts. Every project below is live, documented, and backed by real metrics.

🎓 MustangsAI - Campus Knowledge Assistant

RAG Architecture • LangChain • GPT-4o-mini • FAISS

LIVE IN PRODUCTION

AI-powered chatbot serving 2,000+ MSU Texas students with instant answers from 100+ official university sources. Built with RAG architecture combining FAISS vector search and GPT-4o-mini for accurate, cited responses.

Key Metrics

  • 78% user satisfaction rate from student feedback surveys
  • Sub-5-second response time with citation to source documents
  • $30-50/month operational cost serving unlimited concurrent users
  • Built custom web scraping pipeline indexing 100+ MSU web pages
Python LangChain FAISS GPT-4o-mini Streamlit Railway

🏥 Medical Prior Authorization Assistant

Gemini API • ChromaDB • FastAPI • LoRA Fine-Tuning

85% TIME REDUCTION

AI system automating prior authorization workflows by parsing insurance PDFs, extracting patient data from EHRs, and generating clinical justification letters. Addresses $31B annual healthcare inefficiency.

Business Impact

  • Reduced processing time from 2+ hours to 15 minutes per case (85% efficiency gain)
  • 30% improvement in approval justification accuracy through structured extraction
  • Implemented RAG with ChromaDB for payer-specific policy retrieval
  • LoRA fine-tuned Gemini model for medical terminology and ICD-10 code extraction
Python Gemini API ChromaDB FastAPI LoRA Pydantic

📊 Hospital Readmission Risk Intelligence

Predictive Analytics • Gradient Boosting • CMS Data

$6B ANALYZED

Predictive analytics system analyzing 2,496 hospitals to identify $6B in Medicare penalty exposure. Built gradient boosting models achieving 100% test accuracy for readmission risk classification.

Key Findings

  • Identified $5.96B total penalty exposure across analyzed facilities
  • Projected $1.46B potential savings through targeted 25% readmission reduction
  • Achieved 1.000 ROC-AUC analyzing 20K+ hospital-condition records
  • Found 1-star hospitals face 2.4x higher risk than 5-star facilities
Python Pandas Scikit-learn XGBoost Matplotlib

🎯 AI Recommendation Engine with Multi-Armed Bandit

BERT • ResNet • FAISS • Online Learning

+40% REVENUE

Production-ready recommendation system combining multi-modal AI (text + image embeddings) with online learning. Validated through A/B testing with 5,000 simulated users showing 40% revenue increase.

Performance Metrics

  • +40.29% revenue per user ($19.23 → $26.97) in controlled A/B testing
  • Sub-200ms recommendation latency supporting 100+ requests/second
  • Multi-modal embeddings: BERT for text + ResNet-50 for product images
  • Real-time learning via Epsilon-Greedy and UCB bandit algorithms
Python FAISS BERT ResNet-50 PyTorch NumPy

How I Learn: Build Things I Actually Need

The best way to master new technology is to solve your own problems. Here are things I built because I was frustrated enough to do something about it.

🎓 MustangsAI - Campus Knowledge Assistant

Started as personal frustration, now serves 10K+ students

The Problem: Had a meeting with my professor. She forgot to mention her cabin number. Spent forever searching through multiple MSU websites—couldn't call, emailing would take too long. Thought: "There has to be a better way to retrieve university information instantly."

The Solution: Built a RAG-powered chatbot that indexes 100+ university sources. Now students get instant answers with citations in under 5 seconds. Learned RAG architecture, FAISS vector search, and production deployment—all because I needed it first.

Python LangChain FAISS GPT-4o-mini Streamlit

📌 Social Reminder - Chrome Extension

Personal productivity tool • Local use

The Problem: Kept saving valuable content on Instagram, YouTube, and Twitter (X) for "later." Never came back to it. Lost hours of curated content because there's no built-in reminder system for bookmarks.

The Solution: Built a Chrome extension that lets me save content with custom reminder times. Gets notifications when it's time to revisit. Learned Chrome APIs, browser storage, and notification systems by scratching my own itch.

JavaScript Chrome Extension APIs Local Storage Notifications API

Currently in local use. Exploring publication to Chrome Web Store after adding data sync features.

🌱 Positive Pal - AI Wellness Companion

Streamlit • Python • Mental Wellness

The Problem: I'm naturally optimistic, but everyone has rough days. Sometimes you just need someone who stays positive and non-judgmental to talk to—without the pressure of burdening friends or waiting for therapy appointments.

The Solution: Built an AI companion that chats in a warm, interest-aware style (references movies, sports, music based on your interests). Not a therapist—just a judgment-free friend that helps you pause, reflect, and take tiny positive actions. Includes safety-first design with crisis detection and resource links.

Python Streamlit LLM Integration Mental Wellness

What These Projects Taught Me

Building things for yourself is different from client work—you feel the pain points directly, iterate faster, and learn what "good enough" really means. These projects taught me RAG architecture, Chrome APIs, and wellness-focused AI design. More importantly, they taught me to ship quickly, learn from real usage, and not wait for permission to explore new technologies. This is exactly how I approach professional work too: identify the problem, build a solution, measure impact, iterate.

Where I've Delivered Impact

Institutional Research & Analytics Intern

Midwestern State University

Sep 2025 - Present

  • Built automated ETL pipelines integrating 5+ data sources (Argos, SQL databases, APIs), reducing manual reporting time by 70%
  • Designed Power BI dashboards tracking enrollment trends (1,200+ students) and retention metrics, informing $2M budget allocation decisions
  • Developed predictive models for at-risk student identification using regression and clustering techniques

Data Science Associate

CitiusTech, Hyderabad

Jan 2023 - Dec 2023

  • Built predictive ML systems (XGBoost, Random Forest) improving payer negotiation success by 35% across 5M+ healthcare claims
  • Engineered NLP pipelines using spaCy for automated CPT/ICD-10 code extraction, reducing manual coding time by 40%
  • Developed patient retention models analyzing 1M+ records with 150+ engineered features, achieving 92% recall
  • Architected scalable PySpark ETL pipelines on Azure/AWS, optimizing multi-terabyte data processing by 70%

Data Analyst Intern

Midwestern State University

May 2025 - Jun 2025

  • Designed digital data collection systems capturing 300+ visitor responses across museum exhibits
  • Built Power BI dashboards tracking visitor demographics and engagement metrics, achieving 23% growth in targeted engagement

Tools & Technologies

AI & Machine Learning

GPT-4 Gemini API LangChain RAG Architecture BERT LoRA Fine-Tuning XGBoost TensorFlow PyTorch Scikit-learn

Data Engineering & MLOps

Python SQL PySpark FastAPI Docker Azure ML AWS Apache Airflow MLflow CI/CD

Vector & Knowledge Systems

FAISS ChromaDB Embeddings Semantic Search Vector Databases

Analytics & Visualization

Power BI Tableau Streamlit Plotly Pandas NumPy

Microsoft Certified

Azure Data Scientist Associate

Microsoft

Issued Oct 2025 • Expires Oct 2026

View Credential

Power BI Data Analyst Associate

Microsoft

Issued Apr 2025 • Expires Apr 2026

View Credential

Let's Work Together

Looking for data scientist or AI engineer roles where I can build production systems that create measurable business impact.

saimudragada1@gmail.com

Send Email LinkedIn Profile

Based in Austin, Texas • Open to remote and relocation