Anurag Mishra Profile Picture

Hi, I'm Anurag Mishra

Data Scientist & Data Analyst | LLM Specialist

Data Scientist and Data Analyst with hands-on experience developing and deploying scalable LLM-based solutions & ML models in production environments. Currently working at Kounsel on medical AI applications.

About Me

Passionate Data Scientist & Analyst Ready to Make an Impact

Hello! I am a recent graduate from Vellore Institute of Technology (VIT) with a strong foundation in data science, machine learning, and statistical analysis.

My academic journey has equipped me with both theoretical knowledge and practical experience in extracting meaningful insights from complex datasets. Throughout my studies, I have developed proficiency in Python, R, SQL, and various machine learning frameworks. I am particularly interested in predictive modeling, natural language processing, and computer vision applications. My goal is to leverage data-driven approaches to solve challenging business problems and contribute to organizational success.

I am actively seeking opportunities in data science roles where I can apply my analytical skills, continue learning cutting-edge technologies, and work with diverse teams to deliver impactful solutions.

Key Highlights

  • Currently working as Graduate Data Scientist at Kounsel
  • Specialized in LLM development for medical applications
  • Expert in GCP, MLOps, and production ML deployment
  • Achieved 25% reduction in false positives and 20% accuracy improvement
  • 12+ deployed projects including AudioShield on Hugging Face
  • Certified in Data Science (IBM), Python (Google), ML (UMich), SQL (Kaggle)

Education

Vellore Institute of Technology (VIT)

2021 - 2025

Bachelor of Technology in Computer Science & Engineering
Specialising in Artificial Intelligence and Machine Learning

UG Degree: 83.5%

Specialized in Data Science and Machine Learning with focus on statistical analysis and LLM. Relevant coursework included Statistics, Data Structures, Algorithms, Database Management, Machine Learning, Deep Learning, and Big Data Analytics.

Technical Skills

Programming Languages

Python R

Data Science & ML

Pandas Scikit-Learn Transformers LLM Flask PyTorch TensorFlow MLOps

Tools & Platforms

JIRA Notion Jupyter Power BI Tableau Docker Kubernetes NLTK ZenML Git Bash DVC

Cloud Platforms

Google Cloud Platform (GCP) Amazon Web Services (AWS)

Databases

SQL Milvus Vector DB SQLite

Data Analytics

Seaborn Dask Matplotlib Plotly Bokeh SciPy Spacy

Featured Projects

Here are some of my key projects that demonstrate my data science, data analytics and machine learning capabilities. Each project includes live demos and source code for your review.

Completed

Customer Churn Analytics

ML model predicting customer churn with 92% accuracy using ensemble methods, feature engineering, and hyperparameter optimization.

Python Scikit-learn XGBoost Pandas
Completed

Document Based QnA Chatbot

Real-time question-answering chatbot using GPT model with web scraping, preprocessing, and Streamlit deployment.

Python GPT Web Scraping Streamlit
Completed

Vehicle Lane Detection System using Computer Vision

Deep learning-based lane detection for autonomous vehicles using computer vision and CNN architectures for real-time video processing.

Python PyTorch OpenCV ResNet
Completed

Movie Recommendation Engine

Hybrid recommendation system with collaborative filtering and content-based approaches, matrix factorization, Gradio and Streamlit deployment.

Python Collaborative Filtering Gradio Streamlit

Experience & Achievements

My professional journey and key accomplishments in data science and machine learning

Professional Experience

Graduate Data Scientist

Kounsel
California, US
Oct 2024 – Present

Developing Large Language Models focused on medical benefits and recipe generation. Utilizing GCP and specialized NLP libraries to scale LLM development processes.

  • Built large-scale ingredient nutrition datasets from diverse sources
  • Collaborated with medical professionals for AI-driven dietary recommendations
  • Optimized LLM development process using GCP and NLP frameworks
Python GCP NLP LLM

ML Engineer

Omdena
California, US Jul 2024 – Sep 2024

Developed AudioShield project for deepfake audio detection using advanced machine learning techniques.

  • Reduced false positives by 25% using XGBoost optimization
  • Improved model accuracy by 20% through feature engineering
  • Deployed AudioShield demo on Hugging Face platform
Python XGBoost Audio Processing Hugging Face

Achievements & Recognition

Professional Certifications

Multiple Platforms
Online Learning
2022 – 2024
4+ Certs

Completed industry-recognized certifications to enhance technical skills and stay updated with latest trends.

  • Data Science – IBM Professional Certificate
  • Python Programming – Google Professional Certificate
  • Applied Machine Learning – University of Michigan
  • SQL – Kaggle Learn Certification
Data Science Python Machine Learning SQL

Bachelor's of Technology

Vellore Institute of Technology
Bhopal, India
2021 – 2025
8.35/10

Computer Science Engineering with specialization in Artificial Intelligence and Machine Learning.

  • Maintained 83.5% overall academic performance
  • Specialized in AI & ML with advanced coursework
  • Participated in technical workshops and competitions
  • Completed capstone projects in data science
Data Structures Algorithms Machine Learning Deep Learning

Hackathons & Competitions

Various Platforms
National & International
2023 – 2024
Multiple

Participated in various data science hackathons and coding competitions to enhance practical skills.

  • Participated in 5+ national-level data science hackathons
  • Developed end-to-end ML solutions under time constraints
  • Collaborated with diverse teams on complex problems
  • Gained experience in rapid prototyping and deployment
Rapid Prototyping Team Collaboration Problem Solving ML Deployment

Professional Certifications

Industry-recognized certifications that validate my expertise in data science, machine learning, and programming

Verified

IBM Data Science Professional Certificate

IBM

Comprehensive program covering Python, SQL, data visualization, machine learning, and data analysis using real-world datasets and industry tools.

Python SQL Data Analysis Machine Learning Data Visualization
Completed: 2024
Verified

Google IT Automation with Python

Google

Advanced Python programming for automation, including Git, debugging, configuration management, and cloud deployment.

Python Programming Git Automation Cloud Computing Debugging
Completed: 2023
Verified

Applied Machine Learning in Python

University of Michigan

Hands-on machine learning course covering supervised and unsupervised learning, model evaluation, and practical ML implementation.

Machine Learning Scikit-Learn Model Evaluation Feature Engineering Python
Completed: 2023
Verified

SQL Micro-Course

Kaggle Learn

Comprehensive SQL training covering database queries, joins, aggregations, and advanced SQL techniques for data analysis.

SQL Database Queries Data Analysis Joins Aggregations
Completed: 2022
In Progress

AWS Certified Cloud Practitioner

Amazon Web Services

Currently pursuing AWS cloud fundamentals certification to enhance cloud computing and deployment skills.

Cloud Computing AWS Services Cloud Architecture Security
Expected: 2025
Planned

Continuous Learning

Various Platforms

Committed to continuous professional development through advanced courses in AI/ML, cloud technologies, and emerging data science trends.

Deep Learning MLOps AI Ethics Big Data
Ongoing

Let's Connect & Collaborate

Ready to bring data-driven insights to your team? I'm actively seeking opportunities to apply my expertise in LLM development, machine learning, data science and data analytics. Let's discuss how we can create impactful solutions together.