Arvinder Singh Dhoul

Data Scientist & Backend Developer

Transforming data into insights and building scalable backend solutions with expertise in DevOps and cloud infrastructure.

Scroll Down

About Me

Passionate about leveraging data and technology to solve complex problems. With 3+ years of experience in data science and backend development, I bring a unique blend of analytical thinking and technical expertise.

My Journey

Started as a curious computer science student fascinated by the intersection of mathematics and programming. Over the years, I've evolved into a full-stack data professional who thrives on transforming raw data into actionable insights and building robust systems that scale.

My expertise spans the entire data pipeline - from data collection and processing to machine learning model deployment and infrastructure management. I believe in writing clean, maintainable code and following best practices in software development.

Python

Machine Learning

Node.js

Docker

AWS

Data Science

Machine learning, statistical analysis, and predictive modeling

Backend Development

Scalable APIs, microservices, and database architecture

DevOps

Cloud infrastructure, CI/CD pipelines, and containerization

Analytics

Business intelligence, data visualization, and insights

Skills & Expertise

A comprehensive skill set spanning data science, backend development, and DevOps

Data Science & ML

Python

Pandas

NumPy

Scikit-learn

TensorFlow

PyTorch

Statistical Analysis

Data Visualization

Feature Engineering

Model Deployment

Backend Development

Node.js

Express.js

FastAPI

REST APIs

GraphQL

Microservices

Database Design

API Security

Performance Optimization

WebSockets

Message Queues

DevOps & Cloud

Docker

Kubernetes

AWS

CI/CD

Terraform

Infrastructure as Code

Monitoring

Logging

Auto-scaling

Load Balancing

Databases & Tools

PostgreSQL

MongoDB

Git

Jupyter

Apache Spark

Data Pipelines

ETL Processes

Query Optimization

Areas of Expertise

Programming Languages

Python

JavaScript

TypeScript

C++

SQL

Frameworks & Libraries

React.js

Node.js

Express

FastAPI

Django

Flask

Next.js

Machine Learning

Supervised Learning

Unsupervised Learning

Deep Learning

NLP

Computer Vision

MLOps

Transformers

Generative AI

Development Tools

Git

Docker

Kubernetes

Jenkins

GitHub Actions

Terraform

Analytics & BI

Power BI

Kafka

Snowflake

Security & Best Practices

API Security

OAuth

JWT

HTTPS

Data Privacy

GDPR

Technologies I Work With

Python

JavaScript

TypeScript

SQL

C++

React

Node.js

Express

FastAPI

Django

Flask

PostgreSQL

MongoDB

Redis

Elasticsearch

Docker

Kubernetes

AWS

GCP

Azure

Pandas

NumPy

Scikit-learn

TensorFlow

PyTorch

Kafka

Git

Jenkins

GitHub Actions

Terraform

Ansible

Featured Projects

A selection of projects showcasing my expertise in data science, backend development, and DevOps

Full-Stack Data Science

Customer Retention & Churn Risk Dashboard

Engineered and deployed a Full-Stack Data Science application that translated XGBoost predictions into actionable business insights. Designed a decision-support interface to simulate retention strategies, maximizing Customer Lifetime Value (CLV) by identifying high-risk clients. Achieved optimal model generalization and stability by employing rigorous Hyperparameter Tuning techniques.

Key Features:

Full-Stack Data Science application deployment
XGBoost predictions for churn risk analysis
Decision-support interface to maximize Customer Lifetime Value (CLV)
Rigorous Hyperparameter Tuning for model stability

Python

XGBoost

FastAPI

Hyperparameter Tuning

Streamlit

Code Live Demo

MLOps & Deployment

Student Exam Performance Predictor

Architected the entire ML workflow (EDA → Training) into modular code, ensuring model reproducibility and pipeline consistency. Deployed a lightweight Flask API/UI for real-time inference, enabling quick model validation and prediction serving. Managed and version-controlled all experiment artifacts and training runs, showcasing structured MLOps development proficiency.

Key Features:

Modular ML workflow architecture for reproducibility
Deployed a lightweight Flask API/UI for real-time inference
Managed and version-controlled all experiment artifacts
Showcases structured MLOps development proficiency

MLOps

Flask

Modular Coding

Data Science

Deployment

Code

Machine Learning

Telecom Customer Churn Prediction Model

Designed a foundational Machine Learning model to identify customer churn signals using real-world Telecom KPIs (data usage, billing, etc.). Compared and evaluated Random Forest and Logistic Regression performance to determine the optimal predictive algorithm.

Key Features:

Machine Learning model for customer churn prediction
Utilized real-world Telecom KPIs (data usage, billing)
Performance comparison of Random Forest and Logistic Regression
Deployed with Streamlit (based on listed technologies)

Python

Random Forest

Logistic Regression

Streamlit

Deployment

Code Live Demo

Full-Stack Development

Simple Task Management System

Developed a secure, Full-Stack MERN application to streamline task tracking, demonstrating strong backend engineering discipline. Implemented robust role-based access control and secure authentication using JSON Web Tokens (JWT) for Admin/User dashboards.

Key Features:

Full-Stack MERN application for task tracking
Secure authentication using JSON Web Tokens (JWT)
Robust role-based access control
Demonstrates strong backend engineering discipline

React.js

Node.js

Express.js

MongoDB

JWT

Code Live Demo

NLP

Sentiment Analysis

This project is a deep learning-based sentiment analysis tool for classifying tweets as either positive or negative. It utilizes a Bidirectional LSTM (Long Short-Term Memory) model built with TensorFlow and Keras.

Key Features:

Bidirectional LSTM model for high accuracy
Advanced text preprocessing and cleaning using NLTK/Scikit-Learn
Demonstrates expertise in DL model building and training
Built with TensorFlow and Keras

Python

Scikit-learn

TensorFlow

NLTK

Pandas

Code

Machine Learning & NLP

Smart Resume Analyzer

Spearheaded a cutting-edge Python-based Resume Analyzer using NLTK, Spacy, and Sentence-BERT, achieving 95% accuracy in matching resumes to job descriptions via NLP. Amplified text preprocessing with TF-IDF, extracting key insights and quantifying semantic alignment, enhancing candidate-job fit evaluation by 40%. Engineered a dynamic suggestion engine, improving resume-job alignment by 25% for optimized hiring outcomes.