Copied to clipboard!
About Experience Projects Toolkit Education Certs Contact Resume ↗
Open to Full-Time Roles & Internships

Shruti
Borkar.

Data Engineer  ·  AI Builder  ·  Analytics Storyteller

I build data pipelines and AI systems that ship to production. Previously at Five EDC Architects and Tata Consumer Products — cutting reporting time by 65%, hitting 99.9% forecast accuracy, and improving recommendation accuracy by 75%.

LangChain Agentic AI RAG Pipelines OpenAI API Databricks Delta Lake Apache Spark MLflow
65%
Reporting Reduction
99.9%
Forecast Accuracy
75%
ML Uplift
4.0
MS GPA
Scroll
00 — About

Hello, I'm
Shruti.

I'm a Data Engineer with 2+ years of experience building ETL pipelines and AI systems in production. Currently completing my MS in Business Analytics at UMass Amherst (GPA: 4.0).

My engineering stack is Databricks, Apache Spark, Delta Lake, and Airflow. On the AI side I work with LangChain, RAG pipelines, OpenAI API, and MLflow. I write clean SQL, Python, and PySpark and care about the full pipeline — from raw ingestion to the dashboard a stakeholder actually uses.

At Five EDC Architects I designed SQL/Databricks ETL workflows and built an AI knowledge graph on SCM data. At Tata Consumer Products I processed 1M+ sales records and deployed ThoughtSpot dashboards that cut reporting time by 45%.

Data Engineering Agentic AI RAG Pipelines LLM Integration ETL / Delta Lake BI & Analytics
🎓
Degree
MS Business Analytics — GPA 4.0
🏫
University
UMass Amherst (Expected May 2026)
🎓
Undergraduate
CS & Business Systems — NMIMS Mumbai
📍
Location
Amherst, MA (Open to Relocate)
📞
Phone
+1 (413) 510-7393
✉️
Email
shrutiborkar13@gmail.com
💼
Status
Open to Opportunities ✓
01 — Experience

Work Experience

Jan 2021 –
May 2023
Data Engineer
Five EDC Architects
Full-Time · On-site
  • Reduced manual reporting by 65% by designing scalable SQL/Databricks ETL workflows and structured data models, improving data warehousing reliability and operational efficiency.
  • Reduced RFP delays by 28% by delivering Power BI dashboards with DAX KPIs and presenting data storytelling insights to leadership, accelerating strategic decisions.
  • Engineered an AI-powered knowledge graph using SCM data to forecast vendor efficacy, reducing delays by 28%.
  • Improved forecast accuracy to 99.9% by automating KPI pipelines in partnership with finance and operations, strengthening enterprise BI capabilities.
SQLDatabricksPower BIDAXETLData ModelingAI Knowledge Graph
May 2023 –
Jul 2023
Data Science & AI Intern
Tata Consumer Products
Internship · On-site
  • Built ETL pipelines with SQL, Python, and Alteryx to process 1M+ sales records with integrated dashboard components.
  • Performed data profiling, normalization, and encoding via rule-based validation and statistical techniques to identify and correct data anomalies, improving downstream model reliability.
  • Developed collaborative filtering and regression models, boosting product recommendation accuracy by 75%.
  • Applied feature engineering and preprocessing to improve model performance and interpretability.
  • Created SQL-based data pipelines and deployed AI-driven ThoughtSpot dashboards, reducing reporting time by 45%.
PythonSQLAlteryxThoughtSpotMLFeature Engineering
02 — Projects

Featured Work

🎬
Netflix Genre Intelligence & Agentic AI
Agentic AI

End-to-end pipeline on Databricks — K-Means gap analysis across 140+ countries, GPT-4 LLM recommendations via LangChain, Streamlit Q&A chatbot, and Power BI DAX dashboard for content investment strategy.

DatabricksLangChainOpenAI APIPythonSQLPower BIK-Means
↗ Click to explore full case study
🧠
Human Behaviour Classification via ML

Multi-label activity classification on the ExtraSensory dataset (60 users, 300K+ samples). 94% accuracy with Random Forest, outperforming Logistic Regression by 17% F1. Statistical QA via t-test, Chi-square, ANOVA.

Pythonscikit-learnRandom ForestStatistical AnalysisANOVA
↗ Click to explore full case study
👟
Adidas Sales Analytics

Advanced statistical analysis of ~9.6K records. Log-log regression for price elasticity (β₁ ≈ –1.8), K-Means segmentation, COVID-19 time-series, geo-spatial COGS heatmaps revealing $180M cost reduction opportunity.

PythonSQLTableauRegressionK-Means
↗ Click to explore full case study
🚁
Amazon Air Drone Deployment

Data-driven phased deployment model for Amazon Prime Air across Massachusetts. Mapped 56 hubs across 14 counties covering ~5.1M residents using GIS, FAA airspace analysis, and Python geospatial modeling.

SQLPower BIGISPython (Folium)MS Excel
↗ Click to explore full case study
🔬
Skin Lesion Classification
Published

IJCA-published research classifying 7 skin cancer types from 10K+ dermoscopic images. ResNet18 at 85% accuracy outperforms MobileNet (83.1%). CLAHE preprocessing, GLCM features, Grad-CAM interpretability.

PythonResNet18CLAHEGLCMGrad-CAM
↗ Click to explore full case study
🫀
Brain Alzheimer Detection (CNN)

Custom CNN classifying 4 Alzheimer stages from 5,121 MRI images. 93.7% accuracy — outperforming DBN (91%) and Multi-Kernel Learning (93.5%). SMOTE class balancing, SeparableConv2D for efficiency.

PythonCNNSMOTEKerasTensorFlow
↗ Click to explore full case study
📊
Employee Analysis — Power BI

HR analytics dashboard with advanced DAX — SWITCH() age banding, SUMX() bonus expense, pay equity area charts. KPI cards for headcount, salary, compensation. Enables DEI monitoring and strategic workforce planning.

DAXSQLPower BIHR Analytics
↗ Click to explore full case study
📦
Warehouse Object Detection

Automated inventory system — YOLOv8 at 97.26% mAP for product detection, Siamese Network at 95% accuracy for SKU classification, 93% shelf emptiness detection on SKU110K (1.7M+ annotations).

PythonYOLOv8Siamese NetworkSQLPower BI
↗ Click to explore full case study
03 — Toolkit

Technical Skills

AI / LLMs
LangChainAgentic AIRAG PipelinesOpenAI APIPrompt EngineeringMLflowNLP
Data Platform
DatabricksApache SparkApache AirflowDelta LakeETL PipelinesSnowflakeData Warehousing
Languages
Python (Pandas, PySpark, NumPy)SQL (MySQL, PostgreSQL, BigQuery)RDAXMS Excel (Pivot, Macros)
Cloud & BI
AWS (S3, Glue)GCP (BigQuery)AzurePower BITableauThoughtSpotAmazon QuickSightMATLAB
ML / Analytics
scikit-learnTensorFlowStatistical AnalysisFeature EngineeringA/B TestingRegressionClustering
Workflow & Tools
Agile / ScrumJiraNotionSharePointFigmaDMAICSAP
04 — Education

Academic Background

2024 – May 2026
Expected
M.S. in Business Analytics
University of Massachusetts Amherst
Amherst, MA
GPA: 4.0 / 4.0
Data EngineeringPredictive ModelingBusiness IntelligenceStatistical AnalysisMachine LearningData VisualizationAI & Analytics
2020 – May 2024
B.Tech — Computer Science & Business Systems
NMIMS University
Mumbai, India
CGPA: 3.57 / 4.0
Machine LearningComputer VisionNLPDatabase SystemsData AnalyticsBusiness Management
05 — Certifications & Publications

Credentials

📄
Published Research — IJCA 2024
International Journal of Computer Applications
Borkar et al. "Efficacy check of Haralick and Symmetry features for Skin Lesions Classification" · Vol. 185, No. 3
DB
AI Agent Fundamentals
Databricks
Agentic AI · LangChain · RAG · Delta Lake
DB
SQL Analytics & BI on Databricks
Databricks
SQL Analytics · BI Workflows · Data Lakehouse
MS
Power BI Data Analyst
Microsoft · Coursera
Professional Certificate · DAX · Power Query · BI Reporting
🤖
Gen AI for Data Analysts
IBM · Coursera
Generative AI · LLM Applications · Prompt Engineering
📊
Gen AI for BI Analysts
IBM · Credly Badge
AI-driven Business Intelligence · Generative Analytics
06 — Recommendations

What People Say

"

What sets Shruti apart is not just technical expertise, but a genuine curiosity and a problem-solving mindset. She is always ready to dive deep into a challenge, bring fresh ideas to the table, and follow through with consistent execution. Any team would be lucky to have her on board.

Adwait Laud
Associate Application Developer @ OFSS
"

Shruti has been a brilliant performer in the field of Content Marketing at Step Up Student. Her genuine interest and involvement in every task has helped us grow socially! I believe this recommendation would help her stay creative and inspired throughout the whole career.

Mohit Verma
Founder, Project Banao
07 — Contact

Get in
touch.

If you have a role, project, or research opportunity you think I'd be a good fit for, feel free to reach out. I check email daily.

Looking For
Actively looking — Summer / Fall 2025
Data Engineer
AI / ML Engineer
Analytics Engineer
↓ Download Resume ✉ Send Email