Jayshil Jain
👔 Professional
Jayshil — Explorer
🛶 Adventure Mode

✦ click photo to flip

👋  Hi, I'm Jayshil
Available for new opportunities

Data Engineer
& Solutions Architect

Data Engineer & Solutions Architect with 5+ years at Dell Technologies, building enterprise-grade pipelines, cloud architectures, and AI-powered analytics that drive measurable business outcomes.

5+
Years Exp.
$63M
Orders Impacted
99.9%
SLA Uptime
86%
ML Accuracy
Python·Apache Spark·Databricks·Snowflake·Apache Kafka·AWS·GCP·Azure·Airflow·Tableau·Power BI·LangChain·Delta Lake·BigQuery·MLflow·Talend·PySpark·RAG Architectures· Python·Apache Spark·Databricks·Snowflake·Apache Kafka·AWS·GCP·Azure·Airflow·Tableau·Power BI·LangChain·Delta Lake

Featured Projects

Things I've Built

🧠

Corporate GraphRAG Pipeline

Enterprise GraphRAG pipeline ingesting corporate documents into a dynamic knowledge graph for multi-hop reasoning with LangChain, Neo4j & OpenAI - going beyond traditional vector RAG.

View on GitHub →
📈

Financial Analytics & Sentiment Analysis

End-to-end AWS pipeline extracting real-time news & tweets with time-series forecasting at 92% accuracy. Built with AWS Lambda, Step Functions, Redshift, Airflow & Streamlit.

View project →
🏦

Lending Club Interest Rate Prediction

ML algorithm with 97% accuracy benchmarking 3 supervised ML models (Linear Regression, Random Forest, Neural Net) against 3 AutoML frameworks including H2O.ai & AutoSklearn.

View on GitHub →
🎬

IMDB Data Warehouse & BI

ETL pipeline loading 60M+ rows across Postgres, Snowflake, MySQL & Oracle in 70 minutes using Talend. Dashboards in Tableau, Power BI, Qlik Sense & Looker.

View on GitHub →
🦠

COVID-19 Web Scraping & Analysis

Scraped worldometer.info for all countries & US states, stored in SQL Server, built interactive Plotly dashboards & Tableau Public visualizations. ⭐ 3 GitHub Stars.

View on GitHub →

About Me

Turning raw data into
real decisions

Jayshil Jain
Jayshil Jain
Data Engineer · Solutions Architect

I'm a Data Engineering Advisor at Dell Technologies where I architect enterprise-scale data pipelines, lead technical proof-of-concepts for Fortune 500 customers, and present executive-level readouts to senior leadership.

Over 5+ years I've deployed automated solutions that impacted $63M in orders, built ML models hitting 86.32% accuracy in production failure forecasting, established 99.9% SLA uptime for executive dashboards, and drove a 65% increase in adoption for the Order Life Cycle process mining initiative at Dell.

Prior to my current role, I worked as a BI Engineer Co-op at Dell (Franklin, MA) where I optimized SQL pipelines saving $40K YoY, and as a Research Assistant & TA at Northeastern University where I mentored 150+ graduate students in Spark, SQL, Python & Power BI.

I hold a Master of Science in Information Systems from Northeastern University (GPA 3.84) and won the Global COVID-19 Hackathon in 2020. I'm fluent at bridging the gap between complex data systems and clear business value for stakeholders at every level.

Technical Skills
Programming Languages
PythonSQLJavaPySpark
Database & Data Warehouse
SnowflakeSQL ServerPostgreSQLMySQLMongoDBCassandraOracleVector Databases
Big Data Frameworks
Apache SparkDatabricksDelta LakeApache KafkaHadoopHiveBigQuery
ETL & BI
TalendAlteryxTableauPower BI (DAX)LookerQlik SenseData Modelling
Cloud & Schedulers
AWSGCPAzureApache AirflowCelonisER/StudioSSMS
Generative AI
LangChainLlamaIndexRAG ArchitecturesPrompt EngineeringMLflow
Governance & DevOps
GitBitbucketJiraConfluenceRBACData CatalogingData LineageAgile / Scrum
Education
Master of Science in Information Systems
Northeastern University · Boston, MA
Graduated December 2021  ·  GPA: 3.84
Courses: Data Warehousing & BI, Database Design, Big Data, Data Science, Analytics
🏆 Global COVID-19 Hackathon Winner 2020 ⭐ Featured Top Student
Bachelor of Engineering in Information Technology
University of Mumbai · Mumbai, India
Graduated May 2019
Courses: Data Structures & Algorithms, Database Management Systems

Career

Professional Experience

5+ years designing data solutions across enterprise environments, university research, and strategic business development.

Data Engineering Advisor : Strategic Business Development
Jan 2022 – Present
Dell Technologies  ·  Austin, TX
🏆 Recipient of the Game Changer Award
  • Architected complex data management solutions and delivered technical POCs for large-scale enterprise environments, evaluating cost/performance tradeoffs to drive operational efficiency and influence product development for 500+ users
  • Conducted customer needs analysis and presented executive-level readouts to leadership, successfully positioning strategic initiatives that drove a 65% increase in adoption for the Order Life Cycle (OLC) process mining initiative
  • Delivered compelling technical demonstrations and workshops to 100+ stakeholders, translating complex data infrastructure into clear, quantifiable business value
  • Managed data transformations deploying automated solutions impacting $63M in orders and improving overall data quality at scale
  • Directed ML model development, achieving 86.32% accuracy in forecasting failure, directly improving user retention and product experience
  • Established and managed 99.9% data availability SLAs for executive dashboards; automated Airflow alerts that reduced downtime and ensured proactive incident response
$63M Impacted 99.9% SLA 86.32% ML Accuracy 65% Adoption ↑ 500+ Users
PythonAirflowSnowflakeCelonisPower BIMLflowAzureDatabricks
Data Analyst, TA & Research Assistant : Data Engineering
Jan 2021 – Dec 2021
Northeastern University  ·  Boston, MA
🎓 Awarded Research Assistantship among 1500+ students
  • Automated AWS data pipelines using Glue, Lambda, and REST APIs feeding into Redshift, saving 60% processing time and significantly enhancing throughput efficiency
  • Collaborated with Data Scientists to clean and transform data using Spark DataFrames, ensuring high-quality data ingestion into downstream analytical systems
  • Mentored 150+ graduate students in SQL, Apache Spark, Python, and Power BI through structured workshops and 1-on-1 sessions, building a data-driven academic culture
60% Time Saved 150+ Students Mentored
AWS GlueLambdaRedshiftPySparkSQLPower BI
Business Intelligence Engineer Co-op
Aug 2020 – Dec 2020
Dell Technologies  ·  Franklin, MA
⭐ Earned Director's Recommendation
  • Optimized complex SQL queries, effectively resolved critical data quality issues, reducing project lead time by 60% and saving $40,000 YoY
  • Developed Tableau dashboards using REST APIs for multiple product lines, automating reporting pipelines and saving 20% of SQE team time
  • Coordinated with global teams to build PySpark & Airflow pipelines processing 1 TB of daily logs and petabytes of telemetry data
  • Designed enterprise RBAC policies for data pipelines enforcing GDPR & CCPA compliance, mitigating data security risks across the organization
$40K Saved YoY 60% Faster Queries 1TB/day Pipelines GDPR/CCPA Compliant
SQLTableauPySparkAirflowRBACREST APIs

Portfolio

Projects & Open Source

Real-world data engineering, ML, and BI projects (including all public repositories) from github.com/jayshilj.

🆕 LATEST
🧠

Corporate GraphRAG Pipeline

Machine Learning

Enterprise-grade Graph Retrieval-Augmented Generation (GraphRAG) pipeline for corporate knowledge bases. Ingests unstructured documents, constructs a dynamic knowledge graph, and enables multi-hop reasoning over complex entity relationships, surfacing contextual insights that traditional vector RAG cannot reach.

GraphRAGLangChainNeo4j PythonOpenAIKnowledge Graph Vector DBNLPLLM
FEATURED
📈

Financial Analytics & Sentiment Analysis

Data Engineering

End-to-end AWS app that extracts real-time news, tweets & trend data and forecasts future market trends with 92% accuracy. Uses Airflow for orchestration and Streamlit for the dashboard.

AWS LambdaStep FunctionsEC2 S3RedshiftGlue PySparkAirflowStreamlitLocustPyTest
⭐ TOP PROJECT
🏦

Lending Club Interest Rate Prediction

⭐ 1
Machine Learning

Interest rate prediction algorithm with 97% accuracy. Evaluated 3 supervised ML models (Linear Regression, Random Forest, Neural Networks) and 3 AutoML frameworks (AutoSklearn, H2O.ai, TPOT). Applied MICE imputation and LassoCV feature selection on a dataset of 2.2M+ loan records.

PythonH2O.aiAutoSklearn TPOTLassoCVMICE ImputationScikit-learn
🎬

ETL Pipelines & BI on IMDB Dataset

Data Engineering

Orchestrated an ETL workflow in Talend loading tables across 4 databases (60M+ rows) in a scheduled manner within 70 minutes. Built BI dashboards in Tableau, Power BI (DAX), Qlik Sense, and Looker. Modelled with ER/Studio and profiled with Alteryx.

TalendSnowflakePostgreSQL AlteryxTableauPower BI Qlik SenseLookerER/Studio
⭐ 3 Stars
🦠

COVID-19 Web Scraping & Analysis

⭐ 3
Data Analysis

Scraped worldometer.info for all countries & US states using BeautifulSoup. Stored cleaned data in SQL Server with a Windows Task Scheduler. Built interactive Plotly dashboards and published Tableau Public visualizations tracking daily COVID-19 trends.

PythonBeautifulSoupSQL Server PlotlyTableau PublicPandas
🏪

AdventureWorks Retail Store DW & BI

BI & Analytics

Data warehousing and BI solution for a retail store dataset. Used Alteryx for data profiling and transformation and Talend for ETL workflow. Power BI reports with custom DAX measures for executive dashboards.

AlteryxTalendPower BIDAXSQL
🏫

School & Staff Data Analysis: Power BI

BI & Analytics

In-depth Power BI dashboard with drill-through and historical trend analysis covering school and staff data from 2009 to 2018. Features executive-level KPIs, cross-filtering, and custom DAX measures.

Power BIDAXSQLData Modelling

Press & Recognition

In the News

Featured by Northeastern University for academic excellence, global hackathon wins, and pioneering industry experience. Here's where the story was told.

↗ View Article
Jayshil Jain — Northeastern COE Spotlight
🎓 Northeastern COE
March 2021
Student Spotlight Top Co-op

MS in Information Systems Co-op as Business Intelligence Engineer at Dell Technologies

"Jayshil's work at Dell Technologies exemplifies what the Northeastern co-op experience is all about: applying classroom knowledge to real-world impact from day one." Northeastern College of Engineering

Featured by Northeastern's College of Engineering for completing a prestigious co-op as a Business Intelligence Engineer at Dell Technologies, building enterprise-scale data solutions and dashboards that impacted business decisions globally.

↗ View Article
SkillUp — Global Hackathon Winner
🏆 Northeastern News
July 2020
🥇 Hackathon Winner Global Impact

They Want to Make It Easier for Students in Developing Countries to Go Online

"The team's solution stood out for its technical ingenuity and real-world applicability, bridging the digital divide for millions of students worldwide." Northeastern Global News

Featured in Northeastern's global news for winning an international hackathon with SkillUp 💪, a platform designed to improve internet accessibility and free learning for students in developing countries during COVID-19.

By the Numbers

Featured by Northeastern University
🥇
Global Hackathon Winner
Top
MS Information Systems Co-op at Dell
Global
Impact through tech & education equity

Contact

Let's work together

Open to full-time Data Engineering, Solutions Architecture, and ML Engineering roles. Also available for consulting and technical advisory engagements.

Available for new opportunities

I'm Jayshil Jain, a Data Engineer and Solutions Architect with 5+ years at Dell Technologies. Whether you have a full-time role, a consulting project, or just want to connect, I'd love to hear from you. I typically respond within 24 hours.

💼
LINKEDINlinkedin.com/in/jayshiljain
✉️
EMAILjayshilj@gmail.com
🐙
GITHUBgithub.com/jayshilj
📍
LOCATIONAustin, TX · Open to relocation

Send a Message

Fill out the form below and I'll get back to you within 24 hours.