// Open to Data + AI Opportunities

Turning raw data into
actionable insight.

Hi, I'm Sumant Jadiyappagoudar — a data analyst specialising in machine learning, SQL analytics, Tableau dashboards, and bioinformatics pipelines.

> |

Explore Work Contact Me

metrics.json

Public Projects

Best Model Accuracy

Records Analysed

Tools & Frameworks

Python SQL Tableau Power BI scikit-learn

// Technical Stack

Skills & Proficiency

Analytics & BI

Python (pandas, NumPy)92%

SQL / PostgreSQL88%

Tableau85%

Power BI / DAX80%

Machine Learning

scikit-learn / XGBoost88%

NLP (TF-IDF, BERT)82%

Statistical Analysis85%

Bioinformatics (R, Biopython)75%

// Featured Builds

Projects I'm proud of

NLP + Machine Learning

Amazon Review Sentiment Classifier

Python · scikit-learn · TF-IDF · DistilBERT · FastAPI · Streamlit · NLTK · pandas

Built and deployed an NLP classifier on 20,000 Amazon Fine Food Reviews, achieving 91% accuracy and 0.954 ROC-AUC with TF-IDF, Logistic Regression, class-balanced training, and GridSearchCV optimisation.

Benchmarked against a DistilBERT baseline, built FastAPI and Streamlit inference flows, and documented error-analysis findings around sarcasm, short reviews, and class-level F1 performance.

⎇ Repository ⚡ Live App

Pharmacovigilance Analytics

FDA FAERS Signal Detection

Python · pandas · Plotly · Streamlit · SQL · SciPy · PRR

Analysed 528,000 FDA FAERS adverse-event records from 2015–2025 and applied Proportional Reporting Ratio methodology to identify 8,920 disproportionate drug-reaction signals, including 1,403 high-priority pairs.

Built a reproducible pipeline from raw CSVs to a 218,977-report serious-event subset, then developed a Streamlit dashboard with dynamic Plotly views.

⎇ Repository

E-commerce Analytics

E-commerce Sales Analysis - Olist

Python · DuckDB · pandas · XGBoost · SHAP · Streamlit · Plotly

Built an end-to-end analytics project on 99,441 Brazilian e-commerce orders, joining 9 Olist datasets into a DuckDB star schema and engineering RFM, delivery, payment, review, and purchase-behaviour features.

Developed customer segmentation, churn prediction, and 12-month CLV modelling, then shipped a Streamlit dashboard with segment explorer, churn-risk, CLV, and business-summary views.

⎇ Repository ⚡ Live App

Bioinformatics Automation

AI-Driven Gene Expression Analysis Pipeline

Python · R · FastAPI · PostgreSQL · n8n · Plotly · JavaScript

Engineered an end-to-end bioinformatics pipeline processing 10,000–20,000 genes per dataset using differential gene expression analysis, empirical Bayes methods, and a 0.05 p-value cutoff.

⎇ Repository

People Analytics

Employee Attrition Prediction

Python · scikit-learn · XGBoost · pandas · SMOTE · seaborn

Built an ML pipeline on the IBM HR Analytics dataset to predict employee attrition with 86.7% accuracy, using XGBoost and SMOTE for 16.1% class imbalance.

⎇ Repository

Business Intelligence

Superstore Sales Analytics

SQL · PostgreSQL · Power BI · Data Modeling · DAX

Designed a complete analytics pipeline from raw CSV data to interactive Power BI dashboards. Built DAX measures and data models for revenue, profit margin, and monthly trend drill-downs.

⎇ Repository

YouTube Analytics

Python · YouTube Data API v3 · PostgreSQL · pandas · SQLAlchemy · seaborn

Built an automated ETL pipeline that extracts India's trending YouTube videos across 5 categories, enriches them with engagement metrics, and loads clean records into PostgreSQL.

Collected 383 unique videos across 35 successful pipeline runs, then analysed category performance, channel-size effects, title patterns, video duration, and engagement behaviour.

⎇ Repository

Bioinformatics

Biopython Genome Analysis

Python · Biopython · BioPandas

COVID-19 sequence analysis pipeline built with Biopython for biological pattern exploration and comparative genomics.

⎇ Repository

Bioinformatics

Biopython Sequence Alignment

Python · Biopython · Needleman–Wunsch

Sequence alignment experiments comparing biological sequences with reproducible, peer-reviewable methods.

⎇ Repository

Tableau

Covid-19 Tableau Dashboard

Tableau Public · KPI Design · Filters · Storytelling

Built an interactive Tableau dashboard with clear KPI views, filters, and narrative visuals for stakeholder-friendly reporting.

⎇ Repository

// Live Dashboard

Tableau Public Dashboard

covid19_dashboard.twbx

// About Me

Technical clarity with business impact

I specialise in creating data products that are not only accurate, but useful for real decisions. My work combines machine learning, robust analysis, and communication that stakeholders can act on.

Machine learning workflows and model evaluation
Analytics with SQL, Python, and dashboard storytelling
Bioinformatics exploration with reproducible pipelines

sumant.py

class DataAnalyst:
  name = "Sumant J"
  focus = [
    "Analytics",
    "Machine Learning",
    "Bioinformatics",
  ]
  tools = {
    "query": "SQL",
    "visualise": "Tableau",
    "model": "scikit-learn",
  }
  def value(self):
    return "insight → action"

Explore All Repositories

// Let's Build

Get in Touch

If you're hiring for data, analytics, or AI-focused roles, I'd love to connect and discuss how I can contribute to your team.

📊 Data Analyst 🤖 ML Engineer 🧬 Bioinformatics 📈 BI & Dashboards

Turning raw data into actionable insight.

Skills & Proficiency

Analytics & BI

Machine Learning

Projects I'm proud of

Amazon Review Sentiment Classifier

FDA FAERS Signal Detection

E-commerce Sales Analysis - Olist

AI-Driven Gene Expression Analysis Pipeline

Employee Attrition Prediction

Superstore Sales Analytics

YouTube Analytics

Biopython Genome Analysis

Biopython Sequence Alignment

Covid-19 Tableau Dashboard

Tableau Public Dashboard

Technical clarity with business impact

Get in Touch

Turning raw data into
actionable insight.