SQL & Data Analysis
Healthcare SQL Projects
Healthcare Claims Analysis
Analyzed large-scale claims data using complex SQL joins and window functions to identify cost trends and reduce variance across providers.
Tech stack: SQL, PostgreSQL, BigQuery, DBeaver
View on GitHubMiscellaneous SQL Projects
Life Expectancy Analysis
Built SQL queries to clean, analyze Life Expectancy throughtout the world
Tech stack: MySQL, Power BI
Data Engineering
FHIR to BigQuery Pipeline
Built a GCP-based ETL pipeline using Python to transform and load FHIR JSON data into BigQuery for analysis.
View on GitHubData Science
CKD Prediction Pipeline — Clinical Text + Labs (NLP, TF-IDF)
Machine learning pipeline that combines TF-IDF features from clinical notes with numeric lab values (eGFR, creatinine) to predict CKD.
- Text preprocessing + TF-IDF vectorization for clinical notes.
- Numeric feature scaling for lab values using StandardScaler.
- ColumnTransformer + Pipeline architecture for preprocessing + modeling.
- RandomForestClassifier with
class_weight='balanced'and model persistence via joblib.
Tech stack: Python, Pandas, Scikit-learn, NLP, TF-IDF, Random Forest
Power BI & Visualization
Healthcare Claims Analysis Dashboard
Power BI Dashboard
- An interactive Power BI dashboard connected to Google BigQuery that analyzes healthcare claim values and encounter types across multiple facilities. Developed using DAX measures for dynamic filtering and real-time insights.
World Life Expectancy Dashboard
Power BI Dashboard
- An interactive Power BI dashboard connected to MySQL that analyzes Life Expectancy from 2007-2022. Developed using DAX measures for dynamic filtering and real-time insights.
Tech stack: MySQL, Power BI