🎓

PhD Research

Thesis

Context-Aware Stochastic Representation Learning (RASTER)

Pioneered segmentation-aware time-series representations and random representation learning methods to capture where discriminative information occurs over time. Introduced temporal localization to random kernel networks that outperforms global pooling in non-stationary, weakly supervised regimes.

This framework enables scalable foundation-style representations for sparse, high-frequency clinical signals through segment-wise random convolution networks that capture granular temporal dynamics.

Python PyTorch NumPy scikit-learn
Research

Self-Supervised Feature Learning (CURVE)

Created a contrastive learning framework to optimize high-dimensional random embeddings without labeled data. Leveraged data augmentation strategies and information-theoretic proofs to strictly guarantee feature utility in an unsupervised setting.

The framework enables effective feature selection for time series classification when labeled samples are scarce or expensive to obtain.

Contrastive Learning PyTorch Information Theory
Research

Attention-Based Fusion Models (OscilloFusion)

Built a multi-channel fusion architecture utilizing modern attention mechanisms to integrate complex respiratory oscillometry data. Achieved state-of-the-art discriminability with ~80% accuracy under limited-data constraints, delivering +5 percentage-point improvement over conventional deep-learning baselines.

Attention Mechanisms Multi-Channel Fusion Healthcare ML
Research

Trustworthy AI for Healthcare

Deployed conformal prediction pipelines to quantify uncertainty and model label noise in clinical applications. Implemented calibrated uncertainty estimates to support risk-aware decision-making, ensuring robustness against distribution shifts in safety-critical medical datasets.

Uncertainty Quantification Conformal Prediction Clinical ML
💼

Industry Projects

Healthcare

Clinical ML Pipeline at UHN

Led a 5-year oscillometry–PFT ML pipeline for pulmonary patients at University Health Network. Unified SQL/Excel/file-based hospital data with patient matching and reproducible quality control. Built an interactive clinician tool to explore signals, verify predictions, inspect explanations, and compare ML backbones.

  • Applied robustness strategies: augmentation, synthetic generation (GAN/cGAN)
  • Evaluated model performance under noisy labels and data drift
  • Implemented calibrated uncertainty estimates for clinical workflows
Python PyTorch SQL Streamlit
CAD Automation

GNN-Based Annotation System at DraftAid

Built an ML pipeline that converts CAD drawings into structured geometric graphs for model processing. Designed and trained a graph neural network to recommend annotation links, achieving 94% recall and significantly reducing reliance on expert manual annotation.

  • Formulated drafting as conditional autoregressive graph inference
  • State-dependent sequential annotation selection
  • Integrated human-in-the-loop workflow for efficient model verification
PyTorch Geometric GNN Python CAD
Data Science

Recommendation Engine at SnappFood

Engineered a hybrid recommendation engine fusing latent matrix factorization with real-time behavioral features to personalize user feeds. Devised spatio-temporal predictive models for delivery time estimation (ETA), integrating geographic and traffic signals for logistics optimization.

  • Designed and analyzed A/B tests for CTR, conversion, and retention
  • Developed scalable Hadoop/Spark analytics and dashboards
  • City/vendor/item-level insights using Tableau and Power BI
PySpark ML Hadoop Tableau A/B Testing
Backend Engineering

IoT Platform at Atrovan

Architected a multi-tenant IoT backend for high-throughput ingestion, telemetry analytics, and access control. Built event-driven services in Golang with Kafka for async processing, fault tolerance, and back-pressure handling.

  • Implemented REST APIs with MySQL/MongoDB and Redis caching
  • End-to-end fleet management: GPS telemetry, ETA computation, geospatial indexing
  • Led backend + React teams; defined technical roadmap
Golang Kafka MySQL Redis React
🚀

Side Projects

LLM Application

Job Application Tracker & Resume Tailoring System

Designed and deployed a multi-user job tracker with Streamlit, SQLite, and AWS EC2, featuring secure authentication and per-user application management. Built a LangGraph LLM pipeline to extract structured job data from URLs or pasted descriptions with validated JSON outputs.

  • Guardrailed LaTeX resume tailoring engine enforcing 2-page limits
  • RAG-based customization for role-specific resumes and cover letters
  • Interview preparation material generation
LangGraph OpenAI SDK Streamlit AWS EC2 RAG
📚

Academic Projects

M.Sc. Thesis

Human Activity Recognition on Distributed Infrastructure

Built deep learning pipelines for smartphone sensor time-series classification. Scaled training and inference using distributed processing on Apache Spark, emphasizing efficient data handling and reproducible experimentation.

Deep Learning Apache Spark HAR
B.Sc. Capstone

Real-time Logo Detection & Inpainting

Designed GPU-accelerated algorithms in C/C++ using CUDA to detect static logos in broadcast sports video and perform real-time inpainting to remove visual overlays. Achieved real-time performance on HD video streams.

C/C++ CUDA Computer Vision
B.Sc. Capstone

IoT Gateway for Zigbee Devices

Designed and implemented an edge IoT gateway in Python on Raspberry Pi to collect data from Zigbee-based devices, perform local aggregation and preprocessing, and forward structured telemetry to a backend platform.

Python Raspberry Pi Zigbee IoT