Projects

Machine Learning related Link to heading

Selected machine learning and systems projects focused on model training, performance optimization, data pipelines, and experimental evaluation.

AlexNet Reproduction on ImageNet with Performance Optimization Link to heading

Reduced full ImageNet training time from ~1 month to <3 days by redesigning GPU data pipeline.

Problem Link to heading

Reproduce the original AlexNet architecture on the ImageNet dataset and make full training computationally feasible on modern hardware, addressing data-loading, preprocessing, and runtime performance bottlenecks.

What I Did Link to heading

  • Implemented the AlexNet architecture from the original 2012 paper
  • Trained the model on the full ImageNet dataset
  • Built an efficient ETL and preprocessing pipeline to reduce data-loading overhead
  • Profiled the training pipeline line-by-line to identify performance bottlenecks in data access and computation
  • Optimized data pipelines and training loops to drastically reduce runtime
  • Ran training inside a GPU-enabled container (Podman), resolving driver and CUDA compatibility issues

Key Engineering Challenges Link to heading

  • Initial end-to-end training time was estimated at ~1 month
  • Through profiling and optimization, reduced training time to <3 days (closer to ~12-24 hours on final runs)
  • Demonstrated that data handling and system configuration, not model complexity, were the primary bottlenecks

What I Learned Link to heading

  • End-to-end performance in deep learning systems is often dominated by data pipelines
  • Profiling is essential before scaling training workloads
  • Reproducing classic papers on modern hardware requires careful system-level tuning
  • Containerized GPU workflows introduce non-trivial driver and versioning challenges

Source code

Multi-Agent Frozen Lake with Partial Observability and Communication Link to heading

Focused on environment design, observability, and coordination failure modes in cooperative RL systems.

Problem Link to heading

Investigate coordination challenges in cooperative multi-agent reinforcement learning by extending the classical Frozen Lake environment with partial observability and discrete inter-agent communication.

What We Did Link to heading

  • Designed a multi-agent Frozen Lake environment with:
    • Partial observability (local 5×5 views)
    • Discrete communication channels
    • Optional stochastic (“slippery”) dynamics
  • Implemented training using PPO with parameter sharing
  • Benchmarked performance across 7 environment configurations over 10,000 training iterations
  • Evaluated the impact of:
    • Full vs partial observability
    • LSTM-based memory
    • Discrete communication
    • Deterministic vs stochastic transitions

Key Findings Link to heading

  • Agents with restricted local observability achieved mean rewards of ~60
  • Agents with full state observability failed to learn (mean reward ≈ -0.6)
  • LSTM memory provided minimal benefit in deterministic settings
  • Discrete communication degraded performance under partial observability
  • Stochastic (“slippery”) dynamics prevented learning in all configurations

What I Learned Link to heading

  • Full observability does not necessarily facilitate coordination in MARL
  • Partial observability can implicitly structure agent behavior and reduce coordination complexity
  • PPO with parameter sharing struggles in environments with communication and stochasticity
  • Environment design choices can dominate algorithmic improvements in cooperative RL

Impact Link to heading

Established a challenging MARL benchmark exposing limitations of standard cooperative algorithms

Source code

Bionomicon: Enzyme Classification from Amino Acid Sequences Link to heading

Processed 211M amino acids and trained a Transformer-based classifier achieving F1 0.66, comparable to classical baselines.

Problem Link to heading

Manual genomic annotation is labor-intensive and difficult to scale due to growing sequencing data, species diversity, and inconsistent annotation practices. This project investigates whether enzyme vs non-enzyme classification can be automated using only amino acid sequences.

What We Did Link to heading

  • Compiled a large-scale dataset of enzyme and non-enzyme amino acid sequences from UniProt (211 million amino acid sequences)
  • Processed and transformed raw biological data (FASTA/XML) into model-ready formats
  • Designed a Transformer encoder–based classifier, treating each amino acid as a token
  • Adapted model size and sequence handling based on compute budget and token length constraints
  • Trained a binary classifier without relying on external biological annotations or handcrafted features

Tools & Technologies Link to heading

  • Transformer encoder architecture
  • Large-scale data preprocessing pipelines (Rust for efficient data handling)
  • Deep learning frameworks for sequence modeling

Results Link to heading

The Transformer-based model achieved an F1 score of 0.6611, comparable to classical baselines such as SGD (0.6855), Gaussian Naive Bayes (0.6920), and Multinomial NB (0.6809).

These results indicate that, for this dataset and problem formulation, classical methods remain strong baselines, while Transformers provide competitive performance with greater modeling flexibility at higher computational cost.

What I Learned Link to heading

  • Transformers can effectively model long biological sequences as tokenized data
  • Dataset construction and preprocessing dominate project complexity in applied bioinformatics
  • Compute and token-length constraints strongly influence architectural choices in sequence models
  • End-to-end learning from raw sequences can simplify parts of the genomic annotation pipeline

Context: Academic course project in Machine Learning / Bioinformatics

More information and source code

LoRA Fine-Tuning Diffusion Models for Kurzgesagt-Style Illustration Link to heading

Focused on adapting large-scale generative models under resource constraints.

Problem Link to heading

Can parameter-efficient fine-tuning (LoRA) adapt large text-to-image diffusion models to reproduce a highly constrained, vector-like illustration style with strong geometric consistency?

What we did Link to heading

  • Curated a dataset of Kurzgesagt video frames
  • Generated captions using LLaVA-OneVision
  • Fine-tuned LoRA adapters on Flux1.dev
  • Evaluated results qualitatively and via style consistency indicators

Tools & Models Link to heading

  • Stable Diffusion 1.5 (baseline only), Flux1.dev
  • LoRA (parameter-efficient fine-tuning)
  • LLaVA-OneVision for captioning
  • Custom dataset of video frame illustrations

What I Learned Link to heading

  • LoRA improves color consistency and high-level style cues
  • Identified limitations of LoRA for tightly constrained illustration styles, including geometric precision and flat shading.
  • Dataset quality and caption specificity are critical for stylistic control

Results Link to heading

  • Improved stylistic resemblance compared to base models
  • Persistent artifacts in geometry and shading
  • Demonstrated tradeoffs between efficiency and fidelity

Source code

Foundations (Search, Optimization, Experimentation) Link to heading

Focused on classical algorithms, experimentation, and evaluation of optimization trade-offs.

  • 8-Puzzle / 15-Puzzle Solver (C++)
    Implemented and evaluated classical search algorithms (BFS, A*, IDA*, GBFS) using the Manhattan heuristic, analyzing trade-offs between optimality, memory usage, and computational cost.

    Source code

  • MNIST - Experimented with MLflow, Ray, Lightning, and experiment tracking (Source code)

  • CIFAR-10 - Hyperparameter optimization with Optuna building on MNIST pipelines (Source code)