Software Engineer specializing in systems and performance tuning. Analytical, proactive, and collaborative, committed to continuous learning and growth.
Datacenter-scale LLM inference framework that offloads KV-cache to a hyperconverged KV-store, increasing capacity and robustness while keeping high hit-rates and good user experience.

Mentored by: Pliops
Data Science Bootcamp 2025 (Data)
Responsibilities:
vLLM Research & System Architecture Analysis - Conducted in-depth research into vLLM internals and execution flow, as well as a detailed study of Dynamo’s architecture to fully understand system behavior, request flow, and end-to-end pipeline dynamics.
Hyper-Converged KV-Cache Offloading – Specialized in vLLM infrastructure and memory optimization within a hyper-converged design, with hands-on experience using Dynamo and LMCache for efficient KV-Cache management and KVRocks for key-value storage. Deployed and executed Dynamo on containerized environments using Docker, applying system-level optimization techniques to improve cache locality and inference performance.
High-Performance Profiling & Benchmarking - Performed deep profiling and performance analysis on high-end GPU servers, benchmarking three inference pipelines, identifying critical bottlenecks, and achieving up to a 2× improvement in TPS and RPS per user when comparing LMCache-accelerated runs (storing KV-Cache on DRAM) against Vanilla vLLM runs.
Distributed Deployment & Scalability Validation Deployed Dynamo + LMCache across a distributed multi-node environment to validate scalability, maintain throughput, and ensure consistent performance under distributed load.
Hyper-Converged Architecture Optimization - Removed and optimized aware-routing mechanisms to enable a fully hyper-converged architecture, significantly reducing system network load and simplifying traffic patterns.

Flight Booking & Travel Management Platform
Developed a full-stack enterprise-grade travel management platform integrating React/Redux/Material-UI frontend with a .NET MVC Web API backend.
Engineered robust backend services supporting authentication, authorization, and intuitive role-based access management for seamless user experience.
Implemented features for booking management, payment processing, and calendar scheduling, streamlining end-to-end travel operations.
Implemented MySQL database architecture ensuring data integrity, scalability, and efficient query performance.
Delivered a maintainable, modular, and scalable codebase aligning with enterprise software standards.
Working Proficiency