ExtraTech Logo
TracksProjectsStudentsRecommendationsContact Us

© 2025 ExtraTech Bootcamps. All rights reserved.

← Back to Students

Yocheved K.

GitHub

Bio

Software Engineer specializing in systems and performance tuning. Analytical, proactive, and collaborative, committed to continuous learning and growth.

Skills

Python
Docker
Redis
CUDA
Vllm
Nvidia Dynamo
LMCache
KVRocks

Bootcamp Project

Hyperconverged KV-Cache Offloading for Cost-Efficient LLM Inference

Datacenter-scale LLM inference framework that offloads KV-cache to a hyperconverged KV-store, increasing capacity and robustness while keeping high hit-rates and good user experience.

Pliops

Mentored by: Pliops

Data Science Bootcamp 2025 (Data)

Responsibilities:

  • vLLM Research & System Architecture Analysis - Conducted in-depth research into vLLM internals and execution flow, as well as a detailed study of Dynamo’s architecture to fully understand system behavior, request flow, and end-to-end pipeline dynamics.

  • Hyper-Converged KV-Cache Offloading – Specialized in vLLM infrastructure and memory optimization within a hyper-converged design, with hands-on experience using Dynamo and LMCache for efficient KV-Cache management and KVRocks for key-value storage. Deployed and executed Dynamo on containerized environments using Docker, applying system-level optimization techniques to improve cache locality and inference performance.

  • High-Performance Profiling & Benchmarking - Performed deep profiling and performance analysis on high-end GPU servers, benchmarking three inference pipelines, identifying critical bottlenecks, and achieving up to a 2× improvement in TPS and RPS per user when comparing LMCache-accelerated runs (storing KV-Cache on DRAM) against Vanilla vLLM runs.

  • Distributed Deployment & Scalability Validation Deployed Dynamo + LMCache across a distributed multi-node environment to validate scalability, maintain throughput, and ensure consistent performance under distributed load.

  • Hyper-Converged Architecture Optimization - Removed and optimized aware-routing mechanisms to enable a fully hyper-converged architecture, significantly reducing system network load and simplifying traffic patterns.

Yocheved K. - Task Preview
Click to enlarge

Additional Projects

Flight Booking & Travel Management Platform


Developed a full-stack enterprise-grade travel management platform integrating React/Redux/Material-UI frontend with a .NET MVC Web API backend.


Engineered robust backend services supporting authentication, authorization, and intuitive role-based access management for seamless user experience.


Implemented features for booking management, payment processing, and calendar scheduling, streamlining end-to-end travel operations.


Implemented MySQL database architecture ensuring data integrity, scalability, and efficient query performance.


Delivered a maintainable, modular, and scalable codebase aligning with enterprise software standards.

English Level

Working Proficiency