ExtraTech Logo
TracksProjectsStudentsRecommendationsContact Us

© 2025 ExtraTech Bootcamps. All rights reserved.

← Back to Students

Nechama K.

GitHub

Bio

Analytical Software Developer delivering efficient, high-quality solutions. Skilled in quick learning, independent problem-solving, and optimizing high-performance systems. A reliable team player with strong communication skills and the ability to quickly adapt to dynamic development environments

Skills

Python
C++
CUDA
PyTorch
vLLM
LMCache
KVRocks

Bootcamp Project

Hyperconverged KV-Cache Offloading for Cost-Efficient LLM Inference

Datacenter-scale LLM inference framework that offloads KV-cache to a hyperconverged KV-store, increasing capacity and robustness while keeping high hit-rates and good user experience.

Pliops

Mentored by: Pliops

Data Science Bootcamp 2025 (Data)

Responsibilities:

  • Research: Conducted a comparative study on "KV-Cache Offloading": Benchmarked vLLM with KVrocks (SSD) vs. DRAM to quantify cost-performance trade-offs

  • Provisioned a high-throughput RAID0 array across dual 15TB NVMe SSDs specifically to accelerate KVrocks I/O operations

  • Deployed the open-source vLLM framework on NVIDIA L4 GPUs to establish a baseline environment for Llama-3-8B inference

  • Engineered the integration of vLLM with LMcache and KVrocks, enabling the chunk-based redirection of KV-cache blocks from GPU memory to local storage

  • Calculated precise concurrency limits by analyzing HBM saturation points relative to token size and KV-cache footprint

  • Optimized RocksDB parameters (compression, write buffers) to minimize storage latency under high concurrent user loads

  • Enhanced the Python benchmarking script by implementing timeout exception handling to filter data skew and ensure accurate TPS metrics

  • Synthesized benchmark data into comparative DRAM vs. SSD graphs plotting RPS/GPU vs. TPS/USER, illustrating the cost-performance trade-offs

Nechama K. - Task Preview
Click to enlarge

Additional Projects

2025 | Consultancy Workflow & Scheduling Platform (Node.js & React) | Github


Engineered a full-stack system managing complex, state-driven appointment workflows across multiple user roles (Client, Consultant, Admin).


• Implemented a Role-Based Access Control (RBAC) architecture for three user levels, supporting a State-Driven workflow for appointment approval/rejection and advanced multi-criteria filtering.

• Built a secure REST API in Node.js (JWT, Bcrypt) with a normalized SQL architecture (Sequelize, 1:N, N:M), complemented by an SPA in React/Redux for global state management.


2024 | Wit – Version Control System | Python | GitHub


Developed a CLI tool for version control (VCS) using Click, implementing OOP architecture for file history tracking.

English Level

Working Proficiency