ExtraTech Logo
TracksProjectsStudentsRecommendationsContact Us

© 2025 ExtraTech Bootcamps. All rights reserved.

← Back to Students

Dvora G.

GitHub

Bio

Software Engineering Associate with a specialization in AI, excelling in strong logical reasoning, advanced analytical skills, and fast, creative thinking. Skilled in handling complex tasks, highly proactive, responsible, and an excellent team player. Self-taught with the ability to quickly adapt to new technologies, consistently meeting deadlines while striving for excellence.

Skills

Python
C++
Docker
Redis
FastAPI
CUDA
vLLM
KVrocks
RocksDB
GPU Computing
HBM
KV-Cache Offloading
REST APIs
Linux
Git
Multi-Threading
Profiling
Benchmarking
DRAM/SSD Performance Analysis.

Bootcamp Project

Hyperconverged KV-Cache Offloading for Cost-Efficient LLM Inference

Datacenter-scale LLM inference framework that offloads KV-cache to a hyperconverged KV-store, increasing capacity and robustness while keeping high hit-rates and good user experience.

Pliops

Mentored by: Pliops

Data Science Bootcamp 2025 (Data)

Responsibilities:

  • Research and Familiarization with vLLM Conducted in-depth study of the vLLM architecture, execution flow, and inference pipeline. Executed multiple LLM inference runs using vLLM to gain hands-on operational understanding.

  • Research and Evaluation of KVRocks Investigated the KVRocks storage engine, its design goals, architecture, and usage scenarios. Evaluated its suitability as an external KV-store for inference KV-cache offloading.

  • Baseline vLLM Deployment on NVIDIA L4 GPUs Deployed the open-source vLLM framework on NVIDIA L4 GPUs to establish a baseline environment. Executed Llama-3-8B inference workloads to measure initial performance characteristics.

  • Concurrency Limit Analysis Based on GPU HBM Saturation Calculated precise concurrency limits by analyzing HBM saturation points relative to token size and KV-cache footprint. Derived the maximum number of concurrent clients supported per server.

  • Integration Between vLLM and the Pliops Gateway Integrated vLLM with the Pliops Gateway (connector) for accelerated KV access. Validated full end-to-end inference flow through the gateway.

  • Implementation of a New Storage Backend for KVRocks in the Gateway Developed and integrated a new KVRocks storage backend within the Pliops Gateway. Implemented full CRUD support for direct interaction with the KV-store.

  • Single-Threaded Multi-Get / Multi-Set Implementation Implemented a single-threaded Multi-Get / Multi-Set access pattern to KVRocks. Evaluated correctness, stability, and latency impact of batched operations.

  • Multi-Threaded Get/Set Integration via the Gateway Implemented a multi-threaded Get/Set access pattern with multiple concurrent clients. Managed connection handling, synchronization, and system stability under load.

  • Performance Benchmarking and Comparative Storage Analysis Executed performance benchmarks comparing vLLM + Pliops Gateway + KVRocks against vLLM with DRAM-based LM-Cache. Synthesized benchmark data into comparative DRAM vs. SSD graphs, plotting RPS/GPU vs. TPS/USER to illustrate cost–performance trade-offs.

  • Bottleneck Analysis and Performance Optimization Insights Analyzed logs and performance metrics to identify I/O, networking, and memory bottlenecks. Derived initial optimization insights and future improvement directions.

Dvora G. - Task Preview
Click to enlarge

Additional Projects

Online Survey Platform | Node.js & React (Deployed on AWS) | https://github.com/d7080120/reactnodeproject Full-stack development of a survey platform featuring an intuitive admin panel, wide functionality, and user-friendly interface. Technologies: React (client-side), Node.js (server-side), MongoDB (database).

E-Commerce Website | C# .NET & JavaScript | https://github.com/d7080120/WebApi_BabyProductShop Developed an e-commerce website for managing online transactions. Backend in .NET with a three-layer architecture, Swagger integration, DTOs, and SQL database connection. Frontend implemented in JavaScript for customer interaction.

English Level

Fluent