Systems Engineer
I work on AI inference infrastructure at Google. Most recently, I led efficiency optimization for Gemini LLM inference, reclaiming 35+% of TPU fleet capacity, and shipped Context Caching on Vertex AI and Google AI Studio from preview to GA.
Previously co-founded ghOSt, a userspace scheduling framework deployed on Google Search (SOSP 2021). Earlier work on Linux memory management solved decade-old kernel problems—including fragmentation-resistant pinning and cache reaper latency (10ms → 50μs).
Before Google: Qualcomm Innovation Center (ARM64 server enablement, upstream Linux kernel). MS Computer Science, Penn State.
Kernel and userspace components for delegating Linux scheduling to userspace, enabling rapid iteration on scheduling policies without kernel recompilation.