Hojoon Leo Kim

hojoon.kim@snu.ac.kr

37, Samseong-ro 51-gil

Gangnam-gu, Seoul 06280

Republic of Korea

Hello, my name is Hojoon Kim, and I also go by Leo.

As an undergraduate researcher passionate about advancing ML systems through hardware-software co-design, I explore how traditional computer architecture principles such as caching, branch prediction, and multiprocessing can address inefficiencies in modern ML workloads. My work spans low-bit quantization, storage-assisted inference, and cache-driven planning for embodied AI agents, with publications at OSDI'25, ICML'25 (Spotlight), and MLSys'26. I aim to develop practical system architectures that make next-generation ML applications both efficient and deployable at scale by rethinking system abstractions across the entire computing stack.

When I’m not deep in code, you can probably find me on the tennis court🎾!

News

Jan, 2026	Our work `AgenticCache`, a cache-driven asynchronous planning system for embodied AI agents, has been accepted to `MLSys'26`! Thank you Thierry and Yuheng!
Nov, 2025	Our team won the `Grand Prize (1st Place)` at the `2025 AI Chip Contest - NPU Optimization Track` hosted by the Ministry of Science and ICT (MSIT), Republic of Korea, receiving a prize of KRW 10,000,000.
May, 2025	Our `DecDEC` has been accepted to `OSDI'25`!
May, 2025	Excited to share that our `ICML'25` submission, `FlashTP`, is currently under consideration for a Spotlight or Oral presentation. Also, it has already been successfully deployed in real-world industrial applications.

Selected Publications

* indicates equal contribution

EuroSys’27 (In Submission)

HELIOS: Heterogeneous Lightweight VLA Model Serving System

Jongheon Jeong^*, Hojoon Kim^*, Rokhee Lee, Yeonhong Park, Young H. Oh, and Jae W. Lee

2027

In submission
ATC’26 (In Submission)

QUESO: Storage-Assisted Quantization Error Compensation for On-Device LLM Inference

Seong Hoon Seo, Donghyun Lee, Geonha Lee, Hojoon Kim, Yeonhong Park, and Jae W. Lee

2026

In submission
MLSys’26

AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

Hojoon Kim, Yuheng Wu, and Thierry Tambe

2026

Acceptance Rate: 133/504 = 26.4%

PDF Code Poster Slides
ICML’25 Spotlight

FlashTP: Fused, Sparsity-Aware Tensor Product for Machine Learning Interatomic Potentials

Seung Yul Lee, Hojoon Kim, Yutack Park, Dawoon Jeong, Seungwu Han, Yeonhong Park, and Jae W. Lee

2025

Spotlight (313/12,107, 2.6%)

PDF Code
OSDI’25

DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization

Yeonhong Park^*, Jake Hyun^*, Hojoon Kim, and Jae W. Lee

2025

Acceptance Rate: 52/327 = 15.9%

PDF Code