Prefill on jonam'Log

Prefill on jonam'Loghttps://www.jonam.io/tags/prefill/Recent content in Prefill on jonam'LogHugo -- gohugo.ioen© 2026 Manoj. All Rights Reserved.Mon, 18 May 2026 00:00:00 +0000PrefillXhttps://www.jonam.io/journal/inference-engineering/product-ideas/prefillx/Mon, 18 May 2026 00:00:00 +0000https://www.jonam.io/journal/inference-engineering/product-ideas/prefillx/Cut TTFT for long-context document applications by precomputing and repairing reusable KV states.Speculative Prefillhttps://www.jonam.io/journal/inference-engineering/research-topics/speculative-prefill/Mon, 18 May 2026 00:00:00 +0000https://www.jonam.io/journal/inference-engineering/research-topics/speculative-prefill/Speculative decoding is common; this asks whether speculation can reduce long-prompt prefill latency.