↓
Skip to main content
jonam’Log
about
posts
journal
files
resume
about
posts
journal
files
resume
Inference-Engineering
Product Ideas
18 May, 2026
·
372 words
·
2 mins
Inference-Engineering
Products
Startup
Inference
Yc
Inference Engineering
18 May, 2026
·
281 words
·
2 mins
Inference-Engineering
Inference
Llm
Systems
Journal
Research Topics
18 May, 2026
·
618 words
·
3 mins
Inference-Engineering
Research
Kv-Cache
Inference
Llm-Systems
NeuralEdge
18 May, 2026
·
114 words
·
1 min
Inference-Engineering
Startup
Edge-Ai
Robotics
Thermal
Hardware-Aware AI CPU Ideas
18 May, 2026
·
327 words
·
2 mins
Inference-Engineering
Hardware
Systolic-Arrays
Compiler
Hbm
Tokens-per-Dollar
Unlearning Layer In Attention
18 May, 2026
·
167 words
·
1 min
Inference-Engineering
Unlearning
Attention
Safety
Adapters
SpecDraft Cloud
18 May, 2026
·
110 words
·
1 min
Inference-Engineering
Startup
Speculative-Decoding
Eagle
Inference
ConvoCache
18 May, 2026
·
129 words
·
1 min
Inference-Engineering
Startup
Memory
Assistant
Kv-Cache
Attention Head Similarity Pruning
18 May, 2026
·
208 words
·
1 min
Inference-Engineering
Attention
Pruning
Heads
Inference
SLO-Aware KV Cache Tiering
18 May, 2026
·
218 words
·
2 mins
Inference-Engineering
Kv-Cache
Slo
Scheduler
Pagedattention
Serving
DistillAudit
18 May, 2026
·
123 words
·
1 min
Inference-Engineering
Startup
Distillation
Audit
Governance
Online EAGLE Draft Learning
18 May, 2026
·
224 words
·
2 mins
Inference-Engineering
Speculative-Decoding
Eagle
Online-Learning
Research
HaloscoreAI
18 May, 2026
·
133 words
·
1 min
Inference-Engineering
Startup
Hallucination
Compliance
Safety
SLOGuard
18 May, 2026
·
127 words
·
1 min
Inference-Engineering
Startup
Slo
Scheduler
Enterprise
Quantization Divergence As Hallucination Signal
18 May, 2026
·
226 words
·
2 mins
Inference-Engineering
Quantization
Hallucination
Uncertainty
Safety
Speculative Prefill
18 May, 2026
·
273 words
·
2 mins
Inference-Engineering
Prefill
Speculative-Decoding
Kv-Cache
Research
DraftOS
18 May, 2026
·
154 words
·
1 min
Inference-Engineering
Startup
Speculative-Decoding
Cpu
Gpu
Roofline-Adaptive Inference Scheduler
18 May, 2026
·
304 words
·
2 mins
Inference-Engineering
Roofline
Scheduler
Vllm
Gpu
Research
InferGrid
18 May, 2026
·
160 words
·
1 min
Inference-Engineering
Startup
Observability
Roofline
Gpu
Temporal TurboQuant KV Tiering
18 May, 2026
·
350 words
·
2 mins
Inference-Engineering
Turboquant
Kv-Cache
Quantization
Long-Context
PrefillX
18 May, 2026
·
184 words
·
1 min
Inference-Engineering
Startup
Prefill
Rag
Kv-Cache
Position-Invariant Document KV Cache
18 May, 2026
·
477 words
·
3 mins
Inference-Engineering
Kv-Cache
Rag
Rope
Prefix-Caching
Research
DocVault
18 May, 2026
·
289 words
·
2 mins
Inference-Engineering
Startup
Rag
Kv-Cache
Docvault