If you're curious about how much KV Cache quantization affects Qwen3.5 27B, take a look at the table below. The model used in all of these benchmarks is Unsloth's Q8_K_XL. KV Cache BF 16 vs F16 vs Q8_0 KV Cache TypeMean PPL(Q)ΔPPL (Q - base)PPL Ratioln RatioMean KLDMax KLDRMS Δp (%)Same Top-p (%)BF166.8653 ± 0.04470———————F166.866214 …
Continue reading "Qwen3.5 27B Q8 – KV Cache Benchmarks BF16 vs F16 vs Q8_0"