Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5252
build : fix build info on windows (#13239) * build : fix build info on windows * fix cuda host compiler msg
b5250
llama-chat : update GLM4 chat template (#13238) * update GLM4 chat template * Update chat template Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
b5249
vulkan: Add bfloat16 support (#12554) * vulkan: Add bfloat16 support This adds bfloat16 matrix multiply support based on VK_KHR_shader_bfloat16. The extension is required for coopmat multiply support, but matrix-vector multiply trivially promotes bf16 to fp32 and doesn't require the extension. The copy/get_rows shaders also don't require the extension. It's probably possible to fall back to non-coopmat and promote to fp32 when the extension isn't supported, but this change doesn't do that. The coopmat support also requires a glslc that supports the extension, which currently requires a custom build. * vulkan: Support bf16 tensors without the bf16 extension or coopmat support Compile a variant of the scalar mul_mm shader that will promote the bf16 values to float, and use that when either the bf16 extension or the coopmat extensions aren't available. * vulkan: bfloat16 fixes (really works without bfloat16 support now) * vulkan: fix spirv-val failure and reenable -O
b5248
vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul sha…
b5246
sync : ggml ggml-ci
b5243
mtmd : add **vision** support for Mistral Small 3.1 (#13231) * convert ok * load ok, missing patch merger * ah sheet it works * update llava/readme * add test * fix test
b5242
arg : remove CURLINFO_EFFECTIVE_METHOD (#13228)
b5241
llama-model : fix the reported size class for nomic-embed-text-v2-moe…
b5239
ggml : fix ggml_gallocr_ptr type (ggml/1205)
b5237
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (#13199)