Commit Graph

5201 Commits

Author SHA1 Message Date
Ettore Di Giacinto
61afe4ca60 chore: drop drawin-x86_64 support (#7616)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 21:22:15 +01:00
Ettore Di Giacinto
424c95edba fix: correctly propagate error during model load (#7610)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 18:26:54 +01:00
Ettore Di Giacinto
b348a99b03 chore: move defaults to constants
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 17:40:51 +01:00
Ettore Di Giacinto
f3c70a96ba chore(memory-reclaimer): use saner defaults
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 16:25:09 +01:00
Ettore Di Giacinto
e3e5f59965 fix(ram): do not read from cgroup (#7606)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 13:28:11 +01:00
blightbow
67baf66555 feat(mlx): add thread-safe LRU prompt cache and min_p/top_k sampling (#7556)
* feat(mlx): add thread-safe LRU prompt cache

Port mlx-lm's LRUPromptCache to fix race condition where concurrent
requests corrupt shared KV cache state. The previous implementation
used a single prompt_cache instance shared across all requests.

Changes:
- Add backend/python/common/mlx_cache.py with ThreadSafeLRUPromptCache
- Modify backend.py to use per-request cache isolation via fetch/insert
- Add prefix matching for cache reuse across similar prompts
- Add LRU eviction (default 10 entries, configurable)
- Add concurrency and cache unit tests

The cache uses a trie-based structure for efficient prefix matching,
allowing prompts that share common prefixes to reuse cached KV states.
Thread safety is provided via threading.Lock.

New configuration options:
- max_cache_entries: Maximum LRU cache entries (default: 10)
- max_kv_size: Maximum KV cache size per entry (default: None)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* feat(mlx): add min_p and top_k sampler support

Add MinP field to proto (field 52) following the precedent set by
other non-OpenAI sampling parameters like TopK, TailFreeSamplingZ,
TypicalP, and Mirostat.

Changes:
- backend.proto: Add float MinP field for min-p sampling
- backend.py: Extract and pass min_p and top_k to mlx_lm sampler
  (top_k was in proto but not being passed)
- test.py: Fix test_sampling_params to use valid proto fields and
  switch to MLX-compatible model (mlx-community/Llama-3.2-1B-Instruct)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* refactor(mlx): move mlx_cache.py from common to mlx backend

The ThreadSafeLRUPromptCache is only used by the mlx backend. After
evaluating mlx-vlm, it was determined that the cache cannot be shared
because mlx-vlm's generate/stream_generate functions don't support
the prompt_cache parameter that mlx_lm provides.

- Move mlx_cache.py from backend/python/common/ to backend/python/mlx/
- Remove sys.path manipulation from backend.py and test.py
- Fix test assertion to expect "MLX model loaded successfully"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* test(mlx): add comprehensive cache tests and document upstream behavior

Added comprehensive unit tests (test_mlx_cache.py) covering all cache
operation modes:
- Exact match
- Shorter prefix match
- Longer prefix match with trimming
- No match scenarios
- LRU eviction and access order
- Reference counting and deep copy behavior
- Multi-model namespacing
- Thread safety with data integrity verification

Documents upstream mlx_lm/server.py behavior: single-token prefixes are
deliberately not matched (uses > 0, not >= 0) to allow longer cached
sequences to be preferred for trimming. This is acceptable because real
prompts with chat templates are always many tokens.

Removed weak unit tests from test.py that only verified "no exception
thrown" rather than correctness.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* chore(mlx): remove unused MinP proto field

The MinP field was added to PredictOptions but is not populated by the
Go frontend/API. The MLX backend uses getattr with a default value,
so it works without the proto field.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

---------

Signed-off-by: Blightbow <blightbow@users.noreply.github.com>
Co-authored-by: Blightbow <blightbow@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 11:27:46 +01:00
Ettore Di Giacinto
878c9d46d5 fix: improve ram estimation (#7603)
* fix: default to 10seconds of watchdog if runtime setting is malformed

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: use gosigar for RAM estimation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 10:18:36 +01:00
Ettore Di Giacinto
b841a495da Revert "chore(deps): bump securego/gosec from 2.22.9 to 2.22.11" (#7602)
Revert "chore(deps): bump securego/gosec from 2.22.9 to 2.22.11 (#7588)"

This reverts commit 648dfc0389.
2025-12-16 09:48:46 +01:00
Ettore Di Giacinto
f75903d7f7 Update latest project news in README
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-16 09:16:42 +01:00
Ettore Di Giacinto
50f9c9a058 feat(watchdog): add Memory resource reclaimer (#7583)
* feat(watchdog): add GPU reclaimer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Handle vram calculation for unified memory devices

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Support RAM eviction, set watchdog interval from runtime settings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 09:15:18 +01:00
dependabot[bot]
dbd25885c3 chore(deps): bump sentence-transformers from 5.1.0 to 5.2.0 in /backend/python/transformers (#7594)
chore(deps): bump sentence-transformers in /backend/python/transformers

Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.1.0 to 5.2.0.
- [Release notes](https://github.com/huggingface/sentence-transformers/releases)
- [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.1.0...v5.2.0)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-version: 5.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 09:12:57 +01:00
dependabot[bot]
3d55055126 chore(deps): bump github.com/jaypipes/ghw from 0.20.0 to 0.21.1 (#7591)
Bumps [github.com/jaypipes/ghw](https://github.com/jaypipes/ghw) from 0.20.0 to 0.21.1.
- [Release notes](https://github.com/jaypipes/ghw/releases)
- [Commits](https://github.com/jaypipes/ghw/compare/v0.20.0...v0.21.1)

---
updated-dependencies:
- dependency-name: github.com/jaypipes/ghw
  dependency-version: 0.21.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 08:16:05 +01:00
dependabot[bot]
af7ba2e3de chore(deps): bump github.com/labstack/echo/v4 from 4.13.4 to 4.14.0 (#7589)
Bumps [github.com/labstack/echo/v4](https://github.com/labstack/echo) from 4.13.4 to 4.14.0.
- [Release notes](https://github.com/labstack/echo/releases)
- [Changelog](https://github.com/labstack/echo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/labstack/echo/compare/v4.13.4...v4.14.0)

---
updated-dependencies:
- dependency-name: github.com/labstack/echo/v4
  dependency-version: 4.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 08:15:41 +01:00
LocalAI [bot]
7a3b0bbfaa chore: ⬆️ Update leejet/stable-diffusion.cpp to 200cb6f2ca07e40fa83b610a4e595f4da06ec709 (#7597)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-16 08:15:15 +01:00
dependabot[bot]
648dfc0389 chore(deps): bump securego/gosec from 2.22.9 to 2.22.11 (#7588)
Bumps [securego/gosec](https://github.com/securego/gosec) from 2.22.9 to 2.22.11.
- [Release notes](https://github.com/securego/gosec/releases)
- [Commits](https://github.com/securego/gosec/compare/v2.22.9...v2.22.11)

---
updated-dependencies:
- dependency-name: securego/gosec
  dependency-version: 2.22.11
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 01:49:11 +00:00
dependabot[bot]
b396413ad5 chore(deps): bump actions/download-artifact from 6 to 7 (#7587)
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 00:14:02 +01:00
dependabot[bot]
2ad928678c chore(deps): bump peter-evans/create-pull-request from 7 to 8 (#7586)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7 to 8.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](https://github.com/peter-evans/create-pull-request/compare/v7...v8)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 00:13:42 +01:00
dependabot[bot]
9b27b53a50 chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.2 to 2.27.3 (#7590)
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.27.2 to 2.27.3.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.27.2...v2.27.3)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-version: 2.27.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-15 22:58:45 +01:00
Ettore Di Giacinto
2387b266d8 chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-15 21:55:20 +01:00
dependabot[bot]
0f2df23c61 chore(deps): bump actions/upload-artifact from 5 to 6 (#7585)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-15 19:33:48 +00:00
Ettore Di Giacinto
8ac7e8c299 fix(chat-ui): model selection toggle and new chat (#7574)
Fixes a minor glitch that happens when switching model in from the chat
pane where the header was not getting updated. Besides, it allows to
create new chat directly when clicking from the management pane to the
model.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-14 22:29:11 +01:00
LocalAI [bot]
0f5cc4c07b chore: ⬆️ Update ggml-org/llama.cpp to 5c8a717128cc98aa9e5b1c44652f5cf458fd426e (#7573)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 22:21:54 +01:00
LocalAI [bot]
3e4e6777d8 chore: ⬆️ Update ggml-org/llama.cpp to 5266379bcae74214af397f36aa81b2a08b15d545 (#7563)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 11:41:10 +01:00
Simon Redman
5de539ab07 fix(7355): Update llama-cpp grpc for v3 interface (#7566)
* fix(7355): Update llama-cpp grpc for v3 interface

Signed-off-by: Simon Redman <simon@ergotech.com>

* feat(llama-gprc): Trim whitespace from servers list

Signed-off-by: Simon Redman <simon@ergotech.com>

* Trim trailing spaces in grpc-server.cpp

Signed-off-by: Simon Redman <simon@ergotech.com>

---------

Signed-off-by: Simon Redman <simon@ergotech.com>
2025-12-14 11:40:33 +01:00
LocalAI [bot]
3013d1c7b5 chore: ⬆️ Update leejet/stable-diffusion.cpp to 43a70e819b9254dee0d017305d6992f6bb27f850 (#7562)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-13 22:52:20 +01:00
LocalAI [bot]
073b3855d9 chore: ⬆️ Update ggml-org/whisper.cpp to 2551e4ce98db69027d08bd99bcc3f1a4e2ad2cef (#7561)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-13 21:22:14 +00:00
Ettore Di Giacinto
e1874cdb54 feat(ui): add mask to install custom backends (#7559)
* feat: allow to install backends from URL in the WebUI and API

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* trace backends installations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-13 19:11:32 +01:00
Ettore Di Giacinto
7790a24682 Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory" (#7558)
Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend…"

This reverts commit 1b4aa6f1be.
2025-12-13 17:04:46 +01:00
dependabot[bot]
1b4aa6f1be chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory (#7549)
chore(deps): bump torch

Bumps the pip group with 1 update in the /backend/python/diffusers directory: torch.


Updates `torch` from 2.5.1+cxx11.abi to 2.7.1+cpu

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.7.1+cpu
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-13 13:12:18 +00:00
Ettore Di Giacinto
504d954aea Add chardet to requirements-l4t13.txt
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-13 12:59:03 +01:00
Ettore Di Giacinto
1383ad6d6d Change runner from macOS-14 to macos-latest
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-13 10:11:27 +01:00
Ettore Di Giacinto
5e270ba5bd Change runner from macOS-14 to macos-latest
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-13 10:10:47 +01:00
Ettore Di Giacinto
6d2a535813 chore(l4t13): use pytorch index (#7546)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-13 10:04:57 +01:00
Ettore Di Giacinto
abfb0ff8fe feat(stablediffusion-ggml): add lora support (#7542)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-13 08:29:06 +01:00
LocalAI [bot]
2bd6faaff5 chore: ⬆️ Update leejet/stable-diffusion.cpp to 11ab095230b2b67210f5da4d901588d56c71fe3a (#7539)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-12 21:31:13 +00:00
Ettore Di Giacinto
1a9f5da1b7 Update Discord badge with dynamic member count
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-12 12:50:55 +01:00
Ettore Di Giacinto
7f823fce7c Update Discord badge in README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-12 12:34:57 +01:00
Ettore Di Giacinto
fc5b9ebfcc feat(loader): enhance single active backend to support LRU eviction (#7535)
* feat(loader): refactor single active backend support to LRU

This changeset introduces LRU management of loaded backends. Users can
set now a maximum number of models to be loaded concurrently, and, when
setting LocalAI in single active backend mode we set LRU to 1 for
backward compatibility.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-12 12:28:38 +01:00
LocalAI [bot]
c141a40e00 chore(model-gallery): ⬆️ update checksum (#7530)
⬆️ Checksum updates in gallery/index.yaml

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-12 08:16:04 +01:00
Ettore Di Giacinto
0b130fb811 fix(llama.cpp): handle corner cases with tool array content (#7528)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-12 08:15:45 +01:00
LocalAI [bot]
0771a2d3ec chore: ⬆️ Update ggml-org/llama.cpp to a81a569577cc38b32558958b048228150be63eae (#7529)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 21:55:44 +00:00
Richard Palethorpe
9441eb509a chore(makefile): Add buildargs for sd and cuda when building backend (#7525)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-12-11 20:33:19 +01:00
Ettore Di Giacinto
8442f33712 chore(deps): bump stable-diffusion.cpp to '8823dc48bcc1598eb9671da7b69e45338d0cc5a5' (#7524)
* chore(deps): bump stable-diffusion.cpp to '8823dc48bcc1598eb9671da7b69e45338d0cc5a5'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Dockerfile.golang): Make curl noisy to see when download fails

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Richard Palethorpe <io@richiejp.com>
2025-12-11 20:32:25 +01:00
Ettore Di Giacinto
5dde7e9ac6 fix: make sure to close on errors (#7521)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-11 14:03:20 +01:00
LocalAI [bot]
72621a1d1c chore: ⬆️ Update ggml-org/llama.cpp to 4dff236a522bd0ed949331d6cb1ee2a1b3615c35 (#7508)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 08:15:38 +01:00
Ettore Di Giacinto
3b5c2ea633 feat(ui): allow to order search results (#7507)
* feat(ui): improve table view and let items to be sorted

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: use constants

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-11 00:11:33 +01:00
LocalAI [bot]
e1d060d147 chore: ⬆️ Update ggml-org/whisper.cpp to 9f5ed26e43c680bece09df7bdc8c1b7835f0e537 (#7509)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-10 23:09:13 +01:00
Ettore Di Giacinto
32dcb58e89 feat(vibevoice): add new backend (#7494)
* feat(vibevoice): add backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add workflow and backend index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): add vibevoice

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted for intel builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Pin python version for l4t

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-10 21:14:21 +01:00
LocalAI [bot]
ef44ace73f chore: ⬆️ Update ggml-org/llama.cpp to 086a63e3a5d2dbbb7183a74db453459e544eb55a (#7496)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-10 12:05:13 +01:00
Ettore Di Giacinto
f51d3e380b fix(config): make syncKnownUsecasesFromString idempotent (#7493)
fix(config): correctly parse usecases from strings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-09 21:08:22 +01:00