Commit Graph

827 Commits

Author SHA1 Message Date
Ettore Di Giacinto
fc6057a952 chore(deps): bump llama.cpp to '0e1ccf15c7b6d05c720551b537857ecf6194d420' (#7684)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-22 09:50:42 +01:00
Ettore Di Giacinto
8b3e0ebf8a chore: allow to set local-ai log format, default to custom one (#7679)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-21 21:21:59 +01:00
Ettore Di Giacinto
c37785b78c chore(refactor): move logging to common package based on slog (#7668)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-21 19:33:13 +01:00
LocalAI [bot]
38cde81ff4 chore: ⬆️ Update ggml-org/llama.cpp to 52ab19df633f3de5d4db171a16f2d9edd2342fec (#7665)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-20 21:09:15 +00:00
LocalAI [bot]
626057bcca chore: ⬆️ Update ggml-org/llama.cpp to ce734a8a2f9fb6eb4f0383ab1370a1b0014ab787 (#7654)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-19 21:15:39 +00:00
LocalAI [bot]
aa0efeb0a8 chore: ⬆️ Update ggml-org/whisper.cpp to 6c22e792cb0ee155b6587ce71a8410c3aeb06949 (#7644)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-19 09:26:41 +01:00
LocalAI [bot]
f25ac00bca chore: ⬆️ Update ggml-org/llama.cpp to f9ec8858edea4a0ecfea149d6815ebfb5ecc3bcd (#7642)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 21:17:14 +00:00
Richard Palethorpe
c3494a0927 chore: ⬆️ Update leejet/stable-diffusion.cpp to bda7fab9f208dff4b67179a68f694b6ddec13326 (#7639)
* ⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(stablediffusion-ggml): Don't set removed lora model dir

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 20:52:22 +01:00
Richard Palethorpe
716dba94b4 feat(whisper): Add prompt to condition transcription output (#7624)
* chore(makefile): Add buildargs for sd and cuda when building backend

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(whisper): Add prompt to condition transcription output

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-12-18 14:40:45 +01:00
LocalAI [bot]
5515119a7e chore: ⬆️ Update ggml-org/llama.cpp to d37fc935059211454e9ad2e2a44e8ed78fd6d1ce (#7629)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 09:07:09 +01:00
LocalAI [bot]
4535e7dfc4 chore: ⬆️ Update ggml-org/whisper.cpp to 3e79e73eee32e924fbd34587f2f2ac5a45a26b61 (#7630)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 09:06:48 +01:00
LocalAI [bot]
14bb65b57b chore: ⬆️ Update ggml-org/llama.cpp to ef83fb8601229ff650d952985be47e82d644bfaa (#7611)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-17 08:32:42 +01:00
Ettore Di Giacinto
61afe4ca60 chore: drop drawin-x86_64 support (#7616)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-16 21:22:15 +01:00
blightbow
67baf66555 feat(mlx): add thread-safe LRU prompt cache and min_p/top_k sampling (#7556)
* feat(mlx): add thread-safe LRU prompt cache

Port mlx-lm's LRUPromptCache to fix race condition where concurrent
requests corrupt shared KV cache state. The previous implementation
used a single prompt_cache instance shared across all requests.

Changes:
- Add backend/python/common/mlx_cache.py with ThreadSafeLRUPromptCache
- Modify backend.py to use per-request cache isolation via fetch/insert
- Add prefix matching for cache reuse across similar prompts
- Add LRU eviction (default 10 entries, configurable)
- Add concurrency and cache unit tests

The cache uses a trie-based structure for efficient prefix matching,
allowing prompts that share common prefixes to reuse cached KV states.
Thread safety is provided via threading.Lock.

New configuration options:
- max_cache_entries: Maximum LRU cache entries (default: 10)
- max_kv_size: Maximum KV cache size per entry (default: None)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* feat(mlx): add min_p and top_k sampler support

Add MinP field to proto (field 52) following the precedent set by
other non-OpenAI sampling parameters like TopK, TailFreeSamplingZ,
TypicalP, and Mirostat.

Changes:
- backend.proto: Add float MinP field for min-p sampling
- backend.py: Extract and pass min_p and top_k to mlx_lm sampler
  (top_k was in proto but not being passed)
- test.py: Fix test_sampling_params to use valid proto fields and
  switch to MLX-compatible model (mlx-community/Llama-3.2-1B-Instruct)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* refactor(mlx): move mlx_cache.py from common to mlx backend

The ThreadSafeLRUPromptCache is only used by the mlx backend. After
evaluating mlx-vlm, it was determined that the cache cannot be shared
because mlx-vlm's generate/stream_generate functions don't support
the prompt_cache parameter that mlx_lm provides.

- Move mlx_cache.py from backend/python/common/ to backend/python/mlx/
- Remove sys.path manipulation from backend.py and test.py
- Fix test assertion to expect "MLX model loaded successfully"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* test(mlx): add comprehensive cache tests and document upstream behavior

Added comprehensive unit tests (test_mlx_cache.py) covering all cache
operation modes:
- Exact match
- Shorter prefix match
- Longer prefix match with trimming
- No match scenarios
- LRU eviction and access order
- Reference counting and deep copy behavior
- Multi-model namespacing
- Thread safety with data integrity verification

Documents upstream mlx_lm/server.py behavior: single-token prefixes are
deliberately not matched (uses > 0, not >= 0) to allow longer cached
sequences to be preferred for trimming. This is acceptable because real
prompts with chat templates are always many tokens.

Removed weak unit tests from test.py that only verified "no exception
thrown" rather than correctness.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

* chore(mlx): remove unused MinP proto field

The MinP field was added to PredictOptions but is not populated by the
Go frontend/API. The MLX backend uses getattr with a default value,
so it works without the proto field.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Blightbow <blightbow@users.noreply.github.com>

---------

Signed-off-by: Blightbow <blightbow@users.noreply.github.com>
Co-authored-by: Blightbow <blightbow@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 11:27:46 +01:00
dependabot[bot]
dbd25885c3 chore(deps): bump sentence-transformers from 5.1.0 to 5.2.0 in /backend/python/transformers (#7594)
chore(deps): bump sentence-transformers in /backend/python/transformers

Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.1.0 to 5.2.0.
- [Release notes](https://github.com/huggingface/sentence-transformers/releases)
- [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.1.0...v5.2.0)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-version: 5.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-16 09:12:57 +01:00
LocalAI [bot]
7a3b0bbfaa chore: ⬆️ Update leejet/stable-diffusion.cpp to 200cb6f2ca07e40fa83b610a4e595f4da06ec709 (#7597)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-16 08:15:15 +01:00
Ettore Di Giacinto
2387b266d8 chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-15 21:55:20 +01:00
LocalAI [bot]
0f5cc4c07b chore: ⬆️ Update ggml-org/llama.cpp to 5c8a717128cc98aa9e5b1c44652f5cf458fd426e (#7573)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 22:21:54 +01:00
LocalAI [bot]
3e4e6777d8 chore: ⬆️ Update ggml-org/llama.cpp to 5266379bcae74214af397f36aa81b2a08b15d545 (#7563)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 11:41:10 +01:00
Simon Redman
5de539ab07 fix(7355): Update llama-cpp grpc for v3 interface (#7566)
* fix(7355): Update llama-cpp grpc for v3 interface

Signed-off-by: Simon Redman <simon@ergotech.com>

* feat(llama-gprc): Trim whitespace from servers list

Signed-off-by: Simon Redman <simon@ergotech.com>

* Trim trailing spaces in grpc-server.cpp

Signed-off-by: Simon Redman <simon@ergotech.com>

---------

Signed-off-by: Simon Redman <simon@ergotech.com>
2025-12-14 11:40:33 +01:00
LocalAI [bot]
3013d1c7b5 chore: ⬆️ Update leejet/stable-diffusion.cpp to 43a70e819b9254dee0d017305d6992f6bb27f850 (#7562)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-13 22:52:20 +01:00
LocalAI [bot]
073b3855d9 chore: ⬆️ Update ggml-org/whisper.cpp to 2551e4ce98db69027d08bd99bcc3f1a4e2ad2cef (#7561)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-13 21:22:14 +00:00
Ettore Di Giacinto
7790a24682 Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory" (#7558)
Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend…"

This reverts commit 1b4aa6f1be.
2025-12-13 17:04:46 +01:00
dependabot[bot]
1b4aa6f1be chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory (#7549)
chore(deps): bump torch

Bumps the pip group with 1 update in the /backend/python/diffusers directory: torch.


Updates `torch` from 2.5.1+cxx11.abi to 2.7.1+cpu

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.7.1+cpu
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-13 13:12:18 +00:00
Ettore Di Giacinto
504d954aea Add chardet to requirements-l4t13.txt
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-13 12:59:03 +01:00
Ettore Di Giacinto
6d2a535813 chore(l4t13): use pytorch index (#7546)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-13 10:04:57 +01:00
Ettore Di Giacinto
abfb0ff8fe feat(stablediffusion-ggml): add lora support (#7542)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-13 08:29:06 +01:00
LocalAI [bot]
2bd6faaff5 chore: ⬆️ Update leejet/stable-diffusion.cpp to 11ab095230b2b67210f5da4d901588d56c71fe3a (#7539)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-12 21:31:13 +00:00
Ettore Di Giacinto
0b130fb811 fix(llama.cpp): handle corner cases with tool array content (#7528)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-12 08:15:45 +01:00
LocalAI [bot]
0771a2d3ec chore: ⬆️ Update ggml-org/llama.cpp to a81a569577cc38b32558958b048228150be63eae (#7529)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 21:55:44 +00:00
Ettore Di Giacinto
8442f33712 chore(deps): bump stable-diffusion.cpp to '8823dc48bcc1598eb9671da7b69e45338d0cc5a5' (#7524)
* chore(deps): bump stable-diffusion.cpp to '8823dc48bcc1598eb9671da7b69e45338d0cc5a5'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Dockerfile.golang): Make curl noisy to see when download fails

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Richard Palethorpe <io@richiejp.com>
2025-12-11 20:32:25 +01:00
LocalAI [bot]
72621a1d1c chore: ⬆️ Update ggml-org/llama.cpp to 4dff236a522bd0ed949331d6cb1ee2a1b3615c35 (#7508)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 08:15:38 +01:00
LocalAI [bot]
e1d060d147 chore: ⬆️ Update ggml-org/whisper.cpp to 9f5ed26e43c680bece09df7bdc8c1b7835f0e537 (#7509)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-10 23:09:13 +01:00
Ettore Di Giacinto
32dcb58e89 feat(vibevoice): add new backend (#7494)
* feat(vibevoice): add backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add workflow and backend index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): add vibevoice

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted for intel builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Pin python version for l4t

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-10 21:14:21 +01:00
LocalAI [bot]
ef44ace73f chore: ⬆️ Update ggml-org/llama.cpp to 086a63e3a5d2dbbb7183a74db453459e544eb55a (#7496)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-10 12:05:13 +01:00
Ettore Di Giacinto
74ee1463fe chore(deps/llama-cpp): bump to '2fa51c19b028180b35d316e9ed06f5f0f7ada2c1' (#7484)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-09 15:41:37 +01:00
LocalAI [bot]
6c7b215687 chore: ⬆️ Update ggml-org/whisper.cpp to a8f45ab11d6731e591ae3d0230be3fec6c2efc91 (#7483)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-09 08:33:30 +01:00
dependabot[bot]
bbce461f57 chore(deps): bump protobuf from 6.33.1 to 6.33.2 in /backend/python/transformers (#7481)
chore(deps): bump protobuf in /backend/python/transformers

Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.33.1 to 6.33.2.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Commits](https://github.com/protocolbuffers/protobuf/commits)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-version: 6.33.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 22:13:18 +01:00
LocalAI [bot]
5610384d8a chore: ⬆️ Update ggml-org/llama.cpp to db97837385edfbc772230debbd49e5efae843a71 (#7447)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-07 08:32:35 +01:00
LocalAI [bot]
c3493e4917 chore: ⬆️ Update ggml-org/whisper.cpp to a88b93f85f08fc6045e5d8a8c3f94b7be0ac8bce (#7448)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-06 21:26:25 +00:00
LocalAI [bot]
edf7141b9b chore: ⬆️ Update ggml-org/llama.cpp to 8160b38a5fa8a25490ca33ffdd200cda51405688 (#7438)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-06 13:35:24 +01:00
Ettore Di Giacinto
024aa6a55b chore(deps): bump llama.cpp to 'bde188d60f58012ada0725c6dd5ba7c69fe4dd87' (#7434)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-05 00:17:35 +01:00
Copilot
1abbedd732 feat(diffusers): implement dynamic pipeline loader to remove per-pipeline conditionals (#7365)
* Initial plan

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add dynamic loader for diffusers pipelines and refactor backend.py

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix pipeline discovery error handling and test mock issue

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Address code review feedback: direct imports, better error handling, improved tests

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Address remaining code review feedback: specific exceptions, registry access, test imports

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add defensive fallback for DiffusionPipeline registry access

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Actually use dynamic pipeline loading for all pipelines in backend

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use dynamic loader consistently for all pipelines including AutoPipelineForText2Image

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dynamic loader tests into test.py for CI compatibility

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extend dynamic loader to discover any diffusers class type, not just DiffusionPipeline

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add AutoPipeline classes to pipeline registry for default model loading

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(python): set pyvenv python home

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do pyenv update during start

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Minor changes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-04 19:02:06 +01:00
Richard Palethorpe
c2e4a1f29b feat(stablediffusion): Passthrough more parameters to support z-image and flux2 (#7419)
* feat(stablediffusion): Passthrough more parameters to support z-image and flux2

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(z-image): Add Z-Image-Turbo GGML to library

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(stablediffusion-ggml): flush stderr and check errors when writing PNG

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(stablediffusion-ggml): Re-allocate Go strings in C++

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(stablediffusion-ggml): Try to avoid segfaults

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(stablediffusion-ggml): Init sample and easycache params

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-04 17:08:21 +01:00
LocalAI [bot]
ca2e878aaf chore: ⬆️ Update ggml-org/llama.cpp to e9f9483464e6f01d843d7f0293bd9c7bc6b2221c (#7421)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-04 11:54:01 +01:00
LocalAI [bot]
7c5a0cde64 chore: ⬆️ Update leejet/stable-diffusion.cpp to 5865b5e7034801af1a288a9584631730b25272c6 (#7422)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-04 11:29:16 +01:00
Ettore Di Giacinto
edcbf82b31 chore(ci): add wget 2025-12-04 10:01:34 +01:00
Ettore Di Giacinto
6558caca85 chore(ci): adapt also golang-based backends docker images 2025-12-04 09:14:08 +01:00
Ettore Di Giacinto
b4172762d7 chore(ci): do override pip in 24.04
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 22:54:13 +01:00
Ettore Di Giacinto
dc6182bbb1 chore(ci): add wget to llama-cpp docker image builder
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 22:48:41 +01:00