Commit Graph

778 Commits

Author SHA1 Message Date
Ettore Di Giacinto
dc6182bbb1 chore(ci): add wget to llama-cpp docker image builder
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 22:48:41 +01:00
Ettore Di Giacinto
1d1d52da59 chore(ci): small fixups to build arm64 images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 21:42:33 +01:00
Ettore Di Giacinto
46b1a1848f chore(ci): minor fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 16:47:31 +01:00
LocalAI [bot]
957eea3da3 chore: ⬆️ Update ggml-org/llama.cpp to 61bde8e21f4a1f9a98c9205831ca3e55457b4c78 (#7415)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-03 16:27:12 +01:00
Ettore Di Giacinto
ab4f2742a6 chore(ci): minor fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 16:26:33 +01:00
Ettore Di Giacinto
03f3bf2d94 chore(ci): only install runtime libs needed on arm64
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 15:13:21 +01:00
Ettore Di Giacinto
8dfeea2f55 fix: use ubuntu 24.04 for cuda13 l4t images (#7418)
* fix: use ubuntu 24.04 for cuda13 l4t images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop openblas from containers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 09:47:03 +01:00
Ettore Di Giacinto
fea9018dc5 Revert "feat(stablediffusion): Passthrough more parameters to support z-image and flux2" (#7417)
Revert "feat(stablediffusion): Passthrough more parameters to support z-image…"

This reverts commit 4018e59b2a.
2025-12-02 22:14:28 +01:00
Richard Palethorpe
4018e59b2a feat(stablediffusion): Passthrough more parameters to support z-image and flux2 (#7414)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-12-02 18:28:26 +01:00
Richard Palethorpe
aaece6685f chore(deps/stable-diffusion-ggml): update stablediffusion-ggml (#7411)
* ⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(stablediffusion-ggml): fixup schedulers and samplers arrays, use default getters

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-02 16:35:39 +01:00
Ettore Di Giacinto
f5df806f35 Fixup tags
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-02 15:15:41 +01:00
Ettore Di Giacinto
cfd95745ed feat: add cuda13 images (#7404)
* chore(ci): add cuda13 jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to pipelines and to capabilities. Start to work on the gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* capabilities: try to detect by looking at /usr/local

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* neutts

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* backends.yaml

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add cuda13 l4t requirements.txt

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add cuda13 requirements.txt

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Pin vllm

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Not all backends are compatible

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add vllm to requirements

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* vllm is not pre-compiled for cuda 13

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-02 14:24:35 +01:00
LocalAI [bot]
665441ca94 chore: ⬆️ Update ggml-org/llama.cpp to ec18edfcba94dacb166e6523612fc0129cead67a (#7406)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-02 07:59:52 +01:00
Ettore Di Giacinto
e3bcba5c45 chore: ⬆️ Update ggml-org/llama.cpp to 7f8ef50cce40e3e7e4526a3696cb45658190e69a (#7402)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-01 07:50:40 +01:00
LocalAI [bot]
0824fd8efd chore: ⬆️ Update ggml-org/llama.cpp to 8c32d9d96d9ae345a0150cae8572859e9aafea0b (#7395)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-30 09:06:18 +01:00
Ettore Di Giacinto
468ac608f3 chore(deps): bump llama.cpp to 'd82b7a7c1d73c0674698d9601b1bbb0200933f29' (#7392)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-29 08:58:07 +01:00
Ettore Di Giacinto
4b5977f535 chore: drop pinning of python 3.12 (#7389)
Update install.sh

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-28 11:02:56 +01:00
Ettore Di Giacinto
0d877b1e71 Revert "chore(l4t): Update extra index URL for requirements-l4t.txt" (#7388)
Revert "chore(l4t): Update extra index URL for requirements-l4t.txt (#7383)"

This reverts commit 0d781e6b7e.
2025-11-28 11:02:11 +01:00
Ettore Di Giacinto
e27f1370eb chore(diffusers): Add PY_STANDALONE_TAG for l4t Python version (#7387)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-28 09:34:05 +01:00
LocalAI [bot]
1a53fd2b9b chore: ⬆️ Update ggml-org/llama.cpp to 4abef75f2cf2eee75eb5083b30a94cf981587394 (#7382)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-28 00:08:27 +01:00
Ettore Di Giacinto
e01d821314 chore: Add Python 3.12 support for l4t build profile (#7384)
Set Python version to 3.12 for l4t build profile.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-27 23:00:09 +01:00
Ettore Di Giacinto
0d781e6b7e chore(l4t): Update extra index URL for requirements-l4t.txt (#7383)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-27 22:02:06 +01:00
Ettore Di Giacinto
7ccc383a8b chore(l4t/diffusers): bump nvidia l4t index for pytorch 2.9 (#7379)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-27 17:42:01 +01:00
Ettore Di Giacinto
2f8a2b1297 chore(deps): update diffusers dependency to use GitHub repo for l4t (#7369)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-27 16:02:48 +01:00
LocalAI [bot]
b5f4f4ac6d chore: ⬆️ Update ggml-org/llama.cpp to eec1e33a9ed71b79422e39cc489719cf4f8e0777 (#7363)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-27 09:17:25 +01:00
Ettore Di Giacinto
7a94d237c4 chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a (#7358)
chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-26 08:23:21 +01:00
dependabot[bot]
7e01aa8faa chore(deps): bump protobuf from 6.32.0 to 6.33.1 in /backend/python/transformers (#7340)
chore(deps): bump protobuf in /backend/python/transformers

Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.32.0 to 6.33.1.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl)
- [Commits](https://github.com/protocolbuffers/protobuf/commits)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-version: 6.33.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 20:12:17 +00:00
LocalAI [bot]
f6d2a52cd5 chore: ⬆️ Update ggml-org/llama.cpp to 0c7220db56525d40177fcce3baa0d083448ec813 (#7337)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-24 09:11:38 +01:00
LocalAI [bot]
05a00b2399 chore: ⬆️ Update ggml-org/llama.cpp to 3f3a4fb9c3b907c68598363b204e6f58f4757c8c (#7336)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-22 21:53:40 +00:00
Ettore Di Giacinto
3a232446e0 Revert "chore(chatterbox): bump l4t index to support more recent pytorch" (#7333)
Revert "chore(chatterbox): bump l4t index to support more recent pytorch (#7332)"

This reverts commit 55607a5aac.
2025-11-22 10:10:27 +01:00
LocalAI [bot]
bdfe8431fa chore: ⬆️ Update ggml-org/llama.cpp to 23bc779a6e58762ea892eca1801b2ea1b9050c00 (#7331)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-22 08:44:01 +01:00
Ettore Di Giacinto
55607a5aac chore(chatterbox): bump l4t index to support more recent pytorch (#7332)
This should add support for devices like the DGX Spark

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-21 22:24:46 +01:00
Ettore Di Giacinto
ec492a4c56 fix(typo): environment variable name for max jobs
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-21 18:37:22 +01:00
Ettore Di Giacinto
2defe98df8 fix(vllm): Update flash-attn to specific wheel URL
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-21 18:06:46 +01:00
Ettore Di Giacinto
6261c87b1b Add NVCC_THREADS and MAX_JOB environment variables
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-21 16:14:13 +01:00
Ettore Di Giacinto
e88db7d142 fix(llama.cpp): handle corner cases with tool content (#7324)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-21 09:21:49 +01:00
LocalAI [bot]
b7b8a0a748 chore: ⬆️ Update ggml-org/llama.cpp to dd0f3219419b24740864b5343958a97e1b3e4b26 (#7322)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-21 08:11:47 +01:00
LocalAI [bot]
b8011f49f2 chore: ⬆️ Update ggml-org/whisper.cpp to 19ceec8eac980403b714d603e5ca31653cd42a3f (#7321)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-20 23:07:22 +01:00
Ettore Di Giacinto
daf39e1efd chore(vllm/ci): set maximum number of jobs
Also added comments to clarify CPU usage during build.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-20 15:53:32 +01:00
LocalAI [bot]
bfa07df7cd chore: ⬆️ Update ggml-org/llama.cpp to 7d77f07325985c03a91fa371d0a68ef88a91ec7f (#7314)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-20 07:58:42 +01:00
Ettore Di Giacinto
3152611184 chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f (#7311)
chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-19 14:42:11 +01:00
LocalAI [bot]
4278506876 chore: ⬆️ Update ggml-org/llama.cpp to cb623de3fc61011e5062522b4d05721a22f2e916 (#7301)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-18 07:43:57 +01:00
LocalAI [bot]
1dd1d12da1 chore: ⬆️ Update ggml-org/whisper.cpp to b12abefa9be2abae39a73fa903322af135024a36 (#7300)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-18 07:43:33 +01:00
LocalAI [bot]
fb834805db chore: ⬆️ Update ggml-org/llama.cpp to 80deff3648b93727422461c41c7279ef1dac7452 (#7287)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-17 07:51:08 +01:00
Ettore Di Giacinto
d7f9f3ac93 feat: add support to logitbias and logprobs (#7283)
* feat: add support to logprobs in results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add support to logitbias

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-16 13:27:36 +01:00
LocalAI [bot]
d1a0dd10e6 chore: ⬆️ Update ggml-org/llama.cpp to 662192e1dcd224bc25759aadd0190577524c6a66 (#7277)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-16 08:41:12 +01:00
LocalAI [bot]
a09d49da43 chore: ⬆️ Update ggml-org/llama.cpp to 9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0 (#7273)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-15 00:05:39 +01:00
Ettore Di Giacinto
03e9f4b140 fix: handle tool errors (#7271)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-14 17:23:56 +01:00
Ettore Di Giacinto
7129409bf6 chore(deps): bump llama.cpp to c4abcb2457217198efdd67d02675f5fddb7071c2 (#7266)
* chore(deps): bump llama.cpp to '92bb442ad999a0d52df0af2730cd861012e8ac5c'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* DEBUG

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Bump

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test/debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Revert "DEBUG"

This reverts commit 2501ca3ff2.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-14 12:16:52 +01:00
LocalAI [bot]
d9e9ec6825 chore: ⬆️ Update ggml-org/whisper.cpp to d9b7613b34a343848af572cc14467fc5e82fc788 (#7268)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-13 23:05:06 +01:00