Commit Graph

346 Commits

Author SHA1 Message Date
Ettore Di Giacinto
089efe05fd feat(backends): add system backend, refactor (#6059)
- Add a system backend path
- Refactor and consolidate system information in system state
- Use system state in all the components to figure out the system paths
  to used whenever needed
- Refactor BackendConfig -> ModelConfig. This was otherway misleading as
  now we do have a backend configuration which is not the model config.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-14 19:38:26 +02:00
Ettore Di Giacinto
19c92c70c5 fix(backend-detection): default to CPU if there is less than 4GB of GPU available (#6057)
fix(gpu-detection): default to CPU if there is less than 4GB of GPU available

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-14 16:57:33 +02:00
Ettore Di Giacinto
b52bfaf1b3 fix: do not show invalid backends (#6058)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-14 13:01:56 +02:00
Ettore Di Giacinto
05757e2738 feat(backends install): allow to specify name and alias during manual installation (#5971)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-10 10:05:53 +02:00
Ettore Di Giacinto
b9a25b16e6 feat: add reasoning effort and metadata to template (#5981)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-06 21:56:05 +02:00
Ettore Di Giacinto
3295a298f4 feat(webui): allow to specify image size (#5976)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-06 12:38:02 +02:00
Ettore Di Giacinto
90f5639639 feat(backends): allow backends to not have a metadata file (#5963)
In this case we generate one on the fly and we infer the metadata we
can.

Obviously this have the side effect of not being able to register
potential aliases.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-03 16:47:02 +02:00
Ettore Di Giacinto
a35a701052 feat(backends): install from local path (#5962)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-03 14:24:50 +02:00
Ettore Di Giacinto
9aadfd485f chore: update swagger (#5946)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 16:22:27 +02:00
Ettore Di Giacinto
3d22bfc27c feat(stablediffusion-ggml): add support to ref images (flux Kontext) (#5935)
* feat(stablediffusion-ggml): add support to ref images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add it to the model gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 22:42:34 +02:00
Ettore Di Giacinto
949e5b9be8 feat(rfdetr): add object detection API (#5923)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-27 22:02:51 +02:00
Ettore Di Giacinto
73ecb7f90b chore: drop assistants endpoint (#5926)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-27 21:06:09 +02:00
Ettore Di Giacinto
053bed6e5f feat: normalize search (#5925)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-27 11:51:28 +02:00
Ettore Di Giacinto
b3600b3c50 feat(backend gallery): add mirrors (#5910)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-25 19:20:08 +02:00
Ettore Di Giacinto
f0b47cfe6a fix(backends gallery): trim string when reading cap from file (#5909)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-25 18:10:02 +02:00
Ettore Di Giacinto
ee625fc34e fix(backends gallery): pass-by backend galleries to the model service (#5906)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-25 16:38:09 +02:00
Dave
b3c2a3c257 fix: untangle pkg and core (#5896)
* migrate core/system to pkg/system - it has no dependencies FROM core, and IS USED in pkg

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/templates up to core/templates -- nothing in pkg references it, but it does reference core.

Signed-off-by: Dave Lee <dave@gray101.com>

* remove extra check, len of nil is 0

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/startup to core/startup -- it does have important and unfixable dependencies on core

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2025-07-24 15:03:41 +02:00
Dave
9cecf5e7ac fix: rename Dockerfile.go --> Dockerfile.golang to avoid IDE errors (#5892)
extract up and out Dockerfile.go --> Dockerfile.golang rename. Prevents syntax highlighting and IDE errors

Signed-off-by: Dave Lee <dave@gray101.com>
2025-07-23 21:33:26 +02:00
Ettore Di Giacinto
5f7ece3e94 fix(p2p): adapt to backend changes, general improvements (#5889)
The binary is now named "llama-cpp-rpc-server" for p2p workers.

We also decrease the default token rotation interval, in this way
peer discovery is much more responsive.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-23 12:40:32 +02:00
Richard Palethorpe
754bedc3ea fix(realtime): Reset speech started flag on commit (#5879)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-22 16:41:12 +02:00
Ettore Di Giacinto
98e5291afc feat: refactor build process, drop embedded backends (#5875)
* feat: split remaining backends and drop embedded backends

- Drop silero-vad, huggingface, and stores backend from embedded
  binaries
- Refactor Makefile and Dockerfile to avoid building grpc backends
- Drop golang code that was used to embed backends
- Simplify building by using goreleaser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): be specific with llama-cpp backend templates

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(docs): update

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): minor fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: drop all ffmpeg references

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: run protogen-go

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Always enable p2p mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update gorelease file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(stores): do not always load

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix linting issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Mac OS fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-22 16:31:04 +02:00
Max Goltzsche
eae4ca08da feat(openai): support input_audio chat api field (#5870)
Improving the chat completion endpoint OpenAI API compatibility by supporting messages of type `input_audio`, e.g.:
```
{
  ...
  "messages": [
    {
      "role": "user",
      "content": [{
        "type": "input_audio",
        "input_audio": {
          "data": "<base64-encoded audio data>",
          "format": "wav"
        }
      }]
    }
  ]
}
```

Closes #5869

Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>
2025-07-21 09:15:55 +02:00
Ettore Di Giacinto
b29544d747 feat: split piper from main binary (#5858)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-19 08:31:33 +02:00
Ettore Di Giacinto
294f7022f3 feat: do not bundle llama-cpp anymore (#5790)
* Build llama.cpp separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Start to try to attach some tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add git and small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: correctly autoload external backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run AIO tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Slightly update the Makefile helps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt auto-bumper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run linux test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add llama-cpp into build pipelines

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add default capability (for cpu)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop llama-cpp specific logic from the backend loader

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* drop grpc install in ci for tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Pass by backends path for tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Build protogen at start

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(tests): set backends path consistently

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Correctly configure the backends path

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to build for darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Compile for metal on arm64/darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run build off from cross-arch

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to the backend index nvidia-l4t and cpu's llama-cpp backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Build also darwin-x86 for llama-cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable arm64 builds temporary

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Test backend build on PR

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup build backend reusable workflow

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pass by skip drivers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use crane

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Skip drivers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* x86 darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add packaging step for llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix leftover from bark-cpp extraction

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to fix hipblas build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-18 13:24:12 +02:00
Richard Palethorpe
932f6b01a6 feat(realtime): Add speech started and stopped events (#5856)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-18 09:22:23 +02:00
Ettore Di Giacinto
5fc8d5bb78 fix: explorer page should not have login (#5855)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-17 10:54:03 +02:00
Ettore Di Giacinto
354c0b763e feat(cli): add command to create custom OCI images from directories (#5844)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-14 08:21:29 +02:00
Ettore Di Giacinto
ec206cc67c feat(cli): allow to install backends from OCI tar files (#5816)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-09 18:19:51 +02:00
Ettore Di Giacinto
c5b9f45166 chore(cli): add backends CLI to manipulate and install backends (#5787)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-03 19:31:27 +02:00
Ettore Di Giacinto
8276952920 feat(system): detect and allow to override capabilities (#5785)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-03 19:30:52 +02:00
Ettore Di Giacinto
b7cd5bfaec feat(backends): add metas in the gallery (#5784)
* chore(backends): add metas in the gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: correctly handle aliases and metas with same names

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-03 18:01:55 +02:00
Ettore Di Giacinto
bfdc29d316 fix(gallery): correctly show status for downloading OCI images (#5774)
We can't use the mutate.Extract written bytes as current status as that
will be bigger than the compressed image size. Image manifest don't have
any guarantee of the type of artifact (can be compressed or not) when
showing the layer size.

Split the extraction process in two parts: Downloading and extracting as
a flattened system, in this way we can display the status of downloading
and extracting accordingly.

This change also fixes a small nuance in detecting installed backends,
now it's more consistent and looks if a metadata.json and/or a path with
a `run.sh` file is present.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-02 08:25:48 +02:00
Ettore Di Giacinto
d0fb23514f Revert "fix(gallery): correctly show status for downloading OCI images"
This reverts commit 780d034ac9.
2025-07-01 21:32:04 +02:00
Ettore Di Giacinto
780d034ac9 fix(gallery): correctly show status for downloading OCI images
We can't use the mutate.Extract written bytes as current status as that
will be bigger than the compressed image size. Image manifest don't have
any guarantee of the type of artifact (can be compressed or not) when
showing the layer size.

Split the extraction process in two parts: Downloading and extracting as
a flattened system, in this way we can display the status of downloading
and extracting accordingly.

This change also fixes a small nuance in detecting installed backends,
now it's more consistent and looks if a metadata.json and/or a path with
a `run.sh` file is present.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-01 19:56:28 +02:00
Ettore Di Giacinto
33f9ee06c9 fix(gallery): automatically install model from name (#5757)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-29 17:42:58 +02:00
Ettore Di Giacinto
dfadc3696e feat(llama.cpp): allow to set kv-overrides (#5745)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-28 21:26:07 +02:00
Ettore Di Giacinto
7a78e4f482 fix(backends gallery): meta packages do not have URIs (#5740)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 22:23:14 +02:00
Ettore Di Giacinto
6f41a6f934 fix(backends gallery): correctly identify gpu vendor (#5739)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 22:22:58 +02:00
Ettore Di Giacinto
bb54f2da2b feat(gallery): automatically install missing backends along models (#5736)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 18:25:44 +02:00
Ettore Di Giacinto
bcccee3909 fix(backends gallery): delete dangling dirs if installation failed (#5729)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-26 17:38:03 +02:00
Ettore Di Giacinto
a6d9988e84 feat(backend gallery): add meta packages (#5696)
* feat(backend gallery): add meta packages

So we can have meta packages such as "vllm" that automatically installs
the corresponding package depending on the GPU that is being currently
detected in the system.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: use a metadata file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-24 17:08:27 +02:00
Ettore Di Giacinto
efde0eaf83 feat(backend gallery): display download progress (#5687)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-18 23:49:44 +02:00
Ettore Di Giacinto
9bcf4c56f1 fix(backends gallery): propagate p2p settings to correctly draw menu (#5684)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-18 22:06:12 +02:00
FT
1f29b5f38e Fix Typos and Improve Documentation Clarity (#5648)
* Update p2p.go

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>

* Update GPU-acceleration.md

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>

---------

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>
2025-06-15 16:04:44 +02:00
Ettore Di Giacinto
2d64269763 feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00
fuder.eth
eb8c29f90a Minor Documentation Updates: Clarified Comments in Python and Go Files (#5641)
* Update ui.go

Signed-off-by: fuder.eth <139509124+vtjl10@users.noreply.github.com>

* Update backend.py

Signed-off-by: fuder.eth <139509124+vtjl10@users.noreply.github.com>

---------

Signed-off-by: fuder.eth <139509124+vtjl10@users.noreply.github.com>
2025-06-13 19:55:25 +02:00
Richard Palethorpe
d650647db9 fix(realtime): Use updated model on session update (#5604)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-06-09 00:11:05 +02:00
Ettore Di Giacinto
8472321a81 feat(ui): display thinking tags appropriately (#5540)
* fix(streaming): stream complete runes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): display thinking tags separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-05-31 08:50:46 +02:00
Ettore Di Giacinto
3bac4724ac fix(streaming): stream complete runes (#5539)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-31 08:48:05 +02:00
Ettore Di Giacinto
59db154cbc feat(ui): allow to upload PDF and text files, also add support to multiple input files (#5538)
* Support file inputs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: support multiple files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* show preview of files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-31 08:47:48 +02:00