* feat(backend gallery): add meta packages
So we can have meta packages such as "vllm" that automatically installs
the corresponding package depending on the GPU that is being currently
detected in the system.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: use a metadata file
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: Add backend gallery
This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add backends docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wip: Backend Dockerfile for python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: drop extras images, build python backends separately
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixup on all backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* test CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tweaks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop old backends leftovers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixup CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move dockerfile upper
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix proto
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Feature dropped for consistency - we prefer model galleries
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing packages in the build image
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* exllama is ponly available on cublas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* pin torch on chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups to index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug CI
* Install accellerators deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add target arch
* Add cuda minor version
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use self-hosted runners
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: use quay for test images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups for vllm and chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups on CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chatterbox is only available for nvidia
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Simplify CI builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt test, use qwen3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(model gallery): add jina-reranker-v1-tiny-en-gguf
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use reranker from llama.cpp in AIO images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Limit concurrent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* Support file inputs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: support multiple files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* show preview of files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(realtime): Initial Realtime API implementation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: go mod tidy
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat: Implement transcription only mode for realtime API
Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* chore(build): Build backends on a separate layer to speed up core only changes
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
* fix(embed): use go-rice for large backend assets
Golang embed FS has a hard limit that we might exceed when providing
many binary alternatives.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* simplify golang deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): switch to testcontainers and print logs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(tests): do not build a test binary
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* small fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ui): drop set api key button
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ui): shore in-progress installs in model view
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): improve text to image view
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This PR changes entirely the UI look and feeling. It updates all
sections and makes it also mobile-ready.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): show more informations in the chat view, minor adjustments to model gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(ui): UI improvements
Visual improvements and bugfixes including:
- disable pagination during search
- fix scrolling on new message
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): improve index
- Redirect to the chat view when clicking on a model
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Display chat icon nearby the model
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
The GGML format is now dead, since in the next version of LocalAI we
already bring many breaking compatibility changes, taking the occasion
also to drop ggml support (pre-gguf).
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
When hitting /models/available we are intersted in the model
description, name and small metadatas. Configuration and overrides are
part of internals which are required only for installation.
This also solves a current bug when hitting /models/available fails if
one of the gallery items have overrides with parameters defined
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(stablediffusion-ncn): drop in favor of ggml implementation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): drop stablediffusion build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): add
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): try to fixup current tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to fix tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tests improvements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): use quality to specify step
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): switch to sd-1.5
also increase prep time for downloading models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* merge sentencetransformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add alias to silently redirect sentencetransformers to transformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add alias also for transformers-musicgen
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop from makefile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move tests from sentencetransformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Remove sentencetransformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Remove tests from CI (part of transformers)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Do not always try to load the tokenizer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix typo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tiny adjustments
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Rename LocalAI-Extra-Usage -> Extra-Usage, add MACHINE_TAG as cli flag option, add docs about extra-usage and machine-tag
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* remove redurant timing fields, fix not working timings output
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* use middleware for Machine-Tag only if tag is specified
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
---------
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
Makes the web app honour the `X-Forwarded-Prefix` HTTP request header that may be sent by a reverse-proxy in order to inform the app that its public routes contain a path prefix.
For instance this allows to serve the webapp via a reverse-proxy/ingress controller under a path prefix/sub path such as e.g. `/localai/` while still being able to use the regular LocalAI routes/paths without prefix when directly connecting to the LocalAI server.
Changes:
* Add new `StripPathPrefix` middleware to strip the path prefix (provided with the `X-Forwarded-Prefix` HTTP request header) from the request path prior to matching the HTTP route.
* Add a `BaseURL` utility function to build the base URL, honouring the `X-Forwarded-Prefix` HTTP request header.
* Generate the derived base URL into the HTML (`head.html` template) as `<base/>` tag.
* Make all webapp-internal URLs (within HTML+JS) relative in order to make the browser resolve them against the `<base/>` URL specified within each HTML page's header.
* Make font URLs within the CSS files relative to the CSS file.
* Generate redirect location URLs using the new `BaseURL` function.
* Use the new `BaseURL` function to generate absolute URLs within gallery JSON responses.
Closes#3095
TL;DR:
The header-based approach allows to move the path prefix configuration concern completely to the reverse-proxy/ingress as opposed to having to align the path prefix configuration between LocalAI, the reverse-proxy and potentially other internal LocalAI clients.
The gofiber swagger handler already supports path prefixes this way, see e2d9e9916d/swagger.go (L79)
Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>