Deploy Open WebUI on Kubernetes with ArgoCD and Helm and connect to OpenRouter
This article will guide you through the complete process of setting up Open WebUI, one of the best open source LLM frontends on Kubernetes with ArgoCD and Helm, and connect it to OpenRouter for a smart, cost efficient setup. As a bonus, we'll deploy SearXNG for local web search as well.
This post is a part of the Cost Efficient AI Assistants series on Kubito. Make sure you check out the other posts in this series in the menu above.
If you’ve been thinking of moving away from ChatGPT, but you don’t know where to start, you’ve come to the right place. This article will explain how to set up Open WebUI, an amazing and feature rich LLM frontend, where you can connect any LLM you want, no matter if the model is local or external. It also supports RAG, Web Search, Model Cloning, SSO login, Voice with STT/TTS, and much more!
OpenRouter
Before we begin, let’s create and get our own OpenRouter API key. This will allow you to use any model available there through a single, unified API. It’s also really great for bypassing rate limiting on models like Anthropic’s Claude 3.5 Sonnet. OpenRouter also has providers that offer free usage of some models, and they are quite generous, since you can use models like Llama 3.1 70B or Llama 3.2 90B Vision for free. It will be slower, but you know, they are really nice to have sometimes.
Account setup
Go to OpenRouter, click on the Sign In
button in the upper right and then the Sign up
button. Now create the account however you want and log in once that’s done.
Now go to your account settings, and set some things up first if you’d like. I have enabled the Low Balance Notifications
email notifications and I’ve set the notification to trigger once my credits are below $2, and I have set the default model to Claude 3.5 Sonnet. Now go to Keys
and click on Create Key
. Give it a name, set a credit limit if you want, and once done, you’ll get the API key. Save it in a really secure place because you won’t be able to see it again later.
Credits and Cost
Now it’s time to add some credits to your account, but you can do this later if you want. Go to Credits and click on the Add Credits
button. Finish that process, add as much as you want, and they’ll be available on your account. Keep in mind, Stripe takes a little fee for the payment, but it’s a really small one and in my opinion very much worth it for what you’re getting, for example I added 20$ and it took 21.43$ from my card.
In the long run, this is a better solution for me than paying a fixed price point for a single model on ChatGPT, since I don’t use it daily and I am never getting the full value. OpenRouter allows me to be flexible and use cheaper or more expensive models depending on what I need them for.
You can see the usage in the Credits
menu once you’ve interacted with the API, and it’s really nice that it shows you which client used the API and how many tokens you’ve used and how much they cost. You can check each model’s price per million input/output tokens in the OpenRouter dashboard.
We are finished with the OpenRouter part, let’s proceed with the other components.
Open WebUI
To begin the process, since we’ll be deploying it on Kubernetes, make sure you already have an existing Kubernetes cluster. ArgoCD is optional, you can simply deploy with Helm directly, but this guide will show the process with both.
To deploy it, visit the official Helm repository on ArtifactHub page. If you prefer GitHub, you can find the direct link to the Helm repo here. The chart has Ollama, Open WebUI Pipelines, and Apache Tika as optional dependencies (subcharts).
In short, Ollama is an amazing way to run local models and expose them through a single API, Pipelines is a way to extend Open WebUI and it comes into play when you’re dealing with computationally heavy tasks (e.g., running large models or complex logic) that you want to offload from your main Open WebUI instance for better performance and scalability, and Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
Create the API Key Secret
For both methods, you’ll want to create the API key secret before proceeding. It’s best if you have a secret manager like HashiCorp Vault or Bitnami Sealed Secrets. I won’t go into detail of using these, so if you have any of them or similar, follow their guidelines for adding the secret.
To create this as a simple Kubernetes Secret, prepare your OpenRouter API key and create the secret, we’ll name it open-webui-secret
and the key will be named open-router-api-key
:
|
|
SearXNG Deployment (optional)
First, if you want web search with SearXNG, deploy it first in the same namespace. Feel free to skip this part and remove the SearXNG related flags in the installation command after if you don’t want the web search feature in Open WebUI.
For this, we’ll use the Kubito SearXNG Helm Chart. The default values are already prepared for Open WebUI integration per their recommended configuration.
To deploy it, simply leave the configuration values as they are, unless you want to set your own secret key in the settings generated with openssl rand -hex 32
. If you’re changing the port to something else in the config
part of the values, make sure to set the service.port
to the same one. Make sure you check the other default values to configure it to your own taste.
Choose whether you want this deployed with Helm or ArgoCD.
Helm
Simply run the following commands to install SearXNG using Helm only.
|
|
ArgoCD
First, create the following ArgoCD application in YAML format, replace the secret_key
to anything you want with openssl rand -hex 32
if you want. Check the application and feel free to change anything you want, like adding sync waves or disabling self healing:
|
|
After this, depending on how your infrastructure is structured, deploy the YAML application. For simplicity, let’s just show how it’s done with pure kubectl
:
|
|
Once done, SearXNG will be deployed.
Open WebUI Deployment
Without further ado, let’s start with deploying this. We’ll disable the Ollama, Pipelines, and Tika subcharts for simplicity, since in this series we’ll be using OpenRouter to connect to models. We will be enabling the following things, so replace what’s needed, feel free to disable some of them if you don’t need them, and enable some that you need by looking at the available Helm values and the Open WebUI Environment Variables Configuration:
- Persistence
- OIDC Login Only
- OpenRouter with existing secret
- Web Search with SearXNG
Helm
Simply install with Helm here. I’ve written comments between the flags so you know which section they belong to.
|
|
ArgoCD
Same as with SearXNG’s app, create the YAML file with the configuration. Check the application and feel free to change anything you want, like adding sync waves or disabling self healing. I’ve written comments between the values so you know which section they belong to:
|
|
After this, depending on how your infrastructure is structured, deploy the YAML application. For simplicity, let’s just show how it’s done with pure kubectl
:
|
|
Once done, Open WebUI will be deployed.
Open WebUI Usage
Now that everything is set up and deployed, we can proceed with the usage. Since most of the things are already configured via the configuration before, there isn’t a lot of things left. You can enable ingress or gateway API access to the instance, or simply port forward it to test it out, it’s up to you. This guide won’t go into detail on how you’ll be accessing the instance outside the cluster.
Once you’ve opened the instance in your browser, depending on how you configured the login process, either user/password or SSO login, you need to register. The first user to log in always gets the Admin
access!
The UI is very simple, and yet much more configurable than the other options I’ve found. If the OpenRouter connection is successful, in the upper left corner you can choose a model, and there will be a lot of them, so simply choose one and start interacting with any LLM you want!
In the Workspace
tab below New Chat
, you can see the list of models and rename them, set logos for them, add functions and tools, and much more.
In the user settings, you can set defaults for anything you want, including setting up a system prompt for every model, although you can set a system prompt per model in the Workspace/Models
tab. For example, this is the system prompt that I’ve set and found great results with any model:
|
|
All in all, explore the UI, the user settings, the admin dashboard, and I am sure you’ll get used to it very quickly and have a great experience like I do.
Conclusion
This whole setup helped me explore many different models in a cost effective and simple way. I am hosting all of this on my Raspberry Pi Kubernetes cluster without any issues, so I hope you won’t have issues as well in any kind of environment. The open source nature of the tools in this guide lets you see what’s happening in the code at all times, which is a big plus for me personally. Have fun using Open WebUI, and make sure to check the next guide in this series for configuring and using Aider as your personal AI Assistant, all with the same OpenRouter API Key!
If you find this post helpful, please consider supporting the blog. Your contributions help sustain the development and sharing of great content. Your support is greatly appreciated!
Buy Me a Coffee