Setting Up Hermes Agent on Windows with Ollama API Key
May 16, 2026 11 min read Wise Technologies Team
#Hermes Agent#Ollama#WSL#Windows#API Key#AI Agent
What is Hermes Agent?
Hermes Agent is an open-source AI agent framework that autonomously executes tasks by combining language models with tools like web search, code execution, and file system access. Unlike Hermes 3 which is just a language model, Hermes Agent is a full system that plans, reasons, and acts to accomplish goals. It can write code, browse the web, analyze data, and even debug itself — all orchestrated by an LLM backend.
Hermes Agent vs Hermes 3 Model
This is a common point of confusion. Hermes 3 is a language model — a neural network that processes text. Hermes Agent is a software framework that uses a language model (like Hermes 3, Llama 3.1, or GPT-4o) as its brain to make decisions and execute tasks. Think of it this way: Hermes 3 is the engine, Hermes Agent is the car. You need both, but they are completely different things.
Prerequisites
Before starting, ensure you have: Windows 10 or 11 with WSL2 installed, Ubuntu 22.04 or newer inside WSL, Ollama installed and running on Windows (host), at least 16GB RAM (32GB recommended), Python 3.10+ inside WSL, and Git for cloning repositories.
Step 1: Install WSL2 on Windows
Open PowerShell as Administrator and run: "wsl --install". This installs WSL2 with Ubuntu by default. Restart your computer when prompted. After reboot, set up your Ubuntu username and password. Update packages with "sudo apt update && sudo apt upgrade -y". WSL2 is required because Hermes Agent relies on Linux-specific tools and Python packages that do not run natively on Windows.
Step 2: Install Ollama on Windows (Host)
Download Ollama from ollama.com and install it on Windows — not inside WSL. This gives you GPU acceleration through your Windows NVIDIA drivers. Start Ollama and verify it is running by visiting http://localhost:11434 in your browser. You should see "Ollama is running". Pull a capable model: "ollama pull hermes3:latest" or "ollama pull llama3.1:70b" for complex reasoning tasks.
Step 3: Configure Ollama for WSL Access
By default, Ollama only listens on localhost (127.0.0.1) which WSL cannot reach. You need to bind it to all interfaces. In Windows PowerShell, set the environment variable: "$env:OLLAMA_HOST="0.0.0.0:11434"" then restart Ollama. Inside WSL, find your Windows host IP with "ip route | grep default | awk '{print $3}'". Test the connection: "curl http://$(ip route | grep default | awk '{print $3}'):11434". You should see "Ollama is running".
Step 4: Generate an API Key
Ollama does not natively support API keys, but for production agent deployments you must add authentication. The simplest approach: install Nginx inside WSL as a reverse proxy with basic auth. Run "sudo apt install nginx apache2-utils". Create a password file: "sudo htpasswd -c /etc/nginx/.htpasswd admin". Configure Nginx to proxy localhost:11434 with auth. Alternatively, use a Python FastAPI wrapper with API key validation for more control.
Step 5: Install Hermes Agent in WSL
Inside WSL, clone the Hermes Agent repository: "git clone https://github.com/hermes-agent/hermes-agent.git" (or the actual repository URL). Navigate to the directory and create a Python virtual environment: "python3 -m venv venv && source venv/bin/activate". Install dependencies: "pip install -r requirements.txt". The requirements typically include openai (for API compatibility), requests, beautifulsoup4, and playwright for web browsing.
Step 6: Configure Hermes Agent to Use Ollama
Edit the agent configuration file (usually config.yaml or .env). Set the base URL to your Windows host IP: "OLLAMA_BASE_URL=http://$(ip route | grep default | awk '{print $3}'):11434". Set the model name: "MODEL=hermes3:latest". If using Nginx with basic auth, include credentials: "OLLAMA_API_KEY=your-generated-key". Set temperature to 0.3-0.5 for balanced creativity and reliability. Enable tools you need: web_search, code_executor, file_manager.
Step 7: Test Your Agent
Run the agent with a simple task: "python main.py --task "Find the current weather in Rahim Yar Khan and summarize it"". The agent will: plan the task, search the web using its tools, extract weather data, and return a summary — all using your local Ollama backend. Check the logs to verify it is calling your Ollama instance correctly. If you see connection errors, verify the Windows host IP and that Ollama is bound to 0.0.0.0.
Security Best Practices
Never expose Ollama directly to the internet without authentication. Use a firewall to block port 11434 from external networks. Rotate API keys regularly. Run the agent inside WSL with limited user permissions — never as root. Monitor Ollama logs for unusual activity. If deploying in a team environment, use separate API keys per developer and log all requests. Consider running Ollama inside Docker with network isolation for additional security.
Troubleshooting Common Issues
If WSL cannot reach Ollama: verify OLLAMA_HOST is set to 0.0.0.0 and Windows Firewall allows port 11434. If the agent responds slowly: use a smaller model like hermes3:8b or enable GPU passthrough in WSL (WSLg). If tools fail to execute: install missing dependencies inside WSL (playwright browsers, Node.js). If API key errors occur: check Nginx config syntax with "sudo nginx -t" and verify the Authorization header format.
Related Reading
If you are new to Ollama, start with our Ollama for Beginners guide. To understand the difference between models and agents, read about Hermes 3 model setup. For cost comparisons between local and cloud AI, see Ollama vs OpenRouter.
Wise Technologies Team
AI Infrastructure
"Enjoyed this article? We build the tools we write about."