Policy-Based LLM Routing with Nvidia's Open Source Blueprint

David Pickett

06 Feb 2026 — 1 min read

People

David

Idea

Testing Nvidia's v1 LLM Router blueprint as a third approach to intelligent query routing - this time using policy-based task classification instead of semantic similarity or neural network intent matching.

Details

Nvidia's v1 LLM Router blueprint (main branch) takes a three-step approach: apply a policy like task or intent classification, use a trained router for that policy, then proxy to the right LLM
This is our third routing prototype in recent weeks - following experiments with LiteLLM's semantic-router and UIUC's LLMRouter framework
The Nvidia approach is more structured than the other two - it separates the "what kind of task is this" decision from the "which model handles it" decision
We got it working to route requests to local models based on task complexity using Nvidia's toolset
Our longer-term use case is agricultural edge devices in the field deciding where to send LLM queries - locally, on-site, in-island, or to the cloud
Routing decisions need to account for factors like query complexity, available compute, connectivity, and latency requirements (and in the field, environmental and energy considerations as well)
We're prototyping on desktop and workstation hardware in the office right now
Build instructions are being written to also support deployment on lower-powered devices like Raspberry Pi and Nvidia Jetsons (for edge routing)
Having three different routing approaches prototyped gives us a better picture of the tradeoffs between simplicity, accuracy, and configurability
Code for this prototype is up at github.com/pickettd/nvidia-llm-router

Neural Network Intent Routing with UIUC's LLMRouter

People: David Idea: Tested UIUC's LLMRouter framework as an alternative to LiteLLM's semantic routing - this one trains an actual neural network for intent classification and can run on hardware as small as a Raspberry Pi with 4GB of ram. Details: * Wanted to compare this against

Basic Semantic Routing with LiteLLM Proxy

People: David Idea: Testing semantic routing as a way to automatically send LLM requests to different models based on what the user is asking—part of a bigger vision for smart edge-to-hub-to-cloud routing on our Maui cluster. Details: * Our LiteLLM proxy already bundles multiple machines into one API endpoint, so

Automous terminal agent accessible through Slack

People: David Idea: Testing KIRA, Krafton's open-source project that lets you run a full Claude Code instance through Slack, on our local PMF hardware. Details: * Nicole mentioned wanting more Slack bot and automation options for the team * Found Krafton's KIRA project on GitHub which does exactly

CrewAI plus open-source LLMs for marketing strategy

People: Me Idea: Tested whether local/open-source models can handle real marketing strategy work using CrewAI as the orchestration layer. Details: * Wanted to see how open source tools do for automating some tasks in marketing * Used gpt-oss-120b as the backbone model * Ran everything through CrewAI for agent orchestration * Got Joe&