Open source

Open source is how we build.

We don't think inference infrastructure should be a black box. The tools we build to run it, from the gateway to our kernel and optimization work, we put in the open so you can read them, run them, and make them better. No gated cores, no bait-and-switch.

Explore our GitHub Star the gateway

Permissive licensesNo gated corePRs welcome

Our projects

Everything we maintain in the open.

View all on GitHub

Core

Tensormux Gateway

Python

The OpenAI-compatible inference gateway: routing, health-based failover, and observability in one self-hosted binary.

routingfailoveropenai-apiobservability

Tooling

kernel-skills

TypeScript

A skill library that helps AI coding agents write, optimize, and debug high-performance CUDA and Triton kernels.

cudatritonquantizationgpu-kernelsllm-agents

Research

TensorPath

Python

Inference-optimization control plane that picks the best GPU, backend, and quantization for a model, with a Triton kernel-gen layer.

llm-inferencetritondeploymentbenchmarking

FeaturedTensormux Gateway

Our flagship, the open front door to the platform.

Route, health-check, and observe your inference backends with a single config file. Run it standalone, or grow into the managed control plane when you need scale.

Quick install

$ docker pull ghcr.io/krxgu/tensormux:latest

Routing: least inflight, EWMA latency, weighted round-robin

Health checking with configurable thresholds

Automatic failover and recovery

OpenAI-compatible API (chat completions + streaming)

Prometheus metrics endpoint

Health and status endpoints

YAML configuration

JSONL audit logging

Want to build with us?

Star a repo, open an issue, or send a pull request. Everything is permissively licensed, self-hostable, and free of vendor lock-in.

github.com/tensormux