The fastest and easiest way to run AI locally

Local AI, built from the ground up. Trillim gives you total control over your inference without ever leaving your hardware.

Install

Full install guide

Performance

Fastest BitNet Inference Engine

Read on Trillim's Tokens
DarkNet bitnet.cpp

Prefill

tokens/sec
190.3 167.7 q4_0 189.0 165.5 q5_0 188.7 169.2 q6_k 190.0 166.5 q8_0

Decode

tokens/sec
41.2 32.8 q4_0 39.5 40.6 q5_0 38.0 40.2 q6_k 34.9 31.7 q8_0

*Benchmarked on a 12th Gen Intel i7-1255U with 10 threads

Docs

Install, chat, and more

Open documentation

Trillim's Tokens

Benchmarking DarkNet against bitnet.cpp

Open blog