Fast, Native C++ BPE Token Counter for OpenAI + SentencePiece

This C++ library is open source, part of the xbid.ai stack. I needed a low-overhead, fast Byte Pair Encoding (BPE) counter accurate enough for billing estimates and strategy comparisons. By skipping OpenAI template overhead we can trade exact parity for speed, with only ~1.5% deviation. The tool also provides support for Google’s sentencepiece binary models with a thin wrapper (100% parity).

C++ BPE counter compatible with .tiktoken (OpenAI) encodings.
Quasi-parity (no templates, <1.5% error)
60% faster than OpenAI’s official tiktoken (JS/WASM)
No dependencies (standard C++20 toolchain)

Our initial code was using a naive byte-length heuristic, very fast but too inaccurate. For xbid-ai, I wanted something more reliable due to the nature of our prompts—trading signals are unbounded, and strategy outputs are compared for costs before routing across multi-LLM/model layer.

Benchmark (2,628 GPT-4o inputs, 16 MB)

Evaluated on a corpus of 2,628 GPT-4o requests (16.08 MB) collected directly from xbid.ai live calls, with reference values from the OpenAI API usage counters.

Method	Bias (tokens)	MAE	MAPE	R²
Naive heuristic (n/4)	–272	312	10.6%	0.889
Native C++ BPE Counter (xbid.ai)	–12	12	1.48%	1.0

Reducing error from ~10% to ~1.5% while remaining fast and lightweight.

IPC server mode

The tool provides a tiny IPC server so you can preload models, keep it alive and stream requests over stdin/stdout. Especially useful for high-throughput or multi-LLM pipelines, to avoid the overhead of spawning a new process per request.

xbid.ai uses this mode and you can find our client implementation in the xbid-ai repo.

# OpenAI BPE
./tokkit --provider openai --model /data/o200k_base.tiktoken --serve

# SentencePiece (built with -DSENTENCEPIECE=1)
./tokkit --provider sentencepiece --model /data/tokenizer.model --serve

Open source

This work is open source under the MIT license. Pull requests, issues, and discussions are always welcome.

If you want to explore further:

SentencePiece — Google tokenizer library.
OpenAI tiktoken — official tokenizer for GPT models

Note The best place to find SentencePiece models (.model) is Hugging Face model hub.

xbid-ai is an ongoing experiment in onchain intelligence.

Benchmark (2,628 GPT-4o inputs, 16 MB)#

IPC server mode#

Open source#

Benchmark (2,628 GPT-4o inputs, 16 MB)

IPC server mode

Open source