Performance & Benchmarks
Reproducible numbers: token efficiency vs React, layout throughput, and the frame profiler — all runnable with `vel bench`.
Vel makes three performance claims, and ships a benchmark for each so you can verify them yourself. Everything below is reproducible on your machine with one command.
All numbers were measured on an Apple M1 Pro. Absolute timings vary by
machine — what’s stable is the ratio and the before/after. Run the
suite locally with vel bench to get your own figures.
TL;DR
2.33× fewer tokens
Authoring a UI in .vel costs 2.33× fewer LLM tokens than the same UI
in idiomatic React + Tailwind.
~65–93× faster layout
Text-measure caching cut warm relayout of a 1,000-row tree from 19.6 ms → 0.30 ms, and 10,000 rows from 206 ms → 2.2 ms.
~1 ms frame work
Steady-state CPU work per frame (layout + reactive update) for the showcase is ~1 ms p50 — vsync-bound headroom to spare.
1. Token efficiency — vel bench token
The headline reason Vel exists: an AI agent (or a human) should be able to describe a UI in as few tokens as possible. Fewer tokens = cheaper, faster, more reliable generation.
Method. Six canonical UIs are written twice — once in .vel, once in
idiomatic React + Tailwind at equal fidelity — and tokenized with o200k_base
(the closest public proxy for modern frontier tokenizers). Every .vel file in
the corpus compiles (velc --check passes), so these are real programs, not
strawmen.
| UI | Vel tokens | React tokens | React / Vel |
|---|---|---|---|
| counter | 108 | 258 | 2.39× |
| login | 159 | 500 | 3.14× |
| pricing | 409 | 801 | 1.96× |
| settings | 307 | 829 | 2.70× |
| dashboard | 275 | 594 | 2.16× |
| profile | 223 | 476 | 2.13× |
| Total | 1,481 | 3,458 | 2.33× |
pip3 install tiktoken
vel bench token
Why this is a fair comparison
- The React side is idiomatic React + Tailwind — including
import,useState, and event handlers (real authored tokens). - Where React is genuinely terser (e.g.
.map()over radio options) it gets that credit; the benchmark does not handicap React. - The React baseline is raw Tailwind, not a component library. A
shadcn/uibaseline would narrow the gap on some cases — Vel’s built-in design system is part of the advantage, and we say so.
2. Layout throughput — vel bench layout
The js-framework-benchmark protocol applied to Vel’s core: build a tree of N rows (swatch + flexible label + action button) and measure tree construction and layout, with no window/GPU/vsync — pure CPU, directly comparable to React’s render + reconcile + layout.
Result (M1 Pro, median)
| rows | build (tree) | layout (cold) | create | relayout (warm) | was (warm) |
|---|---|---|---|---|---|
| 100 | 0.03 ms | 0.04 ms | 0.07 ms | 0.04 ms | 1.9 ms |
| 1,000 | 0.26 ms | 0.30 ms | 0.55 ms | 0.30 ms | 19.6 ms |
| 10,000 | 1.99 ms | 2.22 ms | 4.21 ms | 2.23 ms | 206 ms |
vel bench layout
Tree construction was always fast — C++ allocation builds 10,000 rows of
widgets in 2 ms, where JS createElement would be far slower. The bottleneck
was layout re-measuring every glyph through FreeType on every frame.
The fix: text-measure caching
The win came from two process-lifetime caches in the text rasterizer
(engine/src/text/FreeTypeRasterizer.cpp):
Per-glyph advance cache
Keyed on (face, pixelSize, codepoint). Common characters are loaded once,
so even cold layout of varied text is fast.
Per-string width cache
Keyed on (face, pixelSize, string). Re-measuring unchanged text is now
O(1) — the steady-state case for scrolling and animation.
Glyph advances never change for a given face + size, so the caches live for the process lifetime and never need invalidation. Result: warm relayout of 1,000 rows dropped 19.6 ms → 0.30 ms (~65×); 10,000 rows 206 ms → 2.2 ms (~93×). A 10k-row list now lays out comfortably inside a 60 fps frame — the steady-state interaction budget that Figma-class apps live in.
3. The frame profiler — VEL_PERF
Every Vel app has a built-in frame profiler. Set VEL_PERF and run any app:
VEL_PERF=60 ./build/showcase # report every 60 frames
[vel-perf] n=60 build p50/p99=970/2600us gpu p50/p99=7148/8262us \
frame p50/p95/p99/max=8195/9012/9304/9304us ~122fps
- build —
measure + place + tick(layout + reactive update): clean CPU cost. - gpu — command encoding plus the vsync wait (so it’s not pure GPU time).
- frame — the whole render pass.
On an M1 Pro the showcase runs vsync-bound at ~120 fps (ProMotion); the meaningful figure is build ≈ 1 ms p50 — comfortable headroom.
Reproduce everything
vel bench # token + layout
vel bench token # tokens vs React (needs: pip3 install tiktoken)
vel bench layout # headless build + layout for 100 / 1k / 10k rows
VEL_PERF=60 vel run ./build/showcase # live per-frame profile
The benchmark sources live in benchmarks/
with full methodology and honest caveats.
What’s still open
Performance work is never done. Known next steps: dirty-subtree layout (skip re-measuring unchanged subtrees), Flex intrinsic-size caching, and a startup/bundle benchmark for the web (WASM size, time-to-interactive) — Vel’s biggest remaining unknown vs React on the web.