Salt: Systems programming, mathematically verified

Source: https://salt-lang.dev/

Salt delivers exact performance parity with `clang -O3`

Verified safety without the runtime cost.

Z3-proven contracts at build time.

No GC. No borrow checker.

Mathematical certainty.

hello.salt

package main

use std.collections.HashMap

fn main() { let mut map = HashMap<StringView, i64>::new(); map.insert("hello", 1); map.insert("world", 2);

// Pipe + optional chaining map.get("hello") |?> println(f"Found: {_}");

// Iterate with f-strings for entry in map.iter() { println(f"{entry.key}: {entry.value}"); } }

Basalt

A complete Llama 2 inference engine. Weights are memory-mapped directly from disk, matrix multiplications are tiled through MLIR's affine optimizer, and every compute kernel carries a Z3 proof that its array accesses are in bounds. The whole thing is about 600 lines of Salt.

~870

tok/s

≈C

vs llama2.c

~600

lines of code

verified kernels

View source →

kernels.salt

// Z3-verified matrix multiply fn mat_mul( out: Ptr<f32>, a: Ptr<f32>, b: Ptr<f32>, m: i64, n: i64, d: i64 ) requires(m > 0, n > 0, d > 0) { for i in 0..m { for j in 0..n { let mut sum: f32 = 0.0; for k in 0..d { sum = sum + a[i*d+k] * b[k*n+j]; } out[i * n + j] = sum; } } }

kmain.salt

// KeuOS kernel entry point @no_mangle pub fn kmain(magic: u32, mb: &u8) { serial.init(); gdt.init(); idt.init(); pit.init(); run_smp_tests(); // ACPI → APIC → AP boot percpu_init_bsp(); scheduler.init();

serial.print("KEUOS BOOT [OK]\n"); scheduler.start(); // Preemptive scheduling is live bench_suite_run(); }

KeuOS

A keuos microkernel that boots on real x86 hardware, written from scratch in Salt. 16-core SMP with Chase-Lev work-stealing, ACPI/APIC, preemptive scheduler with 7-field fiber migration, per-core sharded allocators, a zero-trap socket API where `read()` and `write()` are pure shared-memory operations with no kernel traps, a Ring 3 TCP stack hardened with stateless SYN cookies (SipHash-2-4), and epoch-based memory reclamation. No C runtime, no libc, no external dependencies.

cycle alloc

487

cycle ctx switch

297

cycle IPC

KVM

bare metal

View source →

59 cycles

Arena Allocation

~15 ns @ 4.0 GHz · glibc malloc ≈ 1,000–5,000

487 cycles

Context Switch (16-core SMP)

~122 ns · full GPR + 512B FXSAVE

297 cycles

IPC Ping-Pong

~74 ns · fiber-to-fiber zero-copy yield

60M+ PPS

NetD C10M Data Plane

Ring 3 SPSC bridge · 6× C10M target

102 cycles

Null Syscall (Ring 3→0→3)

~26 ns · SWAPGS + KPTI · 7.4× Linux

136 cycles

Socket Data Plane

per 64B read+write · zero kernel traps

Intel Xeon Platinum 8151 (Skylake, 4.0 GHz), AWS z1d.metal, QEMU/KVM. `-enable-kvm -cpu host`. Full methodology →

KeuOS Train

A neural network that trains on MNIST handwritten digits — entirely in Salt. Weight matrices are arena-allocated, every array access is Z3-verified through Slice bounds, and training time is competitive with the equivalent C implementation. The entire trainer is 188 lines.

97%

accuracy

≈C

training time

188

lines of code

bounds verified

View source →

keuos_train.salt

// Arena + Slice — the safety bridge let arena = Arena::new(4194304); let w1 = Slice<f32>::new( arena.alloc_array::<f32>(HIDDEN * INPUT), HIDDEN * INPUT );

for epoch in 0..8 { for sample in 0..TRAIN_COUNT { matvec(w1, input, pre_relu, HIDDEN, INPUT); relu(hidden.as_ptr(), HIDDEN); softmax_cross_entropy_grad(...); } println(f"Epoch {epoch}: {accuracy}%"); }

Lettuce

A Redis-compatible in-memory data store. Leads Redis at every concurrency level tested (1–100 clients, 13-point sweep) using zero-copy RESP parsing, an arena-backed hash table, and a kqueue event loop. 567 total lines of Salt, zero external deps.

1–100

clients tested

all

concurrency wins

567

lines of code

Redis commands

View source →

server.salt

// The entire event loop fn run(&mut self) { let poller = Poller::new(); poller.add(self.listener.fd, 0);

loop { let events = poller.wait(64); for ev in events { if ev.fd == self.listener.fd { self.accept_client(&poller); } else { self.handle_client(ev.fd); } } } }

GPU-accelerated graphics: FACET

A full-stack 2D rendering engine: from Bézier curve flattening to Metal compute shaders. Five composable layers (raster, window, compositor, UI, GPU), and every pixel write carries a Z3 proof that it's within the framebuffer. Algorithm-identical to the C reference: Salt's MLIR codegen matches `clang -O3` at 457 fps.

457

fps

≈C

raster perf

layers

Metal

GPU backend

View source →

demo_tiger.salt

Video 3

raster.salt

// Z3-verified pixel writes pub fn set_pixel( &mut self, x: i32, y: i32, r: u8, g: u8, b: u8, a: u8 ) requires( x >= 0 && x < self.width && y >= 0 && y < self.height ) { let off = (y as i64) * self.stride + (x as i64) * 4; self.pixels.offset(off).write(r); // Z3 proves: off in [0, stride*height) }

C10M

High-connection networking benchmarks measuring how Salt handles raw I/O. Echo servers, arena-based connection pools, noisy-neighbor isolation. Salt is tested against C, Rust, and Node on identical kqueue event loops — it matches C's throughput.

kqueue

event loop

≈C

echo throughput

Arena

memory model

languages tested

View source →

echo_salt.salt

// kqueue echo server in Salt fn main() -> i32 { let listen_fd = echo_server_init(PORT); let kq = echo_kq_create(listen_fd); let buf = malloc(BUF_SIZE);

while true { let n = echo_kq_poll( kq, listen_fd, buf, BUF_SIZE ); if n > 0 { total_bytes = total_bytes + n; } } }

Performance, measured honestly.

Every benchmark uses dynamic inputs and prints results to prevent dead-code elimination. Same hardware, same optimization flags, no tricks. Salt delivers exact performance parity with `clang -O3` across functionally equivalent benchmarks (28 total).

Platform: Apple M4, macOS ARM64. Baselines: C (`clang -O3`), Rust (`--release`). Salt compiled with default flags. Full data is public. Full methodology and raw data →

How it works.

Salt looks like a normal language. Under the hood, three ideas make it unusual.

🔬

Compile-time proofs

You write `requires()` preconditions and `ensures()` postconditions on your functions. The compiler hands these to Z3, an SMT solver from Microsoft Research, and checks every call site and return site. When Z3 can prove the condition holds, the check is elided entirely, at zero runtime cost. When it finds a violation, it reports concrete counterexample values.

🏟️

Arena memory

Instead of garbage collection or borrow checking, Salt uses arenas: all allocations happen in a region, and the region is freed at once. Compiling verifies that no reference outlives its arena.

⚡

MLIR codegen

Salt emits code through MLIR, the same compiler infrastructure behind TensorFlow and PyTorch. The affine dialect enables automatic loop tiling without hand-tuned tile sizes — general matrix multiply runs at parity with `clang -O3`, and ML inference kernels benefit from L1-cache-aware tiling automatically.

Pipe operator

Data flows left to right. Chain transformations with `|>`, propagate errors with `|?>`, and use `_` placeholders to bind arguments wherever you want. It reads the way you think.

🔒

Explicit unsafe boundaries

Raw pointer arithmetic lives inside `unsafe` blocks. Foreign function calls are marked `@trusted`. Everything outside these boundaries is statically verified. You always know where the unverified code is.

🧂

Built-in package manager

`sp new`, `sp build`, `sp add`. Artifacts are content-addressed and cached globally. When you add a dependency, its Z3 contracts are verified against your call sites automatically. Inspired by Cargo.

Why it feels different.

The same task, in C, Rust, and Salt. Toggle between C and Rust on the left to see what Salt replaces.

Memory management

// C: manual, error-prone char *buf = malloc(4096); if (!buf) return -1; read_data(buf, 4096); process(buf); free(buf); // forget this → leak process(buf); // use-after-free → UB// Rust — safe, but verbose let buf: Vec<u8> = Vec::with_capacity(4096); read_data(&mut buf)?; process(&buf); // borrow — ok let owned = buf; // move — buf is gone process(&buf); // ✗ compile error: use of moved value C leaks or corrupts. Rust is safe but ownership fights are real.

// Arena: allocate freely, free once let arena = Arena::new(4096); let buf = arena.alloc(4096); read_data(buf, 4096); process(buf); process(buf); // still valid: arena owns it // arena drops here → everything freed No lifetimes. No manual free. The arena owns it all, and the compiler verifies nothing escapes.

Error handling

// C: return codes, manual checking int fd = open(path, O_RDONLY); if (fd < 0) return -1; int n = read(fd, buf, 4096); if (n < 0) { close(fd); return -1; } int rc = parse(buf, n); if (rc < 0) { close(fd); return -1; }// Rust: safe, but ceremony adds up let mut file = File::open(path)?; let mut buf = vec![0u8; 4096]; let n = file.read(&mut buf)?; let result = parse(&buf[..n]) .map_err(|e| io::Error::new( ErrorKind::InvalidData, e ))?; C forgets to check. Rust is correct but `.map_err` + trait bounds add boilerplate.

// Pipe operator: reads left to right let data = path |> File::open() |> read_all() |> parse();

// Or use Result + match: your choice match data { Result::Ok(val) => process(val), Result::Err(e) => println(f"error: {e}") } One pipeline. Each step returns a Result. No boilerplate, no forgotten checks.

Safety guarantees

// C: nothing stops you int divide(int a, int b) { return a / b; // b == 0? UB. good luck. }

int get(int *arr, int i) { return arr[i]; // out of bounds? UB. }// Rust: runtime panics fn divide(a: i32, b: i32) -> i32 { a / b // b == 0? panic at runtime. }

fn get(arr: &[i32], i: usize) -> i32 { arr[i] // out of bounds? panic at runtime. } C is silent. Rust panics. Both discover the bug only when it happens.

// Contracts: proved before your code runs fn divide(a: i32, b: i32) -> i32 requires(b != 0) { return a / b; }

fn clamp(val: i32) -> i32 ensures(result >= 0 && result <= 100) { if val < 0 { return 0; } if val > 100 { return 100; } return val; } // Z3 checks every call site and return site. // requires: caller must prove. ensures: compiler proves. No runtime cost when Z3 can prove it. Otherwise, the compiler emits a safe runtime check as fallback.

String formatting

// C: format strings, no type checking char msg[256]; sprintf(msg, "%s scored %d in %dms", name, score, elapsed); // wrong format specifier → UB // buffer overflow → UB// Rust: safe, but allocates let msg = format!( "{} scored {} in {}ms", name, score, elapsed ); // Heap allocation every time. // Display trait needed for custom types. C is fast but fragile. Rust is safe but `format!` always allocates.

// F-strings: just works let msg = f"{name} scored {score} in {elapsed}ms";

println(f"result: {x |> square() |> double()}");

// Inline expressions, type-safe, // no format specifiers, no heap allocation. Embeds any expression. Type-checked at compile time. No allocation.

Batteries included.

Over 70 modules, zero external dependencies.

Everything here is written in Salt.

std.collections Vec, HashMap, Slab std.string String, StringView, f-strings std.net TCP, kqueue poller std.http Client & server std.json JSON parse & access std.thread spawn, join std.sync Mutex, AtomicI64 std.io File, BufferedWriter std.math SIMD, transcendentals std.nn relu, softmax, cross_entropy std.crypto TLS bridge std.fs exists, remove, rename std.process Command builder std.regex Pattern matching std.encoding Base64, Hex

Zero to running code in 30 seconds.

Salt ships with `sp`, a package manager designed around the same principles as the language: fast, ergonomic, and verified by default.

terminal

Create a project — instant, no prompts

$ sp new my_app ✨ Created project 'my_app'

Build & run

$ sp run 🧂 Running my_app...

Add a dependency — contracts are shown

$ sp add crypto ✨ Added crypto v0.2.1 📋 Contracts: encrypt requires(key.len() >= 32)

Verify all contracts across package boundaries

$ sp check ✅ All contracts verified (0.2s)

Content-addressed caching

Build artifacts are keyed by the hash of their source, compiler version, and dependencies. If nothing changed, nothing rebuilds. Caches are shared across every project on your machine.

Cross-package verification

When a library declares `requires(key.len() >= 32)`, the compiler verifies that condition at every call site in every downstream package. Safety contracts compose across dependency boundaries.

Workspace support

Monorepos work out of the box. One lockfile, parallel builds, and the compiler only recompiles what actually changed.

Try it.

Three commands, thirty seconds. That's it.

terminal

$ sp new hello && cd hello && sp run ✨ Created project 'hello' 🧂 Hello from hello!