← Labs On-device · no API key
AI Lab · Semantic data quality

The errors your rules can't see.

Regex, not-null and range checks catch structural errors. They are blind to meaning — near-duplicates that aren't string-identical, values that don't belong in a column, records filed under the wrong label. Semantic Lint runs a sentence-embedding model entirely in your browser to surface exactly those. Nothing leaves your device.

Paste a column of values — one per line

Near-duplicate sensitivity 0.78
Booting the embedding model…

How it works. Each value is turned into a 384-dimensional vector by all-MiniLM-L6-v2 running on-device via transformers.js (WASM/WebGPU). Cosine similarity gives a near-duplicate graph (connected components = clusters); a value whose best match is weak is flagged as an outlier; classical MDS projects the vectors to 2D for the map. No server, no API — the same engine could run inside a CI data-quality gate. Built by John Mikel Regida.

Semantic Lint · an original tool by John Mikel Regida Embeddings · MiniLM · transformers.js