Validation model

syntaqlite's validator is a single-pass semantic analyzer. It walks the AST once, resolving names against a layered catalog and emitting diagnostics inline. This page explains the design. For practical usage, see project setup.

Why single-pass

The analyzer dispatches on semantic roles — annotations defined in the .synq grammar files that tell the analyzer what each AST node means:

Role	Triggers
`SourceRef`	Table/view reference in FROM, JOIN, INSERT INTO, etc.
`ColumnRef`	Column reference (qualified or unqualified)
`Call`	Function call (checks existence and arity)
`Query`	SELECT body (pushes/pops a scope frame)
`CteScope`	WITH clause (registers CTE bindings)
`DefineTable` / `DefineView`	DDL (accumulates to catalog)

Because all roles are leaf-level operations (look up a name, push a scope, register a definition), they can be handled inline as the AST walk encounters them. There's no need for a separate resolution pass, which keeps the implementation simple and means each node is visited exactly once.

The catalog

The catalog is where all name information lives. It uses a layered architecture where inner layers shadow outer ones: the analyzer searches from the innermost layer outward and takes the first match:

Layer	What it holds	Lifetime
Query	CTEs, subquery aliases, FROM aliases	Per-statement (pushed/popped during walk)
Document	`CREATE TABLE` / `CREATE VIEW` from the current file	Cleared between `analyze()` calls
Connection	DDL accumulated across calls	Persists (Execute mode only)
Database	User-provided schema	Set once by caller
Dialect	Built-in SQLite functions, version/cflag-gated	Set once by caller

This layering is what makes the validator work without a database connection. You provide the schema you care about in the Database layer, the analyzer discovers DDL in the file automatically via the Document layer, and the Dialect layer knows which functions are available for the target SQLite version.

The source is in catalog.rs.

Known vs. unknown columns

When a table is registered with a column list, the validator checks that referenced columns actually exist. When registered with None (columns unknown), any column reference is accepted. This distinction matters because schema information is often incomplete: you might know a table exists from an ORM definition but not have the full DDL.

Scope resolution

Each SELECT statement gets its own scope frame that tracks which tables are visible. The ValidationPass manages these frames automatically:

Entering a SELECT pushes a new frame
FROM/JOIN clauses register tables (with aliases) into that frame
Column references resolve against the frame's tables
Leaving the SELECT pops the frame

Qualified references (t.col) resolve in the named table only. Unqualified references (col) search all tables in scope. SQLite resolves ambiguous unqualified columns at runtime, so the validator accepts them, matching SQLite's own behavior rather than over-reporting.

CTE scoping

WITH clauses register CTE bindings before the main query. If the CTE declares a column list (WITH cte(a, b) AS (...)), the declared columns are used for validation and the count is checked against the SELECT's actual output columns. Recursive CTEs work: the CTE name is visible within its own body.

Fuzzy matching

When a name doesn't resolve, the analyzer computes case-insensitive Levenshtein distance against all candidates in scope (fuzzy.rs). If a candidate is within the threshold (default: 2 edits), a "did you mean?" suggestion is attached to the diagnostic.

This applies uniformly to table names, column names, and function names.

Diagnostics

Each diagnostic carries a severity, byte-accurate source span, a human-readable message, and a machine-readable detail enum (UnknownTable, UnknownColumn, UnknownFunction, FunctionArity) for programmatic consumers.

By default, unresolved names produce warnings because the schema might be incomplete. Strict mode (ValidationConfig::with_strict_schema(true)) promotes them to errors. This lets you start with a permissive baseline and tighten validation as your schema coverage improves.

Version and compile-flag awareness

The Dialect layer knows which functions are available in each SQLite version and which require compile-time flags. When you set a target version, functions added after that version are removed from the catalog. This means the validator catches version mismatches the same way it catches typos, as unresolved names with suggestions.

This is the same mechanism described in Why SQLite's own grammar, extended from syntax to semantics.