Python API reference

The Python package is pure Python and ships with a bundled syntaqlite CLI binary. Requires Python 3.10+. Wheels are published for macOS (arm64, x86_64), Linux (x86_64, aarch64), and Windows (x86_64).

import syntaqlite

with syntaqlite.Syntaqlite() as sq:
    print(sq.format_sql("select 1"))
    stmts = sq.parse("select * from users")

`syntaqlite.Syntaqlite`

The entry point for all SQL operations. Create one instance and reuse it across many calls — each instance spins up a worker on construction and keeps it alive until close() is called.

syntaqlite.Syntaqlite(*, dialect_path=None, dialect_name=None, binary=None)

Parameter	Type	Default	Description
`dialect_path`	`str \| None`	`None`	Path to a dialect shared library (`.so`/`.dylib`/`.dll`). Defaults to SQLite.
`dialect_name`	`str \| None`	`None`	Dialect name. Required only when `dialect_path` exports more than one dialect.
`binary`	`str \| None`	`None`	Override the path to the `syntaqlite` CLI. Defaults to the binary shipped with the wheel (or `SYNTAQLITE_BIN`).

Supports the context-manager protocol, so with syntaqlite.Syntaqlite() as sq: cleans up automatically. Not intended for concurrent use — if you want parallelism, create one instance per thread.

`sq.format_sql`

Format SQL with configurable options.

sq.format_sql(sql, *, line_width=80, indent_width=2, keyword_case="upper", semicolons=True)

Parameter	Type	Default	Description
`sql`	`str`	—	SQL to format
`line_width`	`int`	`80`	Maximum line width
`indent_width`	`int`	`2`	Spaces per indent level
`keyword_case`	`str`	`"upper"`	`"upper"` or `"lower"`
`semicolons`	`bool`	`True`	Append semicolons to statements

Returns: str — the formatted SQL.

Raises: FormatError when the input cannot be parsed.

>>> sq.format_sql("select 1")
'SELECT 1;\n'
>>> sq.format_sql("select 1", keyword_case="lower", semicolons=False)
'select 1\n'

`sq.parse`

Parse SQL into a list of typed AST nodes.

sq.parse(sql)

Parameter	Type	Description
`sql`	`str`	SQL to parse (may contain multiple statements)

Returns: list — one typed AST node per statement. Each node is a __slots__ class (e.g. SelectStmt, CreateTableStmt) with typed attributes.

>>> stmt = sq.parse("SELECT 1 + 2 FROM foo")[0]
>>> type(stmt).__name__
'SelectStmt'
>>> stmt.columns[0].expr
BinaryExpr(...)
>>> stmt.from_clause
TableRef(...)
>>> stmt.where_clause is None
True

Enum and flag fields are IntEnum/IntFlag from syntaqlite.enums:

>>> from syntaqlite.enums import BinaryOp
>>> stmt.columns[0].expr.op
<BinaryOp.PLUS: 0>

Node classes support isinstance checks:

>>> from syntaqlite.nodes import SelectStmt
>>> isinstance(stmt, SelectStmt)
True

The parser recovers from errors and continues; any unparseable fragment comes through as an Error node in the list.

`sq.parse_raw`

Same as parse but returns plain JSON-shaped dicts without the typed-class wrapping. Use this for performance-sensitive code that doesn't need attribute-style access.

>>> sq.parse_raw("SELECT 1")[0]["type"]
'SelectStmt'

`sq.analyze`

Analyze SQL against an optional Schema.

sq.analyze(sql, schema=None, *, output=AnalysisOutput.STRUCTURED, render_options=None)

Parameter	Type	Default	Description
`sql`	`str`	—	SQL to analyze
`schema`	`Schema` `\| None`	`None`	Catalog schema to analyze against
`output`	`AnalysisOutput` `\| str`	`STRUCTURED`	Return shape — typed result or rendered string
`render_options`	`RenderOptions` `\| None`	`None`	Fine-grained options for text rendering (source label, etc.). Ignored unless `output=TEXT`.

Schema can be built from explicit tables/views, from DDL text, or both:

sq.analyze(
    sql,
    syntaqlite.Schema(
        tables=[syntaqlite.Table("users", ["id", "name"])],
        views=[syntaqlite.View("active", ["id"])],
    ),
)

Returns (when output=AnalysisOutput.STRUCTURED): an Analysis.

Returns (when output=AnalysisOutput.TEXT): str with the diagnostics rendered with source context, matching the CLI output.

>>> schema = syntaqlite.Schema(tables=[syntaqlite.Table("users", ["id", "name"])])
>>> r = sq.analyze("SELECT id FROM users", schema)
>>> r.diagnostics
[]

>>> # Rendered text output with a custom file label
>>> text = sq.analyze(
...     "SELECT nme FROM users", schema,
...     output=syntaqlite.AnalysisOutput.TEXT,
...     render_options=syntaqlite.RenderOptions(source_name="query.sql"),
... )
>>> print(text)
error: unknown column 'nme'
 --> query.sql:1:8
...

`sq.tokenize`

Tokenize SQL into a list of token dicts.

sq.tokenize(sql)

Parameter	Type	Description
`sql`	`str`	SQL to tokenize

Returns: list[dict] — one entry per token:

Key	Type	Description
`text`	`str`	Token text
`offset`	`int`	Byte offset in the source
`length`	`int`	Length of the token in bytes
`type`	`int`	Internal token type ID
`category`	`str`	`"keyword"`, `"identifier"`, `"string"`, `"number"`, `"operator"`, `"punctuation"`, `"comment"`, `"parameter"`, `"function"`, `"type"`, or `"other"`

>>> sq.tokenize("SELECT 1")
[{'text': 'SELECT', 'offset': 0, 'length': 6, 'type': 161, 'category': 'keyword'},
 ...]

`sq.close`

Release the worker and stop accepting calls. Called automatically when the instance is used as a context manager.

sq.close()

After close(), any method call raises SyntaqliteError.

Schema types

`syntaqlite.Schema`

A catalog schema. Everything that contributes to the analyzer's catalog lives here — pick whichever combination fits your use case.

syntaqlite.Schema(*, tables=None, views=None, ddl=None, modules=None)

Parameter	Type	Description
`tables`	`list[Table] \| None`	Structured table definitions
`views`	`list[View] \| None`	Structured view definitions
`ddl`	`str \| None`	Raw DDL (`CREATE TABLE` / `CREATE VIEW`) text, parsed once
`modules`	`dict[str, str] \| None`	Dialect-specific. Map from dotted module path to SQL source, loaded lazily when the analyzer encounters an import (e.g. Perfetto's `INCLUDE PERFETTO MODULE`). Ignored by dialects without module support.

`syntaqlite.Table`

syntaqlite.Table(name, columns=None)

Parameter	Type	Description
`name`	`str`	Table name
`columns`	`list[str] \| None`	Column names. When `None`, any column reference is accepted.

`syntaqlite.View`

syntaqlite.View(name, columns=None)

Same fields as Table.

Result types

`Analysis`

Returned by analyze when output=AnalysisOutput.STRUCTURED.

Attribute	Type	Description
`diagnostics`	`list[Diagnostic]`	All diagnostics, aggregated across statements
`statements`	`list[Statement]`	Per-statement analysis
`lineage`	`Lineage` `\| None`	Column lineage of the final query-bearing statement; `None` when no statement had a query body

`Statement`

Per-statement analysis, available on Analysis.statements.

Attribute	Type	Description
`source`	`str`	The SQL source text for this statement
`diagnostics`	`list[Diagnostic]`	Diagnostics for this statement
`defined_relations`	`list[DefinedRelation]`	Tables/views defined by DDL statements
`lineage`	`Lineage` `\| None`	Column lineage; `None` if this statement has no query body

`Lineage`

Column lineage for a query-bearing statement.

Attribute	Type	Description
`complete`	`bool`	`True` if all sources were fully resolved
`columns`	`list[ColumnLineage]`	Per-column lineage
`relations`	`list[RelationAccess]`	Catalog relations referenced directly in `FROM`
`physical_tables`	`list[str]`	Physical table names accessed after CTE/view expansion
`unexpanded_views`	`list[str]`	Views whose bodies weren't available for expansion (non-empty implies `complete=False`)

`ColumnLineage`

Attribute	Type	Description
`name`	`str`	Output column name (alias or inferred)
`index`	`int`	Zero-based position in the result column list
`origin`	`ColumnOrigin` `\| None`	Origin table/column, or `None` for expressions/literals/aggregates

`ColumnOrigin`

Attribute	Type	Description
`table`	`str`	Physical table name
`column`	`str`	Column name in that table

`RelationAccess`

Attribute	Type	Description
`name`	`str`	Relation name as it appears in the catalog
`kind`	`str`	`"table"` or `"view"`

`Diagnostic`

Attribute	Type	Description
`severity`	`str`	`"error"`, `"warning"`, `"info"`, or `"hint"`
`message`	`str`	Diagnostic message
`start_offset`	`int`	Byte offset of the start of the span
`end_offset`	`int`	Byte offset of the end of the span
`code`	`DiagnosticCode`	Machine-readable kind

`AnalysisOutput`

StrEnum selecting the return shape of analyze:

Name	Value	Meaning
`STRUCTURED`	`"structured"`	Return an `Analysis` (default)
`TEXT`	`"text"`	Return a rendered diagnostics string

`RenderOptions`

Options that shape AnalysisOutput.TEXT output.

syntaqlite.RenderOptions(*, source_name="")

Parameter	Type	Default	Description
`source_name`	`str`	`""`	Source label shown in rendered diagnostics (analogous to a file path).

`DiagnosticCode`

IntEnum of machine-readable diagnostic kinds:

Name	Value
`PARSE_ERROR`	`0`
`UNKNOWN_TABLE`	`1`
`UNKNOWN_COLUMN`	`2`
`UNKNOWN_FUNCTION`	`3`
`UNKNOWN_MODULE`	`4`
`FUNCTION_ARITY`	`5`
`CTE_COLUMN_COUNT_MISMATCH`	`6`

`DefinedRelation`

Attribute	Type	Description
`name`	`str`	Relation name
`is_view`	`bool`	`True` for views, `False` for tables

Exceptions

`syntaqlite.FormatError`

Raised by format_sql when the input SQL cannot be parsed. Inherits from Exception.

try:
    sq.format_sql("SELECT FROM")
except syntaqlite.FormatError as e:
    print(e)

`syntaqlite.SyntaqliteError`

Base class for runtime errors raised by a Syntaqlite instance — for example, calls made after close. Inherits from RuntimeError.