Python API reference

The Python library is a C extension (_syntaqlite) bundled with the pip package. It requires Python 3.10+ and is available on macOS (arm64, x86_64), Linux (x86_64, aarch64), and Windows (x86_64).

import syntaqlite

If the C extension is not available (e.g. Windows arm64), the library functions are not importable. The CLI binary is still usable.

`syntaqlite.format_sql`

Format SQL with configurable options.

syntaqlite.format_sql(sql, *, line_width=80, indent_width=2, keyword_case="upper", semicolons=True)

Parameter	Type	Default	Description
`sql`	`str`	—	SQL to format
`line_width`	`int`	`80`	Maximum line width
`indent_width`	`int`	`2`	Spaces per indent level
`keyword_case`	`str`	`"upper"`	`"upper"` or `"lower"`
`semicolons`	`bool`	`True`	Append semicolons to statements

Returns: str. The formatted SQL.

Raises: syntaqlite.FormatError on parse error (the original SQL is syntactically invalid).

>>> syntaqlite.format_sql("select 1")
'SELECT 1;\n'
>>> syntaqlite.format_sql("select 1", keyword_case="lower", semicolons=False)
'select 1\n'

`syntaqlite.parse`

Parse SQL into a list of typed AST nodes.

syntaqlite.parse(sql)

Parameter	Type	Description
`sql`	`str`	SQL to parse (may contain multiple statements)

Returns: list. One typed AST node per statement. Each node is a __slots__ class (e.g. SelectStmt, CreateTableStmt) with typed attributes:

>>> stmt = syntaqlite.parse("SELECT 1 + 2 FROM foo")[0]
>>> type(stmt).__name__
'SelectStmt'
>>> stmt.columns[0].expr
BinaryExpr(...)
>>> stmt.from_clause
TableRef(...)
>>> stmt.where_clause is None
True

Enum and flag fields are wrapped as IntEnum/IntFlag from syntaqlite.enums:

>>> from syntaqlite.enums import BinaryOp
>>> BinaryOp(stmt.columns[0].expr.op).name
'PLUS'

Node classes support isinstance checks:

>>> from syntaqlite.nodes import SelectStmt, BinaryExpr
>>> isinstance(stmt, SelectStmt)
True

On parse error, the entry is a plain dict (not wrapped):

Key	Type	Description
`type`	`str`	`"Error"`
`message`	`str`	Error message
`offset`	`int`	Byte offset of the error
`length`	`int`	Length of the error span

The parser recovers from errors and continues parsing subsequent statements.

Raw dict access

For performance-sensitive code, use syntaqlite.parse_raw() to skip the wrapping and get plain dicts:

>>> syntaqlite.parse_raw("SELECT 1")[0]["type"]
'SelectStmt'

`syntaqlite.validate`

Validate SQL against an optional schema.

syntaqlite.validate(sql, *, tables=None, views=None, schema_ddl=None, render=False)

Parameter	Type	Default	Description
`sql`	`str`	—	SQL to validate
`tables`	`list[Table] \| None`	`None`	Schema tables
`views`	`list[View] \| None`	`None`	Schema views
`schema_ddl`	`str \| None`	`None`	DDL to parse as schema (CREATE TABLE/VIEW)
`render`	`bool`	`False`	Return rendered diagnostics string instead

Schema can be provided three ways (or combined):

# Explicit tables and views
syntaqlite.validate(sql,
    tables=[syntaqlite.Table(name="users", columns=["id", "name"])],
    views=[syntaqlite.View(name="active", columns=["id"])],
)

# From DDL
syntaqlite.validate(sql,
    schema_ddl="CREATE TABLE users(id, name); CREATE VIEW active AS SELECT id FROM users;",
)

Table and View accept name (required) and columns (optional; omit to accept any column reference).

Returns (render=False): ValidationResult with attributes:

Attribute	Type	Description
`diagnostics`	`list[Diagnostic]`	Parse and semantic diagnostics
`lineage`	`Lineage \| None`	Column lineage for SELECT statements, `None` for non-queries

Returns (render=True): str. Human-readable diagnostics with source context, similar to CLI output.

>>> r = syntaqlite.validate("SELECT id, name FROM users",
...     tables=[syntaqlite.Table(name="users", columns=["id", "name"])])
>>> r.diagnostics
[]
>>> r.lineage.complete
True
>>> for col in r.lineage.columns:
...     print(f"{col.name} <- {col.origin}")
id <- users.id
name <- users.name
>>> r.lineage.tables
['users']

Result types

Diagnostic — a single diagnostic:

Attribute	Type	Description
`severity`	`str`	`"error"`, `"warning"`, `"info"`, or `"hint"`
`message`	`str`	Diagnostic message
`start_offset`	`int`	Byte offset of the start of the span
`end_offset`	`int`	Byte offset of the end of the span

Lineage — column lineage for a SELECT statement:

Attribute	Type	Description
`complete`	`bool`	`True` if all sources fully resolved
`columns`	`list[ColumnLineage]`	Per-column lineage
`relations`	`list[RelationAccess]`	Catalog relations referenced in FROM
`tables`	`list[str]`	Physical table names accessed

ColumnLineage — lineage for a single result column:

Attribute	Type	Description
`name`	`str`	Output column name (alias or inferred)
`index`	`int`	Zero-based position in the result column list
`origin`	`ColumnOrigin \| None`	Origin, or `None` for expressions

ColumnOrigin — physical table and column:

Attribute	Type	Description
`table`	`str`	Table name
`column`	`str`	Column name

RelationAccess — a catalog relation in FROM:

Attribute	Type	Description
`name`	`str`	Relation name
`kind`	`str`	`"table"` or `"view"`

`syntaqlite.tokenize`

Tokenize SQL into a list of token dicts.

syntaqlite.tokenize(sql)

Parameter	Type	Description
`sql`	`str`	SQL to tokenize

Returns: list[dict]. One entry per token:

Key	Type	Description
`text`	`str`	Token text
`offset`	`int`	Byte offset in the source
`length`	`int`	Length of the token in bytes
`type`	`int`	Internal token type ID

>>> syntaqlite.tokenize("SELECT 1")
[{'text': 'SELECT', 'offset': 0, 'length': 6, 'type': ...},
 {'text': '1', 'offset': 7, 'length': 1, 'type': ...}]

`syntaqlite.FormatError`

Exception raised by syntaqlite.format_sql when the input SQL cannot be parsed.

try:
    syntaqlite.format_sql("SELECT FROM")
except syntaqlite.FormatError as e:
    print(e)  # syntax error near 'FROM'

Inherits from Exception.