Python API reference

The Python library is a C extension (_syntaqlite) bundled with the pip package. It requires Python 3.10+ and is available on macOS (arm64, x86_64), Linux (x86_64, aarch64), and Windows (x86_64).

import syntaqlite

If the C extension is not available (e.g. Windows arm64), the library functions are not importable. The CLI binary is still usable.

syntaqlite.format_sql

Format SQL with configurable options.

syntaqlite.format_sql(sql, *, line_width=80, indent_width=2, keyword_case="upper", semicolons=True)
ParameterTypeDefaultDescription
sqlstrSQL to format
line_widthint80Maximum line width
indent_widthint2Spaces per indent level
keyword_casestr"upper""upper" or "lower"
semicolonsboolTrueAppend semicolons to statements

Returns: str. The formatted SQL.

Raises: syntaqlite.FormatError on parse error (the original SQL is syntactically invalid).

>>> syntaqlite.format_sql("select 1")
'SELECT 1;\n'
>>> syntaqlite.format_sql("select 1", keyword_case="lower", semicolons=False)
'select 1\n'

syntaqlite.parse

Parse SQL into a list of typed AST nodes.

syntaqlite.parse(sql)
ParameterTypeDescription
sqlstrSQL to parse (may contain multiple statements)

Returns: list. One typed AST node per statement. Each node is a __slots__ class (e.g. SelectStmt, CreateTableStmt) with typed attributes:

>>> stmt = syntaqlite.parse("SELECT 1 + 2 FROM foo")[0]
>>> type(stmt).__name__
'SelectStmt'
>>> stmt.columns[0].expr
BinaryExpr(...)
>>> stmt.from_clause
TableRef(...)
>>> stmt.where_clause is None
True

Enum and flag fields are wrapped as IntEnum/IntFlag from syntaqlite.enums:

>>> from syntaqlite.enums import BinaryOp
>>> BinaryOp(stmt.columns[0].expr.op).name
'PLUS'

Node classes support isinstance checks:

>>> from syntaqlite.nodes import SelectStmt, BinaryExpr
>>> isinstance(stmt, SelectStmt)
True

On parse error, the entry is a plain dict (not wrapped):

KeyTypeDescription
typestr"Error"
messagestrError message
offsetintByte offset of the error
lengthintLength of the error span

The parser recovers from errors and continues parsing subsequent statements.

Raw dict access

For performance-sensitive code, use syntaqlite.parse_raw() to skip the wrapping and get plain dicts:

>>> syntaqlite.parse_raw("SELECT 1")[0]["type"]
'SelectStmt'

syntaqlite.validate

Validate SQL against an optional schema.

syntaqlite.validate(sql, *, tables=None, views=None, schema_ddl=None, render=False)
ParameterTypeDefaultDescription
sqlstrSQL to validate
tableslist[Table] | NoneNoneSchema tables
viewslist[View] | NoneNoneSchema views
schema_ddlstr | NoneNoneDDL to parse as schema (CREATE TABLE/VIEW)
renderboolFalseReturn rendered diagnostics string instead

Schema can be provided three ways (or combined):

# Explicit tables and views
syntaqlite.validate(sql,
    tables=[syntaqlite.Table(name="users", columns=["id", "name"])],
    views=[syntaqlite.View(name="active", columns=["id"])],
)

# From DDL
syntaqlite.validate(sql,
    schema_ddl="CREATE TABLE users(id, name); CREATE VIEW active AS SELECT id FROM users;",
)

Table and View accept name (required) and columns (optional; omit to accept any column reference).

Returns (render=False): ValidationResult with attributes:

AttributeTypeDescription
diagnosticslist[Diagnostic]Parse and semantic diagnostics
lineageLineage | NoneColumn lineage for SELECT statements, None for non-queries

Returns (render=True): str. Human-readable diagnostics with source context, similar to CLI output.

>>> r = syntaqlite.validate("SELECT id, name FROM users",
...     tables=[syntaqlite.Table(name="users", columns=["id", "name"])])
>>> r.diagnostics
[]
>>> r.lineage.complete
True
>>> for col in r.lineage.columns:
...     print(f"{col.name} <- {col.origin}")
id <- users.id
name <- users.name
>>> r.lineage.tables
['users']

Result types

Diagnostic — a single diagnostic:

AttributeTypeDescription
severitystr"error", "warning", "info", or "hint"
messagestrDiagnostic message
start_offsetintByte offset of the start of the span
end_offsetintByte offset of the end of the span

Lineage — column lineage for a SELECT statement:

AttributeTypeDescription
completeboolTrue if all sources fully resolved
columnslist[ColumnLineage]Per-column lineage
relationslist[RelationAccess]Catalog relations referenced in FROM
tableslist[str]Physical table names accessed

ColumnLineage — lineage for a single result column:

AttributeTypeDescription
namestrOutput column name (alias or inferred)
indexintZero-based position in the result column list
originColumnOrigin | NoneOrigin, or None for expressions

ColumnOrigin — physical table and column:

AttributeTypeDescription
tablestrTable name
columnstrColumn name

RelationAccess — a catalog relation in FROM:

AttributeTypeDescription
namestrRelation name
kindstr"table" or "view"

syntaqlite.tokenize

Tokenize SQL into a list of token dicts.

syntaqlite.tokenize(sql)
ParameterTypeDescription
sqlstrSQL to tokenize

Returns: list[dict]. One entry per token:

KeyTypeDescription
textstrToken text
offsetintByte offset in the source
lengthintLength of the token in bytes
typeintInternal token type ID
>>> syntaqlite.tokenize("SELECT 1")
[{'text': 'SELECT', 'offset': 0, 'length': 6, 'type': ...},
 {'text': '1', 'offset': 7, 'length': 1, 'type': ...}]

syntaqlite.FormatError

Exception raised by syntaqlite.format_sql when the input SQL cannot be parsed.

try:
    syntaqlite.format_sql("SELECT FROM")
except syntaqlite.FormatError as e:
    print(e)  # syntax error near 'FROM'

Inherits from Exception.