Python API reference

The Python package is pure Python and ships with a bundled syntaqlite CLI binary. Requires Python 3.10+. Wheels are published for macOS (arm64, x86_64), Linux (x86_64, aarch64), and Windows (x86_64).

import syntaqlite

with syntaqlite.Syntaqlite() as sq:
    print(sq.format_sql("select 1"))
    stmts = sq.parse("select * from users")

syntaqlite.Syntaqlite

The entry point for all SQL operations. Create one instance and reuse it across many calls — each instance spins up a worker on construction and keeps it alive until close() is called.

syntaqlite.Syntaqlite(*, dialect_path=None, dialect_name=None, binary=None)
ParameterTypeDefaultDescription
dialect_pathstr | NoneNonePath to a dialect shared library (.so/.dylib/.dll). Defaults to SQLite.
dialect_namestr | NoneNoneDialect name. Required only when dialect_path exports more than one dialect.
binarystr | NoneNoneOverride the path to the syntaqlite CLI. Defaults to the binary shipped with the wheel (or SYNTAQLITE_BIN).

Supports the context-manager protocol, so with syntaqlite.Syntaqlite() as sq: cleans up automatically. Not intended for concurrent use — if you want parallelism, create one instance per thread.

sq.format_sql

Format SQL with configurable options.

sq.format_sql(sql, *, line_width=80, indent_width=2, keyword_case="upper", semicolons=True)
ParameterTypeDefaultDescription
sqlstrSQL to format
line_widthint80Maximum line width
indent_widthint2Spaces per indent level
keyword_casestr"upper""upper" or "lower"
semicolonsboolTrueAppend semicolons to statements

Returns: str — the formatted SQL.

Raises: FormatError when the input cannot be parsed.

>>> sq.format_sql("select 1")
'SELECT 1;\n'
>>> sq.format_sql("select 1", keyword_case="lower", semicolons=False)
'select 1\n'

sq.parse

Parse SQL into a list of typed AST nodes.

sq.parse(sql)
ParameterTypeDescription
sqlstrSQL to parse (may contain multiple statements)

Returns: list — one typed AST node per statement. Each node is a __slots__ class (e.g. SelectStmt, CreateTableStmt) with typed attributes.

>>> stmt = sq.parse("SELECT 1 + 2 FROM foo")[0]
>>> type(stmt).__name__
'SelectStmt'
>>> stmt.columns[0].expr
BinaryExpr(...)
>>> stmt.from_clause
TableRef(...)
>>> stmt.where_clause is None
True

Enum and flag fields are IntEnum/IntFlag from syntaqlite.enums:

>>> from syntaqlite.enums import BinaryOp
>>> stmt.columns[0].expr.op
<BinaryOp.PLUS: 0>

Node classes support isinstance checks:

>>> from syntaqlite.nodes import SelectStmt
>>> isinstance(stmt, SelectStmt)
True

The parser recovers from errors and continues; any unparseable fragment comes through as an Error node in the list.

sq.parse_raw

Same as parse but returns plain JSON-shaped dicts without the typed-class wrapping. Use this for performance-sensitive code that doesn't need attribute-style access.

>>> sq.parse_raw("SELECT 1")[0]["type"]
'SelectStmt'

sq.analyze

Analyze SQL against an optional Schema.

sq.analyze(sql, schema=None, *, output=AnalysisOutput.STRUCTURED, render_options=None)
ParameterTypeDefaultDescription
sqlstrSQL to analyze
schemaSchema | NoneNoneCatalog schema to analyze against
outputAnalysisOutput | strSTRUCTUREDReturn shape — typed result or rendered string
render_optionsRenderOptions | NoneNoneFine-grained options for text rendering (source label, etc.). Ignored unless output=TEXT.

Schema can be built from explicit tables/views, from DDL text, or both:

sq.analyze(
    sql,
    syntaqlite.Schema(
        tables=[syntaqlite.Table("users", ["id", "name"])],
        views=[syntaqlite.View("active", ["id"])],
    ),
)

Returns (when output=AnalysisOutput.STRUCTURED): an Analysis.

Returns (when output=AnalysisOutput.TEXT): str with the diagnostics rendered with source context, matching the CLI output.

>>> schema = syntaqlite.Schema(tables=[syntaqlite.Table("users", ["id", "name"])])
>>> r = sq.analyze("SELECT id FROM users", schema)
>>> r.diagnostics
[]

>>> # Rendered text output with a custom file label
>>> text = sq.analyze(
...     "SELECT nme FROM users", schema,
...     output=syntaqlite.AnalysisOutput.TEXT,
...     render_options=syntaqlite.RenderOptions(source_name="query.sql"),
... )
>>> print(text)
error: unknown column 'nme'
 --> query.sql:1:8
...

sq.tokenize

Tokenize SQL into a list of token dicts.

sq.tokenize(sql)
ParameterTypeDescription
sqlstrSQL to tokenize

Returns: list[dict] — one entry per token:

KeyTypeDescription
textstrToken text
offsetintByte offset in the source
lengthintLength of the token in bytes
typeintInternal token type ID
categorystr"keyword", "identifier", "string", "number", "operator", "punctuation", "comment", "parameter", "function", "type", or "other"
>>> sq.tokenize("SELECT 1")
[{'text': 'SELECT', 'offset': 0, 'length': 6, 'type': 161, 'category': 'keyword'},
 ...]

sq.close

Release the worker and stop accepting calls. Called automatically when the instance is used as a context manager.

sq.close()

After close(), any method call raises SyntaqliteError.

Schema types

syntaqlite.Schema

A catalog schema. Everything that contributes to the analyzer's catalog lives here — pick whichever combination fits your use case.

syntaqlite.Schema(*, tables=None, views=None, ddl=None, modules=None)
ParameterTypeDescription
tableslist[Table] | NoneStructured table definitions
viewslist[View] | NoneStructured view definitions
ddlstr | NoneRaw DDL (CREATE TABLE / CREATE VIEW) text, parsed once
modulesdict[str, str] | NoneDialect-specific. Map from dotted module path to SQL source, loaded lazily when the analyzer encounters an import (e.g. Perfetto's INCLUDE PERFETTO MODULE). Ignored by dialects without module support.

syntaqlite.Table

syntaqlite.Table(name, columns=None)
ParameterTypeDescription
namestrTable name
columnslist[str] | NoneColumn names. When None, any column reference is accepted.

syntaqlite.View

syntaqlite.View(name, columns=None)

Same fields as Table.

Result types

Analysis

Returned by analyze when output=AnalysisOutput.STRUCTURED.

AttributeTypeDescription
diagnosticslist[Diagnostic]All diagnostics, aggregated across statements
statementslist[Statement]Per-statement analysis
lineageLineage | NoneColumn lineage of the final query-bearing statement; None when no statement had a query body

Statement

Per-statement analysis, available on Analysis.statements.

AttributeTypeDescription
sourcestrThe SQL source text for this statement
diagnosticslist[Diagnostic]Diagnostics for this statement
defined_relationslist[DefinedRelation]Tables/views defined by DDL statements
lineageLineage | NoneColumn lineage; None if this statement has no query body

Lineage

Column lineage for a query-bearing statement.

AttributeTypeDescription
completeboolTrue if all sources were fully resolved
columnslist[ColumnLineage]Per-column lineage
relationslist[RelationAccess]Catalog relations referenced directly in FROM
physical_tableslist[str]Physical table names accessed after CTE/view expansion
unexpanded_viewslist[str]Views whose bodies weren't available for expansion (non-empty implies complete=False)

ColumnLineage

AttributeTypeDescription
namestrOutput column name (alias or inferred)
indexintZero-based position in the result column list
originColumnOrigin | NoneOrigin table/column, or None for expressions/literals/aggregates

ColumnOrigin

AttributeTypeDescription
tablestrPhysical table name
columnstrColumn name in that table

RelationAccess

AttributeTypeDescription
namestrRelation name as it appears in the catalog
kindstr"table" or "view"

Diagnostic

AttributeTypeDescription
severitystr"error", "warning", "info", or "hint"
messagestrDiagnostic message
start_offsetintByte offset of the start of the span
end_offsetintByte offset of the end of the span
codeDiagnosticCodeMachine-readable kind

AnalysisOutput

StrEnum selecting the return shape of analyze:

NameValueMeaning
STRUCTURED"structured"Return an Analysis (default)
TEXT"text"Return a rendered diagnostics string

RenderOptions

Options that shape AnalysisOutput.TEXT output.

syntaqlite.RenderOptions(*, source_name="")
ParameterTypeDefaultDescription
source_namestr""Source label shown in rendered diagnostics (analogous to a file path).

DiagnosticCode

IntEnum of machine-readable diagnostic kinds:

NameValue
PARSE_ERROR0
UNKNOWN_TABLE1
UNKNOWN_COLUMN2
UNKNOWN_FUNCTION3
UNKNOWN_MODULE4
FUNCTION_ARITY5
CTE_COLUMN_COUNT_MISMATCH6

DefinedRelation

AttributeTypeDescription
namestrRelation name
is_viewboolTrue for views, False for tables

Exceptions

syntaqlite.FormatError

Raised by format_sql when the input SQL cannot be parsed. Inherits from Exception.

try:
    sq.format_sql("SELECT FROM")
except syntaqlite.FormatError as e:
    print(e)

syntaqlite.SyntaqliteError

Base class for runtime errors raised by a Syntaqlite instance — for example, calls made after close. Inherits from RuntimeError.