Parse SQL from C
This tutorial walks you through parsing a SQL query from C using the syntaqlite source amalgamation. By the end you'll have a small program that parses a query and prints its AST with no dependencies beyond a C compiler.
1. Download the amalgamation
The source amalgamation is two files: syntaqlite_syntax.h (header) and
syntaqlite_syntax.c (implementation). Download them from the latest release:
mkdir sql-parser && cd sql-parser
curl -L https://github.com/LalitMaganti/syntaqlite/releases/latest/download/syntaqlite-syntax-amalgamation.tar.gz | tar xz
You should have:
ls
# syntaqlite_syntax.c syntaqlite_syntax.h
2. Write the program
Create parse.c:
#include "syntaqlite_syntax.h"
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv) {
const char* sql = "SELECT id, name FROM users WHERE active = 1";
if (argc > 1) {
sql = argv[1];
}
// Create a parser for the SQLite dialect.
SyntaqliteParser* p = syntaqlite_parser_create(NULL);
// Feed the source text.
syntaqlite_parser_reset(p, sql, strlen(sql));
// Parse each statement (a source string can contain multiple).
int stmt = 0;
for (;;) {
int32_t rc = syntaqlite_parser_next(p);
if (rc == SYNTAQLITE_PARSE_DONE)
break;
stmt++;
if (rc == SYNTAQLITE_PARSE_ERROR) {
fprintf(stderr, "error in statement %d: %s\n",
stmt, syntaqlite_result_error_msg(p));
continue;
}
// Print the AST.
uint32_t root = syntaqlite_result_root(p);
char* dump = syntaqlite_dump_node(p, root, 0);
printf("--- statement %d ---\n%s\n", stmt, dump);
free(dump);
}
syntaqlite_parser_destroy(p);
return 0;
}
3. Compile and run
cc -O2 -o parse parse.c syntaqlite_syntax.c
Run it with the default query:
./parse
--- statement 1 ---
SelectStmt
columns:
ResultColumn
expr:
ColumnRef
column: "id"
ResultColumn
expr:
ColumnRef
column: "name"
from_clause:
TableRef
table_name: "users"
where_clause:
BinaryExpr
op: EQ
...
Pass your own SQL:
./parse "CREATE TABLE t(x INTEGER PRIMARY KEY, y TEXT NOT NULL)"
4. Handle errors
The parser recovers from errors and keeps going. Try invalid SQL:
./parse "SELECT FROM; SELECT 1"
error in statement 1: syntax error near 'FROM'
--- statement 2 ---
SelectStmt
columns:
ResultColumn
expr:
Literal: 1
Statement 1 failed but statement 2 still parsed successfully.
5. Access tokens
Enable the token side-table to see what the tokenizer produced:
syntaqlite_parser_set_collect_tokens(p, 1);
syntaqlite_parser_reset(p, sql, strlen(sql));
syntaqlite_parser_next(p);
uint32_t count;
const SyntaqliteParserToken* tokens =
syntaqlite_result_tokens(p, &count);
for (uint32_t i = 0; i < count; i++) {
printf("token %u: type=%u offset=%u len=%u\n",
i, tokens[i].type,
tokens[i].offset, tokens[i].length);
}
Next steps
- The source amalgamation gives you the parser and tokenizer. For the formatter and validator, use the prebuilt shared library.
- See the C API reference for the full list of parser, tokenizer, formatter, and validator functions.