tree-sitter-postgres
v1.1.1
Published
PostgreSQL grammar for tree-sitter, generated from Postgres source
Maintainers
Readme
tree-sitter-postgres
A tree-sitter grammar for PostgreSQL, generated directly from PostgreSQL's Bison grammar (gram.y) and keyword list (kwlist.h).
Features
- Current as of PostgreSQL 18 (generated from REL_18_3)
- 727 grammar rules covering the full PostgreSQL SQL syntax
- 494 case-insensitive keywords across all four PG keyword categories
- Correct operator precedence —
1 + 2 * 3parses as1 + (2 * 3) - PL/pgSQL support via a separate grammar with language injection
- Generated, not hand-written — regenerate for any PostgreSQL version
Quick start
npm install
cd postgres && npx tree-sitter generate && npx tree-sitter testRegenerating from PostgreSQL source
The grammar is generated from a local PostgreSQL checkout. Set PG_SOURCE_DIR to point at your PostgreSQL source tree:
export PG_SOURCE_DIR=/path/to/postgres
# Using just (recommended)
just generate
# Or run the script directly
node script/generate-grammar.js "$PG_SOURCE_DIR"
cd postgres && npx tree-sitter generateInput files
| File | Source |
| ----------------------------- | -------------------------------------------- |
| src/backend/parser/gram.y | Bison grammar (733 rules, 3236 alternatives) |
| src/include/parser/kwlist.h | Keyword definitions (494 keywords) |
Generator scripts
| Script | Purpose |
| ------------------------------- | ------------------------------------------------------------------------ |
| script/generate-grammar.js | Orchestrator — reads PG source, writes postgres/grammar.js |
| script/parse-gram-y.js | Parses Bison grammar: rules, terminals, precedence, %prec annotations |
| script/parse-kwlist.js | Parses keyword list into categories |
| script/codegen.js | Generates tree-sitter grammar with precedence and optional-rule handling |
| postgres/harvest-conflicts.sh | Iteratively discovers GLR conflicts needed by tree-sitter |
Repository structure
postgres/ PostgreSQL SQL grammar
grammar.js Generated tree-sitter grammar
src/ Generated parser (C)
test/corpus/ Test cases (35 tests)
known-conflicts.json GLR conflict pairs
plpgsql/ PL/pgSQL grammar
grammar.js Hand-written tree-sitter grammar
src/scanner.c External scanner for dollar-quoting and keywords
test/corpus/ Test cases
queries/ Highlights and injection queries
script/ Shared generator code
generate-grammar.js SQL grammar orchestrator
parse-gram-y.js Bison parser
parse-kwlist.js Keyword parser
codegen.js Grammar code generator
bindings/ Language bindings (Node, Rust, Python, Go, Swift, C)Design notes
Empty rule handling
Bison's /* EMPTY */ alternatives cannot be directly translated — tree-sitter forbids non-start rules that match the empty string. The generator propagates optionality upward via a fixpoint loop and wraps references with optional() at call sites.
Operator precedence
Binary operators are split into a separate a_expr_prec rule resolved by static precedence (no GLR), while complex patterns (IS, IN, BETWEEN, LIKE, subquery operators) stay in a_expr with GLR conflict resolution. Both prec.left/prec.right (generation-time) and prec.dynamic (runtime) are emitted.
PL/pgSQL
PL/pgSQL is implemented as a separate hand-written grammar in plpgsql/ with an external scanner for dollar-quoting and context-sensitive keywords. SQL expressions and statements within PL/pgSQL blocks are delegated to the postgres grammar via tree-sitter language injection (plpgsql/queries/injections.scm).
License
BSD 3-Clause
