Skip to the content.

Appendix A: Grammar Notes

TL;DR

Use this appendix as a compact syntax map while reading chapters. It summarizes forms that appear in verified examples.

This appendix summarizes practical grammar shapes used in the book.

Core query

WITH <cte_name> AS (<select_query>) [, ...]
SELECT <projection>
FROM <source> [AS <alias>]
[<join_clause> ...]
[WHERE <predicate>]
[ORDER BY <field> [ASC|DESC]]
[LIMIT <n>]
[TO <sink>]

WITH is optional. If omitted, the query starts at SELECT.

Sources

Joins

Projections

SELECT a(href, tag) ... is the compact equivalent of selecting a.href, a.tag.

Extraction semantics (important):

Predicates

Notes on current behavior

This is a style recommendation, not a language rule.

Why this helps:

Rules of thumb:

Before:

WITH rows AS (
  SELECT n.node_id AS row_id
  FROM doc AS n
  WHERE n.tag = 'tr'
),
cells AS (
  SELECT
    r.row_id,
    c.sibling_pos AS pos,
    TEXT(c) AS val
  FROM rows AS r
  CROSS JOIN LATERAL (
    SELECT self
    FROM doc AS c
    WHERE c.parent_id = r.row_id
      AND c.tag = 'td'
  ) AS c
)
SELECT r.row_id, c.val
FROM rows AS r
JOIN cells AS c ON c.row_id = r.row_id;

After (recommended style):

WITH r_rows AS (
  SELECT node_row.node_id AS row_id
  FROM doc AS node_row
  WHERE node_row.tag = 'tr'
),
r_cells AS (
  SELECT
    r_row.row_id,
    node_cell.sibling_pos AS pos,
    TEXT(node_cell) AS val
  FROM r_rows AS r_row
  CROSS JOIN LATERAL (
    SELECT self
    FROM doc AS node_cell
    WHERE node_cell.parent_id = r_row.row_id
      AND node_cell.tag = 'td'
  ) AS node_cell
)
SELECT r_row.row_id, r_cell.val
FROM r_rows AS r_row
JOIN r_cells AS r_cell ON r_cell.row_id = r_row.row_id;

SELECT self for current row nodes

Canonical rule:

Why:

Compatibility:

Migration:

Diagnostics Quick Use

Validate grammar + semantic shape without execution:

./build/markql --lint "SELECT FROM doc"
./build/markql --lint "SELECT FROM doc" --format json

Color controls for human lint text:

Diagnostic references: