Appendix C: Function Reference
TL;DR
This appendix is a function checklist, not a tutorial. Pair it with chapter examples when you need behavioral context.
You can inspect the runtime list with:
./build/markql --mode plain --color=disabled --query "SHOW FUNCTIONS;"
Extraction
TEXT(tag|self[, where/index])DIRECT_TEXT(tag|self)ATTR(tag|self, attr[, where/index])FIRST_TEXT(...),LAST_TEXT(...)FIRST_ATTR(...),LAST_ATTR(...)INNER_HTML(tag|self[, depth|MAX_DEPTH])RAW_INNER_HTML(tag|self[, depth|MAX_DEPTH])
Schema construction
FLATTEN_TEXT(tag[, depth])FLATTEN(tag[, depth])(alias)PROJECT(tag) AS (alias: expr, ...)FLATTEN_EXTRACT(tag)(compat alias ofPROJECT)
Expression helpers
COALESCE(a, b, ...)CASE WHEN ... THEN ... [ELSE ...] ENDTRIM,LTRIM,RTRIMLOWER,UPPERREPLACECONCATSUBSTRING/SUBSTRLENGTH/CHAR_LENGTHPOSITION(substr IN str)/LOCATE
Aggregation and analytics
COUNT(tag|*)SUMMARIZE(*)TFIDF(...)
Source constructors
RAW('<html...>')PARSE('<html...>')PARSE(SELECT inner_html(...) FROM doc ...)FRAGMENTS(...)(deprecated; preferPARSE(...))
Behavior and constraints
selfrefers to the current node for the current row produced byFROM.- Inside axis scopes such as
EXISTS(descendant WHERE ...),selfis rebound to the node being evaluated in that scope. attr.foois shorthand forattributes.fooin operand paths (including alias/axis-qualified forms).TEXT()/INNER_HTML()/RAW_INNER_HTML()require an outerWHEREclause.- The outer
WHEREmust include a non-tag self predicate (for exampleattributes.*,parent.*, etc.), not onlytag = .... INNER_HTML(tag)andRAW_INNER_HTML(tag)default to depth1when depth is omitted.INNER_HTML(tag, MAX_DEPTH)andRAW_INNER_HTML(tag, MAX_DEPTH)auto-expand to each row’smax_depth.- In one
SELECT,INNER_HTML/RAW_INNER_HTMLdepth mode must be consistent (do not mix default, numeric depth, andMAX_DEPTH). - In one
SELECT, do not mixINNER_HTML()andRAW_INNER_HTML()projections.