sqlode/query_analyzer/column_inferencer
Values
pub fn extract_cte_tables(
query_name: String,
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> Result(List(model.Table), context.AnalysisError)
Extract CTE definitions from a query’s tokens and return the
resulting virtual tables. Each name AS (body) (or
name(c1, c2) AS (body)) becomes a Table whose columns come from
running infer_columns_from_tokens on the body. RECURSIVE CTEs use
the anchor (first) branch via the existing tok_strip_compound, so
recursive self-references are not analysed.
pub fn extract_cte_tables_from_stmt(
query_name: String,
stmt: query_ir.Stmt,
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> Result(List(model.Table), context.AnalysisError)
IR-driven counterpart to extract_cte_tables. Walks
stmt.ctes to enumerate CTE definitions, then re-uses the
existing token-based extractor for body-column inference.
Returning Ok([]) when the IR reports no CTEs matches the
behaviour of the token-based extractor when no WITH … prefix
is present.
pub fn extract_derived_tables(
query_name: String,
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> Result(List(model.Table), context.AnalysisError)
Find every derived table in the token stream — FROM (SELECT ...),
JOIN (SELECT ...), JOIN LATERAL (SELECT ...), and comma-LATERAL
, LATERAL (SELECT ...) — and build a virtual Table for each. Each
body is resolved against catalog via infer_columns_from_tokens,
so nested CTEs / VALUES / derived tables compose naturally. An
explicit AS alias(c1, c2) column list overrides the body’s names.
pub fn extract_derived_tables_from_stmt(
query_name: String,
stmt: query_ir.Stmt,
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> Result(List(model.Table), context.AnalysisError)
IR-driven counterpart to extract_derived_tables. Walks the
statement’s FROM clauses (including nested FromJoin trees
and UPDATE/DELETE from/using) to check whether any derived
subquery exists. When one does, delegates to the existing
token-based extractor for body-column inference — see the
module-level comment for the rationale.
pub fn extract_table_aliases(
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> List(model.Table)
Register table aliases as virtual tables pointing at the underlying
table’s columns. Scans the token stream for
FROM/JOIN/UPDATE table [AS] alias patterns and emits a
model.Table(name: alias, columns: …) copy for each alias whose
underlying table exists in the catalog. Aliases that shadow an
existing catalog entry are skipped so we don’t accidentally
replace a real base table. This is what lets p.user_id resolve
to posts.user_id when the query writes FROM posts AS p.
pub fn extract_table_aliases_from_stmt(
stmt: query_ir.Stmt,
catalog: model.Catalog,
) -> List(model.Table)
IR-driven counterpart to extract_table_aliases. Walks the
statement’s FROM clauses (including nested FromJoin trees
and UPDATE/DELETE target/alias) to register every FROM table AS alias as a virtual table pointing at the underlying
table’s columns. Same skip guards as the token-based path:
aliases identical to the base name are skipped, aliases that
shadow an existing catalog entry are skipped, and duplicate
alias registrations collapse to the first occurrence.
pub fn extract_values_tables(
tokens: List(lexer.Token),
) -> List(model.Table)
Find every (VALUES ...) AS alias(c1, c2, ...) in the token stream
and build a virtual Table for each. Column types come from the first
row’s literals; rows with unsupported expressions are skipped silently
(the resolver will surface any downstream issue).
pub fn extract_values_tables_from_stmt(
stmt: query_ir.Stmt,
) -> List(model.Table)
IR-driven counterpart to extract_values_tables. Walks the
statement’s FROM clauses (and nested FromJoin trees, plus
UPDATE/DELETE from/using) to find every FromValues and
builds a model.Table using the first row’s literal types.
Rows whose expressions do not classify as simple literals are
skipped silently, matching the token-based extractor.
pub fn infer_columns_from_tokens(
query_name: String,
tokens: List(lexer.Token),
catalog: model.Catalog,
) -> Result(List(model.ResultItem), context.AnalysisError)
Token-based column inference, exposed so callers (notably the CTE
virtual-table builder in query_analyzer) can reuse the SELECT
resolver without re-lexing or constructing a synthetic ParsedQuery.
pub fn infer_columns_from_tokens_scoped(
query_name: String,
tokens: List(lexer.Token),
catalog: model.Catalog,
outer_tables: List(String),
) -> Result(List(model.ResultItem), context.AnalysisError)
Same as infer_columns_from_tokens, but when the inner query has no
FROM clause of its own, fall back to outer_tables so correlated
references like SELECT (SELECT books.id) can still be resolved
against the enclosing query’s FROM list.
Every entry point also runs VALUES and derived-table discovery over its own token scope and augments the catalog before resolution, so nested subqueries pick up sibling virtual tables without extra work from the caller.
pub fn infer_result_columns(
ctx: context.AnalyzerContext,
engine: model.Engine,
query: model.ParsedQuery,
tokens: List(lexer.Token),
statement: query_ir.SqlStatement,
catalog: model.Catalog,
) -> Result(List(model.ResultItem), context.AnalysisError)