fix: apply recursive CTE column-list aliases to the static term#23098
fix: apply recursive CTE column-list aliases to the static term#23098tomsanbear wants to merge 3 commits into
Conversation
`WITH RECURSIVE t(n) AS (SELECT 1 UNION ALL SELECT n + 1 FROM t WHERE n < 10)` failed to plan with `Schema error: No field named n. Valid fields are t."Int64(1)".`. The CTE's declared column-list names were applied (via `apply_table_alias`) only after the whole CTE plan was built, but the recursive working relation is derived from the schema of the static term before that, so the self-reference could never resolve the declared names. Apply the column-list aliases to the static term inside `recursive_cte`, before the work table is created, so the working relation and the self-reference expose the declared names. The relation-name alias is still added by the caller; on the recursive path the column re-alias is skipped to avoid a redundant projection on top of the `RecursiveQuery` node. The non-UNION fallback applies the aliases directly. Non-recursive CTEs are unchanged. Adds sqllogictest coverage for single- and multi-column column-list recursive CTEs, UNION (DISTINCT), and the column/alias-count mismatch error.
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
There was a problem hiding this comment.
@tomsanbear
Thanks for the update! The changes look good to me. I just have one small suggestion that could help strengthen the test coverage.
| # recursive CTE with a column-list alias (e.g. `t(n)`): the declared names must be | ||
| # applied to the static term so the recursive self-reference can resolve them | ||
| query I rowsort | ||
| WITH RECURSIVE t(n) AS ( |
There was a problem hiding this comment.
Nice improvement! One small suggestion: could we add a regression test with a quoted recursive CTE column-list alias, for example WITH RECURSIVE t("N") AS (...) SELECT "N" FROM t? I think it would be helpful to document that quoted and case-sensitive aliases are preserved in the recursive work table as well. This is not blocking since the implementation already goes through the existing alias normalization path.
kosiew
left a comment
There was a problem hiding this comment.
Can you investigate and resolve the CI error?
For sure, thanks for the feedback! I'll have some time a bit later today and will push a fix to the branch |
|
@kosiew i've updated the snapshot along with the new test you recommended 👍 |
Which issue does this PR close?
Rationale for this change
WITH RECURSIVE t(n) AS (...)failed to plan because the CTE's declared column-list names (thet(n)part) were never applied to the recursive working relation. They were applied (viaapply_table_alias) only after the whole CTE plan was built, but the working table is derived from the static term's schema before that — so the self-reference couldn't resolve the declared names and planning failed withSchema error: No field named n. Valid fields are t."Int64(1)".. PostgreSQL and DuckDB accept the query; aliasing inside the staticSELECT(SELECT 1 AS n) was the only workaround.What changes are included in this PR?
Apply the column-list aliases to the static term inside
recursive_cte(), before the work table is created, so the working relation and the self-reference carry the declared names. The caller now applies only the relation-name alias on the recursive path (the columns are already applied), avoiding a redundant projection on top of theRecursiveQuerynode. The non-UNION fallback applies the aliases directly; non-recursive CTEs are unchanged. A column/alias-count mismatch is now reported at the static term — a clearer error than the previous "No field named …".Are these changes tested?
Yes, added
cte.sltcases for single- and multi-column column-list recursive CTEs (asserting the recursion produces the expected rows),UNION (DISTINCT), the arity-mismatch error, and anEXPLAINlocking the plan shape (no extra projection overRecursiveQuery).Are there any user-facing changes?
WITH RECURSIVE t(n) AS (...)and multi-column column lists now plan and execute correctly, matching PostgreSQL/DuckDB. No API changes.Note: this pull request was created together with AI tools (claude code), the full diff was reviewed by myself in full prior to submission