Skip to content

Escape newlines in sqlString so multiline rowConditions produce valid SQL#2214

Open
ihistand wants to merge 2 commits into
dataform-co:mainfrom
ihistand:fix-multiline-rowconditions-sqlstring
Open

Escape newlines in sqlString so multiline rowConditions produce valid SQL#2214
ihistand wants to merge 2 commits into
dataform-co:mainfrom
ihistand:fix-multiline-rowconditions-sqlstring

Conversation

@ihistand

Copy link
Copy Markdown

Fixes #2201.

Problem

A multiline rowConditions entry compiles to invalid SQL:

config {
  type: "table",
  assertions: {
    rowConditions: [`
      invalidCheck
    `]
  }
}

rowConditionsAssertion embeds the condition twice: once raw inside WHERE NOT (...) (fine — multiline is valid there), and once as a string-literal label via sqlString(...). sqlString only escaped backslashes and single quotes, so a raw newline survived into the single-quoted failing_row_condition literal — which BigQuery rejects, since single-quoted literals can't span multiple lines.

Fix

The bug belongs in sqlString, not the assertion template — any caller that quotes a multiline string hits it. sqlString now also escapes newlines/carriage-returns, turning a raw newline into the two-char \n escape (which parses back to a newline, keeping the literal single-line):

return `'${stringContents
  .replace(/\\/g, "\\\\")
  .replace(/'/g, "\\'")
  .replace(/\n/g, "\\n")
  .replace(/\r/g, "\\r")}'`;

Ordering matters: backslash-doubling runs first, otherwise the \n/\r introduced here would themselves get re-escaped into \\n/\\r and print a literal backslash-n.

The raw WHERE NOT (...) clause is untouched — the multiline expression stays as-authored; only the label literal is normalized:

SELECT
  'a > 0\n  AND b > 0' AS failing_row_condition,
  *
FROM `project.dataset.test`
WHERE NOT (a > 0
  AND b > 0)

Why escape over triple-quoting (the issue's other suggestion): escaping reuses the quote-escaping path already in sqlString, so one code path handles quotes, backslashes, and newlines uniformly. Triple-quoting would need its own rules for embedded '''.

Test

Added a regression test in core/actions/assertion_test.ts that compiles a table with a multiline rowConditions entry and asserts the failing_row_condition literal stays single-line (escaped \n) while the WHERE NOT (...) clause keeps the raw, multiline expression.

This was raised in #2201 and a maintainer invited a PR. Happy to adjust style or placement.

ihistand and others added 2 commits June 29, 2026 14:12
A multiline `rowConditions` entry compiled to an invalid `failing_row_condition`
string literal, because BigQuery single-quoted literals cannot span multiple
lines (issue dataform-co#2201). `sqlString` only escaped backslashes and single quotes, so
a raw newline survived into the quoted label.

Escape newlines/carriage-returns too, after backslash escaping so the introduced
escapes are not themselves doubled. The raw `WHERE NOT (...)` clause is untouched.
Compiles a table with a multiline rowConditions entry and asserts the
failing_row_condition literal stays single-line (escaped \n) while the
WHERE NOT (...) clause keeps the raw multiline expression. Covers dataform-co#2201.
@ihistand ihistand requested a review from a team as a code owner June 29, 2026 19:14
@ihistand ihistand requested review from kolina and removed request for a team June 29, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiline rowConditions entries lead to invalid SQL

1 participant