Skip to content

[SPARK-21529][SQL] Improve the error message for unsupported Hive union type#56929

Open
AgenticSpark wants to merge 1 commit into
apache:branch-4.xfrom
AgenticSpark:agenticspark/SPARK-21529-branch-4.x
Open

[SPARK-21529][SQL] Improve the error message for unsupported Hive union type#56929
AgenticSpark wants to merge 1 commit into
apache:branch-4.xfrom
AgenticSpark:agenticspark/SPARK-21529-branch-4.x

Conversation

@AgenticSpark

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Backport of #56775 to branch-4.x, requested by @MaxGekk in #56775 (comment) because the original change conflicts on this branch.

Detect unsupported Hive uniontype<...> values when converting Hive FieldSchema types to Spark SQL types and raise a dedicated UNSUPPORTED_HIVE_TYPE error instead of the generic CANNOT_RECOGNIZE_HIVE_TYPE parser error.

This is a cherry-pick of the merged master commit c90cad6. The only conflict was in error-conditions.json: branch-4.x does not have the UNSUPPORTED_HIVE_FUNCTION_TYPE / UNSUPPORTED_HIVE_METASTORE_VERSION_FOR_JAVA entries that exist on master, so the new UNSUPPORTED_HIVE_TYPE entry is placed directly between UNSUPPORTED_GROUPING_EXPRESSION and UNSUPPORTED_INSERT. The Scala changes apply unchanged.

Why are the changes needed?

Spark SQL does not support Hive union types. Today the failure message comes from the parser path and does not clearly identify that the Hive union type is unsupported.

Does this PR introduce any user-facing change?

Yes. Reading a Hive table column that uses uniontype<...> now reports UNSUPPORTED_HIVE_TYPE with the offending Hive type and column name.

How was this patch tested?

Cherry-picked from the merged master commit c90cad6, which passed CI and review as #56775. The production Scala hunks apply unchanged on branch-4.x; only the error-conditions.json entry placement differed and was re-validated (valid JSON, alphabetical ordering, one structural token per line). CI here runs HiveClientImplSuite and the SparkThrowableSuite "Error conditions are correctly formatted" golden check.

Was this patch authored or co-authored using generative AI tooling?

Yes. GitHub Copilot assisted with preparing and validating this change.

…on type

Detect unsupported Hive `uniontype<...>` values when converting Hive `FieldSchema` types to Spark SQL types and raise a dedicated `UNSUPPORTED_HIVE_TYPE` error instead of the generic `CANNOT_RECOGNIZE_HIVE_TYPE` parser error.

Spark SQL does not support Hive union types. Today the failure message comes from the parser path and does not clearly identify that the Hive union type is unsupported.

Yes. Reading a Hive table column that uses `uniontype<...>` now reports `UNSUPPORTED_HIVE_TYPE` with the offending Hive type and column name.

- `SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "core/testOnly *SparkThrowableSuite -- -t \"Error conditions are correctly formatted\""`
- `build/sbt "hive/testOnly *HiveClientImplSuite"`

Yes. GitHub Copilot assisted with preparing and validating this change.

Closes apache#56775 from AgenticSpark/agenticspark/SPARK-21529-uniontype-error.

Authored-by: AgenticSpark <jianglie2023@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
(cherry picked from commit c90cad6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant