Introduce archive ability for table partitions #6915
+2,267
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[spark][filesystems][core] Introduce archive ability for table partitions
Purpose
Linked issue: close #5510
Implements archive functionality for Paimon table partitions to optimize storage costs by moving partition files to Archive/ColdArchive storage tiers in S3 and OSS. Supports archive, restore, and unarchive operations via Spark SQL DDL.
Tests
ArchivePartitionActionTest(9 tests)ArchivePartitionActionITCase(3 test templates)ArchivePartitionSQLTest(8 tests)API and Format
New APIs:
StorageTypeenum (Standard, Archive, ColdArchive)FileIO.archive(),FileIO.restoreArchive(),FileIO.unarchive()SQL Syntax:
ALTER TABLE table PARTITION (dt='2024-01-01') ARCHIVE;
ALTER TABLE table PARTITION (dt='2024-01-01') COLD ARCHIVE;
ALTER TABLE table PARTITION (dt='2024-01-01') RESTORE ARCHIVE;
ALTER TABLE table PARTITION (dt='2024-01-01') UNARCHIVE;Storage Format: No changes. Original paths preserved in metadata (in-place archiving).
Documentation
docs/content/concepts/archive.mddocs/content/spark/sql-alter.mdwith archive syntax