Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions lib/shortcuts/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ a Lambda permission.</p>
<dt><a href="#GlueDatabase">GlueDatabase</a></dt>
<dd><p>Create a Glue Database.</p>
</dd>
<dt><a href="#GlueIcebergTable">GlueIcebergTable</a></dt>
<dd><p>Create a Glue table backed by Apache Iceberg format on S3.</p>
</dd>
<dt><a href="#GlueJsonTable">GlueJsonTable</a></dt>
<dd><p>Create a Glue Table backed by line-delimited JSON files on S3.</p>
</dd>
Expand Down Expand Up @@ -202,6 +205,43 @@ const db = new cf.shortcuts.GlueDatabase({

module.exports = cf.merge(myTemplate, db);
```
<a name="GlueIcebergTable"></a>

## GlueIcebergTable
Create a Glue table backed by Apache Iceberg format on S3.

**Kind**: global class
<a name="new_GlueIcebergTable_new"></a>

### new GlueIcebergTable(options)

| Param | Type | Default | Description |
| --- | --- | --- | --- |
| options | <code>Object</code> | | Options for creating an Iceberg table. |
| options.LogicalName | <code>String</code> | | The logical name of the Glue Table within the CloudFormation template. |
| options.Name | <code>String</code> | | The name of the table. |
| options.DatabaseName | <code>String</code> | | The name of the database the table resides in. |
| options.Location | <code>String</code> | | The physical location of the table (S3 URI). Required. |
| [options.Columns] | <code>Array.&lt;Object&gt;</code> | | Simple column definitions as array of {Name, Type, Required}. If provided, will be auto-converted to Iceberg Schema format. Use options.Schema for full control. |
| [options.Schema] | <code>Object</code> | | Full Iceberg schema definition with Type: "struct" and Fields array. Either Columns or Schema must be provided. See [AWS documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-icebergtableinput.html). |
| [options.PartitionSpec] | <code>Object</code> | | Iceberg partition specification. See [AWS documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-partitionspec.html). |
| [options.WriteOrder] | <code>Object</code> | | Iceberg write order specification. See [AWS documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-writeorder.html). |
| [options.CatalogId] | <code>String</code> | <code>AccountId</code> | The AWS account ID for the account in which to create the table. |
| [options.IcebergVersion] | <code>String</code> | <code>&#x27;2&#x27;</code> | The table version for the Iceberg table. |
| [options.EnableOptimizer] | <code>Boolean</code> | <code>false</code> | Whether to enable the snapshot retention optimizer. |
| [options.OptimizerRoleArn] | <code>String</code> | | The ARN of the IAM role for the retention optimizer. Required if EnableOptimizer is true. |
| [options.SnapshotRetentionPeriodInDays] | <code>Number</code> | <code>5</code> | The number of days to retain snapshots. |
| [options.NumberOfSnapshotsToRetain] | <code>Number</code> | <code>1</code> | The minimum number of snapshots to retain. |
| [options.CleanExpiredFiles] | <code>Boolean</code> | <code>true</code> | Whether to delete expired data files after expiring snapshots. |
| [options.EnableCompaction] | <code>Boolean</code> | <code>false</code> | Whether to enable the compaction optimizer. |
| [options.CompactionRoleArn] | <code>String</code> | | The ARN of the IAM role for the compaction optimizer. Required if EnableCompaction is true. |
| [options.EnableOrphanFileDeletion] | <code>Boolean</code> | <code>false</code> | Whether to enable the orphan file deletion optimizer. |
| [options.OrphanFileDeletionRoleArn] | <code>String</code> | | The ARN of the IAM role for the orphan file deletion optimizer. Required if EnableOrphanFileDeletion is true. |
| [options.OrphanFileRetentionPeriodInDays] | <code>Number</code> | <code>3</code> | The number of days to retain orphan files before deleting them. |
| [options.OrphanFileDeletionLocation] | <code>String</code> | | The S3 location to scan for orphan files. |
| [options.Condition] | <code>String</code> | | CloudFormation condition name. |
| [options.DependsOn] | <code>String</code> | | CloudFormation resource dependency. |

<a name="GlueJsonTable"></a>

## GlueJsonTable
Expand Down
227 changes: 227 additions & 0 deletions lib/shortcuts/glue-iceberg-table.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
'use strict';

/**
* Create a Glue table backed by Apache Iceberg format on S3.
*
* @param {Object} options - Options for creating an Iceberg table.
* @param {String} options.LogicalName - The logical name of the Glue Table within the CloudFormation template.
* @param {String} options.Name - The name of the table.
* @param {String} options.DatabaseName - The name of the database the table resides in.
* @param {String} options.Location - The physical location of the table (S3 URI). Required.
* @param {Array<Object>} [options.Columns] - Simple column definitions as array of {Name, Type, Required}.
* If provided, will be auto-converted to Iceberg Schema format. Use options.Schema for full control.
* @param {Object} [options.Schema] - Full Iceberg schema definition with Type: "struct" and Fields array.
* Either Columns or Schema must be provided. See [AWS
* documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-icebergtableinput.html).
* @param {Object} [options.PartitionSpec] - Iceberg partition specification. See [AWS
* documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-partitionspec.html).
* @param {Object} [options.WriteOrder] - Iceberg write order specification. See [AWS
* documentation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-writeorder.html).
* @param {String} [options.CatalogId=AccountId] - The AWS account ID for the account in which to create the table.
* @param {String} [options.IcebergVersion='2'] - The table version for the Iceberg table.
* @param {Boolean} [options.EnableOptimizer=false] - Whether to enable the snapshot retention optimizer.
* @param {String} [options.OptimizerRoleArn=undefined] - The ARN of the IAM role for the retention optimizer. Required if EnableOptimizer is true.
* @param {Number} [options.SnapshotRetentionPeriodInDays=5] - The number of days to retain snapshots.
* @param {Number} [options.NumberOfSnapshotsToRetain=1] - The minimum number of snapshots to retain.
* @param {Boolean} [options.CleanExpiredFiles=true] - Whether to delete expired data files after expiring snapshots.
* @param {Boolean} [options.EnableCompaction=false] - Whether to enable the compaction optimizer.
* @param {String} [options.CompactionRoleArn=undefined] - The ARN of the IAM role for the compaction optimizer. Required if EnableCompaction is true.
* @param {Boolean} [options.EnableOrphanFileDeletion=false] - Whether to enable the orphan file deletion optimizer.
* @param {String} [options.OrphanFileDeletionRoleArn=undefined] - The ARN of the IAM role for the orphan file deletion optimizer. Required if EnableOrphanFileDeletion is true.
* @param {Number} [options.OrphanFileRetentionPeriodInDays=3] - The number of days to retain orphan files before deleting them.
* @param {String} [options.OrphanFileDeletionLocation=undefined] - The S3 location to scan for orphan files.
* @param {String} [options.Condition=undefined] - CloudFormation condition name.
* @param {String} [options.DependsOn=undefined] - CloudFormation resource dependency.
*/
class GlueIcebergTable {
constructor(options) {
if (!options) throw new Error('Options required');
const {
LogicalName,
Name,
DatabaseName,
Location,
Columns,
Schema,
PartitionSpec,
WriteOrder,
CatalogId = { Ref: 'AWS::AccountId' },
IcebergVersion = '2',
EnableOptimizer = false,
OptimizerRoleArn,
SnapshotRetentionPeriodInDays = 5,
NumberOfSnapshotsToRetain = 1,
CleanExpiredFiles = true,
EnableCompaction = false,
CompactionRoleArn,
EnableOrphanFileDeletion = false,
OrphanFileDeletionRoleArn,
OrphanFileRetentionPeriodInDays = 3,
OrphanFileDeletionLocation,
Condition,
DependsOn
} = options;

// Validate required fields
const required = [LogicalName, Name, DatabaseName, Location];
if (required.some((variable) => !variable))
throw new Error('You must provide a LogicalName, Name, DatabaseName, and Location');

if (!Columns && !Schema)
throw new Error('You must provide either Columns or Schema');

if (EnableOptimizer && !OptimizerRoleArn)
throw new Error('You must provide an OptimizerRoleArn when EnableOptimizer is true');

if (EnableCompaction && !CompactionRoleArn)
throw new Error('You must provide a CompactionRoleArn when EnableCompaction is true');

if (EnableOrphanFileDeletion && !OrphanFileDeletionRoleArn)
throw new Error('You must provide an OrphanFileDeletionRoleArn when EnableOrphanFileDeletion is true');

// Convert simple Columns format to Iceberg Schema format if needed
let icebergSchema = Schema;
if (!Schema && Columns) {
icebergSchema = {
Type: 'struct',
Fields: Columns.map((col, index) => ({
Name: col.Name,
Type: col.Type,
Id: index + 1,
Required: col.Required !== undefined ? col.Required : true
}))
};
}

// Build the Iceberg table resource (no TableInput!)
this.Resources = {
[LogicalName]: {
Type: 'AWS::Glue::Table',
Condition,
DependsOn,
Properties: {
CatalogId,
DatabaseName,
Name,
OpenTableFormatInput: {
IcebergInput: {
MetadataOperation: 'CREATE',
Version: IcebergVersion,
IcebergTableInput: {
Location,
Schema: icebergSchema
}
}
}
}
}
};

// Add optional PartitionSpec if provided
if (PartitionSpec) {
this.Resources[LogicalName].Properties.OpenTableFormatInput.IcebergInput.IcebergTableInput.PartitionSpec = PartitionSpec;
}

// Add optional WriteOrder if provided
if (WriteOrder) {
this.Resources[LogicalName].Properties.OpenTableFormatInput.IcebergInput.IcebergTableInput.WriteOrder = WriteOrder;
}

// Optionally add TableOptimizer for configuring snapshot retention
if (EnableOptimizer) {
const optimizerLogicalName = `${LogicalName}RetentionOptimizer`;
this.Resources[optimizerLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'retention',
TableOptimizerConfiguration: {
RoleArn: OptimizerRoleArn,
Enabled: true,
RetentionConfiguration: {
IcebergConfiguration: {
SnapshotRetentionPeriodInDays,
NumberOfSnapshotsToRetain,
CleanExpiredFiles
}
}
}
}
};

// Apply Condition to optimizer if specified on the table
if (Condition) {
this.Resources[optimizerLogicalName].Condition = Condition;
}
}

// Optionally add TableOptimizer for compaction
// NOTE: CloudFormation does not support CompactionConfiguration properties
// (strategy, minInputFiles, deleteFileThreshold). These must be configured
// via AWS CLI/API after stack creation, or will use AWS defaults.
// See: https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/2257
if (EnableCompaction) {
const compactionLogicalName = `${LogicalName}CompactionOptimizer`;
this.Resources[compactionLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'compaction',
TableOptimizerConfiguration: {
RoleArn: CompactionRoleArn,
Enabled: true
}
}
};

// Apply Condition to compaction optimizer if specified on the table
if (Condition) {
this.Resources[compactionLogicalName].Condition = Condition;
}
}

// Optionally add TableOptimizer for orphan file deletion
if (EnableOrphanFileDeletion) {
const orphanLogicalName = `${LogicalName}OrphanFileDeletionOptimizer`;
const icebergConfiguration = {
OrphanFileRetentionPeriodInDays
};

// Only add Location if specified, otherwise it defaults to table location
if (OrphanFileDeletionLocation) {
icebergConfiguration.Location = OrphanFileDeletionLocation;
}

this.Resources[orphanLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'orphan_file_deletion',
TableOptimizerConfiguration: {
RoleArn: OrphanFileDeletionRoleArn,
Enabled: true,
OrphanFileDeletionConfiguration: {
IcebergConfiguration: icebergConfiguration
}
}
}
};

// Apply Condition to orphan file deletion optimizer if specified on the table
if (Condition) {
this.Resources[orphanLogicalName].Condition = Condition;
}
}
}
}

module.exports = GlueIcebergTable;
1 change: 1 addition & 0 deletions lib/shortcuts/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ module.exports = {
GlueJsonTable: require('./glue-json-table'),
GlueOrcTable: require('./glue-orc-table'),
GlueParquetTable: require('./glue-parquet-table'),
GlueIcebergTable: require('./glue-iceberg-table'),
GluePrestoView: require('./glue-presto-view'),
GlueSparkView: require('./glue-spark-view'),
hookshot: require('./hookshot'),
Expand Down
Loading