Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions ticdc/ticdc-csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ quote = '"'
null = '\N'
include-commit-ts = true
output-old-value = false
output-field-header = false # New in v8.5.5
```

## Transactional constraints
Expand All @@ -51,6 +52,12 @@ In the CSV file, each column is defined as follows:
- Column 5: The `is-update` column only exists when the value of `output-old-value` is true, which is used to identify whether the row data change comes from the UPDATE event (the value of the column is true) or the INSERT/DELETE event (the value is false).
- Column 6 to the last column: One or more columns with data changes.

When `output-field-header = true`, the CSV file includes a header row. The column names in the header row are as follows:

| Column 1 | Column 2 | Column 3 | Column 4 (optional) | Column 5 (optional) | Column 6 | ... | Last column |
| --- | --- | --- | --- | --- | --- | --- | --- |
| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | The first column with data changes | ... | The last column with data changes |
Comment on lines +55 to +59

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

To improve clarity, I suggest rephrasing this section. The current description of header columns for data fields (columns 6 onwards) is a bit confusing as it mixes literal header names with descriptions. This can mislead the user into thinking "The first column with data changes" is the literal header name. My suggestion clarifies that data columns use the actual column names from the source table as headers, using placeholders for illustration.

Suggested change
When `output-field-header = true`, the CSV file includes a header row. The column names in the header row are as follows:
| Column 1 | Column 2 | Column 3 | Column 4 (optional) | Column 5 (optional) | Column 6 | ... | Last column |
| --- | --- | --- | --- | --- | --- | --- | --- |
| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | The first column with data changes | ... | The last column with data changes |
When `output-field-header = true`, the CSV file includes a header row. The header names for metadata columns are fixed, while the header names for data columns are the actual column names from your table. The header row is structured as follows:
| Column 1 | Column 2 | Column 3 | Column 4 (optional) | Column 5 (optional) | Column 6 | ... | Last column |
| --- | --- | --- | --- | --- | --- | --- | --- |
| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | {first-data-column-name} | ... | {last-data-column-name} |


Assume that table `hr.employee` is defined as follows:

```sql
Expand Down Expand Up @@ -85,6 +92,19 @@ When `include-commit-ts = true` and `output-old-value = true`, the DML events of
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

When `include-commit-ts = true`, `output-old-value = true`, and `output-field-header = true`, the DML events of this table are stored in the CSV format as follows:

```csv
ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation
"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York"
"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai"
"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles"
"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas"
"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai"
"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing"
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

## Data type mapping

| MySQL type | CSV type | Example | Description |
Expand Down