Skip to content

fix(cli): avoid splitting emoji when truncating display strings#28224

Open
feizhuzheng wants to merge 1 commit into
google-gemini:mainfrom
feizhuzheng:fix-sanitize-surrogate
Open

fix(cli): avoid splitting emoji when truncating display strings#28224
feizhuzheng wants to merge 1 commit into
google-gemini:mainfrom
feizhuzheng:fix-sanitize-surrogate

Conversation

@feizhuzheng

Copy link
Copy Markdown

Summary

sanitizeForDisplay truncates with str.length and str.substring, which count UTF-16 code units. When maxLength falls inside a surrogate pair (an emoji or other astral character), the pair is split and the leftover lone surrogate renders as a replacement character. This affects terminal notification titles/bodies and slash-command descriptions, which can contain emoji.

Details

The same file already exports cpLen and cpSlice, which operate on code points. sanitizeForDisplay now uses them, so length checks and truncation happen on code-point boundaries and a trailing emoji is either kept whole or dropped, never cut in half.

For a purely ASCII string the behavior is unchanged (cpLen/cpSlice short-circuit on the ASCII fast path).

How to Validate

sanitizeForDisplay('🎉'.repeat(10), 8)
// before: "🎉🎉\uD83C..."  (lone surrogate -> renders as )
// after:  "🎉🎉🎉🎉🎉..."

Added a unit test in textUtils.test.ts covering this case (asserts the result contains no lone surrogate) alongside the existing ASCII truncation test.

sanitizeForDisplay measured length with String.length and cut with
substring, both of which count UTF-16 code units. When maxLength lands
inside a surrogate pair (an emoji or other astral character), the pair is
split and the leftover lone surrogate renders as a replacement character
in notification titles/bodies and command descriptions.

Use the cpLen/cpSlice helpers already defined in this file so truncation
happens on code point boundaries.
@feizhuzheng feizhuzheng requested a review from a team as a code owner July 1, 2026 00:05
@google-cla

google-cla Bot commented Jul 1, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where display strings containing emojis or other astral characters were being incorrectly truncated. By switching from UTF-16 code unit-based string operations to code-point-aware utilities, the implementation now ensures that multi-unit characters remain intact during truncation, improving the visual consistency of terminal outputs and command descriptions.

Highlights

  • Emoji Truncation Fix: Updated the sanitizeForDisplay function to use code-point-aware length checking and slicing instead of standard string methods.
  • Surrogate Pair Integrity: Ensured that astral characters like emojis are not split during truncation, preventing the generation of lone surrogate characters that cause rendering issues.
  • Regression Testing: Added a new unit test in textUtils.test.ts to verify that emoji strings are truncated correctly without leaving invalid surrogate pairs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates sanitizeForDisplay in textUtils.ts to use code point length (cpLen) and slicing (cpSlice) instead of UTF-16 code unit operations, preventing surrogate pairs (such as emojis) from being split during truncation. A corresponding unit test was also added. The reviewer identified a potential bug where a maxLength of less than 3 would result in a negative end index for cpSlice, which behaves differently than substring and can cause unexpected truncation behavior. A suggestion was provided to use Math.max(0, maxLength - 3) to prevent this issue.

Comment on lines +161 to 163
if (maxLength && cpLen(sanitized) > maxLength) {
sanitized = cpSlice(sanitized, 0, maxLength - 3) + '...';
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using cpSlice with maxLength - 3 can result in a negative end index if maxLength is less than 3. Unlike String.prototype.substring (which treats negative indices as 0), cpSlice uses slice under the hood, where a negative index is treated as an offset from the end of the string/array. This causes the function to return a much longer string than intended (e.g., slicing all but the last character) and append '...', leading to a bug.

To fix this, we should ensure the end index is at least 0 by using Math.max(0, maxLength - 3).

Suggested change
if (maxLength && cpLen(sanitized) > maxLength) {
sanitized = cpSlice(sanitized, 0, maxLength - 3) + '...';
}
if (maxLength && cpLen(sanitized) > maxLength) {
sanitized = cpSlice(sanitized, 0, Math.max(0, maxLength - 3)) + '...';
}

@github-actions github-actions Bot added the size/s A small PR label Jul 1, 2026
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

📊 PR Size: size/S

  • Lines changed: 16
  • Additions: +14
  • Deletions: -2
  • Files changed: 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/s A small PR status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant