Skip to content

Conversation

@ericsciple
Copy link
Collaborator

@ericsciple ericsciple commented Dec 5, 2025

Overview

This PR optimizes the webhook payload JSON files used for GitHub Actions autocompletion and hover documentation, achieving a 55% size reduction for these files:

Metric Before After Reduction
Minified 464 KB 209 KB 55%
Gzipped 63 KB 22 KB 65%

Changes

New Optimizations

  1. Compact array format — Params converted from verbose objects to type-dispatched arrays:

    // Before
    {"name": "issue", "description": "The issue itself.", "childParamsGroups": [...]}
    
    // After  
    ["issue", "The issue itself.", [...]]
  2. Object deduplication — Shared structures extracted to objects.json and referenced by index

  3. String interning — 414 duplicate property names stored in strings.json and referenced by index

Size Reduction

File Before After Gzip
webhooks 265 KB 141 KB 13 KB
objects 199 KB 62 KB 7 KB
strings 6 KB 2 KB
Total 464 KB 209 KB 22 KB

Testing

  • All 395 existing tests pass
  • Added validation test that compares optimized output against full source (webhooks.full.json) to ensure no data loss
  • Lint and format checks pass

Documentation

Updated docs/json-data-files.md with:

  • Compact format specification
  • String interning explanation
  • Deduplication details
  • Size reduction breakdown

@ericsciple ericsciple force-pushed the users/ericsciple/25-12-webhooks branch from bfb514f to 4dacd2c Compare December 6, 2025 05:53
@ericsciple ericsciple changed the title Add JSON optimization plan for webhooks/objects data files Optimize webhooks/objects JSON with string interning Dec 6, 2025
@ericsciple ericsciple changed the title Optimize webhooks/objects JSON with string interning Optimize webhooks JSON with compact structure and string interning Dec 6, 2025
@ericsciple ericsciple force-pushed the users/ericsciple/25-12-webhooks branch from 4dacd2c to 27aec5d Compare December 6, 2025 06:23
run: cd languageservice && npm test -- --testPathPattern=eventPayloads
- name: Verify validation tests ran
run: |
if [ ! -f languageservice/src/context-providers/events/webhooks.full.validation-complete ]; then
Copy link
Collaborator Author

@ericsciple ericsciple Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the test is normally skipped when the full webhooks file doesn't exist, I made the test write a marker file. With the marker file, we can be sure the test actually ran.

The full webhooks file (unoptimized) is written above by the script npm run update-webhooks

exit 1
fi
validate-webhooks:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this job to validate the optimized webhooks JSON is equivalent to the full webhooks JSON file.

Refer changes to the file eventPayloads.test.ts in this PR

@ericsciple ericsciple force-pushed the users/ericsciple/25-12-webhooks branch from 27aec5d to 8fbd093 Compare December 6, 2025 06:32
@@ -1,310 +0,0 @@
import {promises as fs} from "fs";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this file to update-webhooks.ts but there were also significant updates anyway

@@ -1,7 +1,8 @@
import {data, DescriptionDictionary} from "@actions/expressions";

Copy link
Collaborator Author

@ericsciple ericsciple Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventsPayloads.ts reads the compact webhook JSON files which contains the keys and descriptions for each GitHub event payload. This information is used for github.event.* syntax validation, autocompletion, and hover descriptions.

A unit test that runs during CI validates the compact webhook JSON files are parsed into a form that exactly matches the full unoptimized webhooks JSON.

The script update-webhooks.ts downloads the full webhook JSON from the GitHub REST API description repository:

import schemaImport from "rest-api-description/descriptions/api.github.com/dereferenced/api.github.com.deref.json";

@ericsciple ericsciple force-pushed the users/ericsciple/25-12-webhooks branch 2 times, most recently from b541ddf to 10dd69f Compare December 6, 2025 06:54
@ericsciple ericsciple force-pushed the users/ericsciple/25-12-webhooks branch from 10dd69f to 2b6d46c Compare December 6, 2025 06:59
@ericsciple ericsciple marked this pull request as ready for review December 6, 2025 07:04
@ericsciple ericsciple requested a review from a team as a code owner December 6, 2025 07:04
Copilot AI review requested due to automatic review settings December 6, 2025 07:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes webhook payload JSON files used for GitHub Actions autocompletion and hover documentation by introducing a compact array format, object deduplication, and string interning. The optimizations achieve a 55% reduction in minified file size (464 KB → 209 KB) and 65% reduction when gzipped (63 KB → 22 KB).

Key Changes:

  • Introduced compact array format for params using type-based dispatch instead of verbose objects
  • Implemented string interning to deduplicate 414 property names into a shared strings.json file
  • Extracted event filtering configuration from TypeScript to event-filters.json for use in both generation and test code

Reviewed changes

Copilot reviewed 11 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
languageservice/src/context-providers/events/strings.json New string table containing 414 interned property names referenced by index
languageservice/src/context-providers/events/eventPayloads.ts Updated to load separate string table and decode compact format with string interning support
languageservice/src/context-providers/events/eventPayloads.test.ts Added comprehensive validation tests comparing optimized output against full source
languageservice/src/context-providers/events/event-filters.json New JSON file defining dropped/kept events, replacing hardcoded TypeScript constants
languageservice/script/webhooks/update-webhooks.ts New generation script implementing compact format conversion, deduplication, and string interning
languageservice/script/webhooks/event-filters.ts TypeScript constants for event filtering (note: appears unused, actual imports use JSON file)
languageservice/script/webhooks/deduplicate.ts Updated to handle both compact and legacy param formats with helper functions
languageservice/package.json Updated scripts to include new strings.json in minification and point to new update-webhooks script
docs/json-data-files.md Comprehensive documentation of compact format, string interning, and optimization details
.gitignore Updated patterns for new full.json and validation-complete marker files
.github/workflows/buildtest.yml Added separate validate-webhooks job to verify optimization correctness
languageservice/script/webhooks/index.ts Removed (replaced by update-webhooks.ts)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Write optimized output with separate string table
// Format: strings.json has string table, webhooks.json/objects.json reference by index
const finalOutput = {
o: internedObjects,
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The o key (deduplicated objects) is written to webhooks.json but is redundant since objects are already written to a separate objects.json file. The reader code loads from objects.json directly (line 102 in eventPayloads.ts) and skips the o key when processing webhooks.json (line 107). Consider removing the o key from finalOutput to avoid confusion and reduce file size slightly.

Suggested change
o: internedObjects,

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +93
/**
* Event filter lists for webhook processing.
*
* These lists categorize GitHub webhook events into those that are valid
* workflow triggers (KEPT_EVENTS) and those that are not (DROPPED_EVENTS).
*
* When new events are added to the GitHub API, they must be added to one
* of these lists. The update-webhooks script will fail if it encounters
* an unknown event.
*
* See: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows
*/

/**
* Events to drop - not valid workflow triggers (GitHub App or API-only events)
*/
export const DROPPED_EVENTS = new Set([
"branch_protection_configuration",
"code_scanning_alert",
"commit_comment",
"custom_property",
"custom_property_values",
"dependabot_alert",
"deploy_key",
"github_app_authorization",
"installation",
"installation_repositories",
"installation_target",
"marketplace_purchase",
"member",
"membership",
"merge_group",
"meta",
"org_block",
"organization",
"package",
"personal_access_token_request",
"ping",
"repository",
"repository_advisory",
"repository_ruleset",
"secret_scanning_alert",
"secret_scanning_alert_location",
"security_advisory",
"security_and_analysis",
"sponsorship",
"star",
"team",
"team_add"
]);

/**
* Events to keep - valid workflow triggers
*/
export const KEPT_EVENTS = new Set([
"branch_protection_rule",
"check_run",
"check_suite",
"create",
"delete",
"deployment",
"deployment_status",
"discussion",
"discussion_comment",
"fork",
"gollum",
"issue_comment",
"issues",
"label",
"milestone",
"page_build",
"project",
"project_card",
"project_column",
"projects_v2",
"projects_v2_item",
"public",
"pull_request",
"pull_request_review",
"pull_request_review_comment",
"pull_request_review_thread",
"push",
"registry_package",
"release",
"repository_dispatch",
"repository_import",
"repository_vulnerability_alert",
"status",
"watch",
"workflow_dispatch",
"workflow_job",
"workflow_run"
]);
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file appears to be unused. The event filters are imported from event-filters.json (see update-webhooks.ts line 6 and eventPayloads.test.ts line 6), not from this TypeScript file. This creates duplicate maintenance burden since the same event lists exist in both event-filters.json and event-filters.ts. Consider removing this file or refactoring to use it as the source of truth instead of the JSON file.

Suggested change
/**
* Event filter lists for webhook processing.
*
* These lists categorize GitHub webhook events into those that are valid
* workflow triggers (KEPT_EVENTS) and those that are not (DROPPED_EVENTS).
*
* When new events are added to the GitHub API, they must be added to one
* of these lists. The update-webhooks script will fail if it encounters
* an unknown event.
*
* See: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows
*/
/**
* Events to drop - not valid workflow triggers (GitHub App or API-only events)
*/
export const DROPPED_EVENTS = new Set([
"branch_protection_configuration",
"code_scanning_alert",
"commit_comment",
"custom_property",
"custom_property_values",
"dependabot_alert",
"deploy_key",
"github_app_authorization",
"installation",
"installation_repositories",
"installation_target",
"marketplace_purchase",
"member",
"membership",
"merge_group",
"meta",
"org_block",
"organization",
"package",
"personal_access_token_request",
"ping",
"repository",
"repository_advisory",
"repository_ruleset",
"secret_scanning_alert",
"secret_scanning_alert_location",
"security_advisory",
"security_and_analysis",
"sponsorship",
"star",
"team",
"team_add"
]);
/**
* Events to keep - valid workflow triggers
*/
export const KEPT_EVENTS = new Set([
"branch_protection_rule",
"check_run",
"check_suite",
"create",
"delete",
"deployment",
"deployment_status",
"discussion",
"discussion_comment",
"fork",
"gollum",
"issue_comment",
"issues",
"label",
"milestone",
"page_build",
"project",
"project_card",
"project_column",
"projects_v2",
"projects_v2_item",
"public",
"pull_request",
"pull_request_review",
"pull_request_review_comment",
"pull_request_review_thread",
"push",
"registry_package",
"release",
"repository_dispatch",
"repository_import",
"repository_vulnerability_alert",
"status",
"watch",
"workflow_dispatch",
"workflow_job",
"workflow_run"
]);

Copilot uses AI. Check for mistakes.
Comment on lines +202 to +203
const name = param[0] as string;
counts.set(name, (counts.get(name) || 0) + 1);
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type assertion as string is unsafe if param[0] could be a number (e.g., if the param was already processed or came from a deduplicated source). While this shouldn't happen in the current flow since collectNames is called before internNames, consider adding a runtime check or using a type guard to ensure type safety.

Suggested change
const name = param[0] as string;
counts.set(name, (counts.get(name) || 0) + 1);
const name = param[0];
if (typeof name === "string") {
counts.set(name, (counts.get(name) || 0) + 1);
}

Copilot uses AI. Check for mistakes.
const idx = nameToIndex.get(param);
return idx !== undefined ? idx : param;
}
if (Array.isArray(param)) {
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type assertion as string is unsafe. param[0] should be a string at this point in the flow, but the type assertion doesn't enforce this at runtime. Consider adding a type guard: if (typeof param[0] !== 'string') throw new Error(...) before line 227 to catch unexpected data structures.

Suggested change
if (Array.isArray(param)) {
if (Array.isArray(param)) {
if (typeof param[0] !== 'string') {
throw new Error(`Expected param[0] to be a string, but got ${typeof param[0]}: ${JSON.stringify(param)}`);
}

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +174
["push", [
[0, "The URL..."], // 0 = "url" from string table
[1, "Unique ID"], // 1 = "id"
2 // 2 = "name" (name-only, no description)
]]
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example structure is misleading. The code shows ["push", [...]] which looks like a compact param array, but "push" is actually an event name (object key), not part of a param structure. Consider revising to show the actual structure: { "push": { "default": { "p": [0, 1, 2] } } } or clarifying that this is pseudo-code illustrating how params would look within the p array, not showing "push" as part of a param.

Suggested change
["push", [
[0, "The URL..."], // 0 = "url" from string table
[1, "Unique ID"], // 1 = "id"
2 // 2 = "name" (name-only, no description)
]]
{
"push": {
"default": {
"p": [
[0, "The URL..."], // 0 = "url" from string table
[1, "Unique ID"], // 1 = "id"
2 // 2 = "name" (name-only, no description)
]
}
}
}

Copilot uses AI. Check for mistakes.
* }
*/
type InternedWebhooksData = {
s: string[];
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type InternedWebhooksData expects an s: string[] field for the string table, but the actual generated webhooks.json does not include this field (it's in a separate strings.json file). The type definition should either mark s as optional or remove it entirely since it's loaded separately on line 101.

Suggested change
s: string[];

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +24
* BODY_PARAM_FIELDS: discarded from every bodyParameters object (only name, description, childParamsGroups are kept)
*/
const EVENT_ACTION_FIELDS = ["description", "summary", "availability", "category", "action"];
const BODY_PARAM_FIELDS = ["type", "in", "isRequired", "enum", "default"];
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable BODY_PARAM_FIELDS.

Suggested change
* BODY_PARAM_FIELDS: discarded from every bodyParameters object (only name, description, childParamsGroups are kept)
*/
const EVENT_ACTION_FIELDS = ["description", "summary", "availability", "category", "action"];
const BODY_PARAM_FIELDS = ["type", "in", "isRequired", "enum", "default"];
* (bodyParameters: only name, description, childParamsGroups are kept)
*/
const EVENT_ACTION_FIELDS = ["description", "summary", "availability", "category", "action"];

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants