-
Notifications
You must be signed in to change notification settings - Fork 0
feat: auto-generate llms #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughIntroduces automated infrastructure for generating LLMS documentation index for Flutter SDK. Adds a GitHub Actions workflow to orchestrate generation, a Bash script to analyze and index repository contents with optional LLM enrichment, and an initial LLMS inventory file. Changes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements automated generation of llms.txt documentation files using LLM-powered descriptions. The script provides incremental updates to minimize API costs, atomic file writes for safety, and GitHub Actions integration triggered on releases.
Key changes:
- Added
scripts/generate_llms.shwith incremental diff-based LLM enrichment and surgical updates - Created GitHub Actions workflow (
.github/workflows/generate-llms.yml) to auto-generate on release events - Generated initial
llms.txtwith file descriptions grouped by directory
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| scripts/generate_llms.sh | Main bash script implementing LLM-powered file description generation with incremental mode, GitHub API integration, and atomic writes |
| .github/workflows/generate-llms.yml | GitHub Actions workflow triggering on release or manual dispatch to run generation script and create PR |
| llms.txt | Generated output file containing organized file index with LLM-enriched descriptions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (4)
llms.txt (1)
19-22: Similar duplicate entries in iOS/platform section.Messages.g.swift (lines 19 & 21) and ClixPlugin.swift (lines 20 & 22) are duplicated with slightly different descriptions. This reinforces the systematic duplication issue noted in the Android section.
scripts/generate_llms.sh (2)
111-111: Sed command in workflow (line 111 of .github/workflows/generate-llms.yml) lacks file existence check.
120-126: Case statement falls through without skipping excluded directories.The case statement matches excluded directories on line 120-121 but doesn't skip them—it falls through to the
findcheck on line 123, potentially adding excluded directories to the scan list anyway. This is the exact issue flagged in the past review.Add
continueafter the empty command on line 121 to actually skip excluded directories:while IFS= read -r topdir; do case "$topdir" in - .git|.github|.vscode|build|.dart_tool|ios|android|gradle|.swiftpm) continue ;; # checked below by content + .git|.github|.vscode|build|.dart_tool|ios|android|gradle|.swiftpm) ;; # checked below by content esac - if find "$topdir" -type f \( -name "*.dart" -o -name "*.kt" -o -name "*.swift" \) -print -quit >/dev/null 2>&1; then + # FIXED: Only add if not in excluded list + if [[ ! "$topdir" =~ ^(.git|.github|.vscode|build|.dart_tool|ios|android|gradle|.swiftpm)$ ]] && \ + find "$topdir" -type f \( -name "*.dart" -o -name "*.kt" -o -name "*.swift" \) -print -quit >/dev/null 2>&1; then dirs+=("$topdir") fiWait—upon closer inspection, the comment says "checked below by content", suggesting the excluded dirs should be added if they contain target files. If that's the intent, the logic is correct as-is and only needs clarification. However, the naming conflict between the top-level excluded directories (which have structure like
ios/Pods) and the directories being checked (likeios/) suggests possible false matches. Please verify intent..github/workflows/generate-llms.yml (1)
107-114: Sed command on line 111 may fail silently on first run when llms.txt doesn't exist.When
llms.txtdoes not exist (first run), the sed command on line 111 will produce no output. The|| truesuppresses the error, butBASEremains empty. This is then used to constructCOMPARE_RANGE="${BASE}...${HEAD_REF}", resulting in an invalid compare range like...origin/main.Add an explicit file existence check before attempting to extract the commit:
# Read base SHA from existing file header - BASE="$(sed -nE 's/^<!--[[:space:]]*commit:[[:space:]]*([0-9a-f]+).*/\1/p' llms.txt | head -1 || true)" - [[ -z "${BASE}" ]] && BASE="${HEAD_REF}" - COMPARE_RANGE="${BASE}...${HEAD_REF}" + if [[ -f llms.txt ]]; then + BASE="$(sed -nE 's/^<!--[[:space:]]*commit:[[:space:]]*([0-9a-f]+).*/\1/p' llms.txt | head -1)" + fi + BASE="${BASE:-${HEAD_REF}}" + COMPARE_RANGE="${BASE}...${HEAD_REF}"This ensures the file is only parsed if it exists, and a sensible fallback (
HEAD_REF) is used when the file is missing or commit header is not found.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/generate-llms.yml(1 hunks)llms.txt(1 hunks)scripts/generate_llms.sh(1 hunks)
🧰 Additional context used
🪛 LanguageTool
llms.txt
[style] ~32-~32: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...token retrieval, and event tracking. - [Clix Config](https://raw.githubusercontent.c...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~33-~33: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...onand dependent onClixLogLevel`. - [Clix Config.g](https://raw.githubusercontent...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... across the core configuration flow. - [Clix Version](https://raw.githubusercontent....
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~35-~35: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...romPlatform(), falling back to "0. - [Clix](https://raw.githubusercontent.com/clix...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ..., and event tracking (trackEvent). - [Clix Config](https://raw.githubusercontent.c...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~43-~43: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ies` for push notification handling. - [Clix Device](https://raw.githubusercontent.c...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ializable codegen in `clix_device.g. - [Clix User Property](https://raw.githubuserco...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hashCode for user property payloads. - [Clix Push Notification Payload](https://raw....
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~46-~46: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ise debug toString representation. - [Clix Device.g](https://raw.githubusercontent...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~47-~47: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...lds to/from the backend JSON schema. - [Clix User Property.g](https://raw.githubuser...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~48-~48: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...l’s public fromJson/toJson APIs. - [Clix Push Notification Payload.g](https://ra...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~49-~49: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...en raw JSON maps and the Dart model. - [Clix Device](https://raw.githubusercontent.c...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~50-~50: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ationand generatedclix_device.g. - [Clix User Property](https://raw.githubuserco...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~82-~82: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nfo, debug`) for SDK-wide logging. - [Clix Log Level](https://raw.githubuserconten...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 Shellcheck (0.11.0)
scripts/generate_llms.sh
[warning] 33-33: args appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 222-222: Declare and assign separately to avoid masking return values.
(SC2155)
[info] 305-305: Expressions don't expand in single quotes, use double quotes for that.
(SC2016)
[info] 308-308: Expressions don't expand in single quotes, use double quotes for that.
(SC2016)
[warning] 417-417: This redirection doesn't have a command. Move to its command (or use 'true' as no-op).
(SC2188)
[warning] 437-437: This redirection doesn't have a command. Move to its command (or use 'true' as no-op).
(SC2188)
[warning] 464-464: This redirection doesn't have a command. Move to its command (or use 'true' as no-op).
(SC2188)
[info] 582-582: Expansions inside ${..} need to be quoted separately, otherwise they match as patterns.
(SC2295)
[info] 636-636: Expansions inside ${..} need to be quoted separately, otherwise they match as patterns.
(SC2295)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🔇 Additional comments (1)
scripts/generate_llms.sh (1)
28-28: I need to verify the exact OpenAI API model identifier to provide an accurate recommendation for the fix.Based on the web search results, I've found critical information about the review comment. Let me verify one more detail about the specific OpenAI API naming convention used in shell scripts.
The review comment is factually incorrect and should be disregarded.
In the API, the three reasoning-focused models — GPT-5, GPT-5 mini, and GPT-5 nano — are available as gpt-5, gpt-5-mini, and gpt-5-nano, respectively. OpenAI is releasing GPT-5 in three sizes in the API—gpt-5, gpt-5-mini, and gpt-5-nano. The model name
gpt-5-minispecified in line 28 is a valid, documented OpenAI model that is currently available in the API and has no issues. The code default is correct.Likely an incorrect or invalid review comment.
This pull request implements the automation workflow of generating the
llms.txt. Ensurellms.txtfiles stay up to date, detailed description with LLM enrichment and cost-efficient by updating only what changed.The implementation includes
llms.txtllms.txtby running scriptllms.txt: Summary: total=37 ok=37 fail=0Summary by CodeRabbit
Release Notes
New Features
Chores