docs(domain-skills): add X (Twitter) articles + tweets reading recipe#484
Open
optemism wants to merge 1 commit into
Open
docs(domain-skills): add X (Twitter) articles + tweets reading recipe#484optemism wants to merge 1 commit into
optemism wants to merge 1 commit into
Conversation
Field-tested against an X Article (long-form post). Covers: Articles share the tweet URL shape, the login modal blocks clicks but not DOM reads (observed logged-in; logged-out unverified), innerText extraction + body slicing, tweetText selector for regular tweets (empty for Articles), and the navigate-and-extract-in-one-invocation tab-drift trap. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
✅ Skill review passedReviewed 1 file(s) — no findings. |
Contributor
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="domain-skills/x/articles.md">
<violation number="1" location="domain-skills/x/articles.md:64">
P2: `x_article_body` can silently return a near-empty string when no newline follows the headline. When `full_text.find('\n', i)` returns `-1` (no newline after the headline), `start` becomes `-1`. Python slicing `full_text[-1:]` then evaluates to just the last character of the page, the regex fails, and the function returns a single character instead of the article body. Add a guard so `start` falls back to `i + len(headline)` when no newline is found.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
| def x_article_body(full_text, headline): | ||
| # start just after the headline's first occurrence in the content area | ||
| i = full_text.find(headline) | ||
| start = full_text.find('\n', i) if i != -1 else 0 |
Contributor
There was a problem hiding this comment.
P2: x_article_body can silently return a near-empty string when no newline follows the headline. When full_text.find('\n', i) returns -1 (no newline after the headline), start becomes -1. Python slicing full_text[-1:] then evaluates to just the last character of the page, the regex fails, and the function returns a single character instead of the article body. Add a guard so start falls back to i + len(headline) when no newline is found.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At domain-skills/x/articles.md, line 64:
<comment>`x_article_body` can silently return a near-empty string when no newline follows the headline. When `full_text.find('\n', i)` returns `-1` (no newline after the headline), `start` becomes `-1`. Python slicing `full_text[-1:]` then evaluates to just the last character of the page, the regex fails, and the function returns a single character instead of the article body. Add a guard so `start` falls back to `i + len(headline)` when no newline is found.</comment>
<file context>
@@ -0,0 +1,131 @@
+def x_article_body(full_text, headline):
+ # start just after the headline's first occurrence in the content area
+ i = full_text.find(headline)
+ start = full_text.find('\n', i) if i != -1 else 0
+ # end at the post timestamp ("H:MM AM/PM · Mon DD, YYYY") — search AFTER
+ # start, or a timestamp-shaped string earlier in the page truncates the body
</file context>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
domain-skills/x/articles.md— a field-tested recipe for reading X Articles (long-form posts) and regular tweets.Key findings captured
x.com/{handle}/status/{id}) — there is no/article/route;document.titletells you which you got.document.body.innerTextreads straight through it. (Observed with a logged-in profile; fully logged-out is flagged as unverified.)[data-testid="tweetText"]is empty for Articles — it works for regular tweets/threads only; Articles need the innerText path.browser-harness -cinvocation, or a follow-up call can attach to a stale/different tab.ensure_real_tab()referenced as the canonical remedy.wait_for_load()alone returns a short/empty body; needswait(3–4).Follows the SKILL.md domain-skill conventions: no pixel coordinates, no run narration, no secrets; explicit "does not work / untested" section.
🤖 Generated with Claude Code
Summary by cubic
Adds
domain-skills/x/articles.md, a concise recipe for reading X (Twitter) Articles and regular tweets via DOM text extraction. Covers URL shape, selectors, timing, and tab attachment to make extraction reliable even with the login modal.x.com/{handle}/status/{id}; usedocument.titleto distinguish.document.body.innerText; the login modal blocks clicks but not reads.[data-testid="tweetText"](empty for Articles).Written for commit 49a9922. Summary will update on new commits.