fix(skills): replace binary PPTX fixture with programmatic generator#2081
fix(skills): replace binary PPTX fixture with programmatic generator#2081PratikWayase wants to merge 4 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2081 +/- ##
==========================================
+ Coverage 81.61% 87.25% +5.64%
==========================================
Files 130 86 -44
Lines 19489 11554 -7935
Branches 12 0 -12
==========================================
- Hits 15906 10082 -5824
+ Misses 3580 1472 -2108
+ Partials 3 0 -3
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
katriendg
left a comment
There was a problem hiding this comment.
Thanks for tackling this — replacing the opaque binary .pptx with a transparent, code-reviewable generator is a real security and auditability win, and the roundtrip validate_deck test is a nice closed-loop guarantee.
Note I reviewed the current version of the branch, though due to a merge conflict some of the comments may be stale. Please rebase and re-run the CI checks before merge.
Before merge, a few items: the two modified Python files currently fail the repo's ruff lint gate (npm run lint:py) with 13 errors — 9 trailing-whitespace blank lines, 2 lines over 88 chars, and 2 unsorted import blocks. Most are auto-fixable with ruff check --fix / ruff format. Beyond lint, please add type hints to the new helper functions, mark the new roundtrip test with @pytest.mark.integration (matching its sibling tests) and give it AAA structure, and reconsider the direct lxml import (it's only a transitive dep of python-pptx) and the author = "ChatGPT" fixture value. Details are in the inline review.
katriendg
left a comment
There was a problem hiding this comment.
Inline findings from the code review (the overall verdict remains Request changes from my earlier review). Each comment maps to a finding in the review summary.
b9ab10b to
e23cca5
Compare
e23cca5 to
2fad6cf
Compare
katriendg
left a comment
There was a problem hiding this comment.
Thanks @PratikWayase for addressing the comments in this PR. Leaving one small comment which would be a nice one to add.
Also please ensure you rebase and then looking good.
- Read theme XML via etree.fromstring(theme_part.blob); a base python-pptx Part has no .element, which crashed minimal_test_fixture generation. - Restore malformed_pdf_dir, minimal_valid_pdf, oversized_pdf, and many_page_pdf fixtures dropped during the conftest rewrite; test_pdf_safety, test_export_slides, test_export_svg, and test_render_pdf_images depend on them. - Declare lxml as a direct dependency (previously transitive via python-pptx).
auyidi1
left a comment
There was a problem hiding this comment.
New findings only. Excludes items already raised by katriendg/jkim323 and already addressed:
- ruff lint / trailing whitespace / line length / import ordering
- type hints on new helper functions
@pytest.mark.integrationmarker on the roundtrip testauthor = "ChatGPT"→HVE Core Test Fixture- direct
lxmldependency thread (resolved in current head by addinglxmlto
pyproject.toml +uv.lock)
Each finding was verified by reproducing the generator and running validate_deck against the output.
Findings
1. [Medium] Roundtrip guarantee is weaker than advertised + load-bearing empty-notes line
- File:
.github/skills/experimental/powerpoint/tests/test_extract_content_integration.py(L137) - Related:
.github/skills/experimental/powerpoint/tests/conftest.py(L151)
The roundtrip test asserts severity in ("info", "none"). Reproduction shows validate_deck
returns info (not none), because slide 2 has a notes slide with empty text
("Speaker notes present but empty").
That outcome depends entirely on conftest.py L151:
_ = slide2.notes_slide # ensure notes part existsRemove or reorder that line and slide 2 has no notes slide, check_speaker_notes emits a
warning, and the test fails — but the comment gives no hint the roundtrip test depends on it.
Recommendation:
- If the intent is a genuinely clean deck, give slide 2 real notes text and assert
severity == "none". - At minimum, make the coupling explicit in the comment so a future edit does not silently break
the roundtrip test.
2. [Medium] set_color silently no-ops when a theme node is missing
- File:
.github/skills/experimental/powerpoint/tests/conftest.py(L98–L108)
The two guards immediately above raise ValueError (missing theme_part at L86–L90, missing
clrScheme at L95–L96), but set_color does the opposite:
node = clr_scheme.find(f"a:{color_name}", ns)
if node is not None:
... # no else — silently skipsdk1/accent1 are the source of truth for EXPECTED_FIXTURE["theme_colors"]
(test file L34–L37). If a future python-pptx default template renames or restructures these
nodes, generation silently produces the wrong theme and the failure surfaces as an opaque
assertion mismatch in the extraction tests rather than a clear error at the point of the bug.
Recommendation: Raise ValueError when node is None, consistent with the surrounding guards.
3. [Medium] Runtime generation couples fixtures to python-pptx's bundled default template
- Files:
.github/skills/experimental/powerpoint/tests/test_extract_content_integration.py
(L20, L28),.github/skills/experimental/powerpoint/pyproject.toml(L6)
Generating at test time makes the tests depend on python-pptx's bundled default.pptx staying
stable — layout names "Title Slide" / "Title and Content" and the clrScheme structure edited
in _set_theme_colors. The committed binary was self-contained; this trades that for a dependency
on upstream defaults. python-pptx is unpinned in pyproject.toml, so a contributor outside the
uv.lock pin could hit drift.
Recommendation: Add a minimum version constraint (or a short comment noting the tests assume the
bundled template's layout/theme names).
Pull Request
Description
This PR resolves the security and auditability concerns raised in #1135 regarding the committed binary
minimal_test_fixture.pptx.Changes Made:
tests/fixtures/minimal_test_fixture.pptxto eliminate the risk of hidden macros, OLE objects, or malicious XML embedded in an unreviewable binary blob.conftest.pyto dynamically generate the minimal PPTX fixture at test runtime usingpython-pptx. Usedlxmlto manipulate the theme XML directly to ensure deterministic theme color resolution (dk1,accent1).test_generated_fixture_passes_validate_deckintest_extract_content_integration.py. This ensures the programmatically generated PPTX passes structural validation viavalidate_deck.pybefore extraction tests run, creating a closed-loop guarantee.Benefits:
.pptxfile.Related Issue(s)
Fixes #1135
Type of Change
Select all that apply:
Code & Documentation:
Infrastructure & Configuration:
AI Artifacts:
prompt-builderagent and addressed all feedback.github/instructions/*.instructions.md).github/prompts/*.prompt.md).github/agents/*.agent.md).github/skills/*/SKILL.md)Other:
.ps1,.sh,.py)Testing
npm run test:py.test_generated_fixture_passes_validate_deckroundtrip validation test.tts-voiceovertest failures observed in the CI logs are pre-existingargparseissues in a completely separate skill and are unrelated to this PR).Checklist
Required Checks
Required Automated Checks
The following validation commands must pass before merging:
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generatenpm run docs:testSecurity Considerations
python-pptxandlxml)