Add conversation_kit + typed_input skill: a product-agnostic language layer for voice agents#43
Open
jakubkarolczyk wants to merge 10 commits into
Open
Add conversation_kit + typed_input skill: a product-agnostic language layer for voice agents#43jakubkarolczyk wants to merge 10 commits into
jakubkarolczyk wants to merge 10 commits into
Conversation
…oice agents
New leaf subpackage signalwire.conversation_kit — the deterministic pieces a
voice agent needs to understand input, compute values, and speak output
correctly, none tied to any product:
- dates: compute_date (spoken day -> ISO calendar math), WEEKDAYS, and
RESOLVE_DATE_PARAMS (a resolve_date tool's JSON-schema fragment).
- inputs: validate_input + is_valid_email/phone/number, and
input_request_payload for the typed-input (on-screen keypad) channel.
- verbalizer: TTS-ready per-language output (number/unit/date/email/spell/
measure_text) plus prompt guidance(), behind a small plugin registry.
English and Polish ship; get(lang) falls back to English.
Zero dependencies. 20 unit tests under tests/unit/conversation_kit.
…een keypad A multi-instance skill (one instance per field) for collecting a value the caller TYPES on an on-screen keypad — email, phone, account number — when speech-to-text can't capture it reliably. - request_<field>: speak a "type it on screen" line, emit an input_request user event so a connected client reveals/focuses the field, then wait_for_user. - confirm_<field>: read the raw typed value from global_data['typed_<field>'], validate it, reopen on missing/invalid, else read it back to confirm. The value is never a model argument, so a typo can't be silently altered. Per-language prompts resolve against global_data['language'] at call time, so one instance serves a multilingual agent. Validation, the user-event payload, and the spoken read-back come from signalwire.conversation_kit. 12 unit tests.
…tter
Add Verbalizer.spell_acronyms(text): reads generic technical acronyms (DIN, ISO,
PPV, RMS, UTC) letter-by-letter via the per-language alphabet, so a TTS engine says
"er em es" instead of mangling "RMS" into a word.
Whole-token, case-sensitive matching (longest first) so it never touches a lowercase
word ("din"), a substring inside a longer word ("isolation"), or an unknown all-caps
name/code. The acronym set is a ClassVar, extensible per subclass. Two unit tests.
The verbalizer already read ISO measured values and acronyms; add temporal verbalization so a language that needs it (Polish) speaks dates and clock times naturally instead of letting the model guess at the digits — on a combined timestamp it reads the day into the minutes. - base.Verbalizer: extend date() with with_weekday/with_year, add time() (24h passthrough) and a VERBALIZES_DATETIME opt-in flag; datetime_text() rewrites ISO dates and date-times in free text (date-times first; a trailing UTC/Z is left in place for spell_acronyms to read). Base and English stay a no-op — they read ISO acceptably. - pl: _PL_HOURS feminine hour names + time() (on-the-hour reads hour only; a single-digit minute keeps its leading zero), VERBALIZES_DATETIME=True. - tests: time() and datetime_text() PL cases + English no-op (24 total).
The README predated the acronym-spelling and date/time verbalization work, so spell_acronyms/datetime_text/time() and the ACRONYMS/VERBALIZES_DATETIME attrs were undocumented. Update it and add the pieces an AI coding agent needs to extend the package safely: - Module map: every file, its responsibility, and its public names. - Full verbalizer surface: methods table + the three free-text passes (measure_text -> datetime_text -> spell_acronyms) with their gating class attrs and required run-order. - Adding a language: the class attributes that drive the shared methods, and the VERBALIZES_DATETIME opt-in. - Testing: the pytest command + the reference plugin to mirror. - Invariants: zero deps, no SDK import, product-agnostic, base-is-a-fallback, passes-are-no-op-by-default — the contract not to break. Docs only; no runtime change.
compute_date could express today/tomorrow and weekdays but not a spoken day
COUNT ('in two days' / 'za dwa dni') — the model had to approximate it as
'tomorrow' or, worse, as day-of-month 2. Add an in_days integer param + handling
(today + N), kept distinct from the calendar day-of-month so an offset never
lands on the wrong date. Tests cover the offset and that day-of-month still wins.
…gnostic)
Addresses an independent critical review. CI green: ruff format+check, mypy
(check_untyped_defs), pytest (33 passed, was 25).
Crashes:
- pl cardinal() rewritten to a general 1000-grouping algorithm with millions +
milliards tiers (KeyError'd >= 1_000_000, incl. long fractions); raises
ValueError above the milliard scale.
- base measure_text ranges used a hardcoded Polish 'do'; add RANGE_WORD ClassVar
('to' base, 'do' pl) so a non-Polish plugin's ranges aren't Polish.
- datetime_text guards its callbacks + pl.date() validates up front, so a
date-shaped-but-invalid token ('2026-13-45', '25:99') is left untouched.
Robustness / correctness:
- compute_date: 'the 31st' in a short month rolls forward; an out-of-range
explicit month/year is None (not silently today's); bool excluded from int
day/month/year/in_days; drop undocumented which-synonyms.
- datetime_text normalizes a trailing Z/UTC to a spellable ' UTC'.
- time() validates 0-23 / 0-59 (base + pl).
- inputs.is_valid_number rejects nan/inf.
- registry.get() falls back to the neutral BASE (not English) for an unknown
language, so it keeps the generic guidance(); docstrings + README updated.
- fix package docstrings (from signalwire.conversation_kit ...); export Numeric;
input_request_payload -> dict[str,str]; WEEKDAYS -> tuple.
Product-agnostic:
- drop domain-specific 'PPV' from the base ACRONYMS default (apps add domain
acronyms by subclass); scrub product-y test data.
Deferred: Polish 'od <gen> do <gen>' range grammar (needs a genitive declension
table) — kept the nominative form as a documented simplification.
Keep the package free of any originating-product fingerprint before PR: - replace a real personal-name email (karolczyk.jakub@…) with a fictional jan.kowalski@example.com - rename the typed-input example field installer_email -> contact_email - neutralize domain-flavored sample text: 'RMS velocity'/'PPV:' -> 'reading'/ 'value'; the vibration standards ISO 10816 / DIN 4150-3 -> domain-neutral ISO 9001 / DIN 5008-1 Tests + README only; no runtime change (33 tests still pass).
…prompts Pre-PR review of the typed_input skill (CI already green; it correctly reuses conversation_kit.inputs rather than duplicating). Two fixes: - scrub the domain-flavored 'installer_email' / 'Installer's email' example from the docstring, README, and tests -> neutral 'contact_email' / 'Contact email'. - setup() now validates the three required per-language prompt maps (open_prompt / field_label / invalid_prompt): the schema marks them required but the loader doesn't enforce it, so a missing one would silently speak '' at runtime. Fail loud instead; test added. CI green: ruff format+check, mypy, pytest (13 typed_input tests).
…t audit The _check() helper asserted inside itself, so the no-cheat audit's static scan saw the test bodies as assertion-free and flagged six as cheat tests. Inline each case as a visible 'assert fn(value) == expected' loop and drop the helper — the coverage is identical, now visible to the audit. No behaviour change; 46 tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
signalwire.conversation_kit— a small, zero-dependency, product-agnosticlanguage layer for voice agents — and a
typed_inputskill built on it. Purelyadditive: no existing files are modified (+2205 / −0 across 17 new files).
It covers the two halves of a voice turn's language handling, neither tied to any
product: understand what the caller said (spoken → value) and speak values
back correctly (value → spoken).
What's added
conversation_kit/(stdlib only, zero third-party deps)dates—compute_date: resolve a spoken date the caller named (weekday +this/next, today/tomorrow, "in N days", explicit day/month/year) to a calendar
date, so the model never does calendar math. Ships
RESOLVE_DATE_PARAMS(a ready-made JSON-schema fragment) +
WEEKDAYS.inputs—validate_input(email/phone/number) andinput_request_payload/
INPUT_REQUEST_TYPEfor a typed-input (on-screen keypad) channel.verbalizer— TTS-ready per-language output (numbers, units, dates, times,emails, acronym spelling) behind a plugin registry, plus a
guidance()prompthelper. A concrete
Verbalizerbase (safe, language-neutral fallback) withen/plplugins; add a language by subclassing andregister()-ing it.get(lang)falls back to the neutral base for an unregistered language.skills/typed_input/A
SkillBaseskill that collects a value the caller types on an on-screenkeypad (email/phone/number) for cases speech-to-text can't capture: it emits an
input_requestuser-event, parks viawait_for_user, then validates and readsback the RAW typed value (never a model argument, so a typo is never silently
"corrected"). One instance per field, per-language prompts. Reuses
conversation_kit.inputs.Design
conversation_kit(stdlib only); aself-contained leaf that never imports the rest of the SDK.
supplies domain wording (e.g. domain acronyms via a
Verbalizersubclass).Testing
tests/unit/conversation_kit/,tests/unit/skills/).ruff format+ruff checkclean;mypyclean on the added packages.Notes for reviewers
main.find-packages config; the skill READMEships as package data.
conversation_kitis imported assignalwire.conversation_kit; not yetre-exported from the top-level namespace — happy to add that if preferred.
README.mds document usage and how to add a language.