Skip to content

Conversation

@ubik2
Copy link
Contributor

@ubik2 ubik2 commented Dec 4, 2025

Changes the way we do traversal so the same system can be used by both the cell.get() implementation and the query implementation that determines which linked docs we need.

The cell.get() system, or validateAndTransform doesn't follow asCell links, just creating the cell objects that will let us access those in the future.

The query system does follow those links to determine what we need to have available.

To unify tracking, the query system was re-written to use the transaction interface we use for reactivity instead of directly accessing storage.

There's a lot more link magic that needed to go on for the validateAndTransform, and to make this all work, I split the code into a SchemaObjectTraverser portion that knows how to walk objects with rules like the schema, and an IObjectCreator portion that knows how to create the objects.

  • For the validateAndTransform, that IObjectCreator does all the magic of creating new cells.
  • For the querySchema, that IObjectCreator does very little, mostly just passing along information about whether the data matched the schema pattern.

Things I intend to address later:

  • Our typing is unsound. The IObjectCreator returns either the complex mix of proxy and cell objects, or simple JSONValue | undefined depending on the mode.
  • We're not trying to be perfect with schemas, but many anyOf/allOf that expect to combine object properties may break.
  • SchemaObjectTraverser should probably switch to a link interface instead of IAttribution.
  • We really need some documentation on how these cell.get calls work, and what's going on behind the scenes. A lot of the rules when we combine schemas from links that we're following should have unit tests.

Summary by cubic

Unifies traversal and validation to run on storage transactions and normalizes selectors to start at "value", improving consistency, default handling, and link behavior. Addresses CT-997 by removing tight coupling to manager classes, stabilizing array/link semantics, and preventing actions from running with invalid arguments.

  • Refactors

    • Traversal/validation run on IExtendedStorageTransaction (via ManagedStorageTransaction); removed direct BaseObjectManager usage.
    • Selectors and addresses are now rooted at ["value", …]; tests and callers updated.
    • Improved schema handling: merges schemas across links, adds limited anyOf merging, applies defaults to missing/undefined values (including top-level), preserves asCell/asStream, follows first array element link when appropriate, unifies additionalProperties behavior, and treats broken links as undefined.
    • ContextualFlowControl now supports prefixItems and false schemas, reuses a single instance per traversal for better performance, and tightens legacy cell link detection.
    • validateAndTransform delegates to SchemaObjectTraverser and drops the unused sync flag; lowered /memory/transact logs to debug.
    • Actions only run when arguments validate; still run if argumentSchema is false.
    • Shared schema tracker maps space+doc to schema selectors, acting as a watch list and cache to avoid redundant re-traversals.
    • Event streams with false/undefined schemas now use a permissive schema so handlers still run.
  • Migration

    • Update SchemaPathSelector.path to start with "value".
    • Use an ExtendedStorageTransaction for traversal instead of passing a manager.
    • Note: charm list schema now defaults to [] (no action needed unless relying on undefined).

Written for commit b951078. Summary will update automatically on new commits.

ubik2 added 30 commits October 16, 2025 14:01
Added the better performance test in this branch
…ion instead of using the ObjectStorageManager classes.
…jects from validateAndTransform are not simply JSONValue objects.

- Combined the annotateObject and createObject hooks.
- Changed the cycle detector, so that we put in an object immediately, then populate it later. This makes it so we can walk into a object/array with a link to itself, and point the resulting object at itself, adding the special fields when we return.
- Fix typo in Replica log message (already exists)
- Addd TransformObjectCreator which implements the callbacks needed for SchemaObjectTraverser with the validateAndTransform functionality.
- Added new validateAndTransform implementation, temporarily switching between them.
- Convert SchemaQuery SchemaPathSelectors to have the "value" prefix before passing them to traverse code. Do the same for the selectors created for validateAndTransform.
- Changed factValue passed to getAtPath to not narrow to value
…stead of just when undefined), since that was the previous behavior

Tidy up the link value path in validateAndTransform
…ll use a cell, and not a query result proxy.
Added applyDefault helper function to SchemaObjectTraverser
Added test for top level default
Partial pass at type cleanup (still bad)
Added temporary console logs
Added cell test for defaults
Added check for non-empty fact address path in loadFactsForDoc.
…ls (or spell or TYPE links).

These should not be included for client traversal, and will break unit tests if we do, because we end up having a read assertion for a recipe that wasn't written.
…ink with the schema in our parent context. Use the id/type/path from the data.

For server traversal, I should switch over to this as well, but I'm going to wait until more things work.
Move schema combination logic into traversePointerWithSchema
Added missing boolean handling to traverseWithSchemaContext
Fix typo in cache.ts error message
Cleaned up some tests to be more correct
Code for array post-processing to follow first link (disabled)
Changed navigate-to.ts to handle getting a proxy object instead of a cell (need to track this down)
…ating a proxy object instead of a cell. This removes the need for the workaround in navigate-to.ts.
…ments in an array. This needed to be done in traverse, since it should happen before we walk into the object.
…ng optional unused seen parameter to new version for now.
*
* When properties are not specified in the schema, and additionalProperties
* is not specified, we have the standard JSONSchema behavior, where this is
* equivalent to additionalProperties of true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually see this at this point or would have this been collapsed to a true schema earlier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still see this. I don't think @mathpirate 's transformer code ever generates this style of schema, but you can specify it manually still.

…e the object isn't a cell (this was just an error). There's some type issues here, but I'm deferring that.

When creating an event stream, if the schema is false (which it often is if the event isn't used), create a simple object schema. Otherwise, we don't run the event handlers.
…ect for JSONSchema, but seems like a good compromise to get the bits we want.
… often not present, and we do stricter validation now.

Change the counter fallback test, so it doesn't rely on having undefined in the array. Uses null instead to test the same general thing.
If the schema to be used for an event stream is undefined or false, change it to true instead. This is a temporary workaround until I track down why I'm not seeing that triggered.
JSONSchema doesn't allow NaN as a number (since JSON doesn't). I could alter traverse to allow this, but I'm just going to alter the test instead, since neither is ideal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants