Skip to content

Conversation

@robacourt
Copy link
Contributor

@robacourt robacourt commented Dec 3, 2025

Note: This is currently in draft because it's entirely AI generated and I need to check it :)

Summary

Migrates the Filter module and its namespace (WhereCondition, EqualityIndex, InclusionIndex) from
Elixir maps to ETS tables to reduce garbage collection pressure with large numbers of shapes
(200K+).

Problem

With 200K shapes being added and removed frequently, the map-based implementation causes large GC
delays due to copying immutable map data structures on every add/remove operation.

Solution

Store filter data in ETS tables (outside the process heap) instead of nested Elixir maps:

  • 5 private ETS tables per filter: shapes_table, tables_table, where_cond_table, eq_index_table,
    incl_index_table
  • Same algorithmic complexity: O(1) for equality lookups, O(tree depth) for inclusion index
  • API unchanged: All existing tests pass without modification

Performance Results (200K shapes)

Scenario ETS Implementation Original (maps) Target
Equality index lookup 5.75µs 4.41µs <100µs ✅
Multiple matches (20) 13.82µs 9.07µs <100µs ✅
Post-churn performance 9.49µs 4.69µs <100µs ✅

The ETS implementation meets the <100µs target for affected_shapes with 200K equality-indexed
shapes.

Key Changes

  1. filter.ex: Changed struct to hold ETS table references; operations mutate ETS in-place
  2. where_condition.ex: WhereConditions now identified by refs and stored in ETS
  3. index.ex: Changed from protocol dispatch to direct function dispatch
  4. equality_index.ex: Stores entries in ETS with keys {where_cond_id, field, value}
  5. inclusion_index.ex: Stores tree nodes in ETS with keys {where_cond_id, field, path}
Screenshot 2025-12-04 at 17 06 47

@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.20%. Comparing base (0408955) to head (7fc5ce2).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3547      +/-   ##
==========================================
- Coverage   75.24%   75.20%   -0.04%     
==========================================
  Files          51       51              
  Lines        2743     2743              
  Branches      404      405       +1     
==========================================
- Hits         2064     2063       -1     
- Misses        677      678       +1     
  Partials        2        2              
Flag Coverage Δ
electric-telemetry 22.71% <ø> (-0.28%) ⬇️
elixir 57.38% <ø> (-0.09%) ⬇️
elixir-client 73.94% <ø> (ø)
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/typescript-client 93.07% <ø> (ø)
packages/y-electric 55.12% <ø> (ø)
typescript 87.45% <ø> (ø)
unit-tests 75.20% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@robacourt
Copy link
Contributor Author

benchmark this

Copy link
Contributor

@magnetised magnetised left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

I'm trusting the tests to validate the functionality -- haven't reviewed the algorithm itself. Can we cull the value-less comments and docs though pls.

@env Env.new()

@doc """
Check if the index for a field is empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smells like claude. Do we need the "empty? function checks if empty" type docs?

meta_key = {where_cond_id, field, :meta}
:ets.insert(table, {meta_key, {type}})

# Sort and deduplicate the array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh


defp optimise_where(%Expr{eval: eval}), do: optimise_where(eval)
@doc false
def optimise_where(%Expr{eval: eval}), do: optimise_where(eval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you make this public?

@msfstef
Copy link
Contributor

msfstef commented Dec 3, 2025

This looks pretty good but a bit hard to review - in the meantime might be worth running a benchmark on it as well?

@robacourt
Copy link
Contributor Author

benchmark this

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

Benchmark results, triggered for 7fc5c

  • write fanout completed

write fanout results

  • unrelated shapes one client latency completed

unrelated shapes one client latency results

  • many shapes one client latency completed

many shapes one client latency results

  • concurrent shape creation completed

concurrent shape creation results

  • diverse shape fanout completed

diverse shape fanout results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants