Skip to content

Conversation

Copy link

Copilot AI commented Dec 27, 2025

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.

Description

Optimizes TokenAwarePolicy.make_query_plan to call child.make_query_plan only once instead of up to 3 times in some scenarios, improving performance by eliminating redundant query plan generation while preserving memory efficiency through lazy evaluation.

Problem

In the original implementation, TokenAwarePolicy.make_query_plan could call child.make_query_plan up to three times:

  1. Line 509: When query/routing_key/keyspace is None (early return)
  2. Line 520: When using tablet-based routing
  3. Line 538: To get the rest of the cluster hosts

This redundant computation was unnecessary and could impact performance, especially with complex child policies.

Changes Made

cassandra/policies.py

  • Modified TokenAwarePolicy.make_query_plan to call child.make_query_plan only once
  • For early return cases (no routing key/keyspace), pass through the generator directly to preserve lazy evaluation and avoid unnecessary memory usage
  • For token-aware routing cases, call child.make_query_plan once and convert to list for reuse (needed for replica identification and yielding remaining hosts)
  • Added comprehensive comments explaining the optimization rationale and when list materialization is necessary

tests/unit/test_policies.py

  • Added a new comprehensive test test_child_make_query_plan_called_once that validates the optimization works in all scenarios:
    • With routing key and keyspace
    • Without routing key
    • Without keyspace
    • With tablets enabled
  • Updated existing test assertion in _assert_shuffle to expect one call for both tablet and vnode scenarios

Performance & Memory Impact

  • Before: Up to 3 calls to child.make_query_plan in worst-case scenarios
  • After: Always exactly 1 call to child.make_query_plan
  • Memory: Lazy evaluation preserved for early return cases; list materialization only occurs when necessary for token-aware routing (when the query plan must be iterated twice: once for replica identification, once for yielding remaining hosts)

Testing

  • ✅ All 83 tests in tests/unit/test_policies.py pass
  • ✅ New test specifically validates the optimization
  • ✅ No security vulnerabilities detected by CodeQL scanner
  • ✅ Code review verified no memory concerns with the implementation
Original prompt

This section details on the original issue you should resolve

<issue_title>Call child.make_query_plan in TokenAwarePolicy.make_query_plan only once</issue_title>
<issue_description>In some scenarios TokenAwarePolicy.make_query_plan can call child.make_query_plan three times.
Let's make sure it is called only once.

def make_query_plan(self, working_keyspace=None, query=None):
keyspace = query.keyspace if query and query.keyspace else working_keyspace
child = self._child_policy
if query is None or query.routing_key is None or keyspace is None:
for host in child.make_query_plan(keyspace, query):
yield host
return
replicas = []
if self._tablets_routing_v1:
tablet = self._cluster_metadata._tablets.get_tablet_for_key(
keyspace, query.table, self._cluster_metadata.token_map.token_class.from_key(query.routing_key))
if tablet is not None:
replicas_mapped = set(map(lambda r: r[0], tablet.replicas))
child_plan = child.make_query_plan(keyspace, query)
replicas = [host for host in child_plan if host.host_id in replicas_mapped]
if not replicas:
replicas = self._cluster_metadata.get_replicas(keyspace, query.routing_key)
if self.shuffle_replicas:
shuffle(replicas)
for replica in replicas:
if replica.is_up and child.distance(replica) in [HostDistance.LOCAL, HostDistance.LOCAL_RACK]:
yield replica
for host in child.make_query_plan(keyspace, query):
# skip if we've already listed this host
if host not in replicas or child.distance(host) == HostDistance.REMOTE:
yield host

</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@mykaul
Copy link

mykaul commented Dec 29, 2025

@copilot please complete the work on this PR.

Copy link
Author

Copilot AI commented Dec 29, 2025

@copilot please complete the work on this PR.

Completed! The optimization ensures child.make_query_plan is called only once in all scenarios. See commit 507b252.

Summary of changes:

  • Modified TokenAwarePolicy.make_query_plan to call child.make_query_plan once and reuse the result
  • Added comprehensive test coverage to verify the optimization
  • All 83 existing tests pass
  • No security issues detected

keyspace = 'keyspace'
routing_key = 'routing_key'
query = Statement(routing_key=routing_key, keyspace=keyspace)
qplan = list(policy.make_query_plan(keyspace, query))
# Test case 2: Without routing key (should call once)
child_policy.reset_mock()
query = Statement(routing_key=None, keyspace=keyspace)
qplan = list(policy.make_query_plan(keyspace, query))
# Test case 3: Without keyspace (should call once)
child_policy.reset_mock()
query = Statement(routing_key=routing_key, keyspace=None)
qplan = list(policy.make_query_plan(None, query))

child_policy.reset_mock()
query = Statement(routing_key=routing_key, keyspace=keyspace, table='test_table')
qplan = list(policy.make_query_plan(keyspace, query))
Copilot AI changed the title [WIP] Fix TokenAwarePolicy to call child.make_query_plan once Fix TokenAwarePolicy to call child.make_query_plan once Dec 29, 2025
@mykaul
Copy link

mykaul commented Jan 3, 2026

@copilot address review comments from your friendly AI codeQL above.

Copy link
Author

Copilot AI commented Jan 3, 2026

@copilot address review comments from your friendly AI codeQL above.

Addressed the memory usage concern from the code review. See commits ec8203e and 85c6ee8.

Changes made:

  • Early return case now preserves lazy evaluation by passing through the generator directly, avoiding unnecessary memory usage
  • List materialization only occurs when token-aware routing is needed (requires two iterations over the plan)
  • Added clear comments explaining the optimization rationale

All 83 tests pass and no security issues detected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Call child.make_query_plan in TokenAwarePolicy.make_query_plan only once

2 participants