Skip to content

Conversation

@tnull
Copy link
Collaborator

@tnull tnull commented Jan 5, 2026

Previously, we would read entries of our payment store sequentially.
This is more or less fine when we read from a local store, but when we
read from a remote (e.g., VSS) store, all the latency could result in
considerable slowdown during startup. Here, we opt to read store entries
in batches.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jan 5, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@tnull tnull requested review from TheBlueMatt and removed request for tankyleo January 5, 2026 13:34
@tnull tnull force-pushed the 2026-01-parallelize-payment-reads branch from a378267 to 172aa9a Compare January 5, 2026 13:34
TheBlueMatt
TheBlueMatt previously approved these changes Jan 6, 2026
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the next step will be to parallelize every read operation that we can in the builder?

src/io/utils.rs Outdated
NETWORK_GRAPH_PERSISTENCE_SECONDARY_NAMESPACE,
NETWORK_GRAPH_PERSISTENCE_KEY,
)?);
let mut reader = Cursor::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be a good chance to drop the Cursor use - &mut &[u8] also implements a read stream so you can just use that, not that its critical.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, across the codebase.

src/io/utils.rs Outdated
)
.await?;

const BATCH_SIZE: usize = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems low tbh. maybe 100?

Copy link
Collaborator Author

@tnull tnull Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems low tbh. maybe 100?

Hmm, but firing off 100 requests in parallel seems also bit much, to the point where it actually hit some rate-limit once we get around to add proper per-user rate-limits to VSS? Now bumped it to 50 in a fixup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the point where it actually hit some rate-limit once we get around to add proper per-user rate-limits to VSS?

Maybe? We should definitely set VSS rate limits to be appropriate for whatever the ldk-node code does :). Generally, IMO, rate-limits shouldn't be (and usually aren't) "max parallel requests" but rather a token bucket over a number of seconds/minutes, so batch size here won't actually impact our rate-limit chance at all.

firing off 100 requests in parallel seems also bit much

I was thinking of it in terms of RTTs to load the Node - if we have 1000 payments (pretty reasonable "high but common" use case?) a batch size of 100 is still 10 round-trips, which is pretty high. Batch size of 50 is 20 round-trips, and we start getting into 2-3 seconds to load, which is unacceptably bad IMO. Now, because there may be rate-limit concerns we can certainly just leave this at 50 and fix it properly later by doing actual batch requests to VSS, but I think we'll still get complaints at 50.

@tnull
Copy link
Collaborator Author

tnull commented Jan 7, 2026

I assume the next step will be to parallelize every read operation that we can in the builder?

Not quite sure if next-next, but generally yes. Generally the plan is still to move most/all setup logic from build to start, and make the latter async.

For now addressed the feedback, let me know if I can squash the fixup.

Rather than using `KVStoreSync` we now use the async `KVStore`
implementation for most `read_X` util methods used during node building.

This is a first step towards making node building/startup entirely async
eventually.
@tnull tnull force-pushed the 2026-01-parallelize-payment-reads branch from 4451974 to 7ee4bbb Compare January 7, 2026 09:41
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixups look fine, feel free to squash, I didn't check that no other changes slipped in.

tnull added 3 commits January 7, 2026 14:58
Previously, we would read entries of our payment store sequentially.
This is more or less fine when we read from a local store, but when we
read from a remote (e.g., VSS) store, all the latency could result in
considerable slowdown during startup. Here, we opt to read store entries
in batches.
Previously, we consistently handed around `Arc` references for most
objects to avoid unnecessary refactoring work. This approach however
introduced a bunch of unnecessary allocations through `Arc::clone`.

Here we opt to rather use plain references in a bunch of places,
reducing the usage of `Arc`s.
@tnull tnull force-pushed the 2026-01-parallelize-payment-reads branch from d7ed295 to dec6f22 Compare January 7, 2026 13:58
@tnull
Copy link
Collaborator Author

tnull commented Jan 7, 2026

Fixups look fine, feel free to squash, I didn't check that no other changes slipped in.

Squashed without further changes.

@TheBlueMatt
Copy link
Contributor

CI is quite sad

Add integration test that verifies 200 payments are correctly
persisted and retrievable via `list_payments` after restarting
a node.

Co-Authored-By: Claude AI
@tnull tnull force-pushed the 2026-01-parallelize-payment-reads branch from dec6f22 to c724a89 Compare January 8, 2026 09:40
@tnull
Copy link
Collaborator Author

tnull commented Jan 8, 2026

CI is quite sad

Ah, force-pushed a fix:

> git diff-tree -U2 dec6f220 c724a893
diff --git a/tests/integration_tests_rust.rs b/tests/integration_tests_rust.rs
index 8771c51b..4d2a1742 100644
--- a/tests/integration_tests_rust.rs
+++ b/tests/integration_tests_rust.rs
@@ -24,5 +24,6 @@ use common::{
        premine_and_distribute_funds, premine_blocks, prepare_rbf, random_config,
        random_listening_addresses, setup_bitcoind_and_electrsd, setup_builder, setup_node,
-       setup_node_for_async_payments, setup_two_nodes, wait_for_tx, TestChainSource, TestSyncStore,
+       setup_node_for_async_payments, setup_two_nodes, wait_for_tx, TestChainSource, TestStoreType,
+       TestSyncStore,
 };
 use ldk_node::config::{AsyncPaymentsRole, EsploraSyncConfig};
@@ -2326,5 +2327,6 @@ async fn payment_persistence_after_restart() {
        // Setup nodes manually so we can restart node_a with the same config
        println!("== Node A ==");
-       let config_a = random_config(true);
+       let mut config_a = random_config(true);
+       config_a.store_type = TestStoreType::Sqlite;

        let num_payments = 200;

@tnull tnull requested a review from TheBlueMatt January 8, 2026 09:41
@tnull tnull merged commit 0c81333 into lightningdevkit:main Jan 8, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants