Skip to content

perf(vrl): adapt Vector to VRL ObjectMap#25697

Draft
lukesteensen wants to merge 6 commits into
masterfrom
experiment/vrl-objectmap-adaptation
Draft

perf(vrl): adapt Vector to VRL ObjectMap#25697
lukesteensen wants to merge 6 commits into
masterfrom
experiment/vrl-objectmap-adaptation

Conversation

@lukesteensen

Copy link
Copy Markdown
Member

Supersedes fork-based draft PR #25649.

Summary

Adapts Vector to the experimental VRL ObjectMap work by replacing direct
BTreeMap assumptions with ObjectMap-compatible construction and iteration
patterns.

VRL dependency used for this draft:

Commit structure

The main ObjectMap work is isolated in:

  • refactor: adapt Vector to VRL ObjectMap
  • chore: use git VRL ObjectMap dependency

This branch also includes two small test fixes that were discovered while
testing the ObjectMap changes locally, but are otherwise unrelated to ObjectMap:

  • test(codecs): avoid mutating global log schema in syslog tests

    • Avoids syslog tests mutating global log_schema state, which can pollute
      other tests depending on execution order.
  • test(vector-core): keep empty DDSketch fixtures canonical

    • Avoids generating non-canonical empty DDSketch fixtures, which can fail
      round-trip expectations.

Benchmark support is included separately in:

  • chore(bench): add ObjectMap benchmark helpers

Validation

Passed:

  • cargo fmt --all -- --check
  • cargo check --workspace --all-targets --all-features --locked
  • make check-clippy
  • cargo test -p vector-core --all-features --lib --locked
  • cargo test -p codecs --all-features --lib --locked
  • targeted Vector tests around map conversion, HTTP headers, and sink encoding
  • targeted VRL object_map tests

Benchmark sanity check:

  • Case: datadog_agent_remap_blackhole
  • Current btree -> current flat: +24.43%
  • Short local run: --replicates 1 --total-samples 60 --warmup-seconds 15

@lukesteensen lukesteensen requested review from a team as code owners June 29, 2026 16:16
@datadog-vectordotdev

datadog-vectordotdev Bot commented Jun 29, 2026

Copy link
Copy Markdown

Pipelines  Tests

⚠️ Warnings

🚦 1 Pipeline job failed

PR Title Check | Check PR Title   View in Datadog   GitHub Actions

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: abdc43a | Docs | Give us feedback!

@github-actions github-actions Bot added domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components domain: sinks Anything related to the Vector's sinks domain: core Anything related to core crates i.e. vector-core, core-common, etc docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. labels Jun 29, 2026
@lukesteensen lukesteensen marked this pull request as draft June 29, 2026 16:17

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abdc43a27f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread bench-all.sh
Comment on lines +4 to +5
BENCH="/Users/luke.steensen/code/vector/bench.sh"
RESULTS_DIR="/Users/luke.steensen/code/vector/bench-results/full"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Replace hard-coded benchmark paths

On any checkout that is not exactly /Users/luke.steensen/code/vector, running the checked-in bench-all.sh invokes a non-existent benchmark script and writes results under that developer-specific home directory, so the batch benchmark workflow fails before exercising the cases. Derive these paths from the script location, as bench.sh does, or make them configurable.

Useful? React with 👍 / 👎.

Comment thread Dockerfile.bench
protobuf-compiler \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /vector
COPY --from=vrl . /vrl

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use the copied VRL checkout for bench builds

When ./bench.sh build is used to benchmark local VRL variants, this copies ../vrl into the image, but the subsequent cargo build has no path override or [patch] pointing Cargo at /vrl (and the repo config has no such override), so the image is built from the pinned git dependency in Cargo.lock instead. This makes local changes in ../vrl invisible to the benchmark images the script is meant to compare.

Useful? React with 👍 / 👎.

Comment thread Cargo.toml
vector-vrl-category = { path = "lib/vector-vrl/category" }
vector-vrl-functions = { path = "lib/vector-vrl/functions", default-features = false }
vrl = { git = "https://github.com/vectordotdev/vrl.git", branch = "main", default-features = false, features = ["arbitrary", "cli", "test", "test_framework", "stdlib-base"] }
vrl = { git = "https://github.com/lukesteensen/vrl.git", branch = "experiment/objectmap-backends", default-features = false, features = ["arbitrary", "cli", "test", "test_framework", "stdlib-base"] }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid depending on a personal VRL fork

For release/CI builds from this commit, the vrl dependency now resolves from a personal fork and experiment branch instead of the official vectordotdev/vrl repository. Even with Cargo.lock pinning a commit, fresh builders still have to fetch from that fork, so deleting or force-pushing the branch, changing access, or losing the fork breaks reproducible builds and puts production dependency resolution outside the project-controlled remote.

Useful? React with 👍 / 👎.

Comment thread bench.sh
Comment on lines +33 to +37
DOCKER_BUILDKIT=1 docker buildx build \
--build-context "vrl=${SCRIPT_DIR}/../vrl" \
-f "${SCRIPT_DIR}/Dockerfile.bench" \
-t "vector:${tag}" \
"${SCRIPT_DIR}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Load buildx images before running benchmarks

When using a non-docker Buildx builder such as the docker-container driver commonly used with buildx, this tagged build is not loaded into the local Docker image store unless --load or an explicit exporter is set; Docker's docs state that otherwise Buildx defaults to cacheonly and the result remains only in the build cache. In that environment ./bench.sh build <tag> succeeds but smp local run cannot find vector:<tag>, so add --load here and to the derive build.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant