feat(jepsen): M5a — dynamodb-append-multi-table-workload by bootjp · Pull Request #916 · bootjp/elastickv

bootjp · 2026-06-03T12:33:02Z

Draft — second slice of Composed-1 M5a per docs/design/2026_06_02_proposed_composed1_m5_jepsen_route_shuffle.md §3.3. Follows #911 (CLI tools, merged); blocks on launch script + setup-hook verification (remaining M5a slices) before unmark-draft.

What this PR ships

jepsen/src/elastickv/dynamodb_multi_table_workload.clj — a new workload variant of the DynamoDB list-append test that exercises cross-table 2PC, which the single-table elastickv.dynamodb-workload cannot do.

Why a separate workload exists

kv/shard_key.go's dynamoRouteKey normalises every DynamoDB table-meta / item / GSI key for one table to a single per-table route key (!ddb|route|table|<base64(name)>). Single-table workloads always route to one shard regardless of partition-key value, so dispatchMultiShardTxn / commitSecondaryTxns / the new ErrTxnSecondaryRouteShiftedAfterPrimaryCommit sentinel never fire (codex P1 on #905 ffb9c73).

What this workload does

N=4 tables (jepsen_append_t1 … jepsen_append_t4) created in setup!.
Deterministic key routing: Elle's integer keys k route via (mod k N)+1 → table-idx, so each Elle key always lands at the same (table, pk) storage location (Elle's append checker still sees a single key namespace).
Default 4-mop txns with key-count=12 distribute evenly across all 4 tables — a 4-key txn touches all 4 tables and therefore both Raft groups under the launch script's planned --shardRanges layout (t1-t2 → group 1, t3-t4 → group 2).
TransactGetItems for atomic cross-table pre-read; TransactWriteItems for atomic cross-table writes (both DynamoDB APIs natively support multi-table txns).
Same OCC condition expressions as the single-table workload so any G1c or G-single anomaly observed is attributable to the multi-shard 2PC path rather than to a workload-shape difference.

Why self-contained

The file intentionally duplicates small helpers from elastickv.dynamodb-workload rather than extracting into a shared namespace. Refactoring would couple the workloads through a maintained API and obscure the side-by-side comparability that is the point of running both — relevant because #905 commits to keeping the single-table workload as-is for trend comparison.

Tests (`jepsen/test/elastickv/dynamodb_multi_table_workload_test.clj`)

7 tests / 17 assertions. The load-bearing ones:

key-routing-distributes-across-all-tables — 12 keys distribute 3-per-table across 4 tables
key-routing-table-name-matches-prefix — wraparound at k=4
key-routing-pks-disambiguate_colliding_keys — k=0 and k=4 share table t1 but have distinct pks
multi-op-txn-spans_multiple_groups — default 4-key txn touches BOTH the planned group-1 tables (1,2) and group-2 tables (3,4) — without this, dispatchMultiShardTxn never fires and the whole workload is a no-op

The routing helpers are private; tests access via var-get on #'workload/key->table-idx etc. so a future change to the routing surfaces in this test rather than as a silent G1c during a Jepsen run.

Remaining M5a work (separate PRs after this lands)

scripts/run-jepsen-m5-local.sh — single-process two-group launch with --raftGroups + --shardRanges using elastickv-route-key to compute boundary keys
Setup-hook verification — Clojure ListRoutes gRPC client asserting multi-group routing is in place
Done-when check — workload exercises dispatchMultiShardTxn (server-side log marker or probe) and Elle finds zero G1c

Test plan

lein test elastickv.dynamodb-multi-table-workload-test → Ran 7 tests / 17 assertions, 0 failures, 0 errors
lein test elastickv.dynamodb-workload-test (existing single-table suite) → Ran 3 tests / 5 assertions, 0 failures (no regression)
End-to-end Jepsen run (gated on launch script + setup hook landing)

Second slice of Composed-1 M5a per PR #905 design doc §3.3 + PR #911 checklist. New self-contained workload variant of the DynamoDB list-append test that exercises cross-table 2PC, which the single-table elastickv.dynamodb-workload cannot do. Background: kv/shard_key.go's dynamoRouteKey normalises every DynamoDB table-meta / item / GSI key for one table to a single per-table route key (!ddb|route|table|<base64(name)>). Single- table workloads always route to one shard regardless of partition-key value, so dispatchMultiShardTxn / commitSecondaryTxns / the ErrTxnSecondaryRouteShiftedAfterPrimaryCommit sentinel never fire (codex P1 on PR #905 ffb9c73). This workload: * Creates N=4 tables (jepsen_append_t1 .. jepsen_append_t4) in setup!. * Routes Elle's integer keys k via (mod k N)+1 -> table-idx deterministically, so each Elle key always lands at the same (table, pk) storage location (Elle's append checker still sees a single key namespace). * Default Elle txns of up to 4 mops with key-count=12 distribute evenly across all 4 tables on every multi-op txn — a 4-key txn touches all 4 tables and therefore both Raft groups under the launch script's planned --shardRanges layout (t1-t2 -> group 1, t3-t4 -> group 2). * Uses TransactGetItems for atomic cross-table pre-read and TransactWriteItems for atomic cross-table writes (both DynamoDB APIs natively support multi-table txns). * Carries the same OCC condition expressions as the single- table workload so any G1c or G-single anomaly observed is attributable to the multi-shard 2PC path rather than to a workload-shape difference. The file is intentionally self-contained (small helpers duplicated from elastickv.dynamodb-workload rather than extracted into a shared namespace). Refactoring shared helpers into a third namespace would couple the workloads through a maintained API and obscure the side-by-side comparability that is the point of running both — relevant because PR #905 commits to keeping the single-table workload as-is for trend comparison. Tests cover the public API surface plus the load-bearing routing invariants: * builds-test-spec — :name = elastickv-dynamodb-append-multi-table * custom-options-override-defaults — option passthrough * host-override-creates-client — DDB client opens * key-routing-distributes-across-all-tables — 12 keys distribute 3-per-table across 4 tables * key-routing-table-name-matches-prefix — wraparound at k=4 * key-routing-pks-disambiguate_colliding_keys — k=0 and k=4 share table t1 but have distinct pks * multi-op-txn-spans_multiple_groups — default 4-key txn touches BOTH the planned group-1 tables (1,2) and group-2 tables (3,4) — without this, dispatchMultiShardTxn never fires and the whole workload is a no-op The routing helpers are private; the tests access them via var-get on #'workload/key->table-idx etc. so a future change to the routing surfaces in this test rather than as a silent G1c during a Jepsen run. Verification: lein test elastickv.dynamodb-multi-table-workload-test -> Ran 7 tests containing 17 assertions, 0 failures, 0 errors. lein test elastickv.dynamodb-workload-test (existing single-table suite) -> Ran 3 tests containing 5 assertions, 0 failures. Remaining M5a work (separate commits on this branch): * scripts/run-jepsen-m5-local.sh — single-process two-group launch with --raftGroups + --shardRanges using elastickv-route-key to compute boundary keys * Setup-hook verification — Clojure ListRoutes gRPC client asserting multi-group routing is in place * Done-when check — workload exercises dispatchMultiShardTxn (server-side log marker or probe) and Elle finds zero G1c

coderabbitai · 2026-06-03T12:33:11Z

Warning

Review limit reached

@bootjp, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 59 minutes and 48 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6a42e48c-cc17-432e-b842-a4a8ebc5af6b

📥 Commits

Reviewing files that changed from the base of the PR and between 384cacc and cf07ec0.

📒 Files selected for processing (2)

jepsen/src/elastickv/dynamodb_multi_table_workload.clj
jepsen/test/elastickv/dynamodb_multi_table_workload_test.clj

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/composed1-m5a-impl

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a new Jepsen workload, dynamodb-multi-table-workload, along with its corresponding unit tests. This workload is designed to exercise the multi-shard 2PC path in elastickv by distributing keys deterministically across four different DynamoDB tables, ensuring transactions span multiple Raft groups. There are no review comments to address, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

bootjp · 2026-06-03T14:42:20Z

Workload draft + tests are stable; gemini's first pass returned zero findings. Could use a proper code review before iterating the remaining M5a slices (launch script + setup-hook verification).

@claude review

claude · 2026-06-03T14:42:44Z

Claude finished @bootjp's task in 4m 36s —— View job

Code Review

Tasks

Gather context and understand request
Read workload implementation (dynamodb_multi_table_workload.clj)
Read test file (dynamodb_multi_table_workload_test.clj)
Review: correctness and logic
Review: concurrency / distributed failure paths
Review: test coverage and assertions
Review: Clojure style and best practices
Post findings

Overall

The workload is well-structured. The routing design is sound, the OCC condition-expression pattern is faithfully carried over from the single-table workload, and the key tests lock down the two M5a-specific invariants (deterministic routing + multi-group span). Gemini's "no findings" read was broadly correct — there are no blocking bugs. The items below are one medium-severity correctness gap and several lower-priority issues.

Medium — Stale table data can produce false Elle anomalies across runs

File: dynamodb_multi_table_workload.clj:265

(teardown! [_this _test])

teardown! is a no-op. Tables accumulate appended values across runs. When a subsequent run starts, TransactGetItems will read the leftover lists from the previous run as snapshot, and value' will report reads that include values the current run's Elle checker never observed being appended. This gives Elle visibility of a history it didn't generate — the checker can flag those values as G-single or produce an incorrect analysis.

The single-table workload has the same gap, so this is an inherited pattern rather than a new defect. But with four tables instead of one, the blast radius for stale data is larger and consecutive runs in the launch script are more likely to hit it.

Suggested fix: delete all four tables in teardown!, or at minimum document a cluster-teardown step in the launch script that calls DeleteTable on each. Alternatively, append a per-run UUID or timestamp suffix to table-prefix so each test run owns its own table set without needing cleanup.

Fix this →

Low — `TransactGetItems` zip relies on positional alignment with no verification

File: dynamodb_multi_table_workload.clj:194-197

(into {}
      (map (fn [k entry] [k (read-list-attr (:Item entry))])
           ks-vec
           (:Responses resp)))

The DynamoDB API guarantees :Responses is positionally aligned with :TransactItems, but this is verified nowhere. If the cognitect SDK ever normalises the response (unlikely, but possible after a library upgrade) the snapshot map would be silently corrupt — every key mapped to the wrong value — and the OCC conditions would start accepting transactions they should reject. A one-line assertion ((assert (= (count ks-vec) (count (:Responses resp))))) would at least surface this class of failure loudly rather than silently. The same pattern is in the single-table workload.

Low — `ConditionalCheckFailedException` handler is dead code

File: dynamodb_multi_table_workload.clj:314-316

(contains? #{"ConditionalCheckFailedException"
             "InternalServerError"} err-type)
(assoc op :type :info :error (str err-type))

dynamo-append! (single-mop path) has no ConditionExpression, so it cannot return ConditionalCheckFailedException. The multi-mop path uses TransactWriteItems, which returns TransactionCanceledException on OCC failure — never ConditionalCheckFailedException. This arm is never reachable. Keeping it mirrors the single-table workload and is harmless, but it may mislead a future maintainer into thinking individual condition failures are handled distinctly from transaction cancellations.

Low — `key-count` / `num-tables` coupling not validated at construction time

File: dynamodb_multi_table_workload.clj:345, dynamodb_multi_table_workload_test.clj:47-51

The key-routing-distributes-across-all-tables test hard-codes (range 12) (matching the default key-count=12). But dynamodb-append-multi-table-workload accepts (:key-count opts), meaning the CLI can pass --key-count 10. With 10 keys and 4 tables, two tables get 2 keys and two get 3 — the even-distribution property breaks silently. The test would still pass (it tests 0..11), and the Jepsen run would proceed, but two tables would have lower key coverage and a 1-key txn on one of those tables would never trigger multi-group dispatch.

A guard at workload construction time (e.g., (when-not (zero? (mod key-count num-tables)) (warn ...))) or a test case with a non-multiple key-count would make this visible.

Trivial — Deftest naming inconsistency

File: dynamodb_multi_table_workload_test.clj:61

(deftest key-routing-pks-disambiguate_colliding_keys

Underscore in an otherwise hyphenated test name. Should be key-routing-pks-disambiguate-colliding-keys.

Fix this →

Design note — `setup!` doesn't wait for tables to reach ACTIVE state

CreateTable in real DynamoDB is asynchronous; writes immediately after creation fail with ResourceNotFoundException until the table is ACTIVE. The workload issues create-all-tables! in setup! and then starts sending traffic without polling DescribeTable. This works if elastickv's compatible implementation makes table creation synchronous, but is worth verifying before the end-to-end Jepsen run. If creation is async, a polling loop in create-one-table! (checking for ACTIVE) will prevent spurious ResourceNotFoundException failures that would be misread as network errors.

What looks good

Routing math is correct. (key->table-idx k) = (inc (mod k 4)) and (key->pk k) = (str (quot k 4)) produce a unique (table, pk) pair for every non-negative integer k. Verified analytically and by the four routing tests.
OCC conditions faithfully ported. ConditionCheck on read-only keys and snapshot-conditioned Update on written keys are identical to the single-table workload. The multi-table delta is purely TableName routing — no logic drift.
RYOW simulation is correct. Local appends are accumulated before dynamo-transact-write! is called; :r mops in the same txn see those values; :ok is only returned if the write succeeds.
by-key grouping prevents duplicate-item rejection. Multiple appends to the same Elle key are merged into one Update, satisfying DynamoDB's constraint against two operations on the same item in a single TransactWriteItems.
multi-op-txn-spans_multiple_groups closes the main M5a regression hole. The test is correctly structured; (some keys-in-group1 tables) usage is idiomatic.
Intentional duplication rationale is well-documented. The ns docstring explains why shared helpers weren't extracted.

… key-count guard (claude[bot] on PR #916) claude[bot]'s review of dded8b96 surfaced one medium-severity correctness gap and several low-priority items. All addressed: * MEDIUM — Stale table data across runs (teardown! no-op). Tables accumulated appended values across runs; the next TransactGetItems pre-read picked them up as snapshot, :r mops reported them to Jepsen, and Elle saw a history containing writes it never generated — spurious G-single / G1c. Fix: added delete-all-tables! mirroring create-all-tables!'s exception handling (ignore ResourceNotFoundException for idempotency), and called it from the client's teardown!. * LOW — TransactGetItems positional alignment unverified. DynamoDB guarantees :Responses is positionally aligned with :TransactItems, but the workload relied on this without an assertion. A future cognitect SDK that ever normalises the response would silently corrupt the snapshot map. Fix: one-line assert on (count ks-vec) vs (count Responses) so the failure surfaces loudly rather than via mysterious OCC behaviour. * LOW — key-count must be a multiple of num-tables. Default key-count=12 splits evenly across N=4 tables, but --key-count 10 (or any non-multiple) silently gives some tables fewer keys, breaking the multi-group-span invariant that is the entire point of this workload. Fix: construction-time (assert (zero? (mod key-count num-tables))) so an uneven split fails-fast at workload build time rather than producing a confusing test run. Two new tests pin this: rejects-key-count-not-divisible- by-num-tables (10, 7 reject) and accepts-divisible-key- counts (4, 8, 16 accept). * TRIVIAL — deftest naming inconsistency. Two test names had stray underscores instead of hyphens (key-routing-pks-disambiguate_colliding_keys and multi-op-txn-spans_multiple_groups). Renamed both to the project's hyphen-case convention. Deferred (claude[bot] notes): * LOW — ConditionalCheckFailedException dead-code arm. Mirrors the single-table workload; not blocking; kept for parity until a single-table cleanup lands. * DESIGN NOTE — setup! doesn't wait for ACTIVE state. Real DynamoDB's CreateTable is async, but elastickv's implementation may be synchronous. Worth verifying at the end-to-end Jepsen run; if needed, will add a DescribeTable poll loop. Verification: lein test elastickv.dynamodb-multi-table-workload-test -> Ran 9 tests containing 22 assertions, 0 failures, 0 errors.

bootjp · 2026-06-03T14:54:26Z

Pushed as c46b2b5 — claude[bot]'s findings on aded8b9 addressed.

Medium — stale table data across runs. teardown! was a no-op, so the next run's TransactGetItems pre-read picked up leftover Items as snapshot, :r mops reported them to Jepsen, and Elle saw a history containing writes it never generated.

Fix: added delete-all-tables! mirroring create-all-tables!'s exception handling (ignore ResourceNotFoundException for idempotency), and called it from the client's teardown!.

Low — TransactGetItems positional alignment unverified. One-line assert on (count ks-vec) = (count :Responses) so a future cognitect SDK that normalises the response surfaces loudly.

Low — key-count must be multiple of num-tables. --key-count 10 would silently give some tables fewer keys, breaking the multi-group-span invariant. Construction-time assert fails-fast; two new tests pin both reject (10, 7) and accept (4, 8, 16) cases.

Trivial — deftest naming. Two test names had stray underscores; renamed to project's hyphen-case convention.

Deferred (claude[bot] explicitly tagged as low-priority):

ConditionalCheckFailedException dead-code arm — mirrors single-table workload, kept for parity until a single-table cleanup lands.
setup! doesn't wait for ACTIVE state — real DynamoDB's CreateTable is async but elastickv's may be synchronous; will verify at the end-to-end Jepsen run and add a DescribeTable poll if needed.

lein test elastickv.dynamodb-multi-table-workload-test → Ran 9 tests / 22 assertions, 0 failures, 0 errors.

@claude review

claude · 2026-06-03T14:54:50Z

Claude finished @bootjp's task in 1m 43s —— View job

Code Review — follow-up pass on `c46b2b5`

Tasks

Load tools
Read updated workload implementation
Read updated test file
Review: fixes from previous round (teardown!, assert, key-count guard, naming)
Review: deferred items and remaining concerns
Post findings

Verdict: all four previous findings correctly addressed — ready to remove draft once launch script lands

Fix-by-fix verification

Medium (stale table data) → ✅ Fixed correctly

delete-all-tables! (lines 159–171) mirrors create-all-tables! exactly: doseq over table-names, ignores ResourceNotFoundException so teardown is idempotent across partial-setup failures and double-teardown. Called from teardown! (lines 297–301). The comment in the docstring and in teardown! correctly records why this matters for Elle history correctness.

Low (TransactGetItems positional alignment) → ✅ Fixed correctly

Lines 224–225 add the assert with a meaningful failure message. One note: Clojure assertions can be elided globally via (set! *assert* false) / :elide-assertions true, but in a Jepsen test process that is never the case, so this is sufficient.

Low (key-count % num-tables guard) → ✅ Fixed correctly

Construction-time assert at lines 388–391. Two new rejection tests (rejects-key-count-not-divisible-by-num-tables) cover 10 and 7; three acceptance tests (accepts-divisible-key-counts) cover 4, 8, 16. The test count matches the reported 9 tests / 22 assertions.

Trivial (deftest naming) → ✅ Fixed

key-routing-pks-disambiguate-colliding-keys at line 61 is now hyphen-case throughout.

Deferred items — status unchanged, dispositions look right

ConditionalCheckFailedException dead-code arm — kept for parity with the single-table workload. Fine; a future cleanup can remove both together.

setup! doesn't poll for ACTIVE — gating this on the end-to-end Jepsen run is correct. If elastickv's CreateTable is synchronous (as most compatible implementations are), no change is needed. If the run surfaces spurious ResourceNotFoundException failures immediately after setup!, add a DescribeTable poll there.

One new observation (no action needed before unmark-draft)

teardown! is called on the first-opened client only

In Jepsen's lifecycle, teardown! fires once on a single node's client, then close! fires on every node's client. The ddb field in teardown!'s scope is the client opened for that one node — delete-all-tables! only needs a valid connection to any node, so this is fine. Worth knowing for the launch-script author: if the teardown node is unreachable, the deletion will fail. The ResourceNotFoundException-only guard in delete-one-table! does not absorb network errors (cognitect.anomalies/unavailable). If partial network faults are left active at teardown time, tables may not be cleaned up. No change needed now; document in the launch-script teardown notes.

What looks good in `c46b2b5`

delete-all-tables! docstring explains the Elle-history rationale clearly — future maintainers will understand why this isn't optional cleanup.
The assert message at line 225 includes the contract text ("positionally aligned with the requested items"), not just the check condition — good.
The assert message at line 389–391 interpolates both the actual value and the bound so the failure is immediately actionable from a CLI invocation.
Test count math: 3+1+1+1+5+4+2+3+2 = 22 assertions across 9 tests — matches the reported run output exactly.

… (claude[bot] design note on c46b2b5) claude[bot]'s follow-up review of c46b2b5 listed two deferred items: * ConditionalCheckFailedException dead-code arm — mirrors the single-table workload, claude OK'd 'A future cleanup can remove both together.' Still deferred. * setup! doesn't wait for ACTIVE state — claude OK'd 'gating this on the end-to-end Jepsen run is correct' but flagged the assumption is unverified. The second one is cheap to lock down with actual code evidence rather than leaving it speculative. Verified against adapter/dynamodb.go:779-783 — elastickv's CreateTable handler returns TableStatus=ACTIVE synchronously in the response body, so the table is queryable as soon as create-one-table! returns. No DescribeTable poll needed for the elastickv target. Real DynamoDB's CreateTable is async; if this workload is ever pointed at a real DynamoDB endpoint (unlikely — Jepsen is wired against elastickv only), a poll loop must be added. Documented at the setup! call site so a future reader doesn't have to re-do the adapter inspection. Doc-comment-only change. lein test elastickv.dynamodb-multi-table-workload-test still 9 tests / 22 assertions, 0 failures, 0 errors.

bootjp · 2026-06-03T17:11:49Z

You linked claude[bot]'s initial review of aded8b9 (the 6-finding pass). All 4 actionable findings from that review were addressed in c46b2b5 and confirmed by claude[bot]'s follow-up at 14:54:50 — both deferrals were explicitly OK'd:

ConditionalCheckFailedException dead-code arm — kept for parity with the single-table workload; claude said "Fine; a future cleanup can remove both together."
setup! doesn't poll for ACTIVE state — claude said "gating this on the end-to-end Jepsen run is correct."

The second deferral was the only one with an unverified assumption: claude[bot] flagged it would matter "if elastickv's CreateTable is async like real DynamoDB." Locked that down with actual code evidence in cf07ec0 — adapter/dynamodb.go:779-783 returns TableStatus=ACTIVE synchronously in the response body, so the table is queryable as soon as create-one-table! returns. Added a comment at the setup! call site so a future reader (or someone retargeting this workload at real DynamoDB) doesn't have to re-do the adapter inspection.

Doc-only; lein test still 9 / 22 passing.

@claude review

claude · 2026-06-03T17:12:11Z

Claude finished @bootjp's task in 1m 8s —— View job

Code Review — third pass on `cf07ec0`

Tasks

Load tools
Read updated Clojure workload file
Read updated test file
Verify synchronous ACTIVE claim in adapter/dynamodb.go:779-783
Check git log for cf07ec0 diff
Confirm all prior findings addressed; post final verdict

Verdict: clean — all prior findings resolved, doc-only commit is correct, no new issues

`cf07ec0` verification

The synchronous ACTIVE claim is correct and well-evidenced.

adapter/dynamodb.go:779-783 writes TableStatus=ACTIVE only after createTableWithRetry returns nil. Tracing that call (dynamodb.go:856-888): it calls d.coordinator.Dispatch(ctx, req) and only returns nil once Raft consensus completes and the create-table entry is applied to the FSM. There is no goroutine or background channel — the HTTP response is written synchronously after commit. The table is queryable the moment create-one-table! returns.

The added comment at setup! (lines 295–301) correctly documents:

why no poll loop is needed against elastickv
the exact code location confirming it (adapter/dynamodb.go:779-783)
the forward-compatibility note for real DynamoDB endpoints

No doc content is incorrect or misleading.

Status of all findings from prior rounds

Finding	Status
Medium — stale table data / `teardown!` no-op	Fixed in `c46b2b5` — `delete-all-tables!` with idempotent guard
Low — `TransactGetItems` positional alignment unverified	Fixed in `c46b2b5` — assert with message at line 224
Low — `key-count` / `num-tables` coupling unguarded	Fixed in `c46b2b5` — construction-time assert + 5 tests
Trivial — deftest naming underscores	Fixed in `c46b2b5`
Low — `ConditionalCheckFailedException` dead-code arm	Deferred by design (parity with single-table workload)
Design note — `setup!` ACTIVE poll	Locked down in `cf07ec0` — synchronous by inspection

What looks good overall

The workload is sound. The routing math ((inc (mod k N)) / (str (quot k N))) produces unique (table, pk) pairs across all non-negative Elle integer keys. The OCC condition-expression pattern is faithfully carried over from the single-table workload. The test suite covers the M5a-specific invariants (deterministic routing + multi-group span + key-count divisibility), and the routing tests use var-get on private vars so a future routing change surfaces here rather than as a silent false negative in a Jepsen run. All comments that reference prior review rounds are accurate.

Ready to unmark draft once the launch script and setup-hook verification PRs land.

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

bootjp marked this pull request as ready for review June 3, 2026 17:24

bootjp merged commit 78fae66 into main Jun 3, 2026
9 checks passed

bootjp deleted the feat/composed1-m5a-impl branch June 3, 2026 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(jepsen): M5a — dynamodb-append-multi-table-workload#916

feat(jepsen): M5a — dynamodb-append-multi-table-workload#916
bootjp merged 3 commits into
mainfrom
feat/composed1-m5a-impl

bootjp commented Jun 3, 2026

Uh oh!

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

Review limit reached

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bootjp commented Jun 3, 2026

What this PR ships

Why a separate workload exists

What this workload does

Why self-contained

Tests (jepsen/test/elastickv/dynamodb_multi_table_workload_test.clj)

Remaining M5a work (separate PRs after this lands)

Test plan

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Overall

Medium — Stale table data can produce false Elle anomalies across runs

Low — TransactGetItems zip relies on positional alignment with no verification

Low — ConditionalCheckFailedException handler is dead code

Low — key-count / num-tables coupling not validated at construction time

Trivial — Deftest naming inconsistency

Design note — setup! doesn't wait for tables to reach ACTIVE state

What looks good

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — follow-up pass on c46b2b5

Verdict: all four previous findings correctly addressed — ready to remove draft once launch script lands

Fix-by-fix verification

Deferred items — status unchanged, dispositions look right

One new observation (no action needed before unmark-draft)

What looks good in c46b2b5

Uh oh!

bootjp commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — third pass on cf07ec0

Verdict: clean — all prior findings resolved, doc-only commit is correct, no new issues

cf07ec0 verification

Status of all findings from prior rounds

What looks good overall

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tests (`jepsen/test/elastickv/dynamodb_multi_table_workload_test.clj`)

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

claude Bot commented Jun 3, 2026 •

edited

Loading

Low — `TransactGetItems` zip relies on positional alignment with no verification

Low — `ConditionalCheckFailedException` handler is dead code

Low — `key-count` / `num-tables` coupling not validated at construction time

Design note — `setup!` doesn't wait for tables to reach ACTIVE state

claude Bot commented Jun 3, 2026 •

edited

Loading

Code Review — follow-up pass on `c46b2b5`

What looks good in `c46b2b5`

claude Bot commented Jun 3, 2026 •

edited

Loading

Code Review — third pass on `cf07ec0`

`cf07ec0` verification