Skip to content

Add ModelRun.total_cost and total_data_rows#2057

Open
apollonin wants to merge 3 commits into
developfrom
dapollonin/model-run-cost-and-data-rows
Open

Add ModelRun.total_cost and total_data_rows#2057
apollonin wants to merge 3 commits into
developfrom
dapollonin/model-run-cost-and-data-rows

Conversation

@apollonin
Copy link
Copy Markdown
Contributor

@apollonin apollonin commented Jun 2, 2026

What

Adds two lazy properties to ModelRun:

model_run = client.get_model_run(model_run_id)
model_run.total_cost       # float | None  (USD)
model_run.total_data_rows  # int | None    (data rows processed)
model_run.refresh_cost_and_usage()  # bust the cache

Why

Model Foundry users (DoubleVerify) currently log cost and dataset size by hand. This lets them pull both straight off the ModelRun they already fetch — matching the workflow in their existing scripts (client.get_model_run(...)).

How

On first access, a single modelFoundryModelRunInfo query is issued (which proxies to model_service's already-computed per-job cost + data-row count) and cached on the instance. Nothing is persisted on the run — it's real-time rehydration. Runs not backed by a Foundry model job return None (the LabelboxError is swallowed).

The get_model_run query itself is unchanged, so existing fetches stay cheap; the extra round trip only happens when cost/usage is actually read. (A server-side ModelRun.totalCost field resolver was rejected because DbObject selects all fields on every fetch, which would hit model_service on every get_model_run.)

Part of a 3-PR stack (deploy in order)

  1. python-monorepo model_service — populates total_data_rows (Labelbox/python-monorepo#2502)
  2. intelligence — exposes totalDataRows on GraphQL (Labelbox/intelligence#29662)
  3. this PR — SDK properties

Test

tests/unit/test_unit_model_run.py — fetch + cache, refresh re-fetch, and None for non-Foundry runs. 3 tests passing (pytest tests/unit/test_unit_model_run.py).


Note

Low Risk
Read-only additive SDK properties with opt-in GraphQL cost; depends on upstream Model Foundry GraphQL being deployed in the stack.

Overview
Adds lazy, cached ModelRun.total_cost (USD) and ModelRun.total_data_rows, populated on first read via a modelFoundryModelRunInfo GraphQL call (not on get_model_run). refresh_cost_and_usage() clears the cache for a live re-fetch.

Non–Model Foundry runs and missing payloads return None; ResourceNotFoundError / InternalServerError and empty responses are treated as no usage data. Transient errors (e.g. NetworkError) are not cached so a later read can retry.

Unit tests in test_unit_model_run.py cover fetch/cache, refresh, non-Foundry/missing payload behavior, and transient error propagation.

Reviewed by Cursor Bugbot for commit 9f91a2b. Bugbot is set up for automated code reviews on this repo. Configure here.

Expose a model run's total cost and processed data-row count, fetched
in real time from Model Foundry (modelFoundryModelRunInfo) on property
access and cached on the instance -- nothing is persisted on the run.
Returns None for runs not backed by a Foundry model job.

Requires the matching backend changes that surface totalDataRows on
the modelFoundryModelRunInfo GraphQL field.
Comment thread libs/labelbox/src/labelbox/schema/model_run.py Outdated
Catch ResourceNotFoundError/InternalServerError (run has no Foundry
model job) and cache the empty result, but let transient errors
(network, timeout, rate limit) propagate so they are not permanently
cached as None. Addresses Bugbot review feedback.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit ff2aadd. Configure here.

Comment thread libs/labelbox/src/labelbox/schema/model_run.py Outdated
execute() returns None (not raises) for a RESOURCE_NOT_FOUND response
since raise_return_resource_not_found defaults to False, so index the
payload defensively to avoid a TypeError. Addresses Bugbot feedback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant