BlasEntry::m_linkedInstances: std::vector → std::unordered_set for O(1) unlink by BinqAdams · Pull Request #142 · NVIDIAGameWorks/dxvk-remix

BinqAdams · 2026-05-19T09:15:07Z

Summary

BlasEntry::unlinkInstance() was O(N) (std::find + swap-and-pop on a vector). With many same-geometry transient instances sharing one BlasEntry, end-of-frame GC ran N unlinks at O(N) each — O(N²) per frame.
Switching m_linkedInstances to std::unordered_set<RtInstance*> makes unlink O(1) average. linkInstance becomes .insert(); unlinkInstance becomes .erase(). No public-API behavior change.
Present since 98314db1 (Aug 2022). Doesn't bite typical NV target games (Portal RTX, HL2 RTX) because meshes there generally have unique vertex content and don't collide into one BlasEntry via exactMatch. Surfaces in games with many same-geometry single-bone draws — Painkiller's FFP indexed-blend skinning is one such pattern, where hundreds of particle billboards emitted from skinned joints share geometry+material and collapse onto one BlasEntry.

Compatibility

getLinkedInstances() previously returned const std::vector<RtInstance*>&. It now returns const std::unordered_set<RtInstance*>&.
All in-tree consumers only use .size() (rtx_accel_manager.cpp dynamic-BLAS heuristic, BlasEntry debug print) and .empty() (rtx_scene_manager.cpp BVH cache GC predicate). Both work identically on either container.
No consumer iterates m_linkedInstances or depends on insertion order.

Test plan

Builds clean on top of latest main (REMIX-5504 included).
Verified in Painkiller (custom title using dxvk-remix): pre-fix end-of-frame GC scaled quadratically with particle count and caused visible FPS drops in particle-heavy scenes; post-fix the cost is flat with N.
CI tests.

Note on attribution

This O(N²) pattern wasn't introduced by recent commits — it has been latent since the first RTX Remix commit. The first-commit author left a "Swap & pop - faster than 'erase'" comment, indicating awareness of the pop-side cost but not concern about the linear std::find. This PR addresses that.

…1) unlink The container stores every RtInstance currently bound to a BlasEntry, and BlasEntry::unlinkInstance() called std::find on it (linear scan) followed by swap-and-pop. With many same-geometry transient instances sharing one BlasEntry — for example hundreds of single-bone particle billboards emitting from a monster joint, all sharing geometry and material so they collapse onto one BlasEntry through exactMatch — the vector grew to N entries and end-of-frame GC ran N unlinks at O(N) each, for O(N²) total per frame. Switching the container to std::unordered_set makes unlinkInstance O(1) average. linkInstance is now insert() instead of push_back(); getLinkedInstances() returns a set reference. No current consumer iterates the container or depends on insertion order — they only call .size() (accel_manager dynamic-BLAS heuristic, BlasEntry debug print) and .empty() (BVH cache GC predicate in scene_manager), which work identically on either container. This was present since the first RTX Remix commit (98314db, Aug 2022) but doesn't bite in typical NV target games where most meshes have unique vertex content and end up in their own buckets. Painkiller's FFP indexed-blend skinning produces many same-geometry single-bone draws that share a BlasEntry, exposing the quadratic scaling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BlasEntry::m_linkedInstances: std::vector → std::unordered_set for O(1) unlink#142

BlasEntry::m_linkedInstances: std::vector → std::unordered_set for O(1) unlink#142
BinqAdams wants to merge 1 commit into
NVIDIAGameWorks:mainfrom
BinqAdams:fix/m_linkedInstances-unordered-set

BinqAdams commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BinqAdams commented May 19, 2026

Summary

Compatibility

Test plan

Note on attribution

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant