-
-
Notifications
You must be signed in to change notification settings - Fork 17.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fail fast or extend RoPE cache when VLLM_ALLOW_LONG_MAX_MODEL_LEN exceeds model positions
bug
Something isn't working
#44286
opened Jun 2, 2026 by
Sunt-ing
Loading…
Relax CuPy constraint to only exclude 14.1.0
ci/build
#44284
opened Jun 2, 2026 by
khluu
Member
Loading…
[Anthropic] Support system role messages inside messages array
frontend
#44283
opened Jun 2, 2026 by
chaunceyjiang
Collaborator
Loading…
4 tasks
[Bugfix] Vendor MiniCPMV/MiniCPMO processors to unblock Transformers v5
bug
Something isn't working
multi-modality
Related to multi-modality (#4194)
#44282
opened Jun 2, 2026 by
wjinxu
Contributor
Loading…
[Refactor] Remove dead code from parser infrastructure
ready
ONLY add when PR is ready to merge/full CI is needed
#44279
opened Jun 2, 2026 by
sfeng33
Collaborator
Loading…
[XPU] Support dequant Block-scaled FP8 checkpoints (e.g. RedHatAI/gemma-4-31B-it-FP8-block)
intel-gpu
Related to Intel GPU
#44278
opened Jun 2, 2026 by
libinta
Contributor
Loading…
4 tasks
[Breakable Graph] add eager break for unified attention
#44275
opened Jun 2, 2026 by
zhenwei-intel
Contributor
Loading…
4 tasks
[Core] Move ONLY add when PR is ready to merge/full CI is needed
v1
max_concurrent_batches to VllmConfig
ready
#44274
opened Jun 2, 2026 by
njhill
Member
Loading…
[Perf] Add tuned Triton fused MoE configs for NVIDIA H20 (fp8_w8a8)
nvidia
#44273
opened Jun 2, 2026 by
CarrotSwordsman
Loading…
Revert "[Frontend][Core] Add sparse NCCL weight transfer support for in-place updates" (#40096)
documentation
Improvements or additions to documentation
frontend
v1
#44272
opened Jun 2, 2026 by
vllm-agent
Contributor
•
Draft
[XPU][CI] Add intel jobs in daily CI
ci/build
intel-gpu
Related to Intel GPU
#44271
opened Jun 2, 2026 by
zxd1997066
Contributor
•
Draft
3 of 4 tasks
[Model] Initialize GDN dt_bias and A_log params as float32
#44270
opened Jun 2, 2026 by
1-Y-C
Loading…
[Bugfix] Restore logging disable level on exception in suppress_logging
bug
Something isn't working
#44268
opened Jun 2, 2026 by
Kymi808
Loading…
[Refactor] Unify reasoning + tool-call parsing behind Parser.parse()
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
tool-calling
#44267
opened Jun 2, 2026 by
sfeng33
Collaborator
Loading…
[Bugfix][Model] Qwen3-Omni: move cu_seqlens to GPU before VIT attention
bug
Something isn't working
qwen
Related to Qwen models
#44264
opened Jun 2, 2026 by
liulanze
Contributor
Loading…
3 of 4 tasks
[DSV4] Consolidate attention classes into DeepseekV4Attention
deepseek
Related to DeepSeek models
needs-rebase
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#44263
opened Jun 1, 2026 by
WoosukKwon
Collaborator
•
Draft
[Bugfix] Skip MiniMax RMSNorm Lamport workspace when custom all-reduce is disabled
bug
Something isn't working
#44261
opened Jun 1, 2026 by
Jiang020609
•
Draft
Add the QuantizedActivation linear-kernel contract
ci/build
nvidia
quantization
ready
ONLY add when PR is ready to merge/full CI is needed
#44260
opened Jun 1, 2026 by
mgoin
Member
Loading…
4 tasks
[XPU] Fix flash-attn k/v descale shape for compressed-tensors quantization
intel-gpu
Related to Intel GPU
#44259
opened Jun 1, 2026 by
libinta
Contributor
Loading…
4 tasks
[Bugfix][CI] Avoid CUDA init during tests.utils import
bug
Something isn't working
ci/build
kv-connector
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#44258
opened Jun 1, 2026 by
alec-flowers
Contributor
Loading…
[ROCm][CI] Specifying time outs for the lm eval models
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#44255
opened Jun 1, 2026 by
AndreasKaratzas
Member
Loading…
[Bug Fix][Model Runner V2][Spec Decode] Warmup & capture with different attention states for speculator prefill
bug
Something isn't working
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#44253
opened Jun 1, 2026 by
TheEpicDolphin
Collaborator
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.