Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Relax CuPy constraint to only exclude 14.1.0 ci/build
#44284 opened Jun 2, 2026 by khluu Member Loading…
[Anthropic] Support system role messages inside messages array frontend
#44283 opened Jun 2, 2026 by chaunceyjiang Collaborator Loading…
4 tasks
[Bugfix] Vendor MiniCPMV/MiniCPMO processors to unblock Transformers v5 bug Something isn't working multi-modality Related to multi-modality (#4194)
#44282 opened Jun 2, 2026 by wjinxu Contributor Loading…
[Refactor] Remove dead code from parser infrastructure ready ONLY add when PR is ready to merge/full CI is needed
#44279 opened Jun 2, 2026 by sfeng33 Collaborator Loading…
[XPU] Support dequant Block-scaled FP8 checkpoints (e.g. RedHatAI/gemma-4-31B-it-FP8-block) intel-gpu Related to Intel GPU
#44278 opened Jun 2, 2026 by libinta Contributor Loading…
4 tasks
[Breakable Graph] add eager break for unified attention
#44275 opened Jun 2, 2026 by zhenwei-intel Contributor Loading…
4 tasks
[Core] Move max_concurrent_batches to VllmConfig ready ONLY add when PR is ready to merge/full CI is needed v1
#44274 opened Jun 2, 2026 by njhill Member Loading…
[XPU][CI] Add intel jobs in daily CI ci/build intel-gpu Related to Intel GPU
#44271 opened Jun 2, 2026 by zxd1997066 Contributor Draft
3 of 4 tasks
[Bugfix] Restore logging disable level on exception in suppress_logging bug Something isn't working
#44268 opened Jun 2, 2026 by Kymi808 Loading…
[Refactor] Unify reasoning + tool-call parsing behind Parser.parse() frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#44267 opened Jun 2, 2026 by sfeng33 Collaborator Loading…
[Bugfix][Model] Qwen3-Omni: move cu_seqlens to GPU before VIT attention bug Something isn't working qwen Related to Qwen models
#44264 opened Jun 2, 2026 by liulanze Contributor Loading…
3 of 4 tasks
[DSV4] Consolidate attention classes into DeepseekV4Attention deepseek Related to DeepSeek models needs-rebase ready ONLY add when PR is ready to merge/full CI is needed v1
#44263 opened Jun 1, 2026 by WoosukKwon Collaborator Draft
Add the QuantizedActivation linear-kernel contract ci/build nvidia quantization ready ONLY add when PR is ready to merge/full CI is needed
#44260 opened Jun 1, 2026 by mgoin Member Loading…
4 tasks
[XPU] Fix flash-attn k/v descale shape for compressed-tensors quantization intel-gpu Related to Intel GPU
#44259 opened Jun 1, 2026 by libinta Contributor Loading…
4 tasks
[Bugfix][CI] Avoid CUDA init during tests.utils import bug Something isn't working ci/build kv-connector nvidia ready ONLY add when PR is ready to merge/full CI is needed
#44258 opened Jun 1, 2026 by alec-flowers Contributor Loading…
[ROCm][CI] Specifying time outs for the lm eval models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#44255 opened Jun 1, 2026 by AndreasKaratzas Member Loading…
[Bug Fix][Model Runner V2][Spec Decode] Warmup & capture with different attention states for speculator prefill bug Something isn't working nvidia ready ONLY add when PR is ready to merge/full CI is needed v1
#44253 opened Jun 1, 2026 by TheEpicDolphin Collaborator Loading…
[Bugfix] Detect driver-level CUDA init before fork bug Something isn't working nvidia
#44252 opened Jun 1, 2026 by Sunt-ing Loading…
ProTip! Add no:assignee to see everything that’s not assigned.