Skip to content

Pull requests: intel/auto-round

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: replace bare except clauses with except Exception
#1981 opened Jul 1, 2026 by ramkrishs Contributor Loading…
4 of 5 tasks
Support vLLM-based Model Quantization with llm_compressor Export
#1978 opened Jul 1, 2026 by changwangss Contributor Loading…
4 tasks
Add hpu inference in CI test
#1976 opened Jul 1, 2026 by chensuyue Contributor Loading…
4 tasks
[ARK] Support gemm using sycl-tla
#1968 opened Jun 30, 2026 by Zhenzhong1 Contributor Draft
feat: add --dry-run VRAM/size estimation mode
#1958 opened Jun 26, 2026 by mvanhorn Loading…
2
1
Fix UltraChat chat-template handling for Transformers v5
#1941 opened Jun 22, 2026 by Copilot AI Draft
2 of 4 tasks
Add quantization support for DiffusionGemma
#1935 opened Jun 17, 2026 by lvliang-intel Contributor Loading…
1 of 4 tasks
Added prefill strategy benchmarking script and results
#1923 opened Jun 15, 2026 by jijiaz Loading…
[draft]refine device
#1900 opened Jun 9, 2026 by wenhuach21 Contributor Draft
4 tasks
feat: add overlap function for multi-blocks compression
#1850 opened May 25, 2026 by ZaneMark Contributor Loading…
3 tasks
Add moe prefill/ decode with int2/int4/int8 sym /asym and fp8 e4m3 e5m2
#1813 opened May 14, 2026 by Copilot AI Loading…
4 tasks done
feat: support Nemotron-H / Nemotron-Cascade-2 (#1711)
#1712 opened Apr 20, 2026 by michael-rabe Loading…
4 of 9 tasks
Continuously optimize AutoScheme RAM consumption
#1703 opened Apr 17, 2026 by lvliang-intel Contributor Loading…
2 of 9 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.