-
Notifications
You must be signed in to change notification settings - Fork 720
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[MOE] try to optimize cu kernel single block execution - distribute cumsum workload from thread 0 to other threads
#2970
opened Jan 19, 2025 by
yiakwy-xpu-ml-framework-team
Loading…
3 of 4 tasks
[EAGLE] Fix some boundary situation when retract reqs and req's max token = 1
#2939
opened Jan 17, 2025 by
josephydu
Loading…
Support distributed tensor when updating weights
#2831
opened Jan 10, 2025 by
fzyzcjy
Loading…
3 tasks done
Support custom device mesh for tensor parallel workers
#2827
opened Jan 10, 2025 by
fzyzcjy
Loading…
3 tasks done
Use CUDA_VISIBLE_DEVICES instead of gpu_id variables everywhere.
#2824
opened Jan 10, 2025 by
heiner
Loading…
1 task done
Improve the mixed chunk prefill by lanuch two kernels
#2811
opened Jan 9, 2025 by
libratiger
•
Draft
1 of 3 tasks
Add endpoint for file support, purely to speed up processing of input_embeds.
#2797
opened Jan 8, 2025 by
RinRin-32
Loading…
2 of 3 tasks
Speculative decoding with lookahead
enhancement
New feature or request
high priority
#2790
opened Jan 8, 2025 by
jjjjohnson
Loading…
3 tasks done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.