Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

🚀 DeepSpeed integration documentation
#2993 opened Feb 28, 2025 by qgallouedec Loading…
5 tasks
Add support for additional generation kwargs in GRPO Trainer
#2989 opened Feb 28, 2025 by nopepper Loading…
3 of 5 tasks
Feature(WIP): Add SGLang as inference backend for generation in GRPO
#2981 opened Feb 28, 2025 by jhinpan Loading…
5 tasks done
Supporting deepspeed>=0.16.4's rename
#2963 opened Feb 26, 2025 by jamesbraza Loading…
Support ReMax Algorithm
#2955 opened Feb 25, 2025 by liziniu Loading…
3 tasks done
Agents
#2936 opened Feb 23, 2025 by August-murr Loading…
Add the metrics completion_length_max and completion_length_min
#2930 opened Feb 22, 2025 by dignfei Loading…
4 tasks
Supporting multi-vLLM inference for GRPO
#2929 opened Feb 22, 2025 by ghrua Loading…
2 of 5 tasks
Liger GRPO support
#2926 opened Feb 21, 2025 by SalmanMohammadi Draft
4 tasks
Remove CUDA synchronization in mean_token_accuracy
#2902 opened Feb 19, 2025 by cyyever Loading…
1 task done
Fast dataset truncate in SFTTrainer
#2898 opened Feb 18, 2025 by mariosasko Loading…
1 of 5 tasks
BCOTrainer version upgrade fixes
#2867 opened Feb 15, 2025 by claralp Loading…
3 of 5 tasks
[Liger] Liger KTO support
#2812 opened Feb 10, 2025 by vaibhavjindal Loading…
3 of 5 tasks
[draft] Use vLLM in LogCompletionsCallback
#2797 opened Feb 7, 2025 by tchang1997 Draft
2 of 4 tasks
Remote GRPO ref model
#2763 opened Feb 4, 2025 by edbeeching Draft
WIP: RLOOV2
#2724 opened Jan 31, 2025 by mnoukhov Draft
3 tasks
ProTip! Updated in the last three days: updated:>2025-02-25.