Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

(deepspeed stage 3) precompute_ref_log_probs support ⚡accelerate Related to accelerate 🚀 deepspeed Related to deepspeed 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2985 opened Feb 28, 2025 by cyr0930
AttributeError: Can't pickle local object 'GRPOTrainer.__init__.<locals>.data_collator' 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2979 opened Feb 27, 2025 by williamzebrowskI
5 tasks done
Add Community Tutorials related to TRL Integrations 🚀 deepspeed Related to deepspeed 📚 documentation Improvements or additions to documentation ✨ enhancement New feature or request 🦥 unsloth Related to Unsloth
#2978 opened Feb 27, 2025 by ParagEkbote
GRPO Stuck on Step 0 ⚡accelerate Related to accelerate 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2977 opened Feb 27, 2025 by zaddy6
5 tasks done
There may be some doubts about the advantage function of GRPO 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#2976 opened Feb 27, 2025 by L1n111ya
5 tasks done
How many H20 (96GB) GPUs are needed to train Qwen7B with the GRPO algorithm? 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2972 opened Feb 27, 2025 by Tuziking
Add length normalization option to DPO Trainer 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2964 opened Feb 26, 2025 by ggbetz
DeepSpeedZeRoOffload is incompatible with DeepSpeed>=0.16.4 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#2962 opened Feb 26, 2025 by jamesbraza
5 tasks done
Maybe some bug in multi gpu training ⚡accelerate Related to accelerate 🐛 bug Something isn't working
#2960 opened Feb 26, 2025 by Kfkcome
5 tasks done
compute_metrics in GRPOTrainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2959 opened Feb 26, 2025 by dipta007
Request for adding training scripts for SPIN and SPPO 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2958 opened Feb 25, 2025 by jkx19
A quick question about AutoModelForCausalLMWithValueHead ❓ question Seeking clarification or more information
#2957 opened Feb 25, 2025 by SakurajimaMaiii
A little question about GRPO padding way 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2944 opened Feb 24, 2025 by MlSAKA-MlKOTO
IterableDataset not supported on GRPOTrainer 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2942 opened Feb 24, 2025 by Marsella8
5 tasks done
How to dynamically adjust params during grpo training? 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2941 opened Feb 24, 2025 by Tomsawyerhu
Is there any problem with GRPOtrainer’s memory usage? 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2927 opened Feb 21, 2025 by Tuziking
NCCL timeout when GRPO training with vllm 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2923 opened Feb 21, 2025 by edwardzjl
How to support multi-device VLLM inference in the GRPO Trainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2922 opened Feb 21, 2025 by 0x404
ProTip! Add no:assignee to see everything that’s not assigned.