-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cp: [Was PR1912][Dev] feat(moe): Fine-grained activation offloading (#1969) into core_dev_r0.15.0
#2605
opened Dec 9, 2025 by
chtruong814
Loading…
6 tasks
Modify config for the PP fix.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Inference | Add request only if no paused requests.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Check skip_prompt_log_probs in add_request
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Inference | Fix entangled request generations.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Synchronize total block count across pipeline parallel ranks
#2578
opened Dec 5, 2025 by
santhnm2
Loading…
6 tasks
fix: ckpt loading failed because of padding metadata in dist optimizer
Expert Review
Apply this label to indicate that your PR is ready for expert review.
#2576
opened Dec 5, 2025 by
yaoyu-33
Loading…
6 tasks
[Megatron-FSDP] Support both old and new DeviceMesh APIs.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Add initial support for Kimi Delta Attention (Feature Request #2446)
community-request
#2573
opened Dec 5, 2025 by
CodersAcademy006
Loading…
partial cudagraph scopes and improvements for training
#2572
opened Dec 5, 2025 by
jiemingz
Loading…
6 tasks
Fix: Ensure token IDs respect vocab_size in dataset, embeddings, and …
community-request
#2570
opened Dec 5, 2025 by
CodersAcademy006
Loading…
Add offset method for slow tokenizer
community-request
#2567
opened Dec 5, 2025 by
cael-ling
Loading…
6 tasks
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.