[DSv4] Support Megatron-aligned training loop#4623
Conversation
Signed-off-by: huangjiyi <947613776@qq.com>
PaddleFormers Log Analysis
日志分析报告
失败的测试case: 根本原因分析: 本 PR 新增了 修复建议:
🔄 每次 Re-run 后自动更新 |
Signed-off-by: huangjiyi <947613776@qq.com>
PaddleFormers Log Analysis
日志分析报告
失败的测试case: 根本原因分析: 本 PR 新增了 修复建议:
🔄 每次 Re-run 后自动更新 |
Summary
This PR adds the PaddleFormers-side DSv4 training loop support needed by the current Megatron-aligned PaddleFleet training path on
develop.Main areas covered:
Pairs with PaddlePaddle/PaddleFleet#1151.
Validation
Validated in
/root/paddlejob/share-storage/gpfs/system-public/huangjiyi/dsv4_backward_align/0605_latest_align_22stepswith PaddleFleet60700ccb5edb1da29183ca27c4b34c7f06dbb9fcand PaddleFormersa13fcde3f809a011edff0a22c809bd0a1403f46f:Result:
Also ran
git diff --cached --checkandpy_compilefor the touched PaddleFormers files before commit.