We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
v_bump
bump version
Updt triton pin (#89) * Update setup.py make torch pin more flexible * Update setup.py
Merge pull request #47 from stanford-futuredata/fix-topology-kernel Fix bug in topology kernel for ffn_hidden_size>4096.
Merge pull request #34 from stanford-futuredata/fsdp_refactor Refactoring class hierarchy for FSDP wrapping
Merge pull request #31 from vchiley/no_bias Enable running MegaBlocks MoE without bias
merge conflict
Update Megatron submodule to incorporate fixes to checkpoint saving w… …hen using expert model parallelism