[Inductor] Constant folding support #93420
Labels
enhancement
Not as big of a feature, but technically not a bug. Should be easy to fix
module: inductor
oncall: pt2
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Motivating Example
Below is a case in MobileBertForMaskedLM which has a concatenation on two model parameters (
hidden_states = hidden_states.matmul(torch.cat([self.decoder.weight.t(), self.dense.weight], dim=0))
). This concat takes >20% of the single-threaded inference time on CPU but this cost can be saved with constant folding.cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire @ezyang @msaroufim @wconstab @ngimel @bdhirsh
The text was updated successfully, but these errors were encountered: