branch: master
Commits on master
- b58e7b1 zero out the weight in bert init run (#9076) 1 year ago
- 82ad0d2 keep CONST/BUFFER uops in tensor_map [pr] (#9083) 1 year ago
- 6529706 move buffer refcount increment to the toposort [pr] (#9081) 1 year ago
- 73af42a fix pow backward when base is 0 (#9075) 1 year ago
- 2d04a75 start tracking bottom_up_rewrite in viz [pr] (#9071) 1 year ago
- 5ef48bb swap order in rsqrt (#9069) 1 year ago
- e839056 Show install instructions when dawn library is missing (#9059) 1 year ago
- 9e91898 bert eval at the end of training (#9070) 1 year ago
- e02e3b9 remove SQRT hack in llvm (#9067) 1 year ago
- 947c97e add test_sqrt to test_speed_v_torch (#9066) 1 year ago
- 49abc09 remove the reshapes in test_arange_2_reduce [pr] (#9063) 1 year ago
- 2573d06 Tensor.scatter_reduce touchup [pr] (#9060) 1 year ago
- 1f9d244 Add `Tensor.scatter_reduce` (#8947) 1 year ago
- 2b9ce12 simple failing case for reorder expand + keep views in tensor_map [pr] (#9057) 1 year ago
- 765a936 getenv(CC) for clang (#9054) 1 year ago
- 33a1151 Revert "match torch rmsnorm implementation (#6799)" (#9052) 1 year ago
- a66b825 match torch rmsnorm implementation (#6799) 1 year ago
- 19ae829 test float uop in sym_infer (#7456) 1 year ago
- 095504d mulacc_unrolled should happen even with no DEVECTORIZE (#9029) 1 year ago
- 74742c0 hotfix: setup_mock_nv_osx 1 year ago
- d2ff55e OSX GPUOcelot (#8209) 1 year ago
- f4f56d7 move time_linearizer to extra.optimization.helpers [pr] (#9048) 1 year ago
- c15486c remove contiguous in test_subbuffer_used [pr] (#9046) 1 year ago
- b3eab03 Three things to get Windows CI working correctly: (#9047) 1 year ago
- f53b819 UOps. -> Ops. [pr] (#9044) 1 year ago
- 6811688 disallow VIEW(BUFFER) in tensor [pr] (#9041) 1 year ago
- 7b5ac2c free_intermediates in bert (#9040) 1 year ago
- 916d5e7 WebGPU f16 support (f16 bounty part 2) (#8653) 1 year ago
- aaed315 add AMX support to LLVM (#8957) 1 year ago
- 0c97c10 TestOps: silence pytorch std()/var() degrees of freedom warnings (#9034) 1 year ago