branch: master
Commits on master
- c270d54 update test_dtype_alu for METAL (#3629) 2 years ago
- abc5f3a hip bf16 hotfix (#3630) 2 years ago
- bc2a13a test case to show clang and python doing math in double (#3628) 2 years ago
- 568353f hotfix: bump line count to 6500 2 years ago
- a1507c7 Fix Tensor.dropout() with multigpu (#3619) 2 years ago
- e5ee6bb fix outdated url in showcase doc (#3624) 2 years ago
- 8500265 this mem fault still happening (#3620) 2 years ago
- 3c3f846 tinybox benchmark with HSA (#3603) 2 years ago
- f500be1 out of bounds access caused by launch bounds (#3615) 2 years ago
- eb83e2d decouple buffer mutability from cstyle (#3617) 2 years ago
- 3275260 Revert "test: add failing bfloat16 test case for metal backend (#3481)" (#3618) 2 years ago
- 1e12a2a test: add failing bfloat16 test case for metal backend (#3481) 2 years ago
- 957e980 llama + beam to mac benchmark, full cifar to nvidia benchmark (#3612) 2 years ago
- 282bbd5 check the input length into argfix (#3610) 2 years ago
- 7db6dd7 multilazybuffer fix (#3609) 2 years ago
- c3b8d28 cleanup uops (#3605) 2 years ago
- 9467932 simpler float4 direct store and locals support (#3592) 2 years ago
- 3db826e hsa in lin opts (#3602) 2 years ago
- 7c90005 search: hotfix to make sure TC behavior is all in applied_opts (#3598) 2 years ago
- 8e5d60a add more gpt2 variant in mac/nvidia benchmark (#3599) 2 years ago
- 968d109 apply more create_lt_node (#3597) 2 years ago
- bc562c4 Python div alu behavior differs slightly from others (#3596) 2 years ago
- 56d21d7 Fix two bugs concerning Tensor.to. (#3593) 2 years ago
- 0082300 Fix symbolic negative floordiv (#3594) 2 years ago
- e09619a explicitly create_lt_node when used in shapetracker _expr_view (#3561) 2 years ago
- 640dc0f hsa flush hdp (#3591) 2 years ago
- 660df3c Add test for .softmax.argmax (#3559) 2 years ago
- ee41faf use operator instead of lambda in python_alu (#3590) 2 years ago
- a89afd4 Directly store float4 nodes (#3564) 2 years ago
- 770707b hotfix: gpuocelot no rebuild 2 years ago