branch: master
Commits on master
- 3401734 infra for scheduler process replay (#4405) 1 year ago
- 473ecb9 remove SPLIT_REDUCEOP=1 from resnet scripts (#4404) 1 year ago
- b767d59 resnet trainer: keep old cookie around until next step has been queued (#4401) 1 year ago
- cf3ccb8 refactor scheduler parents search (#4402) 1 year ago
- 0627e26 Added missing unittest execution code (#4400) 1 year ago
- d4062cb NV tensor_cores in kernel.py (#4399) 1 year ago
- 0deaaf2 partial fusion spec (#4398) 1 year ago
- 2c3b7f8 pad resnet training data with training data mean (#4369) 1 year ago
- 3cf8291 mlperf/resnet: update beam params to increase time and quality (#4396) 1 year ago
- ca6c8ae factor out resource access logic in multigraph base class (#4385) 1 year ago
- ab01a94 resnet eval 4n+3 if epoch < 33 (#4391) 1 year ago
- 7c8401f search: skip timing the unoptimized kernel (#4395) 1 year ago
- 5c5b408 search: fix edge cases on screening potential ops (#4394) 1 year ago
- 89030b2 add consecutive property to shapetracker 1 year ago
- 2786dff new disk tensor tests (#4393) 1 year ago
- 7492e5d resnet correct log name for red (#4390) 1 year ago
- bf31837 resnet correct steps_in_val_epoch in logging (#4389) 1 year ago
- c8a2047 testing for all reduce (#4387) 1 year ago
- 3113785 Llama 3 Models (#4339) 1 year ago
- 0b47818 simpler reduceop children chasing (#4350) 1 year ago
- 22376e5 resnet mlperf logging (#4361) 1 year ago
- f635c4d fix define global (#4383) 1 year ago
- ad116dc fill in mlperf system description (#4381) 1 year ago
- 9358b62 rename resnet script to dev_beam.sh and dev_run.sh (#4379) 1 year ago
- 6628e13 pad resnet eval data in model_train (#4374) 1 year ago
- 105fbd7 add 3080 support to NV 1 year ago
- 826cccd fix mean underflow for half tensor (#4377) 1 year ago
- dce7ac0 NOCLANG=1 for tinybox green ci. (#4378) 1 year ago
- 272bea5 GraphRunner (#4375) 1 year ago
- 077ea69 remove downcast_half in sum (#4376) 1 year ago