{"author":"chenyu","author_email":"chenyu@fastmail.com","author_time":1706755705,"commit_time":1706755705,"committer":"GitHub","committer_email":"noreply@github.com","hash":"18e854cdbf81be236a762e04576ebb4161961cdf","message":"shrink MLB on sharded axis (#3255)\n\n* shrink MLB on sharded axis\r\n\r\nuse onehot structure to store the real partition. goal is unsynced batchnorm2d that can be run on multigpu for training.\r\n\r\ndraft version in https://github.com/chenyuxyz/tinygrad/pull/109\r\n\r\n* SYNCBN flag\r\n\r\n* test unclean shrinks\r\n\r\n* UnsyncedBatchNorm reuses BatchNorm\r\n\r\n* more robust pad arg check\r\n\r\n* better types\r\n\r\n* more tests!\r\n\r\n* 6 gpus in benchmark\r\n\r\n* disable slow GPUS=6 benchmark","parents":["a3652e6ddcc0cfba36c15ed999c0bd5e24e78f2c"],"tree_hash":"fdb01d78af28790757ae12799763649efe5cfacf"}