{"author":"chenyu","author_email":"chenyu@fastmail.com","author_time":1704405710,"commit_time":1704405710,"committer":"GitHub","committer_email":"noreply@github.com","hash":"f88506e6306ccedc07f199840e42c20f899e9f0e","message":"move gpt2/llama sampling inside the model call (#3013)\n\n* move gpt2/llama sampling inside the model call\r\n\r\n* argmax uses one more kernel","parents":["c2a044ed83ae0dd3971263a2f1bbd6e062824a77"],"tree_hash":"441ab59552e8f81cf619a16c1ac55c5a47d81677"}