Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA. |
||
|---|---|---|
| .. | ||
| BUILD.bazel | ||
| llama.zig | ||
| main.zig | ||
| test_tokenizer.zig | ||
| test.zig | ||
Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA. |
||
|---|---|---|
| .. | ||
| BUILD.bazel | ||
| llama.zig | ||
| main.zig | ||
| test_tokenizer.zig | ||
| test.zig | ||