Go to file
Foke Singh a811b2e1e3 llama: fix dimensions and data types
Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA.
2024-03-20 13:37:19 +00:00
async Replace real mutex with async Mutex for logFn, add fallback logger support outside coroutines, and fix ResetCondition handling. 2024-03-14 11:43:33 +00:00
bazel Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
docs Update Llama example docs and Bazel build files, and add tests for the new HuggingFace tokenizer integration. 2024-03-04 12:11:13 +00:00
examples llama: fix dimensions and data types 2024-03-20 13:37:19 +00:00
ffi Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
mlir Rewrite simple transpose as reshape in core ZML modules and raise default profiler event limit to 1,000,000. 2023-12-18 13:56:45 +00:00
pjrt pjrt: Fix profiler by allowing i64 resource IDs and reserving memory when creating array lists. 2023-12-20 17:18:02 +00:00
platforms Add initial Bazel build configuration, async runtime implementation, and core MLIR dialect definitions for ZML. 2023-01-02 14:28:25 +00:00
runtimes Ensure all runtime plugins have correct SONAME values, fixing issues with prebuilt PJRT plugins. 2024-03-11 10:15:22 +00:00
stdx Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
third_party Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
tools Update tutorial documentation in write_first_model.md with quick fixes. 2023-11-30 12:14:33 +00:00
zml Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master. 2024-03-05 17:04:42 +00:00
BUILD.bazel Update Bazel build files and helper scripts to integrate the custom build runner for ZLS code completion. 2023-11-20 15:29:01 +00:00
build.zig Add initial Bazel build configuration, async runtime implementation, and core MLIR dialect definitions for ZML. 2023-01-02 14:28:25 +00:00
MODULE.bazel Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
MODULE.bazel.lock Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
platform_mappings Add initial Bazel build configuration, async runtime implementation, and core MLIR dialect definitions for ZML. 2023-01-02 14:28:25 +00:00