Commit Graph

27 Commits

Author SHA1 Message Date
394e63e273 Fix llama example to correctly handle token output and avoid re‑feeding the last prompt token. 2024-04-24 16:44:25 +00:00
190c6978d2 llama: simplify llama3 prompt template encoding by removing redundant newline re-encoding and ensuring a trailing newline. 2024-04-10 09:36:28 +00:00
b67685b941 Add example Bazel build files and tokenizer test for tinyllama, including tigerbeetle integration and flags. 2024-04-01 17:40:18 +00:00
a811b2e1e3 llama: fix dimensions and data types
Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA.
2024-03-20 13:37:19 +00:00
602757e7a9 Update examples to use the corrected logFn API. 2024-03-18 13:11:14 +00:00
76e314db9b Update Llama example docs and Bazel build files, and add tests for the new HuggingFace tokenizer integration. 2024-03-04 12:11:13 +00:00
b643f7bc53 Add Bazel build rule and test for Llama3 tokenizer’s byte fallback and unknown token handling. 2024-02-02 10:25:48 +00:00
0ce36599da Update example build config and Llama demo to support the new async epoll backend and zigcoro scheduler. 2024-01-22 12:17:01 +00:00
8a031bd4c8 Update Llama example to use the simplified transpose implementation and increase default profiler size to 1,000,000 events. 2023-12-15 12:06:42 +00:00
22a846de72 Update llama example to use per‑target output folders and call profiler.dumpDataAsJson for testing the new compilation layout. 2023-12-01 16:05:59 +00:00
cb6fcbbb1a Update docs and Zig examples to demonstrate the new client creation flags API. 2023-11-09 12:31:11 +00:00
237a877a29 zml: Add support for Llama 3.2 text-only models. Implement transpose over embed_tokens as a replacement for missing lm_head and make lm_head optional for compatibility. Add repositories and executions to Bazel and update README. 2023-11-01 10:16:48 +00:00
37de7b9613 Add Llama example showcasing the new func.call emission and function caching behavior. 2023-10-17 11:00:37 +00:00
35395c13f8 Update example programs (benchmark, llama, mnist, simple_layer) to use the new Exe API and reflect BaseExe allocation changes. 2023-10-10 11:12:34 +00:00
474f76cd75 Enable buffer donation in the Llama example, donating all buffers except the token_index buffer. 2023-10-03 16:32:40 +00:00
06865f5876 Update Llama example to use the new direct rope IR implementation. 2023-09-25 10:22:05 +00:00
4abdd32f0d Update llama example BUILD to use jax-cuda-pjrt plugin and bump CUDA (12.6.2) / CuDNN (9.5.1) versions. 2023-09-12 15:40:21 +00:00
af0630616c Update docs (deploy_on_server, dockerize_models, getting_started) and example Bazel files to include AWS Neuron/Trainium/Inferentia deployment guidance. 2023-08-21 09:15:48 +00:00
726a2d0691 Update docs and examples to showcase the new async runtime with coroutines and cross‑thread signaling. 2023-08-03 11:35:24 +00:00
f7bac1af10 Update example programs (llama and loader) with hotfixes for issue. 2023-07-04 13:40:05 +00:00
7985716562 Add new Zig example programs (benchmark, llama, loader, mnist, simple_layer) and include a test for the llama example. 2023-06-27 14:23:22 +00:00
672df8fa2f Update tutorial and example code to use the new asyncc name and Generic slugs. 2023-05-08 16:58:45 +00:00
837f8fb111 Add support for the Llama 3.1 70B Instruct model to facilitate testing on high‑performance accelerators. 2023-04-19 10:23:44 +00:00
fdb7da5c9b Introduce sharding attributes to Llama weights to enable Tensor Parallelism. 2023-04-13 12:35:27 +00:00
aea23c720e Update Llama example to use renamed zml.aio.Metadata (formerly Value) and reflect torch loader changes. 2023-04-05 14:09:59 +00:00
16e066ec69 Add llama example demonstrating the new gatherValues functionality. 2023-01-11 09:58:09 +00:00
eded305649 Add initial documentation and example projects for ZML, covering how‑to guides, tutorials, and benchmark examples. 2023-01-03 10:21:07 +00:00