Radix

Author	SHA1	Message	Date
Tarry Singh	6e4fef8844	zml: Introduce arena allocator in CompilationContext. Expose arena allocator to replace existing allocator, enabling safe allocation for ops without misusing std.BoundedArray. Includes breaking changes to chunkAllowTrailing and split. Upgrade axis_ types to anytype for tag handling and add TODOs for upcoming Tensor API.	2023-11-16 15:11:23 +00:00
Tarry Singh	57bf667c90	Add struct‑based client creation flags to the Zig PJRT API and update `context.autoPlatform` to accept a flag struct.	2023-11-13 12:45:17 +00:00
Foke Singh	cb6fcbbb1a	Update docs and Zig examples to demonstrate the new client creation flags API.	2023-11-09 12:31:11 +00:00
Tarry Singh	9f4194ad97	Fix test layer. Add tests to detect silent breakage of testLayer and regression in mapAlloc with zero-size struct fields. Add Python venv directory to .gitignore.	2023-11-06 11:25:57 +00:00
Foke Singh	237a877a29	zml: Add support for Llama 3.2 text-only models. Implement transpose over embed_tokens as a replacement for missing lm_head and make lm_head optional for compatibility. Add repositories and executions to Bazel and update README.	2023-11-01 10:16:48 +00:00
Foke Singh	1c9749c25e	docs: move image in concepts.md	2023-10-31 10:21:14 +00:00
Foke Singh	eb20548241	update instructions following, `prepare` doesn't alloc anymore, `ExeWithWeights` is `ModuleExe`	2023-10-26 13:56:56 +00:00
Tarry Singh	27c8309424	async: add intrusive queue all code contributed by @steeve * add intrusive queue * change the constructor of Channel with default AsyncThread executor --------- Co-authored-by: Steeve Morin <steeve@zml.ai>	2023-10-24 14:36:22 +00:00
Tarry Singh	98b512c495	Implement func.call emission and function caching across MLIR dialects and ZML module/ops, propagating tags and donations.	2023-10-19 17:01:55 +00:00
Foke Singh	37de7b9613	Add Llama example showcasing the new `func.call` emission and function caching behavior.	2023-10-17 11:00:37 +00:00
Tarry Singh	7d36913b31	Refactor ZML API: move compile, compileFn and related types to `exe.zig`, update `BaseExe` allocation and inline caching in `compileInternal`, and clean up supporting modules (`func.zig`, `meta.zig`, `signature.zig`, `cuda.zig`, `testing.zig`, `zml.zig`).	2023-10-13 16:08:08 +00:00
Foke Singh	35395c13f8	Update example programs (benchmark, llama, mnist, simple_layer) to use the new Exe API and reflect BaseExe allocation changes.	2023-10-10 11:12:34 +00:00
Tarry Singh	3bc6ad98be	Update module.zig to donate all buffers except the `token_index` buffer for the Llama+Neuron example.	2023-10-06 10:10:56 +00:00
Foke Singh	474f76cd75	Enable buffer donation in the Llama example, donating all buffers except the token_index buffer.	2023-10-03 16:32:40 +00:00
Tarry Singh	5122ca0203	Refactor rope implementation to compute only required offsets, eliminating full cos/sin matrix generation in module, nn, and tensor code.	2023-09-27 11:45:33 +00:00
Foke Singh	06865f5876	Update Llama example to use the new direct rope IR implementation.	2023-09-25 10:22:05 +00:00
Tarry Singh	b5c4fb7c58	zml: fix float8 <-> float32 conversions, support for `Tensor.constant(.{}, .{ .f8 = 1.0})` Mostly: * fix float8 <-> float32 conversions * support for `Tensor.constant(.{}, .{ .f8 = 1.0})` Misc: * fix small inconsistencies between different versions of sdpa * better error message for broadcast * bazelrc: --config=debug	2023-09-21 11:15:50 +00:00
Tarry Singh	455bb3877f	runtimes/cuda: obtain NCCL from the pip package, matching XLA behavior.	2023-09-20 17:41:44 +00:00
Tarry Singh	0d5389ceda	Update CUDA runtime sandboxing and dynamic symbol renaming, switch to pre‑built jax‑cuda‑pjrt plugin, and bump CUDA to 12.6.2 and cuDNN to 9.5.1.	2023-09-14 13:28:25 +00:00
Foke Singh	4abdd32f0d	Update llama example BUILD to use jax-cuda-pjrt plugin and bump CUDA (12.6.2) / CuDNN (9.5.1) versions.	2023-09-12 15:40:21 +00:00
Tarry Singh	c8c99d7d5a	zml/pjrtx: prefer the built‑in stablehlo version when a plugin reports a newer version, ensuring artifact serialization uses the correct stablehlo version.	2023-09-07 17:06:19 +00:00
Tarry Singh	9505992e00	workspace: log diagnostic message before returning NotFound to aid debugging.	2023-09-04 13:34:37 +00:00
Foke Singh	937cdec324	examples/loader: add missing stdx dependency.	2023-08-30 13:03:59 +00:00
Tarry Singh	aa7fae449e	zml/pjrtx: execute `bufferFromHostBuffer` on the thread pool to avoid blocking and improve weight loading performance.	2023-08-29 10:28:51 +00:00
Tarry Singh	c081cb9ad6	zml/platform: increase maximum device limit to support up to 32 devices per platform.	2023-08-24 12:23:07 +00:00
Foke Singh	af0630616c	Update docs (deploy_on_server, dockerize_models, getting_started) and example Bazel files to include AWS Neuron/Trainium/Inferentia deployment guidance.	2023-08-21 09:15:48 +00:00
Tarry Singh	7d24329d0a	Add Bazel build rules and runtime implementation for AWS Neuron/Trainium/Inferentia support.	2023-08-18 17:11:27 +00:00
Tarry Singh	0709b1b32f	zml: reduce memory usage of sdpaMemEfficient by using zml.ops.while instead of zml.ops.for, avoiding concatenation of intermediate results.	2023-08-14 14:24:11 +00:00
Foke Singh	022baf782b	Update examples/MODULE.bazel to reference the bumped LLVM, XLA, StableHLO, and PJRT plugin versions.	2023-08-11 16:57:15 +00:00
Tarry Singh	01eff33fa0	Update workspace dependencies to newer LLVM, XLA, StableHLO, and PJRT versions and expose new pjrt plugin attribute APIs and stablehlo version APIs in build and runtime configurations.	2023-08-07 12:28:36 +00:00
Foke Singh	726a2d0691	Update docs and examples to showcase the new async runtime with coroutines and cross‑thread signaling.	2023-08-03 11:35:24 +00:00
Tarry Singh	bcde3962ce	Rework async runtime with coroutine support, rename async API (async_→asyncc, await_→awaitt), improve type inference, bump libxev (default epoll) and update related stdx and zml modules.	2023-08-01 11:35:04 +00:00
Tarry Singh	b53462b515	Fix crash in for_ by ensuring values are pushed to their block before opening a new block, adding asserts for block state, and guaranteeing first_step is used. Adjust padding syntax to improve usability.	2023-07-25 14:25:47 +00:00
Foke Singh	0fa258cd88	Update examples to reflect recent async module changes, renaming asyncGeneric to asyncc.	2023-07-24 09:34:35 +00:00
Tarry Singh	f675a203c2	zml.ops.makeBlock now returns the inner tensor to propagate tags. The function returns both the created mlir.Block and tensors from the supplied function, allowing shape and tag propagation without exposing mlir.Values. Updated tests to run on non‑CPU platforms.	2023-07-21 09:01:01 +00:00
Tarry Singh	be8aa4fa8e	Fix several compileError calls introduced by recent changes; ensure Zig compiler catches errors at comptime.	2023-07-17 09:10:27 +00:00
Tarry Singh	0f9a92f27d	module-cache: raise max_pjrt_executable_size limit to 400 MB to accommodate large PJRT executables.	2023-07-14 17:58:22 +00:00
Tarry Singh	88c7a74ccf	third_party/modules/zig-protobuf: revert indentation changes to maintain compatibility with older branches.	2023-07-13 11:44:53 +00:00
Tarry Singh	4681ce2f24	PJRT: add conversion of profiling protobuf output to JSON format.	2023-07-05 13:34:05 +00:00
Foke Singh	f7bac1af10	Update example programs (llama and loader) with hotfixes for issue.	2023-07-04 13:40:05 +00:00
Tarry Singh	63aca9f9c2	Hotfixes for build rule, math utilities, module system, and NN implementation (fixes,)	2023-06-29 10:26:54 +00:00
Foke Singh	7985716562	Add new Zig example programs (benchmark, llama, loader, mnist, simple_layer) and include a test for the llama example.	2023-06-27 14:23:22 +00:00
Tarry Singh	9b7eea8ac2	Add stdx utilities and rework async signature inference; tidy executable logging.	2023-06-21 14:45:14 +00:00
Tarry Singh	c30aa018dc	zml: small cleanup - Add more scatterSlices test cases. - Replace helpers.mapTensors with zml.meta.map. - Fix shape handling when a for loop is fully unrolled. - Allow zml.Tensor.pad to accept i64 for dimension compatibility. - Enable arrays of tensors inside model structs. - Split Buffer.asViewOf into asViewOfHostBuffer and asViewOfDeviceBuffer.	2023-06-19 15:29:29 +00:00
Tarry Singh	f00538667e	zml.nn: add dynamic sampling with support for top‑k, top‑p, and min‑p settings. Implements token index computation based on the selected sampling strategy, including options for top_k, max_top_k, top_p, and min_p.	2023-06-16 14:34:18 +00:00
Tarry Singh	b244a18621	zml: set iota default dtype to .i32, with fallback to .i64 for axes with many elements, simplifying usage.	2023-06-15 12:45:52 +00:00
Tarry Singh	344e07fb6e	stablehlo: extend dot_general API to include DotAlgorithm support by merging precision and algorithm attributes into a union, aligning with spec requirements. Currently not exposed to users due to limited algorithm support.	2023-06-07 11:20:25 +00:00
Tarry Singh	6d720126ac	Add PJRT custom call integration with generic zmlHostBufferCallback to copy tensors to host and invoke user callbacks. Introduce Tensor.print() method to output runtime tensor values (CUDA‑specific, uses a pre‑allocated host buffer).	2023-06-05 13:42:45 +00:00
Foke Singh	bf23eef0d9	examples: clean up inconsistencies in asynk usage across the codebase.	2023-06-01 16:11:58 +00:00
Tarry Singh	499b0d20e5	pjrtx: change behavior to return an error when OpenXLA fails to serialize the new batching_dim attribute for gather/scatter, instead of panicking.	2023-05-29 17:18:19 +00:00

... 2 3 4 5 6

263 Commits