Radix

Author	SHA1	Message	Date
Tarry Singh	72263aa9e3	workspace: fix mistakes in modules bumps Fix missing = in stbalehlo integrity. Changes back - by . in xla module name and folders. Correctly depend on `xla@20250204.1-6789523`	2024-06-06 09:56:17 +00:00
Tarry Singh	f7450a2104	stablehlo: bump to head and use new dialect capi This drastically reduce the number of build steps (from 3589 to 2553 steps)	2024-05-31 13:02:46 +00:00
Tarry Singh	221ece647d	zml/ops.zig: Added `zml.ops.case` operation This can be used to select which branch will be run at runtime. It wraps the `stablehlo.case` operation.	2024-05-30 14:11:08 +00:00
Foke Singh	27aabf9beb	Add Bazel build rules and a test for the benchmark, llama, mnist, and simple_layer examples.	2024-05-23 15:52:34 +00:00
Tarry Singh	3aac788544	Update Bazel build configurations (zig.bzl, BUILD files) for MLIR, PJRT, Neuron, ROCm, tokenizer, and tools, fixing broken dependencies.	2024-05-20 11:28:25 +00:00
Tarry Singh	05944b5cc9	Update FnCache to copy and reuse non‑tensor fields in fixed‑size structs, preventing undefined memory in core modules.	2024-05-15 17:54:52 +00:00
Foke Singh	dfe55b0d34	Update Bazel lock file for examples to reflect FnCache non‑tensor handling changes.	2024-05-13 16:59:37 +00:00
Tarry Singh	8d795dd676	pjrt: profiler support std writer API expose a more low-level function to customize where to write profile reports	2024-05-09 11:09:29 +00:00
Foke Singh	26558d6201	Update examples MODULE.bazel and lockfile to use XLA 20250204.0-6789523 and ensure Bazel 8 compatibility.	2024-05-08 14:03:45 +00:00
Tarry Singh	f5ab6ff2c6	Update XLA to version 20250204.0-6789523 and adjust Bazel module and runtime files for Bazel 8 compatibility.	2024-05-03 15:57:56 +00:00
Tarry Singh	a34190679b	Fix llama token handling and remove redundant prompt token reuse in core Zig modules (aio, module, nn, pjrtx, tensor)	2024-05-02 17:10:11 +00:00
Foke Singh	394e63e273	Fix llama example to correctly handle token output and avoid re‑feeding the last prompt token.	2024-04-24 16:44:25 +00:00
Tarry Singh	5a2171793d	workspace: MODULE.bazel cleanup Title says it all !	2024-04-22 09:27:44 +00:00
Foke Singh	bafe13f546	Update examples/MODULE.bazel.lock to reflect libxev version bump.	2024-04-18 12:53:16 +00:00
Tarry Singh	65c28111a9	Update libxev to version 20252401.0‑31eed4e and apply patches and.	2024-04-15 13:03:25 +00:00
Tarry Singh	13eff4e661	pjrt,zml: add memory bindings This preliminary PR binds PJRT memory endpoints and adds them to `zml.Buffer`. A follow up PR will properly integrate it inside `zml.Buffer`	2024-04-11 15:43:24 +00:00
Foke Singh	190c6978d2	llama: simplify llama3 prompt template encoding by removing redundant newline re-encoding and ensuring a trailing newline.	2024-04-10 09:36:28 +00:00
Tarry Singh	d4db5ccc6b	Integrate TinyLlama support, restore the homemade tokenizer, and align Zig API naming across stdx and zml tokenizer modules.	2024-04-05 15:07:29 +00:00
Foke Singh	b67685b941	Add example Bazel build files and tokenizer test for tinyllama, including tigerbeetle integration and flags.	2024-04-01 17:40:18 +00:00
Tarry Singh	567210d1d7	bazel: depend on prebuilt protoc binaries to eliminate ~1300 build steps. Note: integration is currently blocked due to version constraints in rules_proto and toolchains_protoc.	2024-03-29 09:54:57 +00:00
Tarry Singh	e0c8eecb79	bazel: use OID as sha256 for Git LFS files to prevent unnecessary HuggingFace redownloads.	2024-03-28 17:52:52 +00:00
Foke Singh	a811b2e1e3	llama: fix dimensions and data types Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA.	2024-03-20 13:37:19 +00:00
Foke Singh	602757e7a9	Update examples to use the corrected logFn API.	2024-03-18 13:11:14 +00:00
Tarry Singh	754656f2f0	Replace real mutex with async Mutex for logFn, add fallback logger support outside coroutines, and fix ResetCondition handling.	2024-03-14 11:43:33 +00:00
Tarry Singh	980f1b17fb	Ensure all runtime plugins have correct SONAME values, fixing issues with prebuilt PJRT plugins.	2024-03-11 10:15:22 +00:00
Tarry Singh	8a25b1eb74	Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master.	2024-03-05 17:04:42 +00:00
Foke Singh	76e314db9b	Update Llama example docs and Bazel build files, and add tests for the new HuggingFace tokenizer integration.	2024-03-04 12:11:13 +00:00
Tarry Singh	959bc48c42	Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers.	2024-02-28 15:47:37 +00:00
Foke Singh	5048e7dc89	Update example lock file for rules_distroless 0.4.2 upgrade and verify MNIST image build works.	2024-02-26 15:30:13 +00:00
Tarry Singh	b4b2490690	Upgrade rules_distroless to 0.4.2 in MODULE.bazel and refresh MODULE.bazel.lock accordingly.	2024-02-21 17:48:10 +00:00
Tarry Singh	c109b12e1b	Various minor fixes: rewrite tinyllama tokenizer newline token, prevent HostBuffer.isContiguous false trigger on 1‑dim axes, improve HostBuffer.slice1d error messages, simplify module.zig output to show .mlir file path, correct setFlags handling of comptime int/float, make tokenizer.zig return <oob> for out‑of‑range detokenization, and speed up Buffer.constant creation up to 2.5 GB/s on CUDA.	2024-02-19 12:34:18 +00:00
Foke Singh	3970df5b48	Update getting_started tutorial and example Bazel files for Bazel 8 migration.	2024-02-14 10:44:47 +00:00
Tarry Singh	169a24307c	Migrate workspace and XLA module definitions to Bazel 8, updating MODULE.bazel files, BUILD rules, and related migration patches.	2024-02-12 12:43:23 +00:00
Tarry Singh	7e6103d876	Upgrade XLA to version 20250122.0-cc075be, switch to nvptx compiler and nvlink with nvjitlink support, add warning for CUDA path in LD_LIBRARY_PATH, and revert the previous CUDA sandbox fix.	2024-02-06 09:31:48 +00:00
Tarry Singh	b8a0aaee5a	Update tokenizer to handle byte_fallback for Llama3 GPT2 vocab and add a Llama3‑specific normalizer; adjust tinyllama.zig and hostbuffer.zig to use the new tokenization logic.	2024-02-05 15:22:44 +00:00
Foke Singh	b643f7bc53	Add Bazel build rule and test for Llama3 tokenizer’s byte fallback and unknown token handling.	2024-02-02 10:25:48 +00:00
Tarry Singh	5120fe00dc	Update libxev epoll patch to resolve crashes and hangs in epoll and kqueue implementations.	2024-01-29 17:15:11 +00:00
Tarry Singh	edc2ac26f8	Adjust ROCm runtime sandboxing to hook only the PJRT plugin and make hipblastlt bytecodes optional.	2024-01-26 13:02:23 +00:00
Foke Singh	0ce36599da	Update example build config and Llama demo to support the new async epoll backend and zigcoro scheduler.	2024-01-22 12:17:01 +00:00
Tarry Singh	a7b7ae0180	Fix async hangs by reworking the libxev epoll backend and using callBlocking for PJRT plugin loading, improving performance across async and runtime modules.	2024-01-16 14:13:45 +00:00
Tarry Singh	434cee3a6c	Fix CUDA and ROCm sandbox discovery, update epoll libxev patch to prevent high CPU usage, enable XLA GPU latency‑hiding scheduler, and upgrade cuDNN to 9.6.0.	2024-01-15 09:41:42 +00:00
Tarry Singh	5b8e42f9a9	Vendor zigcoro and unify APIs; rework internals for stdx.meta compatibility, add Channel.try_send/try_recv methods, support dynamically sized channels with comptime capacity, and introduce PoolStackAllocator for coroutine stack allocation.	2024-01-11 15:40:15 +00:00
Tarry Singh	68dbc290e9	zml: revamp scatterSlices Main issue with current `scatter` implementation is that it uses broadcasting dims of `stablehlo.scatter`. While nice in theory, the optimizer doesn't handle them well and they often are unrolled into while loop. Here I convert the batching dim to extra iotas indices.	2024-01-08 17:55:20 +00:00
Tarry Singh	83b5e1ec48	fix Before we where using `module.op().writeBytecode(writer)` to compute the hash of a model but it crashes on some inputs, notably for unused variables. So I used the text representation of the mlir.	2024-01-05 16:44:41 +00:00
Tarry Singh	acc492454f	Add operator name to source locations and introduce QoL enhancements: remove bias from sdpa, support shape literals in gatherSlices, add Shape.outer, Tensor.all, and infer argMax dtype.	2024-01-01 15:31:41 +00:00
Foke Singh	223857251d	Update MNIST example to use new operator source locations and reflect recent API changes (sdpa bias removal, gatherSlices shape literals, Shape.outer, Tensor.all, and argMax dtype inference)	2023-12-26 10:45:52 +00:00
Tarry Singh	5bd7f8aae9	zml: HostBuffer.prettyPrint() Add pretty printing of HostBuffer. This will be leverage by the debug helper `x.print()` It can also be used like this: `std.log.info("my buffer: {}", .{host_buffer.pretty()})`	2023-12-25 13:01:17 +00:00
Tarry Singh	5ddd034d2c	pjrt: Fix profiler by allowing i64 resource IDs and reserving memory when creating array lists.	2023-12-20 17:18:02 +00:00
Tarry Singh	7ef87236ce	Rewrite simple transpose as reshape in core ZML modules and raise default profiler event limit to 1,000,000.	2023-12-18 13:56:45 +00:00
Foke Singh	8a031bd4c8	Update Llama example to use the simplified transpose implementation and increase default profiler size to 1,000,000 events.	2023-12-15 12:06:42 +00:00

1 2 3 4

170 Commits