Radix

Author	SHA1	Message	Date
Tarry Singh	13eff4e661	pjrt,zml: add memory bindings This preliminary PR binds PJRT memory endpoints and adds them to `zml.Buffer`. A follow up PR will properly integrate it inside `zml.Buffer`	2024-04-11 15:43:24 +00:00
Foke Singh	190c6978d2	llama: simplify llama3 prompt template encoding by removing redundant newline re-encoding and ensuring a trailing newline.	2024-04-10 09:36:28 +00:00
Tarry Singh	d4db5ccc6b	Integrate TinyLlama support, restore the homemade tokenizer, and align Zig API naming across stdx and zml tokenizer modules.	2024-04-05 15:07:29 +00:00
Foke Singh	b67685b941	Add example Bazel build files and tokenizer test for tinyllama, including tigerbeetle integration and flags.	2024-04-01 17:40:18 +00:00
Tarry Singh	567210d1d7	bazel: depend on prebuilt protoc binaries to eliminate ~1300 build steps. Note: integration is currently blocked due to version constraints in rules_proto and toolchains_protoc.	2024-03-29 09:54:57 +00:00
Tarry Singh	e0c8eecb79	bazel: use OID as sha256 for Git LFS files to prevent unnecessary HuggingFace redownloads.	2024-03-28 17:52:52 +00:00
Foke Singh	a811b2e1e3	llama: fix dimensions and data types Removed unnecessary batching dimension introduced by recent changes. Converted index outputs from i32 to u32 for token indices. Ensures Llama runs on CUDA and RoCM. Tested on CUDA.	2024-03-20 13:37:19 +00:00
Foke Singh	602757e7a9	Update examples to use the corrected logFn API.	2024-03-18 13:11:14 +00:00
Tarry Singh	754656f2f0	Replace real mutex with async Mutex for logFn, add fallback logger support outside coroutines, and fix ResetCondition handling.	2024-03-14 11:43:33 +00:00
Tarry Singh	980f1b17fb	Ensure all runtime plugins have correct SONAME values, fixing issues with prebuilt PJRT plugins.	2024-03-11 10:15:22 +00:00
Tarry Singh	8a25b1eb74	Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master.	2024-03-05 17:04:42 +00:00
Foke Singh	76e314db9b	Update Llama example docs and Bazel build files, and add tests for the new HuggingFace tokenizer integration.	2024-03-04 12:11:13 +00:00
Tarry Singh	959bc48c42	Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers.	2024-02-28 15:47:37 +00:00
Foke Singh	5048e7dc89	Update example lock file for rules_distroless 0.4.2 upgrade and verify MNIST image build works.	2024-02-26 15:30:13 +00:00
Tarry Singh	b4b2490690	Upgrade rules_distroless to 0.4.2 in MODULE.bazel and refresh MODULE.bazel.lock accordingly.	2024-02-21 17:48:10 +00:00
Tarry Singh	c109b12e1b	Various minor fixes: rewrite tinyllama tokenizer newline token, prevent HostBuffer.isContiguous false trigger on 1‑dim axes, improve HostBuffer.slice1d error messages, simplify module.zig output to show .mlir file path, correct setFlags handling of comptime int/float, make tokenizer.zig return <oob> for out‑of‑range detokenization, and speed up Buffer.constant creation up to 2.5 GB/s on CUDA.	2024-02-19 12:34:18 +00:00
Foke Singh	3970df5b48	Update getting_started tutorial and example Bazel files for Bazel 8 migration.	2024-02-14 10:44:47 +00:00
Tarry Singh	169a24307c	Migrate workspace and XLA module definitions to Bazel 8, updating MODULE.bazel files, BUILD rules, and related migration patches.	2024-02-12 12:43:23 +00:00
Tarry Singh	7e6103d876	Upgrade XLA to version 20250122.0-cc075be, switch to nvptx compiler and nvlink with nvjitlink support, add warning for CUDA path in LD_LIBRARY_PATH, and revert the previous CUDA sandbox fix.	2024-02-06 09:31:48 +00:00
Tarry Singh	b8a0aaee5a	Update tokenizer to handle byte_fallback for Llama3 GPT2 vocab and add a Llama3‑specific normalizer; adjust tinyllama.zig and hostbuffer.zig to use the new tokenization logic.	2024-02-05 15:22:44 +00:00
Foke Singh	b643f7bc53	Add Bazel build rule and test for Llama3 tokenizer’s byte fallback and unknown token handling.	2024-02-02 10:25:48 +00:00
Tarry Singh	5120fe00dc	Update libxev epoll patch to resolve crashes and hangs in epoll and kqueue implementations.	2024-01-29 17:15:11 +00:00
Tarry Singh	edc2ac26f8	Adjust ROCm runtime sandboxing to hook only the PJRT plugin and make hipblastlt bytecodes optional.	2024-01-26 13:02:23 +00:00
Foke Singh	0ce36599da	Update example build config and Llama demo to support the new async epoll backend and zigcoro scheduler.	2024-01-22 12:17:01 +00:00
Tarry Singh	a7b7ae0180	Fix async hangs by reworking the libxev epoll backend and using callBlocking for PJRT plugin loading, improving performance across async and runtime modules.	2024-01-16 14:13:45 +00:00
Tarry Singh	434cee3a6c	Fix CUDA and ROCm sandbox discovery, update epoll libxev patch to prevent high CPU usage, enable XLA GPU latency‑hiding scheduler, and upgrade cuDNN to 9.6.0.	2024-01-15 09:41:42 +00:00
Tarry Singh	5b8e42f9a9	Vendor zigcoro and unify APIs; rework internals for stdx.meta compatibility, add Channel.try_send/try_recv methods, support dynamically sized channels with comptime capacity, and introduce PoolStackAllocator for coroutine stack allocation.	2024-01-11 15:40:15 +00:00
Tarry Singh	68dbc290e9	zml: revamp scatterSlices Main issue with current `scatter` implementation is that it uses broadcasting dims of `stablehlo.scatter`. While nice in theory, the optimizer doesn't handle them well and they often are unrolled into while loop. Here I convert the batching dim to extra iotas indices.	2024-01-08 17:55:20 +00:00
Tarry Singh	83b5e1ec48	fix Before we where using `module.op().writeBytecode(writer)` to compute the hash of a model but it crashes on some inputs, notably for unused variables. So I used the text representation of the mlir.	2024-01-05 16:44:41 +00:00
Tarry Singh	acc492454f	Add operator name to source locations and introduce QoL enhancements: remove bias from sdpa, support shape literals in gatherSlices, add Shape.outer, Tensor.all, and infer argMax dtype.	2024-01-01 15:31:41 +00:00
Foke Singh	223857251d	Update MNIST example to use new operator source locations and reflect recent API changes (sdpa bias removal, gatherSlices shape literals, Shape.outer, Tensor.all, and argMax dtype inference)	2023-12-26 10:45:52 +00:00
Tarry Singh	5bd7f8aae9	zml: HostBuffer.prettyPrint() Add pretty printing of HostBuffer. This will be leverage by the debug helper `x.print()` It can also be used like this: `std.log.info("my buffer: {}", .{host_buffer.pretty()})`	2023-12-25 13:01:17 +00:00
Tarry Singh	5ddd034d2c	pjrt: Fix profiler by allowing i64 resource IDs and reserving memory when creating array lists.	2023-12-20 17:18:02 +00:00
Tarry Singh	7ef87236ce	Rewrite simple transpose as reshape in core ZML modules and raise default profiler event limit to 1,000,000.	2023-12-18 13:56:45 +00:00
Foke Singh	8a031bd4c8	Update Llama example to use the simplified transpose implementation and increase default profiler size to 1,000,000 events.	2023-12-15 12:06:42 +00:00
Tarry Singh	145e60b4dd	workspace: Update LLVM, XLA, StableHLO, and PJRT plugins to latest versions.	2023-12-13 10:10:32 +00:00
Tarry Singh	6a4a7fb9a1	zml/module.zig: Remove unnecessary optional unwrapping.	2023-12-05 12:27:08 +00:00
Tarry Singh	37725cdaa6	Update PJRT, runtime, and ZML modules to use per‑target output folders and expose `profiler.dumpDataAsJson` for JSON profiling output.	2023-12-04 10:38:10 +00:00
Foke Singh	22a846de72	Update llama example to use per‑target output folders and call profiler.dumpDataAsJson for testing the new compilation layout.	2023-12-01 16:05:59 +00:00
Foke Singh	46fbbf43a2	Update tutorial documentation in write_first_model.md with quick fixes.	2023-11-30 12:14:33 +00:00
Foke Singh	737f7cbdee	Add example build runner scripts and config for Zig code completion.	2023-11-21 14:55:34 +00:00
Tarry Singh	ec37c8f731	Update Bazel build files and helper scripts to integrate the custom build runner for ZLS code completion.	2023-11-20 15:29:01 +00:00
Tarry Singh	6e4fef8844	zml: Introduce arena allocator in CompilationContext. Expose arena allocator to replace existing allocator, enabling safe allocation for ops without misusing std.BoundedArray. Includes breaking changes to chunkAllowTrailing and split. Upgrade axis_ types to anytype for tag handling and add TODOs for upcoming Tensor API.	2023-11-16 15:11:23 +00:00
Tarry Singh	57bf667c90	Add struct‑based client creation flags to the Zig PJRT API and update `context.autoPlatform` to accept a flag struct.	2023-11-13 12:45:17 +00:00
Foke Singh	cb6fcbbb1a	Update docs and Zig examples to demonstrate the new client creation flags API.	2023-11-09 12:31:11 +00:00
Tarry Singh	9f4194ad97	Fix test layer. Add tests to detect silent breakage of testLayer and regression in mapAlloc with zero-size struct fields. Add Python venv directory to .gitignore.	2023-11-06 11:25:57 +00:00
Foke Singh	237a877a29	zml: Add support for Llama 3.2 text-only models. Implement transpose over embed_tokens as a replacement for missing lm_head and make lm_head optional for compatibility. Add repositories and executions to Bazel and update README.	2023-11-01 10:16:48 +00:00
Foke Singh	1c9749c25e	docs: move image in concepts.md	2023-10-31 10:21:14 +00:00
Foke Singh	eb20548241	update instructions following, `prepare` doesn't alloc anymore, `ExeWithWeights` is `ModuleExe`	2023-10-26 13:56:56 +00:00
Tarry Singh	27c8309424	async: add intrusive queue all code contributed by @steeve * add intrusive queue * change the constructor of Channel with default AsyncThread executor --------- Co-authored-by: Steeve Morin <steeve@zml.ai>	2023-10-24 14:36:22 +00:00

1 2 3 4 5 ...

255 Commits