Commit Graph

122 Commits

Author SHA1 Message Date
f27a524f31 Update rules_zig: add zig_srcs target, fix source handling bug, clean up BUILD files, adjust async/coro.zig tests, and disable nemo and yaml model loaders. 2025-03-13 12:27:21 +00:00
ff1433d998 pjrt: bind PJRT_Client_CreateUninitializedBuffer. 2025-02-25 10:37:45 +00:00
8456a0d073 zml/pjrt: add binding for PJRT_Device_MemoryStats. 2025-02-19 12:14:05 +00:00
4d6d975dc0 Patch aio.zig: update loadBuffersWithPrefix argument type to match the conditional type of loadBuffers init_args. 2025-02-13 09:48:13 +00:00
af8844c1f1 Add model prefix support when loading a model from safetensors, enabling use of a specific model prefix (e.g., ModernBertModel) instead of the full model. Tested with the text embeddings server project. 2025-02-12 13:18:27 +00:00
0a2ab7c8cb Remove usingnamespace from MLIR. 2025-01-28 09:35:58 +00:00
f8ab0d7b2a Remove dead imports. 2025-01-22 10:45:04 +00:00
51a6cab753 Wire has_side_effect field in zml/ops. 2025-01-20 16:45:13 +00:00
99a2001e63 Rename PJRT BufferType to follow Zig and ZML naming conventions. 2025-01-16 13:00:47 +00:00
09c43b8759 Add customCall operation to zml/ops. 2025-01-09 15:01:33 +00:00
9f1cc762cd Fix map tests in zml/meta. 2025-01-06 17:49:50 +00:00
fbf1ecb8b7 Introduce Executable.getCompiledMemoryStats in PJRT. 2025-01-02 16:36:13 +00:00
4b1a3ff48a Add union support to mapping helpers in zml/meta.zig. 2025-01-01 13:35:17 +00:00
e6286b6097 Update Buffer.from to be blocking by default and add options for async loading and memory placement, adjusting aio, hostbuffer, pjrtx, and tensor implementations. 2024-12-25 17:14:44 +00:00
6aa9aa5a7b Add preliminary implementation for custom call support. 2024-12-10 09:36:37 +00:00
f5ab2c3a55 zml: eliminate compile-time fields from Bufferized, removing the need to pass undefined to exe.call for inlined arguments. Introduce BufferizedWithArgs in zml.testing for compileAndCall utility. 2024-11-28 12:24:39 +00:00
95453c7242 Update XLA dependency to version 20250527.0‑cb67f2f and refresh related Bazel BUILD, MODULE, overlay and patch files. 2024-11-22 16:50:20 +00:00
3849eb10b7 Add buffer and hostbuffer utilities with precise f32→bf16 conversion, type inference for loadBuffers, store expected input shapes, enhance meta.visit and JSON TaggedUnion support, and improve logging. 2024-10-28 11:21:46 +00:00
4ef81b89ea stdx.fmt: add slice formatting support, improving on previous prettyPrinter implementation by leveraging internal fmt mechanisms. 2024-10-18 15:05:08 +00:00
aacbf2ee04 Fix Llama3 rope scaling implementation in the neural network module (zml/nn.zig) 2024-10-07 12:53:03 +00:00
2863c1f5e0 zml/tensor: fix returned value in Tensor.toMemory – ensure _output_memory_kind is set correctly in the result. 2024-09-18 13:18:08 +00:00
aec7072837 pjrt: add FFI bindings for custom calls 2024-09-10 09:14:28 +00:00
1f5ff96c10 zml/ops: add wiring for operand output alias in zml.ops.triton 2024-09-09 15:00:28 +00:00
4b7e618b43 zml/aio: add bool handling in struct population within populateStruct 2024-09-02 14:11:47 +00:00
ac63c30e12 add mini-DSL for creating MLIR common attributes and types, leveraging Zig 0.14 to simplify mlir.Type and mlir.Attribute creation 2024-08-26 14:19:00 +00:00
63ef78efcc zml: add support for NVTX tracing 2024-08-21 14:41:40 +00:00
7df89301dc Bump XLA version and import llvm, stablehlo, triton, and zig‑protobuf modules in workspace BUILD files. 2024-08-06 10:28:43 +00:00
3f36506f1c zml: remove usingnamespace from floats.zig and related dependencies; note that incremental compilation does not improve overall build time due to linking overhead 2024-07-23 17:43:43 +00:00
42dee5d0e0 mlir: rework stablehlo custom call implementation and add a Triton example 2024-07-16 13:23:07 +00:00
aec1d96e6d mlir: rework DenseElementsAttribute to correctly slice inputs and modify .as() to return a concrete value instead of an optional 2024-07-15 12:32:24 +00:00
30f6be0e2f Update core Zig modules (async, mlir, pjrt, stdx) and third‑party Bazel definitions for the Zig 0.14.0 release. 2024-07-02 14:19:04 +00:00
18eb0e5a7b Add async I/O, SentencePiece, NN, and tensor utilities for ModernBERT support and update Bazel build configuration. 2024-06-14 15:27:06 +00:00
221ece647d zml/ops.zig: Added zml.ops.case operation
This can be used to select which branch will be run at runtime.

It wraps the `stablehlo.case` operation.
2024-05-30 14:11:08 +00:00
3aac788544 Update Bazel build configurations (zig.bzl, BUILD files) for MLIR, PJRT, Neuron, ROCm, tokenizer, and tools, fixing broken dependencies. 2024-05-20 11:28:25 +00:00
05944b5cc9 Update FnCache to copy and reuse non‑tensor fields in fixed‑size structs, preventing undefined memory in core modules. 2024-05-15 17:54:52 +00:00
a34190679b Fix llama token handling and remove redundant prompt token reuse in core Zig modules (aio, module, nn, pjrtx, tensor) 2024-05-02 17:10:11 +00:00
13eff4e661 pjrt,zml: add memory bindings
This preliminary PR binds PJRT memory endpoints and adds them to
`zml.Buffer`.

A follow up PR will properly integrate it inside `zml.Buffer`
2024-04-11 15:43:24 +00:00
d4db5ccc6b Integrate TinyLlama support, restore the homemade tokenizer, and align Zig API naming across stdx and zml tokenizer modules. 2024-04-05 15:07:29 +00:00
8a25b1eb74 Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master. 2024-03-05 17:04:42 +00:00
959bc48c42 Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers. 2024-02-28 15:47:37 +00:00
c109b12e1b Various minor fixes: rewrite tinyllama tokenizer newline token, prevent HostBuffer.isContiguous false trigger on 1‑dim axes, improve HostBuffer.slice1d error messages, simplify module.zig output to show .mlir file path, correct setFlags handling of comptime int/float, make tokenizer.zig return <oob> for out‑of‑range detokenization, and speed up Buffer.constant creation up to 2.5 GB/s on CUDA. 2024-02-19 12:34:18 +00:00
169a24307c Migrate workspace and XLA module definitions to Bazel 8, updating MODULE.bazel files, BUILD rules, and related migration patches. 2024-02-12 12:43:23 +00:00
7e6103d876 Upgrade XLA to version 20250122.0-cc075be, switch to nvptx compiler and nvlink with nvjitlink support, add warning for CUDA path in LD_LIBRARY_PATH, and revert the previous CUDA sandbox fix. 2024-02-06 09:31:48 +00:00
b8a0aaee5a Update tokenizer to handle byte_fallback for Llama3 GPT2 vocab and add a Llama3‑specific normalizer; adjust tinyllama.zig and hostbuffer.zig to use the new tokenization logic. 2024-02-05 15:22:44 +00:00
a7b7ae0180 Fix async hangs by reworking the libxev epoll backend and using callBlocking for PJRT plugin loading, improving performance across async and runtime modules. 2024-01-16 14:13:45 +00:00
434cee3a6c Fix CUDA and ROCm sandbox discovery, update epoll libxev patch to prevent high CPU usage, enable XLA GPU latency‑hiding scheduler, and upgrade cuDNN to 9.6.0. 2024-01-15 09:41:42 +00:00
68dbc290e9 zml: revamp scatterSlices
Main issue with current `scatter` implementation is that it uses
broadcasting dims of `stablehlo.scatter`.
While nice in theory, the optimizer doesn't handle them well and they
often are unrolled into while loop.
Here I convert the batching dim to extra iotas indices.
2024-01-08 17:55:20 +00:00
83b5e1ec48 fix
Before we where using `module.op().writeBytecode(writer)` to compute the
hash of a model
but it crashes on some inputs, notably for unused variables.

So I used the text representation of the mlir.
2024-01-05 16:44:41 +00:00
acc492454f Add operator name to source locations and introduce QoL enhancements: remove bias from sdpa, support shape literals in gatherSlices, add Shape.outer, Tensor.all, and infer argMax dtype. 2024-01-01 15:31:41 +00:00
5bd7f8aae9 zml: HostBuffer.prettyPrint()
Add pretty printing of HostBuffer.

This will be leverage by the debug helper `x.print()`
It can also be used like this: `std.log.info("my buffer: {}",
.{host_buffer.pretty()})`
2023-12-25 13:01:17 +00:00