05944b5cc9
Update FnCache to copy and reuse non‑tensor fields in fixed‑size structs, preventing undefined memory in core modules.
2024-05-15 17:54:52 +00:00
a34190679b
Fix llama token handling and remove redundant prompt token reuse in core Zig modules (aio, module, nn, pjrtx, tensor)
2024-05-02 17:10:11 +00:00
959bc48c42
Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers.
2024-02-28 15:47:37 +00:00
68dbc290e9
zml: revamp scatterSlices
...
Main issue with current `scatter` implementation is that it uses
broadcasting dims of `stablehlo.scatter`.
While nice in theory, the optimizer doesn't handle them well and they
often are unrolled into while loop.
Here I convert the batching dim to extra iotas indices.
2024-01-08 17:55:20 +00:00
83b5e1ec48
fix
...
Before we where using `module.op().writeBytecode(writer)` to compute the
hash of a model
but it crashes on some inputs, notably for unused variables.
So I used the text representation of the mlir.
2024-01-05 16:44:41 +00:00
acc492454f
Add operator name to source locations and introduce QoL enhancements: remove bias from sdpa, support shape literals in gatherSlices, add Shape.outer, Tensor.all, and infer argMax dtype.
2024-01-01 15:31:41 +00:00
5bd7f8aae9
zml: HostBuffer.prettyPrint()
...
Add pretty printing of HostBuffer.
This will be leverage by the debug helper `x.print()`
It can also be used like this: `std.log.info("my buffer: {}",
.{host_buffer.pretty()})`
2023-12-25 13:01:17 +00:00
7ef87236ce
Rewrite simple transpose as reshape in core ZML modules and raise default profiler event limit to 1,000,000.
2023-12-18 13:56:45 +00:00
6e4fef8844
zml: Introduce arena allocator in CompilationContext. Expose arena allocator to replace existing allocator, enabling safe allocation for ops without misusing std.BoundedArray. Includes breaking changes to chunkAllowTrailing and split. Upgrade axis_ types to anytype for tag handling and add TODOs for upcoming Tensor API.
2023-11-16 15:11:23 +00:00
5122ca0203
Refactor rope implementation to compute only required offsets, eliminating full cos/sin matrix generation in module, nn, and tensor code.
2023-09-27 11:45:33 +00:00
b5c4fb7c58
zml: fix float8 <-> float32 conversions, support for Tensor.constant(.{}, .{ .f8 = 1.0})
...
Mostly:
* fix float8 <-> float32 conversions
* support for `Tensor.constant(.{}, .{ .f8 = 1.0})`
Misc:
* fix small inconsistencies between different versions of sdpa
* better error message for broadcast
* bazelrc: --config=debug
2023-09-21 11:15:50 +00:00
0709b1b32f
zml: reduce memory usage of sdpaMemEfficient by using zml.ops.while instead of zml.ops.for, avoiding concatenation of intermediate results.
2023-08-14 14:24:11 +00:00
01eff33fa0
Update workspace dependencies to newer LLVM, XLA, StableHLO, and PJRT versions and expose new pjrt plugin attribute APIs and stablehlo version APIs in build and runtime configurations.
2023-08-07 12:28:36 +00:00
b53462b515
Fix crash in for_ by ensuring values are pushed to their block before opening a new block, adding asserts for block state, and guaranteeing first_step is used. Adjust padding syntax to improve usability.
2023-07-25 14:25:47 +00:00
f675a203c2
zml.ops.makeBlock now returns the inner tensor to propagate tags. The function returns both the created mlir.Block and tensors from the supplied function, allowing shape and tag propagation without exposing mlir.Values. Updated tests to run on non‑CPU platforms.
2023-07-21 09:01:01 +00:00
9b7eea8ac2
Add stdx utilities and rework async signature inference; tidy executable logging.
2023-06-21 14:45:14 +00:00
c30aa018dc
zml: small cleanup
...
- Add more scatterSlices test cases.
- Replace helpers.mapTensors with zml.meta.map.
- Fix shape handling when a for loop is fully unrolled.
- Allow zml.Tensor.pad to accept i64 for dimension compatibility.
- Enable arrays of tensors inside model structs.
- Split Buffer.asViewOf into asViewOfHostBuffer and asViewOfDeviceBuffer.
2023-06-19 15:29:29 +00:00
f00538667e
zml.nn: add dynamic sampling with support for top‑k, top‑p, and min‑p settings. Implements token index computation based on the selected sampling strategy, including options for top_k, max_top_k, top_p, and min_p.
2023-06-16 14:34:18 +00:00
b244a18621
zml: set iota default dtype to .i32, with fallback to .i64 for axes with many elements, simplifying usage.
2023-06-15 12:45:52 +00:00
344e07fb6e
stablehlo: extend dot_general API to include DotAlgorithm support by merging precision and algorithm attributes into a union, aligning with spec requirements. Currently not exposed to users due to limited algorithm support.
2023-06-07 11:20:25 +00:00
6d720126ac
Add PJRT custom call integration with generic zmlHostBufferCallback to copy tensors to host and invoke user callbacks. Introduce Tensor.print() method to output runtime tensor values (CUDA‑specific, uses a pre‑allocated host buffer).
2023-06-05 13:42:45 +00:00
499b0d20e5
pjrtx: change behavior to return an error when OpenXLA fails to serialize the new batching_dim attribute for gather/scatter, instead of panicking.
2023-05-29 17:18:19 +00:00
2f54e2a5f3
zml.tensor: add triangular operator to zero out the upper‑right matrix region with configurable offset, and toDiagonal (diag_embed) to embed a vector as a diagonal matrix, correcting previous diag naming. Also add ELU activation under zml.nn.Activation.
2023-05-18 16:39:21 +00:00
05faa5021e
zml.tensor: add cumulativeSum operator and refactor maxPoolND. Introduce cumulative sum using reduceWindow. Simplify reduceWindow signature by merging padding_shape and padding_value. Update maxPool1D/2D to accept tuple arguments. Revise pad to use tagged or AOS syntax; remove SOA syntax.
2023-05-17 09:01:27 +00:00
fefd84b1bb
Replace silu implementation with stablehlo.logistic for higher precision, move logistic logic into sigmoid and alias logistic to sigmoid (breaking change).
2023-05-01 10:40:50 +00:00
ed6444b775
Add Tensor.concatenate support, begin deprecating broadcastLeft, and compute transformer head scaling constant in f32 for higher precision.
2023-04-21 15:55:07 +00:00
a4f0fc96c0
Integrate user sharding hints and HLO sharding annotations across MLIR dialects and ZML core, and remove the now‑unused module options arguments.
2023-03-21 10:50:39 +00:00
7ef67eea27
zml: Relocate tests next to the functions they verify and remove obsolete dynamicSlice1d test.
2023-03-08 14:10:11 +00:00
dfa71018a5
zml: Remove pjrtx wrapper, migrate remaining helpers to their native modules, and fix blocking issue in Event.await.
2023-03-06 17:05:56 +00:00
2f129f76c9
Add in-process sharding support across core ZML components (platform, shape, tensor, MLIR generation, buffers, and PJRT integration)
2023-02-24 17:33:14 +00:00
24a7c98476
Implement scatterSlices functionality.
2023-02-14 13:52:49 +00:00
934acb35a8
zml: initialize Tensor.min and Tensor.max reductions with proper extreme values to ensure correct results
2023-02-10 12:28:41 +00:00
058e1415fa
zml: deprecate buggy Tensor.chunk; introduce chunkExact and chunkAllowTrailing with clarified behavior
2023-02-07 12:42:34 +00:00
0606ea1d7c
Update Bazel workspace and runtime BUILD files to newer XLA, StableHLO, and LLVM versions, enabling batching‑dims support for the gather operator.
2023-02-01 15:58:30 +00:00
7dcd8b516c
zml/nn: fix resize implementations (resizeBilinear and resizeBicubic) and expand refAllDecl usage; all tests pass
2023-01-27 14:35:11 +00:00
ebdb8db213
zml/tests: re‑enable all Zig tests, fix precision issue by switching to f32, and add refAllDecls to ensure all declarations are tested
2023-01-23 16:28:19 +00:00
b961856e5f
zml/tensor: correct typo in uniform comment ('substract' → 'subtract')
2023-01-19 12:20:40 +00:00
ccdf218961
Add multi‑axis, batched gatherValues support to tensor, shape, nn, quantization, and torch modules.
2023-01-18 12:03:48 +00:00
266da6d4be
Add initial Bazel build configuration, async runtime implementation, and core MLIR dialect definitions for ZML.
2023-01-02 14:28:25 +00:00