|
|
e7323be10b
|
runtimes/rocm: switch to in-process LLD, removing the need for sandboxed lld.
|
2025-04-23 11:43:18 +00:00 |
|
|
|
4294a4d08f
|
Bump hftokenizers dependency versions in Bazel and Cargo lockfiles (MODULE.bazel.lock, Cargo.toml, Cargo.lock)
|
2025-04-04 12:54:33 +00:00 |
|
|
|
2d321d232d
|
runtimes/cuda: sandbox CUDA dependencies by removing them from the leaf binary, sandboxing the dependency graph, marking dlopen direct dependencies as NEEDED, setting RPATH to the sandbox, loading the PJRT plugin from the sandbox, and enabling weak CUDA symbols without direct linking.
|
2025-03-26 11:18:29 +00:00 |
|
|
|
a5420068b1
|
pjrt: emit warning instead of panic when FFI Extension is missing (e.g., on TPU).
|
2025-03-24 09:40:44 +00:00 |
|
|
|
f27a524f31
|
Update rules_zig: add zig_srcs target, fix source handling bug, clean up BUILD files, adjust async/coro.zig tests, and disable nemo and yaml model loaders.
|
2025-03-13 12:27:21 +00:00 |
|
|
|
ff1433d998
|
pjrt: bind PJRT_Client_CreateUninitializedBuffer.
|
2025-02-25 10:37:45 +00:00 |
|
|
|
8456a0d073
|
zml/pjrt: add binding for PJRT_Device_MemoryStats.
|
2025-02-19 12:14:05 +00:00 |
|
|
|
4d6d975dc0
|
Patch aio.zig: update loadBuffersWithPrefix argument type to match the conditional type of loadBuffers init_args.
|
2025-02-13 09:48:13 +00:00 |
|
|
|
af8844c1f1
|
Add model prefix support when loading a model from safetensors, enabling use of a specific model prefix (e.g., ModernBertModel) instead of the full model. Tested with the text embeddings server project.
|
2025-02-12 13:18:27 +00:00 |
|
|
|
0a2ab7c8cb
|
Remove usingnamespace from MLIR.
|
2025-01-28 09:35:58 +00:00 |
|
|
|
f8ab0d7b2a
|
Remove dead imports.
|
2025-01-22 10:45:04 +00:00 |
|
|
|
51a6cab753
|
Wire has_side_effect field in zml/ops.
|
2025-01-20 16:45:13 +00:00 |
|
|
|
99a2001e63
|
Rename PJRT BufferType to follow Zig and ZML naming conventions.
|
2025-01-16 13:00:47 +00:00 |
|
|
|
09c43b8759
|
Add customCall operation to zml/ops.
|
2025-01-09 15:01:33 +00:00 |
|
|
|
9f1cc762cd
|
Fix map tests in zml/meta.
|
2025-01-06 17:49:50 +00:00 |
|
|
|
fbf1ecb8b7
|
Introduce Executable.getCompiledMemoryStats in PJRT.
|
2025-01-02 16:36:13 +00:00 |
|
|
|
4b1a3ff48a
|
Add union support to mapping helpers in zml/meta.zig.
|
2025-01-01 13:35:17 +00:00 |
|
|
|
e6286b6097
|
Update Buffer.from to be blocking by default and add options for async loading and memory placement, adjusting aio, hostbuffer, pjrtx, and tensor implementations.
|
2024-12-25 17:14:44 +00:00 |
|
|
|
6aa9aa5a7b
|
Add preliminary implementation for custom call support.
|
2024-12-10 09:36:37 +00:00 |
|
|
|
f5ab2c3a55
|
zml: eliminate compile-time fields from Bufferized, removing the need to pass undefined to exe.call for inlined arguments. Introduce BufferizedWithArgs in zml.testing for compileAndCall utility.
|
2024-11-28 12:24:39 +00:00 |
|
|
|
95453c7242
|
Update XLA dependency to version 20250527.0‑cb67f2f and refresh related Bazel BUILD, MODULE, overlay and patch files.
|
2024-11-22 16:50:20 +00:00 |
|
|
|
3849eb10b7
|
Add buffer and hostbuffer utilities with precise f32→bf16 conversion, type inference for loadBuffers, store expected input shapes, enhance meta.visit and JSON TaggedUnion support, and improve logging.
|
2024-10-28 11:21:46 +00:00 |
|
|
|
4ef81b89ea
|
stdx.fmt: add slice formatting support, improving on previous prettyPrinter implementation by leveraging internal fmt mechanisms.
|
2024-10-18 15:05:08 +00:00 |
|
|
|
aacbf2ee04
|
Fix Llama3 rope scaling implementation in the neural network module (zml/nn.zig)
|
2024-10-07 12:53:03 +00:00 |
|
|
|
2863c1f5e0
|
zml/tensor: fix returned value in Tensor.toMemory – ensure _output_memory_kind is set correctly in the result.
|
2024-09-18 13:18:08 +00:00 |
|
|
|
aec7072837
|
pjrt: add FFI bindings for custom calls
|
2024-09-10 09:14:28 +00:00 |
|
|
|
1f5ff96c10
|
zml/ops: add wiring for operand output alias in zml.ops.triton
|
2024-09-09 15:00:28 +00:00 |
|
|
|
4b7e618b43
|
zml/aio: add bool handling in struct population within populateStruct
|
2024-09-02 14:11:47 +00:00 |
|
|
|
ac63c30e12
|
add mini-DSL for creating MLIR common attributes and types, leveraging Zig 0.14 to simplify mlir.Type and mlir.Attribute creation
|
2024-08-26 14:19:00 +00:00 |
|
|
|
63ef78efcc
|
zml: add support for NVTX tracing
|
2024-08-21 14:41:40 +00:00 |
|
|
|
7df89301dc
|
Bump XLA version and import llvm, stablehlo, triton, and zig‑protobuf modules in workspace BUILD files.
|
2024-08-06 10:28:43 +00:00 |
|
|
|
3f36506f1c
|
zml: remove usingnamespace from floats.zig and related dependencies; note that incremental compilation does not improve overall build time due to linking overhead
|
2024-07-23 17:43:43 +00:00 |
|
|
|
42dee5d0e0
|
mlir: rework stablehlo custom call implementation and add a Triton example
|
2024-07-16 13:23:07 +00:00 |
|
|
|
aec1d96e6d
|
mlir: rework DenseElementsAttribute to correctly slice inputs and modify .as() to return a concrete value instead of an optional
|
2024-07-15 12:32:24 +00:00 |
|
|
|
30f6be0e2f
|
Update core Zig modules (async, mlir, pjrt, stdx) and third‑party Bazel definitions for the Zig 0.14.0 release.
|
2024-07-02 14:19:04 +00:00 |
|
|
|
18eb0e5a7b
|
Add async I/O, SentencePiece, NN, and tensor utilities for ModernBERT support and update Bazel build configuration.
|
2024-06-14 15:27:06 +00:00 |
|
|
|
221ece647d
|
zml/ops.zig: Added zml.ops.case operation
This can be used to select which branch will be run at runtime.
It wraps the `stablehlo.case` operation.
|
2024-05-30 14:11:08 +00:00 |
|
|
|
3aac788544
|
Update Bazel build configurations (zig.bzl, BUILD files) for MLIR, PJRT, Neuron, ROCm, tokenizer, and tools, fixing broken dependencies.
|
2024-05-20 11:28:25 +00:00 |
|
|
|
05944b5cc9
|
Update FnCache to copy and reuse non‑tensor fields in fixed‑size structs, preventing undefined memory in core modules.
|
2024-05-15 17:54:52 +00:00 |
|
|
|
a34190679b
|
Fix llama token handling and remove redundant prompt token reuse in core Zig modules (aio, module, nn, pjrtx, tensor)
|
2024-05-02 17:10:11 +00:00 |
|
|
|
13eff4e661
|
pjrt,zml: add memory bindings
This preliminary PR binds PJRT memory endpoints and adds them to
`zml.Buffer`.
A follow up PR will properly integrate it inside `zml.Buffer`
|
2024-04-11 15:43:24 +00:00 |
|
|
|
d4db5ccc6b
|
Integrate TinyLlama support, restore the homemade tokenizer, and align Zig API naming across stdx and zml tokenizer modules.
|
2024-04-05 15:07:29 +00:00 |
|
|
|
8a25b1eb74
|
Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master.
|
2024-03-05 17:04:42 +00:00 |
|
|
|
959bc48c42
|
Add HuggingFace tokenizer bindings and SentencePiece integration; update BUILD files, async utilities, and FFI modules to support the new tokenizers.
|
2024-02-28 15:47:37 +00:00 |
|
|
|
c109b12e1b
|
Various minor fixes: rewrite tinyllama tokenizer newline token, prevent HostBuffer.isContiguous false trigger on 1‑dim axes, improve HostBuffer.slice1d error messages, simplify module.zig output to show .mlir file path, correct setFlags handling of comptime int/float, make tokenizer.zig return <oob> for out‑of‑range detokenization, and speed up Buffer.constant creation up to 2.5 GB/s on CUDA.
|
2024-02-19 12:34:18 +00:00 |
|
|
|
169a24307c
|
Migrate workspace and XLA module definitions to Bazel 8, updating MODULE.bazel files, BUILD rules, and related migration patches.
|
2024-02-12 12:43:23 +00:00 |
|
|
|
7e6103d876
|
Upgrade XLA to version 20250122.0-cc075be, switch to nvptx compiler and nvlink with nvjitlink support, add warning for CUDA path in LD_LIBRARY_PATH, and revert the previous CUDA sandbox fix.
|
2024-02-06 09:31:48 +00:00 |
|
|
|
b8a0aaee5a
|
Update tokenizer to handle byte_fallback for Llama3 GPT2 vocab and add a Llama3‑specific normalizer; adjust tinyllama.zig and hostbuffer.zig to use the new tokenization logic.
|
2024-02-05 15:22:44 +00:00 |
|
|
|
a7b7ae0180
|
Fix async hangs by reworking the libxev epoll backend and using callBlocking for PJRT plugin loading, improving performance across async and runtime modules.
|
2024-01-16 14:13:45 +00:00 |
|
|
|
434cee3a6c
|
Fix CUDA and ROCm sandbox discovery, update epoll libxev patch to prevent high CPU usage, enable XLA GPU latency‑hiding scheduler, and upgrade cuDNN to 9.6.0.
|
2024-01-15 09:41:42 +00:00 |
|