Commit Graph

308 Commits

Author SHA1 Message Date
8073e45894 Update examples/MODULE.bazel.lock to reflect bumped hftokenizers dependency. 2025-04-09 10:21:44 +00:00
4294a4d08f Bump hftokenizers dependency versions in Bazel and Cargo lockfiles (MODULE.bazel.lock, Cargo.toml, Cargo.lock) 2025-04-04 12:54:33 +00:00
78d7b672e7 runtimes/cpu: sandbox CPU PJRT plugin, simplifying as there are no additional NEEDED dependencies. 2025-04-03 11:57:46 +00:00
2d321d232d runtimes/cuda: sandbox CUDA dependencies by removing them from the leaf binary, sandboxing the dependency graph, marking dlopen direct dependencies as NEEDED, setting RPATH to the sandbox, loading the PJRT plugin from the sandbox, and enabling weak CUDA symbols without direct linking. 2025-03-26 11:18:29 +00:00
a5420068b1 pjrt: emit warning instead of panic when FFI Extension is missing (e.g., on TPU). 2025-03-24 09:40:44 +00:00
dc121fce4f Update example MODULE.bazel and lockfile to reflect toolchains_llvm_bootstrapped bump to 0.2.4. 2025-03-20 12:17:30 +00:00
907577525f Update MODULE.bazel and lockfile to bump toolchains_llvm_bootstrapped to version 0.2.4. 2025-03-18 11:47:22 +00:00
f27a524f31 Update rules_zig: add zig_srcs target, fix source handling bug, clean up BUILD files, adjust async/coro.zig tests, and disable nemo and yaml model loaders. 2025-03-13 12:27:21 +00:00
6fc1148206 async/coro: make coroutines unwindable by zeroing the initial stack region, preventing random unwinding behavior and SIGSEGV during _Unwind_Backtrace. 2025-03-10 16:25:45 +00:00
f63c673f45 bazel: add RPATH manipulation to patchelf 2025-03-05 11:56:40 +00:00
9488672d4b workspace: bump xla to version 20250710.0-22ea002
Also:
- Bump XLA deps : `com_github_grpc_grpc` and `com_google_protobuf`
- Inject `rules_ml_toolchain`
- Fix `zig_proto_library` rule
2025-03-04 17:12:34 +00:00
fa0ed045ef runtimes/cuda: downgrade cuda and cudnn
This commit reverts part of https://github.com/zml/zml/pull/238/files
This is required because XLA has a strong dependencies on CUDA 12.8 and
upgrading to 12.9 is impossible due to
https://github.com/NVIDIA/cccl/issues/4967
2025-02-28 17:36:12 +00:00
ff1433d998 pjrt: bind PJRT_Client_CreateUninitializedBuffer. 2025-02-25 10:37:45 +00:00
8456a0d073 zml/pjrt: add binding for PJRT_Device_MemoryStats. 2025-02-19 12:14:05 +00:00
a580f2a398 Async: use stronger memory ordering to prevent potential segfaults due to ordering issues. 2025-02-18 11:38:56 +00:00
4d6d975dc0 Patch aio.zig: update loadBuffersWithPrefix argument type to match the conditional type of loadBuffers init_args. 2025-02-13 09:48:13 +00:00
af8844c1f1 Add model prefix support when loading a model from safetensors, enabling use of a specific model prefix (e.g., ModernBertModel) instead of the full model. Tested with the text embeddings server project. 2025-02-12 13:18:27 +00:00
1cafcc3c60 Workspace: bump XLA to newer version. 2025-02-05 17:35:27 +00:00
9ef838be25 Update neuron runtime BUILD.bazel to use Bazel manual tag and S3 cache integration. 2025-02-03 14:03:33 +00:00
dd52e988b4 Update example Bazel build files (MODULE.bazel, llama, modernbert) to test the revamped commit workflow. 2025-01-31 16:28:38 +00:00
0a2ab7c8cb Remove usingnamespace from MLIR. 2025-01-28 09:35:58 +00:00
f8ab0d7b2a Remove dead imports. 2025-01-22 10:45:04 +00:00
51a6cab753 Wire has_side_effect field in zml/ops. 2025-01-20 16:45:13 +00:00
99a2001e63 Rename PJRT BufferType to follow Zig and ZML naming conventions. 2025-01-16 13:00:47 +00:00
7324a49da3 Remove .print() calls from globalAttnMask() and localAttnMask() in ModernBERT example to resolve compilation sharding error. 2025-01-15 16:59:26 +00:00
09c43b8759 Add customCall operation to zml/ops. 2025-01-09 15:01:33 +00:00
9f1cc762cd Fix map tests in zml/meta. 2025-01-06 17:49:50 +00:00
fbf1ecb8b7 Introduce Executable.getCompiledMemoryStats in PJRT. 2025-01-02 16:36:13 +00:00
4b1a3ff48a Add union support to mapping helpers in zml/meta.zig. 2025-01-01 13:35:17 +00:00
c961d705f1 Set default values for operand_layouts and result_layouts in StableHLO dialect. 2024-12-26 09:29:45 +00:00
e6286b6097 Update Buffer.from to be blocking by default and add options for async loading and memory placement, adjusting aio, hostbuffer, pjrtx, and tensor implementations. 2024-12-25 17:14:44 +00:00
da1fd2d9dc Add examples demonstrating Buffer.from options, non‑blocking loading, and memory copy behavior. 2024-12-20 09:30:35 +00:00
bb2b77d7de Correctly set model.norm.eps in Llama examples. 2024-12-18 11:48:23 +00:00
6aa9aa5a7b Add preliminary implementation for custom call support. 2024-12-10 09:36:37 +00:00
1d5b79111a modernbert: set default epsilon value for embeddings layernorm. 2024-12-09 16:43:29 +00:00
a63d0a4aa3 Update example MODULE.bazel and lockfile to use the toolchains_llvm_bootstrapped configuration. 2024-12-04 11:30:44 +00:00
5464281c91 Update workspace configuration to use the toolchains_llvm_bootstrapped toolchain for Zig builds. 2024-12-03 13:50:58 +00:00
f5ab2c3a55 zml: eliminate compile-time fields from Bufferized, removing the need to pass undefined to exe.call for inlined arguments. Introduce BufferizedWithArgs in zml.testing for compileAndCall utility. 2024-11-28 12:24:39 +00:00
364a222dc1 Update example MODULE.bazel and lockfile to target XLA version 20250527.0‑cb67f2f. 2024-11-25 17:57:45 +00:00
95453c7242 Update XLA dependency to version 20250527.0‑cb67f2f and refresh related Bazel BUILD, MODULE, overlay and patch files. 2024-11-22 16:50:20 +00:00
fa13287931 workspace: upgrade to Zig 0.14.1 and handle empty tuple syntax &.{} being detected as *const @TypeOf(.{}). 2024-11-19 11:45:36 +00:00
d8a83830e8 runtimes: switch to Cloudflare Debian snapshots for more reliable dependency pinning. 2024-11-15 09:40:58 +00:00
ea3ce685a9 runtimes/neuron: bump runtime version and expose nrt.h header to Zig. 2024-11-14 13:37:47 +00:00
09da9c2982 Make zls.sh example explicitly set the ZLS runner target. 2024-11-06 16:22:44 +00:00
948c577205 Make ZLS runner target explicit in workspace BUILD files and update the zls.sh script accordingly. 2024-11-04 13:57:59 +00:00
47a4eda5f6 runtimes/cuda: expose cuda.h in the C namespace for CUDA runtimes, enabling custom calls to CUDA functions. 2024-11-01 13:27:24 +00:00
3849eb10b7 Add buffer and hostbuffer utilities with precise f32→bf16 conversion, type inference for loadBuffers, store expected input shapes, enhance meta.visit and JSON TaggedUnion support, and improve logging. 2024-10-28 11:21:46 +00:00
1540c6e85e Update loader example to demonstrate new HostBuffer helpers and type‑inferred buffer loading. 2024-10-25 10:20:04 +00:00
048d7eb38e third_party/sentencepiece: add missing protobuf_lite dependency and bump version. 2024-10-22 16:41:52 +00:00
4ef81b89ea stdx.fmt: add slice formatting support, improving on previous prettyPrinter implementation by leveraging internal fmt mechanisms. 2024-10-18 15:05:08 +00:00