1427286716
runtimes/neuron: fix neuron runtime
...
This PR fixes the neuron runtime with the following:
Proxy the PJRT Api method to enforce the client struct sizes since the
neuron PJRT plugin doesn't use `>=` but `==` to assert them, breaking
PJRT compatibility guarantees.
Fixes https://github.com/aws-neuron/aws-neuron-sdk/issues/1095
Reimplement `libneuronxla` in Zig to control neuronx-cc sandboxing and
invocation.
Implement a python bootstrapper in Zig to create a full blown
`neuronx-cc` executable, avoiding the infamous chicken and egg problem
of python executables boostrapping when sandboxed (due to fixed path
shebangs).
---------
Co-authored-by: Corentin Kerisit <corentin.kerisit@gmail.com>
2025-07-15 15:26:03 +00:00
e1ee340306
runtimes/cuda: implement zmlxcuda in Zig
2025-07-08 09:25:25 +00:00
c488b634fc
runtimes/rocm: implement zmlxrocm in Zig
...
Also, sandbox `amdgpu.ids` and restore safetensors json parsing.
2025-07-07 16:48:07 +00:00
cf00506dbb
Switch workspace build rules from zig_cc_binary to zig_binary, removing the hack and using the C linker directly.
2025-07-03 15:10:36 +00:00
e789e26008
Remove examples workspace and clean up related Bazel BUILD/MODULE files and Zig build scripts.
2025-06-19 09:30:29 +00:00
1a2b862ec2
Add sandbox neuron dependencies: define a trampoline PJRT, create an empty repository for distroless deps, and update Bazel build files and Zig/C sources accordingly.
2025-05-19 17:35:33 +00:00
55c5b540f8
Add XLA 20250718.0‑6319f0d with ROCm 6.4.1 support, update Bazel module files and runtime configs, and apply migration, FFI‑handler and header‑cleanup patches.
2025-05-12 12:10:27 +00:00
ed5ae31338
runtimes/rocm: fetch libdrm from amdgpu repository and add amdgpu.ids layer
2025-04-30 15:53:51 +00:00
e7323be10b
runtimes/rocm: switch to in-process LLD, removing the need for sandboxed lld.
2025-04-23 11:43:18 +00:00
7d9fdf94e7
runtimes/rocm: sandbox ROCm dependencies and ensure they load on the main thread due to TLS usage in static C++ destructors.
2025-04-14 16:38:15 +00:00
eba0e72532
runtimes/tpu: sandbox TPU PJRT plugin; no external dependencies.
2025-04-10 14:47:16 +00:00
78d7b672e7
runtimes/cpu: sandbox CPU PJRT plugin, simplifying as there are no additional NEEDED dependencies.
2025-04-03 11:57:46 +00:00
2d321d232d
runtimes/cuda: sandbox CUDA dependencies by removing them from the leaf binary, sandboxing the dependency graph, marking dlopen direct dependencies as NEEDED, setting RPATH to the sandbox, loading the PJRT plugin from the sandbox, and enabling weak CUDA symbols without direct linking.
2025-03-26 11:18:29 +00:00
f27a524f31
Update rules_zig: add zig_srcs target, fix source handling bug, clean up BUILD files, adjust async/coro.zig tests, and disable nemo and yaml model loaders.
2025-03-13 12:27:21 +00:00
9488672d4b
workspace: bump xla to version 20250710.0-22ea002
...
Also:
- Bump XLA deps : `com_github_grpc_grpc` and `com_google_protobuf`
- Inject `rules_ml_toolchain`
- Fix `zig_proto_library` rule
2025-03-04 17:12:34 +00:00
fa0ed045ef
runtimes/cuda: downgrade cuda and cudnn
...
This commit reverts part of https://github.com/zml/zml/pull/238/files
This is required because XLA has a strong dependencies on CUDA 12.8 and
upgrading to 12.9 is impossible due to
https://github.com/NVIDIA/cccl/issues/4967
2025-02-28 17:36:12 +00:00
1cafcc3c60
Workspace: bump XLA to newer version.
2025-02-05 17:35:27 +00:00
9ef838be25
Update neuron runtime BUILD.bazel to use Bazel manual tag and S3 cache integration.
2025-02-03 14:03:33 +00:00
95453c7242
Update XLA dependency to version 20250527.0‑cb67f2f and refresh related Bazel BUILD, MODULE, overlay and patch files.
2024-11-22 16:50:20 +00:00
d8a83830e8
runtimes: switch to Cloudflare Debian snapshots for more reliable dependency pinning.
2024-11-15 09:40:58 +00:00
ea3ce685a9
runtimes/neuron: bump runtime version and expose nrt.h header to Zig.
2024-11-14 13:37:47 +00:00
47a4eda5f6
runtimes/cuda: expose cuda.h in the C namespace for CUDA runtimes, enabling custom calls to CUDA functions.
2024-11-01 13:27:24 +00:00
4a0b1cce50
Update Bazel workspace and XLA overlay (MODULE.bazel, BUILD files, patches) to prevent dual LLVM builds and apply migration/bump patches.
2024-09-27 14:00:44 +00:00
63ef78efcc
zml: add support for NVTX tracing
2024-08-21 14:41:40 +00:00
ca4e061ad5
Add Bazel build configurations for macOS x86_64 CPU runtime and ZLS third‑party integration.
2024-07-25 15:58:14 +00:00
efcf955a4e
workspace, third_party/rules_zig: adjust ZLS to require --version as the first parameter and add missing keys to the BuildConfig object for code completion
2024-07-10 15:20:12 +00:00
967eeb928f
Update Bazel workspace and runtime configs: rework sandboxing, bump PJRT to 7.0.0, and upgrade CUDA (12.8), cuDNN (9.8), and ROCm (6.3.4).
2024-06-25 11:00:29 +00:00
3aac788544
Update Bazel build configurations (zig.bzl, BUILD files) for MLIR, PJRT, Neuron, ROCm, tokenizer, and tools, fixing broken dependencies.
2024-05-20 11:28:25 +00:00
f5ab6ff2c6
Update XLA to version 20250204.0-6789523 and adjust Bazel module and runtime files for Bazel 8 compatibility.
2024-05-03 15:57:56 +00:00
5a2171793d
workspace: MODULE.bazel cleanup
...
Title says it all !
2024-04-22 09:27:44 +00:00
980f1b17fb
Ensure all runtime plugins have correct SONAME values, fixing issues with prebuilt PJRT plugins.
2024-03-11 10:15:22 +00:00
8a25b1eb74
Revert CUDA PJRT plugin version to 0.4.38 to address performance regression on XLA master.
2024-03-05 17:04:42 +00:00
169a24307c
Migrate workspace and XLA module definitions to Bazel 8, updating MODULE.bazel files, BUILD rules, and related migration patches.
2024-02-12 12:43:23 +00:00
7e6103d876
Upgrade XLA to version 20250122.0-cc075be, switch to nvptx compiler and nvlink with nvjitlink support, add warning for CUDA path in LD_LIBRARY_PATH, and revert the previous CUDA sandbox fix.
2024-02-06 09:31:48 +00:00
edc2ac26f8
Adjust ROCm runtime sandboxing to hook only the PJRT plugin and make hipblastlt bytecodes optional.
2024-01-26 13:02:23 +00:00
a7b7ae0180
Fix async hangs by reworking the libxev epoll backend and using callBlocking for PJRT plugin loading, improving performance across async and runtime modules.
2024-01-16 14:13:45 +00:00
434cee3a6c
Fix CUDA and ROCm sandbox discovery, update epoll libxev patch to prevent high CPU usage, enable XLA GPU latency‑hiding scheduler, and upgrade cuDNN to 9.6.0.
2024-01-15 09:41:42 +00:00
145e60b4dd
workspace: Update LLVM, XLA, StableHLO, and PJRT plugins to latest versions.
2023-12-13 10:10:32 +00:00
37725cdaa6
Update PJRT, runtime, and ZML modules to use per‑target output folders and expose profiler.dumpDataAsJson for JSON profiling output.
2023-12-04 10:38:10 +00:00
455bb3877f
runtimes/cuda: obtain NCCL from the pip package, matching XLA behavior.
2023-09-20 17:41:44 +00:00
0d5389ceda
Update CUDA runtime sandboxing and dynamic symbol renaming, switch to pre‑built jax‑cuda‑pjrt plugin, and bump CUDA to 12.6.2 and cuDNN to 9.5.1.
2023-09-14 13:28:25 +00:00
7d24329d0a
Add Bazel build rules and runtime implementation for AWS Neuron/Trainium/Inferentia support.
2023-08-18 17:11:27 +00:00
01eff33fa0
Update workspace dependencies to newer LLVM, XLA, StableHLO, and PJRT versions and expose new pjrt plugin attribute APIs and stablehlo version APIs in build and runtime configurations.
2023-08-07 12:28:36 +00:00
54e7eb30b4
Introduce a thin abstraction layer between ZML and PJRT to manage plugin loading decisions, enable compile‑time detection of linked runtimes, and handle cases such as libtpu blocking metadata access.
2023-05-15 09:36:41 +00:00
cfe38f27ca
Switch ROCm dlopen handling to patchelf's rename_dynamic_symbols for more robust dynamic symbol import.
2023-05-03 17:33:46 +00:00
833ff5f28d
Upgrade PJRT CUDA Plugin to version 0.2.3, adding NCCL support for correct sharding.
2023-04-12 15:47:06 +00:00
70d40208a2
runtimes/cuda: Fix version variable definitions in the build script to enable successful CUDA builds.
2023-03-09 11:31:02 +00:00
0c126c2e12
runtimes/cuda: Upgrade CUDA to 12.6.2 and cuDNN to 9.4.0.
2023-03-03 15:17:26 +00:00
f595d22134
runtimes/rocm: Upgrade ROCm to version 6.2.2.
2023-03-01 13:15:50 +00:00
0606ea1d7c
Update Bazel workspace and runtime BUILD files to newer XLA, StableHLO, and LLVM versions, enabling batching‑dims support for the gather operator.
2023-02-01 15:58:30 +00:00