Skip to content

pin update #8908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 4, 2025
Merged

pin update #8908

merged 10 commits into from
Apr 4, 2025

Conversation

lsy323
Copy link
Collaborator

@lsy323 lsy323 commented Mar 31, 2025

Accommodate the following changes:

  • xla::Shape::rank() is renamed to xla::Shape::dimensions_size
  • change of xla::Shape ctor
  • PJRT compilation API change

@ysiraichi
Copy link
Collaborator

I was able to compile by adding the following patch to OpenXLA:

iff --git a/xla/service/gpu/model/gpu_collective_performance_model.cc b/xla/service/gpu/model/gpu_collective_performance_model.cc
index 496969f545..2d9f73ee36 100644
--- a/xla/service/gpu/model/gpu_collective_performance_model.cc
+++ b/xla/service/gpu/model/gpu_collective_performance_model.cc
@@ -34,7 +34,7 @@ limitations under the License.

 #if GOOGLE_CUDA
 #include "third_party/gpus/cuda/include/cuda.h"
-#include "third_party/gpus/cuda/nvml/include/nvml.h"
+#include "third_party/gpus/cuda/include/nvml.h"
 #endif  // GOOGLE_CUDA
 namespace xla {
 namespace gpu {
diff --git a/xla/service/gpu/model/gpu_collective_performance_model.h b/xla/service/gpu/model/gpu_collective_performance_model.h
index 01c3f3eb45..f44057602b 100644
--- a/xla/service/gpu/model/gpu_collective_performance_model.h
+++ b/xla/service/gpu/model/gpu_collective_performance_model.h
@@ -32,7 +32,7 @@ limitations under the License.
 #include <dlfcn.h>
 #endif

-#include "third_party/gpus/cuda/nvml/include/nvml.h"
+#include "third_party/gpus/cuda/include/nvml.h"
 // Below is a list of function pointers to be used
 // for querying device properties through nvml library.
 #define NVML_FUNCTOR(name, rettype, args) \

@lsy323
Copy link
Collaborator Author

lsy323 commented Apr 2, 2025

I was able to compile by adding the following patch to OpenXLA:

iff --git a/xla/service/gpu/model/gpu_collective_performance_model.cc b/xla/service/gpu/model/gpu_collective_performance_model.cc
index 496969f545..2d9f73ee36 100644
--- a/xla/service/gpu/model/gpu_collective_performance_model.cc
+++ b/xla/service/gpu/model/gpu_collective_performance_model.cc
@@ -34,7 +34,7 @@ limitations under the License.

 #if GOOGLE_CUDA
 #include "third_party/gpus/cuda/include/cuda.h"
-#include "third_party/gpus/cuda/nvml/include/nvml.h"
+#include "third_party/gpus/cuda/include/nvml.h"
 #endif  // GOOGLE_CUDA
 namespace xla {
 namespace gpu {
diff --git a/xla/service/gpu/model/gpu_collective_performance_model.h b/xla/service/gpu/model/gpu_collective_performance_model.h
index 01c3f3eb45..f44057602b 100644
--- a/xla/service/gpu/model/gpu_collective_performance_model.h
+++ b/xla/service/gpu/model/gpu_collective_performance_model.h
@@ -32,7 +32,7 @@ limitations under the License.
 #include <dlfcn.h>
 #endif

-#include "third_party/gpus/cuda/nvml/include/nvml.h"
+#include "third_party/gpus/cuda/include/nvml.h"
 // Below is a list of function pointers to be used
 // for querying device properties through nvml library.
 #define NVML_FUNCTOR(name, rettype, args) \

Thank you @ysiraichi! I added this patch for now.

@lsy323
Copy link
Collaborator Author

lsy323 commented Apr 2, 2025

Persistent cache test is failing on GPU, due to deserialization issue. Skipping the test for now and will file a Github Issue for this.

[ RUN      ] PersistentCacheTest.test_persistent_cache_mp
E0402 19:38:44.492582203   21190 server_chttp2.cc:40]        ***"created":"@1743622724.492561846","description":"Only 1 addresses added out of total 2 resolved","file":"external/com_github_grpc_grpc/src/core/ext/transport/chttp2/server/chttp2_server.cc","file_line":404,"referenced_errors":[***"created":"@1743622724.492558996","description":"Address family not supported by protocol","errno":97,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":420,"os_error":"Address family not supported by protocol","syscall":"socket","target_address":"[::1]:8547"***]***
E0402 19:38:58.918947906   22327 server_chttp2.cc:40]        ***"created":"@1743622738.918928838","description":"Only 1 addresses added out of total 2 resolved","file":"external/com_github_grpc_grpc/src/core/ext/transport/chttp2/server/chttp2_server.cc","file_line":404,"referenced_errors":[***"created":"@1743622738.918926006","description":"Address family not supported by protocol","errno":97,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":420,"os_error":"Address family not supported by protocol","syscall":"socket","target_address":"[::1]:8547"***]***
2025-04-02 19:39:05.879197: W torch_xla/csrc/runtime/pjrt_computation_client.cc:680] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-04-02 19:39:05.880553: W torch_xla/csrc/runtime/pjrt_computation_client.cc:680] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-04-02 19:39:05.887184: W torch_xla/csrc/runtime/pjrt_computation_client.cc:680] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-04-02 19:39:05.892825: W torch_xla/csrc/runtime/pjrt_computation_client.cc:680] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.

@lsy323 lsy323 marked this pull request as ready for review April 3, 2025 19:20
Copy link
Collaborator

@ManfeiBai ManfeiBai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the amazing work, its a really huge change adopt PR, LGTM

@vanbasten23
Copy link
Collaborator

Thanks @lsy323 for updating the pin.

Regarding the paged_attention hang, could you update this line

step = torch.zeros((1,), dtype=torch.int32).to("xla")
to step = torch.ones((1,), dtype=torch.int32).to("xla")? It should make the test pass. I tested locally.

@lsy323 lsy323 requested a review from tengyifei April 3, 2025 22:54
@lsy323 lsy323 enabled auto-merge (squash) April 3, 2025 22:54
@lsy323
Copy link
Collaborator Author

lsy323 commented Apr 3, 2025

Thanks @lsy323 for updating the pin.

Regarding the paged_attention hang, could you update this line

step = torch.zeros((1,), dtype=torch.int32).to("xla")

to step = torch.ones((1,), dtype=torch.int32).to("xla")? It should make the test pass. I tested locally.

Thanks @vanbasten23! Updated the PR. Also do you mind elaborating a bit on this?

@lsy323 lsy323 merged commit 6400e16 into master Apr 4, 2025
18 of 21 checks passed
@lsy323 lsy323 deleted the lsiyuan/pin-update-0331 branch April 4, 2025 01:00
tengyifei added a commit that referenced this pull request Apr 4, 2025
#8908 accidentally enabled some
pallas tests on CPU, which is not supported
@tengyifei tengyifei mentioned this pull request Apr 4, 2025
@vanbasten23
Copy link
Collaborator

Thanks @lsy323 for updating the pin.
Regarding the paged_attention hang, could you update this line

step = torch.zeros((1,), dtype=torch.int32).to("xla")

to step = torch.ones((1,), dtype=torch.int32).to("xla")? It should make the test pass. I tested locally.

Thanks @vanbasten23! Updated the PR. Also do you mind elaborating a bit on this?

Yeah, jax-ml/jax@8c73799 made a change (it's not a bug but a valid change). As a result, the torch_xla wrapper needs to change accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants