Skip to content

BLD: Modify cpu detection and printing to get working aarch64 build #11568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 16, 2018

Conversation

ksunden
Copy link
Contributor

@ksunden ksunden commented Jul 14, 2018

Closes #11564

@jjhelmus
Copy link
Contributor

Once the build is working again on aarch64 we could use the new ARMv8-A on Shippable for an altarch CI.

@charris
Copy link
Member

charris commented Jul 14, 2018

LGTM failure looks unrelated.

@charris
Copy link
Member

charris commented Jul 14, 2018

Has this been tested to see if it fixes the build?

@@ -2714,7 +2714,7 @@ Dragon4_PrintFloat_Intel_extended128(
* becomes more common.
*/
static npy_uint32
Dragon4_PrintFloat_IEEE_binary128(
Dragon4_PrintFloat_IEEE_binary128_le(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is for quad precision floats, I doubt that ARM implements that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, I got warnings for implicit definition of Dragon4_PrintFloat_IEEE_binary128_le and unused symbol for Dragon4_PrintFloat_IEEE_binary128

Additionally, it failed on import looking for Dragon4_PrintFloat_IEEE_binary128_le

It is very possible that the root cause is actually the identification of what representation it should be using. I've been looking at

def long_double_representation(lines):
to see if there's something incorrect there.

I've created the object file outside of running setup_common, and confirmed that the sequence that identifies IEEE_QUAD_LE is indeed present in the od -b dump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the error that was thrown, I tried the naive approach of just editing the function name, not fully expecting it to work, but it did (at least as far as building was concerned, still have some test failures)

Copy link
Member

@charris charris Jul 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, LLVM does seem to have software support for quad precision, so I think we need to track this down. What is your compiler tool chain? I'm concerned about the endian part. @ahaldane Comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe wrong flag, but it is there somewhere, https://llvm.org/docs/LangRef.html#floating-point-types. We should be detecting on the byte representation, so either there is an error in that or the compiler is actually producing quad floats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK we need to have two functions, one with the _be extension and one with the _le extension. This function, despite the comment, looks to be big endian with the little endian version in #11570. The difference is only four lines, so it would be nice to have wrappers for the endian specific versions that called into the common code, but the easy fix would be copy the function from # 11570 here with the appropriate name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See similar wrappers for the various extended precision alignments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the wrappers need to do is swap the a and b members of buf128, so maybe only one wrapper to do that for the _le version and name this function with the _be extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed the easy fix, cherry-picking the commit from my other PR, I'll work on the wrapper func now

@charris
Copy link
Member

charris commented Jul 14, 2018

The availability of half precision is also interesting.

@ksunden
Copy link
Contributor Author

ksunden commented Jul 14, 2018

I have indeed built this on my machine, and @jjhelmus has built it as well on a remote machine

@charris
Copy link
Member

charris commented Jul 15, 2018

Out of curiosity, what does np.finfo(np.float128) show?

@ksunden
Copy link
Contributor Author

ksunden commented Jul 15, 2018

>>> np.finfo(np.float128)
finfo(resolution=1e-33, min=-1.189731495357231765085759326628007e+4932, max=1.189731495357231765085759326628007e+4932, dtype=float128)

@charris
Copy link
Member

charris commented Jul 15, 2018

OK, looks like genuine IEEE quad precision long doubles, the finfo function should be making the identification based on the byte representation. It would be interesting to know why they are there, as that seems not to be common news, but the signs and portents do seem to indicate that the long delayed move to quad precision is about to get underway.

@charris charris changed the title BLD: Modify cpu detection and function to get build working on aarch64 BLD: Modify cpu detection and printing to get working aarch64 build Jul 16, 2018
@charris charris merged commit f07359b into numpy:master Jul 16, 2018
@charris
Copy link
Member

charris commented Jul 16, 2018

Thanks @ksunden .

@QuLogic
Copy link
Contributor

QuLogic commented Aug 11, 2018

I think this broke some big-endian systems. Dragon4_PrintFloat_IEEE_binary128 exists if defined(HAVE_LDOUBLE_IEEE_QUAD_LE) || defined(HAVE_LDOUBLE_IEEE_QUAD_BE) and calls LogBase2_128, but that only exists if defined(HAVE_LDOUBLE_IEEE_QUAD_LE), which means it's not available on big-endian systems.

@QuLogic
Copy link
Contributor

QuLogic commented Aug 11, 2018

BigInt_Set_2x_uint64 is also not defined for HAVE_LDOUBLE_IEEE_QUAD_BE and called by this function.

QuLogic added a commit to QuLogic/numpy that referenced this pull request Aug 11, 2018
Both these functions are used by `Dragon4_PrintFloat_IEEE_binary128`,
which was recently made available on big-endian systems without these
in numpy#11568.
charris pushed a commit to charris/numpy that referenced this pull request Aug 12, 2018
Both these functions are used by `Dragon4_PrintFloat_IEEE_binary128`,
which was recently made available on big-endian systems without these
in numpy#11568.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants