Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Mar 22, 2024

No description provided.

@github-actions github-actions bot added the API label Mar 22, 2024
@Override
public int hashCode() {
return JavaHashes.hashCode(wrapped);
int hash = hashCode;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this uses a local hash var on purpose due to https://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html. Also the String class itself uses the same approach when lazily calculating and caching a hashCode

private JavaHashes() {}

public static int hashCode(CharSequence str) {
if (null == str) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are places in the codebase that use CharSequenceWrapper.wrap(...) and then set the wrapper to null. The issue was never uncovered because equals/hashCode was never called on those CharSequenceWrapper instances.

When adding TestCharSequenceWrapper I noticed that this failed with a NPE, so I fixed this here and also added some tests around this

public class TestCharSequenceWrapper {

@Test
public void nullWrapper() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test uncovered a NPE in JavaHashes

@nastra nastra requested a review from rdblue March 22, 2024 14:23

private CharSequence wrapped;
// lazily computed & cached hashCode
private transient int hashCode = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure I'd set a default value of 0, since that technically can be a legit hashCode value for some series of characters no?
I think I'd make this an Integer hashCode and have it be null to identify that nothing is cached yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But good call making it transient, that's important since it shouldn't be serialized and the cache somehow used across JVMs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@nastra nastra Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point, thanks for finding that. Instead of making this an Integer I actually introduced a boolean flag. The String implementation does the same thing so that the hash isn't re-calculated when the hash is actually 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, maybe just comment in the code that we're following what java.lang.String does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I've added a comment mentioning that this follows java.lang.String

@nastra nastra force-pushed the charseqwrapper-hashcode branch from 93fe63f to 3eb4bf4 Compare April 16, 2024 12:53
@nastra nastra requested a review from amogh-jahagirdar April 16, 2024 12:55
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me!


private CharSequence wrapped;
// lazily computed & cached hashCode
private transient int hashCode = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, maybe just comment in the code that we're following what java.lang.String does

@nastra nastra force-pushed the charseqwrapper-hashcode branch from 3eb4bf4 to ba1127d Compare April 22, 2024 06:41
@nastra nastra merged commit a23021d into apache:main Apr 22, 2024
@nastra nastra deleted the charseqwrapper-hashcode branch April 22, 2024 07:42
sasankpagolu pushed a commit to sasankpagolu/iceberg that referenced this pull request Oct 27, 2024
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants