Skip to content

Use php_random_bytes() for uniqid() #2123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

yohgaki
Copy link
Contributor

@yohgaki yohgaki commented Sep 9, 2016

RFC: https://wiki.php.net/rfc/uniqid

uniqid() uses php_combined_lcg() for more entropy. Use php_random_bytes() as entropy source because it's much better entropy source than php_combined_lcg().

Additionally, this PR enables $more_entropy option by default. This makes default behavior much faster.

Entropy space (php_combined_lcg() is very poor quality also)

More entropy should mean "more randomness".

  • php_combined_lcg() : About 1 billion
  • php_random_bytes() : 2^50. About 1048567 billions

New code will have much better space and quality.
Although uniqid() should never used for crypto related usage even with this PR, misusage risk is mitigated a lot.

Changed

  • uniqid() add more entropy by default.
  • More entropy is generated by php_random_bytes() rather than php_combined_lcg().
  • More entropy uses digit/alphabet chars and does not use '.'. Old code uses digits and '.'.

Not changed.

  • Return value length is the same regardless of more entropy parameter.

BC

Performance

Since uniqid() does not call usleep(1) with more entropy, it's 20x faster. php_combined_lcg() is simple arithmetic function, so it's 2x faster than php_random_bytes().

With entropy. (More entropy is TRUE in new code by default)

[yohgaki@dev github-php-src]$ time ./php-bin -r 'for($i=0; $i<100000;$i++) uniqid("", TRUE);'

real 0m0.272s
user 0m0.068s
sys 0m0.204s

No entropy.

[yohgaki@dev github-php-src]$ time ./php-bin -r 'for($i=0; $i<100000;$i++) uniqid("", FALSE);'

real 0m5.561s
user 0m0.103s
sys 0m0.309s

With entropy by php_combined_lcg()

[yohgaki@dev PHP-master]$ time ./php-bin -r 'for($i=0; $i<100000;$i++) uniqid("", TRUE);'

real 0m0.124s
user 0m0.074s
sys 0m0.050s


/* The max value usec can have is 0xF423F, so we use only five hex
* digits for usecs.
*/
if (more_entropy) {
uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec, php_combined_lcg() * 10);
uniqid = strpprintf(0, "%s%08x%05x%s", prefix, sec, usec, bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can already see half the world breaking here...

Copy link
Contributor Author

@yohgaki yohgaki Sep 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ocramius
Why? It could break code, but almost all code does not care return value from uniqid() at all.
e.g. https://searchcode.com/?q=uniqid&loc=0&loc2=10000&lan=24

@smalyshev
Copy link
Contributor

Why? Manual clearly states it's no crypto-secure.

@yohgaki
Copy link
Contributor Author

yohgaki commented Sep 10, 2016

@smalyshev New code is not crypto secure also. It improves "more_entropy" option. We don't have to keep using poor entropy source now.

@smalyshev
Copy link
Contributor

Yes, but why improve it? What it makes possible now that wasn't possible before? Why it is important which entropy source does it use, which practical value is in it?

@yohgaki
Copy link
Contributor Author

yohgaki commented Sep 10, 2016

@smalyshev My main objective is to mitigate risk of misuse. Fault tolerance is important security factor. It should not be used for crypto purpose anyways, though. It's faster also :)

@yohgaki
Copy link
Contributor Author

yohgaki commented Sep 10, 2016

@smalyshev Even if user misuses uniqid() and attackers found exploitable code, attackers will never try to exploit vulnerable code with this change, probably. Let's be nicer to user's mistakes. It seems misuse is common still.

@smalyshev
Copy link
Contributor

I don't understand. You admit it's not secure, so what's the improvement? That if you severely misuse it, such as using it for security context, it provides you a bit more false sense of security?

@yohgaki
Copy link
Contributor Author

yohgaki commented Sep 10, 2016

@smalyshev I agree that we shouldn't give false sense of security.

Think of this as improving unserialize(). IMO, users should never unserialize external data (manual states it, too) without validation e.g. hmac (I added this to the manual) , but user uses it so you fixed many unserialize vulnerabilities. I don't think it's a false sense of security as well as this change.

@@ -72,12 +74,17 @@ PHP_FUNCTION(uniqid)
gettimeofday((struct timeval *) &tv, (struct timezone *) NULL);
sec = (int) tv.tv_sec;
usec = (int) (tv.tv_usec % 0x100000);
php_random_bytes(bytes, UNIQID_RAND_CHARS*2, 1);
for (i = 0; i < UNIQID_RAND_CHARS; i++) {
Copy link

@CarwynNelson CarwynNelson Sep 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm by no means knowledgeable on security (in fact I know almost nothing about it). But doesn't building up the uniqid like this (assuming that's what it's doing) open up a user to timing attacks?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: In other words shouldn't building up of the uniqid be done in constant time?

Copy link
Contributor Author

@yohgaki yohgaki Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarwynNelson
uniqid() produces timestamp based unique ID. This is what it supposed to do. (AFAIK, it was made for SMTP's message ID which uses timestamp based ID)
"more entropy" option should do better job. This RFC improves "more entropy" option by replacing php_combined_lcg() to php_random_bytes().

@smalyshev smalyshev added the RFC label Oct 30, 2016
@KalleZ
Copy link
Member

KalleZ commented Mar 2, 2019

Closing this due to inactivity. Please open a new PR and link to this if active work is being put back into it.

@KalleZ KalleZ closed this Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants