Skip to content

std::rand: correct an off-by-one in the Ziggurat code. #10196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 1, 2013

Conversation

huonw
Copy link
Member

@huonw huonw commented Oct 31, 2013

The code was using (in the notation of Doornik 2005) f(x_{i+1}) - f(x_{i+2}) rather than f(x_i) - f(x_{i+1}). This corrects that, and
removes the F_DIFF tables which caused this problem in the first place.

They F_DIFF tables are a micro-optimisation (in theory, they could
easily be a micro-pessimisation): that if gets hit about 1% of the
time for Exp/Normal, and the rest of the condition involves RNG calls
and a floating point exp, so it is unlikely that saving a single FP
subtraction will be very useful (especially as more tables means more
memory reads and higher cache pressure, as well as taking up space in
the binary (although only ~2k in this case)).

Closes #10084. Notably, unlike that issue suggests, this wasn't a
problem with the Exp tables. It affected Normal too, but since it is
symmetric, there was no bias in the mean (as the bias was equal on the
positive and negative sides and so cancelled out) but it was visible as
a variance slightly lower than it should be.

New plot:

exp-density

I've started writing some tests in huonw/random-tests (not in the main repo because they can and do fail occasionally, due to randomness, but it is on Travis and Rust-CI so it will hopefully track the language), unsurprisingly, they're currently failing (note that both exp and norm are failing, the former due to both mean and variance the latter due to just variance), but pass at the 0.01 level reliably with this change.

(Currently the only test is essentially a quantitative version of the plots I've been showing, which is run on the f64 Rand instance (uniform 0 to 1), and the Normal and Exp distributions.)

The code was using (in the notation of Doornik 2005) `f(x_{i+1}) -
f(x_{i+2})` rather than `f(x_i) - f(x_{i+1})`. This corrects that, and
removes the F_DIFF tables which caused this problem in the first place.

They `F_DIFF` tables are a micro-optimisation (in theory, they could
easily be a micro-pessimisation): that `if` gets hit about 1% of the
time for Exp/Normal, and the rest of the condition involves RNG calls
and a floating point `exp`, so it is unlikely that saving a single FP
subtraction will be very useful (especially as more tables means more
memory reads and higher cache pressure, as well as taking up space in
the binary (although only ~2k in this case)).

Closes rust-lang#10084. Notably, unlike that issue suggests, this wasn't a
problem with the Exp tables. It affected Normal too, but since it is
symmetric, there was no bias in the mean (as the bias was equal on the
positive and negative sides and so cancelled out) but it was visible as
a variance slightly lower than it should be.
bors added a commit that referenced this pull request Nov 1, 2013
The code was using (in the notation of Doornik 2005) `f(x_{i+1}) -
f(x_{i+2})` rather than `f(x_i) - f(x_{i+1})`. This corrects that, and
removes the F_DIFF tables which caused this problem in the first place.

They `F_DIFF` tables are a micro-optimisation (in theory, they could
easily be a micro-pessimisation): that `if` gets hit about 1% of the
time for Exp/Normal, and the rest of the condition involves RNG calls
and a floating point `exp`, so it is unlikely that saving a single FP
subtraction will be very useful (especially as more tables means more
memory reads and higher cache pressure, as well as taking up space in
the binary (although only ~2k in this case)).

Closes #10084. Notably, unlike that issue suggests, this wasn't a
problem with the Exp tables. It affected Normal too, but since it is
symmetric, there was no bias in the mean (as the bias was equal on the
positive and negative sides and so cancelled out) but it was visible as
a variance slightly lower than it should be.

New plot:

![exp-density](https://f.cloud.github.com/assets/1203825/1445796/42218dfe-422a-11e3-9f98-2cd146b82b46.png)

I've started writing some tests in [huonw/random-tests](https://github.com/huonw/random-tests) (not in the main repo because they can and do fail occasionally, due to randomness, but it is on Travis and Rust-CI so it will hopefully track the language), unsurprisingly, they're [currently failing](https://travis-ci.org/huonw/random-tests/builds/13313987) (note that both exp and norm are failing, the former due to both mean and variance the latter due to just variance), but pass at the 0.01 level reliably with this change.

(Currently the only test is essentially a quantitative version of the plots I've been showing, which is run on the `f64` `Rand` instance (uniform 0 to 1), and the Normal and Exp distributions.)
@bors bors closed this Nov 1, 2013
@bors bors merged commit 1e2283d into rust-lang:master Nov 1, 2013
@huonw huonw deleted the fix-zig branch November 25, 2013 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::rand::distributions::Exp1 is biased
3 participants