BENCH: Split bench_function_base.Sort into Sort and SortWorst. #11889

michaelsaah · 2018-09-05T18:55:12Z

SortWorst houses the worst-case setup code and the time_sort_worst method.
Everything else remains in Sort.
Done in response to issue #11875

SortWorst houses the worst-case setup code and the time_sort_worst method. Everything else remains in Sort. Done in response to issue numpy#11875

charris · 2018-09-11T17:12:58Z

I note that the same data is passed in every time. Is it possible to split that out into class attributes and just make copies in the setup functions? @pv Thoughts.

charris · 2018-09-17T17:15:28Z

Wow, looks like asv reloads the whole module for every test run. That is just horribly wrong. Anyway, I can get a 3x speedup of the sort benchmark with the following.

class Sort(Benchmark):
    is_first = True

    def make_data(self):
        print('hi')
        Sort.is_first = False
        Sort.e = np.arange(10000, dtype=np.float32)
        Sort.o = np.arange(10001, dtype=np.float32)
        np.random.seed(25)
        np.random.shuffle(Sort.o)

        # quicksort implementations can have issues with equal elements
        Sort.equal = np.ones(10000)
        Sort.many_equal = np.sort(np.arange(10000) % 10)

        # quicksort median of 3 worst case
        Sort.worst = np.arange(1000000)
        x = Sort.worst
        while x.size > 3:
            mid = x.size // 2
            x[mid], x[-2] = x[-2], x[mid]
            x = x[:-2]

    def setup(self):
        if Sort.is_first:
            self.make_data()
            Sort.is_first = False
        self.e = Sort.e.copy()
        self.o = Sort.o.copy()
        self.equal = Sort.equal.copy()
        self.many_equal = Sort.equal.copy()
        self.worst = Sort.worst.copy()

    ...

How does that compare? I expect with your split in addition it will be even faster,

michaelsaah · 2018-09-17T18:04:16Z

@charris I got a 7.25x speedup on just the setup calls, didn't time the entire benchmark. I suspect that your strategy is likely marginally slower than mine in this case, since there's still overhead on copying Sort.worst for each method. I don't think combining them would yield a significant boost over mine, since the other test arrays are fairly small. Could be wrong though.

Maybe the most coherent strategy given how asv does things would be to enforce one test method per class. So the resulting Sort suite would have the classes:
Sort
SortManyEqual
SortWorst
SortArgsort
SortArgsortRandom

or some other naming convention. This would prevent redundant setup work.

Also relevant is what's at stake. I'm not too familiar with how these benchmarks are used, maybe it's not worth the work.

charris · 2018-09-17T19:18:05Z

I checked with your PR, and it gives another 2x improvement in the time_sort_worst time, at least on my machine.

I'm not sure how far it is worth pursuing, but it is worth understanding how to improve the performance.

pv · 2018-09-17T19:24:42Z

If you want to go fancy, I imagine you can write a metaclass in 10 lines that does name-based setup assignment, setup_SOMETHING <-> time_SOMETHING (and transforms the classes to singleton classes).

michaelsaah · 2018-09-17T19:26:42Z

@charris a 2x speedup in the setup? or in the entire benchmark?

That's interesting. I had assumed asv was just running each test method once, but if its doing repeated calls to setup() and then time_sort_worst(), your speedup on SortWorst would make sense.

I tried digging into the asv codebase but didn't get very far.

pv · 2018-09-17T19:28:35Z

Docs for how the setup/etc works is here: https://asv.readthedocs.io/en/stable/writing_benchmarks.html#setup-and-teardown-functions

charris · 2018-09-17T19:29:45Z

Building, see build.log...
Build OK
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Benchmarking existing-py_usr_bin_python
[ 50.00%] ··· Running (bench_function_base.Sort.time_sort_worst--).
[100.00%] ··· bench_function_base.Sort.time_sort_worst                                                                                             61.9±0.4ms

real	0m3.738s
user	0m2.873s
sys	0m0.455s

charris · 2018-09-17T19:31:39Z

I had assumed asv was just running each test method once

More like six times. You can do a print in the setup function to see that.

michaelsaah · 2018-09-17T19:32:13Z

@pv yes I read through that. What I couldn't figure out was how many times a given test method is being called by asv. If setup precedes time_sort_worst on every call, and time_sort_worst is called 100 times, then cacheing the setup data makes sense. If time_sort_worst is only called once, then it doesn't, but it does make sense to split the slower data generation out from the other test methods.

michaelsaah · 2018-09-17T19:33:16Z

@charris ok, that clears that up then, thanks. How you do propose we move forward? I can rewrite my PR to integrate your strategy.

michaelsaah · 2018-09-17T19:35:38Z

@charris Actually, looking back at my notes, your strategy can probably be used to speed up the other slow setups as well. I'll look into that.

charris · 2018-09-17T19:39:13Z

I think your improvement is probably adequate as we aren't really pressed for performance yet. But I'd like to fool around with this a bit more. One strange thing that is happening with latest asv 0.3 is that an error message from the totally unrelated bench_ufunc.py file gets printed for every test function. The _arg ufunc is missing from the list of functions to benchmark.

michaelsaah · 2018-09-17T19:42:02Z

@charris yeah I was getting that too. I seem to remember doing something that mitigated it, but it was a couple weeks ago and I didn't make note of it. I'll look at the code and let you know if I remember anything.

Thanks for the discussion, I definitely understand this better now.

charris · 2018-09-22T16:00:43Z

Thanks @michaelsaah . I'll leave further optimizations to you if you want to pursue them.

I fixed the asv messages in bench_ufunc.py by making an exception for _arg, but just locally for the time being as it seems like something that asv should not be doing in the first place. I'm also thinking we should remove that function as it was only for testing purposes and long ago.

BENCH: Split bench_function_base.Sort into Sort and SortWorst.

a8f6590

SortWorst houses the worst-case setup code and the time_sort_worst method. Everything else remains in Sort. Done in response to issue numpy#11875

charris added 03 - Maintenance component: benchmarks labels Sep 9, 2018

charris merged commit 611bd6c into numpy:master Sep 22, 2018

Uh oh!

BENCH: Split bench_function_base.Sort into Sort and SortWorst. #11889

BENCH: Split bench_function_base.Sort into Sort and SortWorst. #11889

Uh oh!

Conversation

michaelsaah commented Sep 5, 2018

Uh oh!

charris commented Sep 11, 2018

Uh oh!

charris commented Sep 17, 2018

Uh oh!

michaelsaah commented Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charris commented Sep 17, 2018

Uh oh!

pv commented Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelsaah commented Sep 17, 2018

Uh oh!

pv commented Sep 17, 2018

Uh oh!

charris commented Sep 17, 2018

Uh oh!

charris commented Sep 17, 2018

Uh oh!

michaelsaah commented Sep 17, 2018

Uh oh!

michaelsaah commented Sep 17, 2018

Uh oh!

michaelsaah commented Sep 17, 2018

Uh oh!

charris commented Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelsaah commented Sep 17, 2018

Uh oh!

charris commented Sep 22, 2018

Uh oh!

Uh oh!

michaelsaah commented Sep 17, 2018 •

edited

Loading

pv commented Sep 17, 2018 •

edited

Loading

charris commented Sep 17, 2018 •

edited

Loading