-
-
Notifications
You must be signed in to change notification settings - Fork 196
adding tutorial for creating a Moore's law linear regression #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
I think the failure might be related to the outputs from the notebook. Can you try removing all outputs before submitting and see if that helps? Thanks! |
@melissawm, I ran |
Clearly a lot of thought and effort went into this! I have just a few comments on the opening section. Regarding this sentence
I'd propose this opening: What you'll do In 1965, engineer Gordon Moore predicted that transistors on a chip would double every two years in the coming decade. We'll compare that against actual transistor counts in the 53 years following his prediction. Skills you'll learn
What you'll need
imported with the commands
We'll be using use these libraries from those packages:
I don't think I allowed for this in my tutorial guidelines, but since there are many steps it might be good to summarize them first -- possibly just with a table of contents. |
@bjnath, I really like the suggested edits. Thanks! I can update the tutorial. @melissawm, I think the failed checks now stem from the missing
It looks like the subsequent errors come from missing |
I just opened #32 to fix that, should work. Thanks @bjnath for the suggestions, @cooperrc you can update the notebook and as soon as #32 is merged we should have no other problems. |
Would it be clearer to start from the law itself? Building Moore's law Making "doubles every two years" into an equation, if the transistor count in year Y is T, in year n it will be That is, two years later T is multiplied by a factor of in four years, by and so on. If we plot this, it's an exponential curve: But we can make it a straight line by taking the log of both sides: This has the familar form y = ax + b, where There were 2250 transistors in 1971, so Y = 1971 giving the straight-line formula It's more convenient to work with base-10 logs than base-2 logs; since we can multiply both sides by log(2) to get the equivalent formula |
I had considered this too, but I thought it would be worthwhile to keep the two functions as similar as possible i.e. linear on a natural log scale. You have to go through a few steps to get to the log2, log10, and loge, so I just kept it as loge, since that is the default NumPy/Mathematics logarithm. In engineering, loge is ln and log is log10, so there is that extra layer of confusion. |
@cooperrc if you merge master now, the tests should pass. Sorry for taking so long! |
Need to add statsmodels to dependencies
No need to apologize! I took some time to remove passive voice and add the docs/stable links in the notebook. Should be much better now. Plus, it passes all checks! |
This looks good to me! I have only a few (final) minor comments, then I feel like this can be merged. |
I think those are the final comments from me. |
As per the discussion today, merging as-is, but there may be more tweaks |
thanks @cooperrc |
You're very welcome! Thanks @melissawm and @bjnath for your careful review! |
@cooperrc @melissawm @rossbar The Markdown versions/conversions of GSoD notebooks, like the Moore's Law or Deep Learning with MNIST are effectively new PRs authored by the maintainers. So, would it be OK if we add the authors' names to |
Thanks @8bitmp3 , an author tag is definitely a good idea. I think this can already be specified in the notebook metadata, but we'd need to check that and then verify that the theme picks that up and displays it prominently in the article. If not, we can always just add our own to the template. |
For the record, displaying author names used to be done in doc sections a very long time ago, and we removed it because it's usually a bad idea. There are a couple of reasons for that:
I may be missing something here, but authorship needs to be preserved across commits/reworks. If what happened is that a contributor wrote a lot of content but didn't end up in the commit log as an author, that's problematic. Even if you copy over content to a new file format, you should use |
This is the step we were missing - it was simply something I failed to consider when converting from .ipynb -> .md formats. Any suggestions how to fix for the existing tutorials? We could amend the commit authors but that will change the hashes. |
So in this case the MNIST tutorial looks fine to me. @8bitmp3 has many commits in master, and @melissawm has one for the
Yeah that's bad practice in general. Here we could still consider it, in case there's something important enough to fix. There's only 15 forks right now, so fixing those up by hand isn't too painful. |
Yeah, the history as a whole should be fine, the commits related to the |
I'm not sure that matters that much. People mostly look at number of commits or full history (e.g. Related: we need a new way at some point to acknowledge contributions on the website. The old The idea is that then we could simply have an issue there that maintainers can comment on whenever a new significant contribution is made: |
I commented on #57 before reading this - still catching up. I did not realize this problem beforehand, sorry about that. |
This tutorial builds a linear regression model based upon historical semiconductor manufacturing data for number of transistors per microchip. It teaches a a typical user workflow by:
Some notes on this PR:
I do not know how to build intersphinx references for the functions I used. In the notebook I made html comments
@mattip said our best-practice is numpy.org/doc/stable so I updated the references<!--need ... reference-->
in each spot where I thought there should be Sphinx link.
Thanks to @melissawm, @mattip, @bjnath, and @rgommers for guidance and initial reviews.