Complete Ruby Data Processing: Using Map, Reduce, and Select 1st Edition Jay Godse PDF For All Chapters
Complete Ruby Data Processing: Using Map, Reduce, and Select 1st Edition Jay Godse PDF For All Chapters
com
https://textbookfull.com/product/ruby-data-processing-using-
map-reduce-and-select-1st-edition-jay-godse/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-loucas/
textboxfull.com
https://textbookfull.com/product/earth-systems-data-processing-and-
visualization-using-matlab-zekai-sen/
textboxfull.com
https://textbookfull.com/product/practical-object-oriented-design-an-
agile-primer-using-ruby-second-edition-metz/
textboxfull.com
https://textbookfull.com/product/a-common-sense-guide-to-data-
structures-and-algorithms-1st-edition-jay-wengrow/
textboxfull.com
Jay Godse
Ruby Data Processing: Using Map, Reduce, and Select
Jay Godse
Kanata, Ontario, Canada
Index���������������������������������������������������������������������������������������������������97
iv
About the Author
Jay Godse is an active software and web applications developer with
expertise in Ruby, Rails, various databases, and Ansible. He also is
active on Stack Overflow as an active contributor. He graduated with an
engineering degree and then went to work as a digital circuit designer.
After a year of that, he switched to software development, and he has
been there ever since in some form. His early work was mostly real-time
telecommunication device control and provisioning using languages
such as C and Protel. He then transitioned into designing distributed
computing systems using languages such as C++. After that, he moved into
web applications. Along the way, he did stints as a software development
manager and a software architect. But for the last nine years, he has
written web applications in Ruby and DevOps applications in Ansible and
Powershell.
v
About the Technical Reviewer
Massimo Nardone has more than 24 years
of experience in security, web/mobile
development, cloud, and IT architecture. His
true IT passions are security and Android.
He has been programming and teaching
how to program with Android, Perl, PHP, Java,
VB, Python, C/C++, and MySQL for more than
20 years.
He holds a Master of Science in Computing
Science from the University of Salerno, Italy.
He has worked as a project manager, software engineer, research
engineer, chief security architect, information security manager, PCI/SCADA
auditor, and senior lead IT security/cloud/SCADA architect for many years.
Technical skills include security, Android, cloud, Java, MySQL, Drupal,
Cobol, Perl, web and mobile development, MongoDB, D3, Joomla,
Couchbase, C/C++, WebGL, Python, Pro Rails, Django CMS, Jekyll,
Scratch, and more.
He worked as visiting lecturer and supervisor for exercises at the
Networking Laboratory of the Helsinki University of Technology (Aalto
University). He holds four international patents (PKI, SIP, SAML, and Proxy
areas).
He currently works as chief information security officer (CISO) for
Cargotec Oyj and is member of ISACA Finland Chapter Board.
Massimo has reviewed more than 45 IT books for different publishers,
and in addition to reviewing this book he is also the coauthor of
Pro Android Games (Apress, 2015).
vii
Acknowledgments
I would like to acknowledge a few people who helped make me a better
programmer and a better writer. Thanks go to Kevin Szabo and Ronnie
Taylor, both of whom helped me be a better programmer, and to Christina
Hardy, who helped me to become a better writer. And thanks also go to
Mark Powers and the publication team at Apress who helped produce this
book.
ix
Introduction
I wrote mostly reactive software for many years, but there was always
a user interface or reporting component that had string manipulation,
data synthesis, or data formatting. Since I was not trained as a computer
scientist, I did not learn of higher-order functions such as map(), reduce(),
and select() that were found in languages like Lisp or Smalltalk. Also,
Ruby, Python, C++, and JavaScript were not around when I was in school.
As a result, I struggled with cumbersome, error-prone imperative code for
some tasks.
A few years after learning Ruby, but while still using the imperative
programming style for data processing, I discovered the Ruby Enumerable
library and started using its higher-order functions, such as map(),
reduce(), and select(). What happened to my data-processing code?
However, I couldn’t find any but the most trivial examples of how to
use these functions to solve data-processing problems. It took me a lot
xi
Introduction
of time-consuming trial and error with map, reduce, and select to solve
these kinds of problems.
I decided to write this book to
I’ll also admit that I was forced to learn many of the programming
nuances of these functions as I wrote the examples.
xii
Introduction
xiii
Prerequisites
You should be familiar with some Ruby programming or at least some
programming in another language. If you don’t know Ruby, search out
the book Learn Ruby the Hard Way1 and go through it as prescribed by its
author, Zed Shaw.
If you are fluent with Python, you might benefit from this book if you
go slowly and look up Ruby information online as you move through
this book.
An internet connection helps if you want to search for online help
using Google, Bing, or another search engine.
You should have a computer with Ruby 2.2.x installed. Windows works,
as do Linux and Mac OSX.
You should have a good syntax-highlighting text editor. I recommend
Notepad++ on Windows or gedit on Linux or Macintosh, both of which are
free. Sublime works on Windows, Linux, or Macintosh and is slightly better,
but costs about $70 at the time of this writing.
1
http://learnrubythehardway.org/book/
xv
CHAPTER 1
Basic Ruby
This section will acquaint or refresh you with basic ways to use the Ruby
command line, as well as some relevant Ruby coding.
If you are comfortable programming in Ruby and understand the Ruby
Enumerable Library reasonably well, you can skip this section.
C:\> irb
irb(main):001:0>
For brevity, I won’t write the full irb prompt every time.
You can execute Ruby statements line by line. The value returned by
each expression is preceded by ⇒.
irb> a = 4
=> 4
irb> a
=> 4
irb> a + 7
=> 11
You can put the code into a file called sample.rb, located in the same
directory or folder from which you ran irb.
1 [2,3,4,5].each do |n|
2 if n%2 == 0
3 puts "even"
4 else
5 puts "odd"
6 end
7 end
2
Chapter 1 Basic Ruby
You could also copy the code block from your text editor and paste it
directly into your command line and get the same result, as long as you use
only spaces for indentation. (Using tabs for indentation will work fine if
you load the file from the command prompt, but if you paste tabs directly
into irb, it will generate errors).
You can either type the code samples from this book into irb directly
or type them into a text editor and then load the file as just shown.
Object Scope
When you are in a Ruby program, the general method of executing a
method f on an object obj is as follows:
obj.f
3
Chapter 1 Basic Ruby
String
Strings are basic constructs in all languages. Let’s look at a few basic
operations used in this book. Try them out on the irb command line for
yourself.
length or size
This yields the size of the string.
downcase
This converts all letters to lowercase.
upcase
This converts all letters to uppercase.
capitalize
This capitalizes the first letter of a string.
split()
This searches a substring for the argument string
and splits the string into a array comprising
substrings on both sides of the split argument,
while the substring of the split argument is
discarded. If there are no matches, an array is
returned with the whole string:
irb> base_string = "abc def ghi"
irb> base_string.split(" ")
=> ["abc","def","ghi"]
irb> base_string.split(" ")
=> ["abc def","ghi"]
irb> base_string.split(" d")
=> ["abc","ef ghi"]
irb> base_string.split("efghi")
=> ["abc def ghi"]
4
Chapter 1 Basic Ruby
join()
string interpolation
5
Chapter 1 Basic Ruby
Array
Arrays in Ruby are like arrays in other languages. They are a collection of
things indexed by a whole number (e.g., 0,1,2,…). In Ruby, an array can
contain any Ruby object at an index. Array indices in Ruby start at 0.
Arrays implement the Ruby Enumerable interface, so they will have
key methods such as each, map, reduce, select, and others.
Special Methods
compact()
irb> [1,nil,2,nil,nil,3,nil,[]].compact
=> [1,2,3,[]]
flatten()
irb> [[1,[2,3]],4,[5,6]].flatten
=> [1,2,3,4,5,6]
push()
irb> [1,2,3,4,5].push(666)
=> [1,2,3,4,5,666]
6
Chapter 1 Basic Ruby
pop
unshift()
shift
7
Chapter 1 Basic Ruby
Hash
Another term for hash is an “associative array,” or even a “dictionary”,
Hashes are indexed by a key object, and there is a value (another object)
for each key object.
These are two ways of creating a new hash. The key of a hash can be
any object or symbol.
8
Chapter 1 Basic Ruby
=> ["The key is a and the value is 1", "The key is b and the
value is 2", "The key is c and the value is 3"]
Block-passing Syntax
Ruby is one of many languages that allow lexical closures, otherwise
known (kind of ) as anonymous functions or blocks. These functions are
dynamically created. The function can take the form of a defined function.
For example:
9
Chapter 1 Basic Ruby
Now, suppose you want to print each element of a range. You will call
each on the array, which will run the block passed to each for each element
of the array.
irb> (1..5).each(&printit)
1
2
3
4
5
=> (1..5)
You could also pass an actual block of code to run. The block is
surrounded by do and end or { and }. Objects between the vertical bars are
optionally passed to the block (depending on the definition of the block),
and they are used by the block to execute. The return value of a block is the
value of the last expression in the block.
10
Chapter 1 Basic Ruby
You will get the same output as earlier. Ditto for the following:
In Ruby, map, reduce, and select all take a code block as a parameter,
and the code block is executed as defined by the function. In this book,
I will use the do-end syntax most often, and sometimes the { } syntax.
1,John
2,Jack
3,Jim
4,Jared
5,John
irb> names.split("\n")
or
11
Chapter 1 Basic Ruby
irb> "John\r".chomp
=> "John"
irb> "John".chomp
=> "John"
So, for our array, just to ensure that we don’t pick up stray line feeds,
we do the following:
I assume that the examples in this book don’t use “\r\n”, but rather
just “\n”. If that doesn’t work on your operating system, use the chomp()
method as shown before working with the array.
12
CHAPTER 2
Function Overview
and Simple Examples
The Ruby library includes the module Enumerable. This library contains
map(), reduce(), select(), and other functions. This section will outline
the syntax and meanings of the different parts of code that use these three
functions.
If you can get through this section comfortably, both typing the code
into irb and understanding the results, then you will be in a good position
to deepen your understanding with the complex examples and reverse
engineering that follow in the next chapters.
M
ap
This function of the Ruby Enumerable library is simple but profound. The
map() method is applied to an array or a hash. The job of map() is to apply
a function or block to each member of the array and return a new array.
So, when you see
def f(x)
x*x
end
output_array =
[1,2,3,4,5].map do |number|
f(number)
end
you read, for each number in [1,2,3,4,5] apply f(x) to return the array
[f(1), f(2), f(3), f(4), f(5)]. In this case, the answer is [1,4,9,16,25].
One could encode it in traditional imperative programming as follows:
output_array = []
for number in 1..5
output_array.push( f(number) )
end
output_array =
[1,2,3,4,5].map do |number|
number*number
end
output_array =
[1,2,3,4,5].map{|element| element*element}
[1,4,9,16,25]
14
Chapter 2 Function Overview and Simple Examples
In this case, each “point” is a tuple (array) with an element and its
square. This returns the following:
[[1,1],[2,4],[3,9],[4,16],[5,25]]
The difference is that the block returns an array each time with the
number and its square, so the result is an array of arrays.
Since map returns an array, you can cascade map calls. For example,
you could have a map block square each element, and then a second map
block add 100. For example:
[1,2,3,4,5].map do |number|
number*number
end.map do |square|
square + 100
end
Reduce
This function of the Ruby Enumerable library is more complex than map()
and is quite powerful. The method is applied to a collection (for example,
an array or a hash). The job of reduce() is to apply a function cumulatively
to each member of the collection and then return an object (which could
be an array, a single value, a hash, or anything).
15
Chapter 2 Function Overview and Simple Examples
For example:
[1,2,3,4,5].reduce(0, :+)
This is like saying for each member of the list, add (:+) the member to
memo and save the memo for the next element. The initial value of memo is 0
(the first parameter). Another way to code it is as follows:
memo = 0
[1,2,3,4,5].each{|element| memo = memo.+(element)}
Or, you can pass reduce a block of code to run. The block comes after
the closing parenthesis and is surrounded by a do-end construct, or a {}
construct. The emitted variables are always the memo first and the element
second. (In this case, the initial value of memo is 33).
=> 48
(You don’t have to call them “memo” or “element,” but that is how they
behave). After each iteration, the memo takes the value of the last object in
the block. For example:
=> 77.7
This returned 77.7, because that was the last object in the block.
16
Chapter 2 Function Overview and Simple Examples
Another thing to remember is that the memo returned by the block has
to be of the same class as the memo emitted by the reduce function. For
example, when you are using reduce() to build a Hash object, you must
return the whole Hash object:
This worked because memo is a Hash class and was returned every time.
You can think of the reduce operation as operating on each element
and saving some kind of memo to carry forward to the operation on the
next element. In the preceding example, where you calculate the sum of
five numbers,
[1,2,3,4,5].reduce(0,:+)
or
17
Exploring the Variety of Random
Documents with Different Content
E. Smith, “The Teaching of Arithmetic,” Teachers College Record,
Vol. X, No. 1.
[14] E. L. Thorndike, “Handwriting,” Teachers College Record,
Vol. XI, No. 2; Stone, Arithmetical Abilities and Some of the
Factors Determining them.
[15] Quoted by Johnson in a monograph on “The Problem of
Adapting History to Children in the Elementary School,” Teachers
College Record, Vol. IX, p. 319.
[16] Teachers College Record, Vol. IX, pp. 319-320.
[17] “Stenographic Reports of High School Lessons,” Teachers
College Record, September, 1910, pp. 18-26.
[18] Baldwin, Industrial School Education. A most helpful
discussion of industrial work.
[19] W. S. Jackman, “The Relation of School Organization to
Instruction,” The Social Education Quarterly, Vol. I, pp. 55-69;
Scott, Social Education.
[20] Allen, Civics and Health, p. 53.
[21] Dewey, Moral Principles in Education.
[22] See chapter on Social Phases of the Recitation.
[23] Moral Training in the Public Schools, p. 41. The essay by
Charles Edward Rugh.
[24] Bagley, Classroom Management, Chapter XIV.
[25] See discussion of the study lesson, ante.
[26] McMurry, How to Study, Chapter III.
[27] See ante, Chapter XI.
[28] Adapted from a plan prepared by Lida B. Earhart, Ph.D., for
the author’s syllabus on Theory and Practice of Teaching.
[29] Some discussion of the course of study as an instrument in
supervision is given in the chapter on “The Teacher in Relation to
the Course of Study.”
[30] For a discussion of the doctrine of formal discipline, and for
bibliography, see Thorndike, Educational Psychology, 1903
edition, Chapter VIII; Heck, Mental Discipline.
[31] James E. Russell, “The School and Industrial Life,”
Educational Review, Vol. XXXVIII, pp. 433-450.
[32] E. L. Thorndike, “Handwriting,” Teachers College Record,
Vol. XI, No. 2.
[33] Cubberley, School Funds and their Apportionment; Elliott,
Fiscal Aspects of Education; Strayer, City School Expenditures.
[34] In proceeding to the part of the study that is necessarily
largely composed of tables, it may be well to state the position of
the author regarding the partial interpretations offered in
connection with the tables. It is that the entire tables give by far
the best basis for conclusions; that for a thorough comprehension
of the study they should be read quite as fully as any other part;
and that they should be regarded as the most important source of
information rather than the brief suggestive readings which are
liable to give erroneous impressions, both because of the
limitations of a single interpretation and the lack of space for
anything like full exposition.
[35] M = Median, which is the representation of central tendency
used throughout this study. It has the advantages over the
average of being more readily found, of being unambiguous, and
of giving less weight to extreme or erroneous cases.
[36] For reliability of measures of reasoning ability, see Appendix,
p. 100.
[37] As stated in Part I, p. 17, a score is arbitrarily set at one. The
fact that the zero point is unknown in both reasoning and
fundamentals makes these scores less amenable to ordinary
handling than they might at first thought seem. Hence, entire
distributions are either printed or placed on file at Teachers
College.
[38] For the data from which these calculations were made, see
first column of table XXI, p. 52, and the first columns of tables III
and IV, p. 21. The absence of known zero points makes such
computations inadvisable except in connection with the more
reliable evidence of the preceding table.
[39] And it is the opinion of the author that the chances are much
better that one would get a school with a superior product in
education.
The following pages contain advertisements of Macmillan
books on education, pedagogy, etc.
A Cyclopedia of Education
Edited by PAUL MONROE, Ph.D.
Professor of the History of Education. Teachers College, Columbia University,
Author of “A Text-Book in the History of Education,” “Brief
Course in the History of Education,” etc.
The need of such work is evidenced: By the great mass of varied educational literature
showing an equal range in educational practice and theory; by the growing importance
of the school as a social institution, and the fuller recognition of education as a social
process; and by the great increase in the number of teachers and the instability of
tenure which at the same time marks the profession.
The men who need it are: All teachers, professional men, editors, ministers, legislators, all
public men who deal with large questions of public welfare intimately connected with
education—every one who appreciates the value of a reference work which will give
him the outlines of any educational problem, the suggested solutions, the statistical
information, and in general the essential facts necessary to its comprehension.
Among the departmental Editors associated with Dr. Monroe are Dr. Elmer E. Brown,
U. S. Commissioner of Education, Prof. E. F. Buchner, of Johns Hopkins, Dr. WM. H.
Burnham, Clark University, M. Gabriel Compayré, Inspector-General of Public
Instruction, Paris, France, Prof. Wilhelm Münch, of Berlin University, Germany, Prof.
John Dewey, of Columbia University, Dr. Ellwood P. Cubberly, Stanford University,
Cal., Prof. Foster Watson, of the University College of Wales, Dr. David Snedden,
Commissioner of Education for the State of Massachusetts, and others.
T H E M A C M I L L A N C O M PA N Y
64-66 Fifth Avenue, New York
A LIST OF BOOKS FOR TEACHERS
Published by The Macmillan Company
Idealism in Education
Or First Principles in the Making of Men and Women
By HERMAN HARRELL HORNE, Ph.D.
Author of “The Philosophy of Education” and “The Psychological Principles of
Education”
Cloth, 12mo, xxi + 183 pages, index, $1.25 by mail, $1.34
Professor Horne here discusses three things which he regards as fundamental in the
building of human character,—Heredity, Environment, and Will. His method of handling
these otherwise heavy subjects, makes the book of interest, even to the general reader.
T H E M A C M I L L A N C O M PA N Y
64-66 Fifth Avenue, New York
By WILLIAM CHANDLER BAGLEY
Director of the School of Education, University of Illinois
Craftsmanship in Teaching
Cloth, 12mo, 247 pages, $1.25
Readers of “The Educative Process” and “Classroom Management” by Director W. C.
Bagley of the University of Illinois will welcome the author’s new book on
“Craftsmanship in Teaching.” The book is made up of a series of addresses given
before educational gatherings, the subject of the first one giving the book its name. In
these addresses the personality of the author is more in evidence than is possible in his
more systematic work, but the same sane, scientific point of view is apparent
throughout.
Classroom Management
Cloth, xvii + 332 pages, $1.25
This book considers the problems that are consequent upon the massing of children
together for purposes of instruction and training. It aims to discover how the unit-group
of the school system—the “class”—can be most effectively handled. The topics
commonly included in treatises upon school management receive adequate attention;
the first day of school; the mechanizing of routine; the daily programme; discipline and
punishment; absence and tardiness, etc.
T H E M A C M I L L A N C O M PA N Y
64-66 Fifth Avenue, New York
TRANSCRIBER’S NOTE
Obvious typographical errors and punctuation errors have been corrected after
careful comparison with other occurrences within the text and consultation of
external sources.
Some hyphens in words have been silently removed, some added, when a
predominant preference was found in the original book.
Except for those changes noted below, all misspellings in the text, and
inconsistent or archaic usage, have been retained.
Pg 69: ‘a singe problem’ replaced by ‘a single problem’.
Pg 113: ‘Professon Johnson’ replaced by ‘Professor Johnson’.
Pg 136: ‘find situtions’ replaced by ‘find situations’.
Pg 150: ‘actally demanded’ replaced by ‘actually demanded’.
Pg 189: ‘was comformable’ replaced by ‘was conformable’.
Pg 236: ‘genuine motive’ replaced by ‘genuine motives’.
Pg 244: ‘I. Abstract.’ replaced by ‘II. Abstract.’.
Pg 258: The note ‘Footnotes on opposite page.’ has been removed from the
bottom of TABLE III.
Pg 260: ‘XXII’ (first row in the table) replaced by ‘XXIII’.
Pg 272: ‘Syntax of etymology’ replaced by ‘Syntax or etymology’.
Pg 273: ‘c.’ inserted in front of ‘The Influence of the’.
*** END OF THE PROJECT GUTENBERG EBOOK A BRIEF
COURSE IN THE TEACHING PROCESS ***
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com