SQL For Data Analysis Advanced Techniques For Transforming Data Into Insights 1nbsped 1492088781 9781492088783 241 245
SQL For Data Analysis Advanced Techniques For Transforming Data Into Insights 1nbsped 1492088781 9781492088783 241 245
Text Analysis
In the last two chapters, we explored applications of dates and numbers with
time series analysis and cohort analysis. But data sets are often more than just
numeric values and associated timestamps. From qualitative attributes to free
text, character fields are often loaded with potentially interesting information.
Although databases excel at numeric calculations such as counting, summing,
and averaging things, they are also quite good at performing operations on text
data.
I’ll begin this chapter by providing an overview of the types of text analysis
tasks that SQL is good for, and of those for which another programming
language is a better choice. Next, I’ll introduce our data set of UFO sightings.
Then we’ll get into coding, covering text characteristics and profiling, parsing
data with SQL, making various transformations, constructing new text from
parts, and finally finding elements within larger blocks of text, including with
regular expressions.