Reading CSV and JSON
Our first task is to decide how best to parse the two different formats that the bulk of our data is stored within: CSV format and JSON format. For parsing, we’re going to use two different libraries, of which there are several choices. For JSON, two very good options are simdjson (simdjson.org) and RapidJSON (rapidjson.org). We’re going to use RapidJSON because the interface is slightly simpler. Similarly, there are many choices of CSV parsers, including csv2 (https://github.com/p-ranav/csv2) and lazycsv (https://github.com/ashtum/lazycsv). We’re going to use csv2 here because it is cross-platform, whereas lazycsv does not seem to be. Having decided on the tools that we’re going to use (in the form of libraries), we need to work out how to parse the files that we have been provided as samples (which are hopefully indicative of what the rest of the data looks like) and turn each entry into a piece of data that we can use later.
...