Similar Visions, Independent Work, and an Eventual Collaboration
Both the Unicon Research Corporation and the Minnesota Population Center recognized these limitations of the CPS and simplified and streamlined its use. The organizations had similar yet distinct views about how exactly to do this. Unicon created CPS Utilities (described in the section that follows), which enabled researchers to easily access several variables and years of data, largely unchanged from the original data, via one system. With IPUMS-CPS, the Minnesota Population Center took an interventionist approach by simplifying and harmonizing original variable names and limiting the redundancy in variables offered to users.
CPS Utilities and IPUMS-CPS operated largely independently of one another until 2011 when the project staff began to collaborate to clean and document CPS data. The Unicon-MPC collaboration is based on our mutual goals of preserving CPS data and documentation and making the data easy to access at no charge to users through the IPUMS online system. Unicon ceased its production of CPS Utilities at the end of 2014. Currently, most data files and documentation have been incorporated into the IPUMS-CPS system by MPC staff. The work that Unicon began in 1989 will continue through the Minnesota Population Center.
History of Unicon's CPS Utilities
The CPS Utilities software originated in 1989 as a tool to allow in-house Unicon researchers easy and accurate access to the March Annual Demographic CPS data files. Variable concepts were analyzed over all the available years and each concept was assigned a fixed variable name (later referred to as the 'Unicon name') to be used across all years. Variable column locations and lengths were hard coded into the software. All known variable documentation was gathered in the Utilities dictionary files and appendices to relieve researchers of the need to read through each individual Census March CPS manual. The tool proved so useful and reliable that Unicon expanded the software to include other CPS data series. As of 2014, there were 457 data files spread over 15 unique series.
The initial funding and most of the ongoing funding for creating and expanding the Utilities was provided by Unicon's president and founder, Dr. Finis Welch. The product was also funded in part by Small Business Innovation Research (SBIR) grants from the National Institute on Aging, the National Library of Medicine, the National Institute of Child Health and Human Development, and the U.S. Census Bureau. The Utilities contents are solely the responsibility of the authors and do not necessarily represent the official views of the funding institutions. By accepting these SBIR grants, it became mandatory to present the product as a commercially viable commodity. For this reason, when Unicon offered the CPS Utilities to researchers outside of the company in 1994, it became necessary to set a minimum charge for the service and the product.
As Unicon expanded its range to the non-March series, an intense search for the missing files was conducted. Early files were collected from several data facilities which include the U.S. Census Bureau, the U.S. Bureau of Labor Statistics (BLS), the U.S. National Archives, the National Bureau of Economic Research (NBER), the Inter-university Consortium for Political and Social Research (ICPSR), and the Center for Demography and Ecology at University of Wisconsin, Madison (CDE). A list of the provenance of the data and the documentation for the CPS files housed in Unicon's library is provided. It should be noted that the early data were received on 9 track tapes. With the introduction of more modern I/O equipment, the 9-track drives were phased out. The collection of Census tapes was systematically destroyed once the data were copied to DVDs. More recently, the data have been downloaded directly from the online Census Ferret FTP site.
History of IPUMS-CPS
The Minnesota Population Center's effort to increase the accessibility of CPS data began in the early 2000s and built off the successful infrastructure developed to provide web-based access to decennial census data. This infrastructure, IPUMS-USA, revolutionized the research community's access to microdata. IPUMS-USA contained multiple years of decennial Census data, allowing users to study long-run change in the United States. IPUMS-CPS was initially conceived of as a natural complement to the information provided in the decennial census data; it provided the best source of data for understanding social and economic patterns between decennial censuses (King & Tertilt, 2003). As such, it was designed to be compatible with IPUMS-USA.
The first phase of IPUMS-CPS, funded by the National Science Foundation and National Institutes of Health, yielded a web-based dissemination system for the delivery of harmonized data and documentation from the March Annual Demographic and Economic Characteristics file (hereafter referred to as the ASEC). By the end of the initial funding period, the database provided researchers with ready-to-use data spanning the period from 1962-2015. Each variable delivered via the new system had a description of its contents, information about comparability over time, and at-a-glance frequencies for the majority of variables. Variables were recoded so they were consistent over time without any loss of information and universe definitions were empirically verified and documented online.
The second major phase of work, currently ongoing, involves expanding IPUMS-CPS to incorporate non-ASEC CPS monthly and supplement data. Funded by the National Institutes of Health and in collaboration swith Unicon Corporation, the current iteration of IPUMS-CPS will, by completion, house the majority of all publicly available CPS basic and supplement data, allow users to easily link observations over time, and continue to provide the high level of research support that Unicon Corporation and IPUMS have always provided.