Importing data in python
Importing data in python
In this course we will learn to import data from large variety of sources
for example,
(i) flat files such as .txts and .csvs;
(ii) files native to other software such as Excel spreadsheets, Stata, SAS and
MATLAB files;
First off, we're going to learn how to import basic text files
which we can broadly classify into 2 types of files –
1. those containing plain text,
such as the opening of Mark Twain's novel The
Adventures of Huckleberry Finn, which you can see
here,
2. Table data
column is a characteristic or feature, such
as gender, cabin and 'survived or not'. The
latter is known as a flat file
open a connection to the file.
To do so,
line3: assign text from a file to a variable text by applying a method read
What you're doing here is called 'binding' a variable in the context manager construct;
while still within this construct, the variable file will be bound to open(filename, 'r'). It is
best practice to use the with statement as you never have to concern yourself with
closing the files again.
The importance of flat files in data
science
Flat Files:
Flat files are basic text files containing
File extension:
Why NumPy?
You can also import different datatypes into NumPy arrays: for example, setting the
argument dtype equals 'str' will ensure that all entries are imported as strings.