Time Series Plot or Line plot with Pandas
Last Updated :
26 Nov, 2020
Prerequisite: Create a Pandas DataFrame from Lists
Pandas is an open-source library used for data manipulation and analysis in Python. It is a fast and powerful tool that offers data structures and operations to manipulate numerical tables and time series. Examples of these data manipulation operations include merging, reshaping, selecting, data cleaning, and data wrangling. This library allows importing data from various file formats like SQL, JSON, Microsoft Excel, and comma-separated values. This article explains how to use the pandas library to generate a time series plot, or a line plot, for a given set of data.
A line plot is a graphical display that visually represents the correlation between certain variables or changes in data over time using several points, usually ordered in their x-axis value, that are connected by straight line segments. The independent variable is represented in the x-axis while the y-axis represents the data that is changing depending on the x-axis variable, aka the dependent variable.
To generate a line plot with pandas, we typically create a DataFrame* with the dataset to be plotted. Then, the plot.line() method is called on the DataFrame.
Syntax:
DataFrame.plot.line(x, y)
The table below explains the main parameters of the method:
Parameter | Value | Default Value | Use |
x | Int or string | DataFrame indices | Set the values to be represented in the x-axis. |
y | Int or string | Remaining columns in DataFrame | Set the values to be represented in the y-axis. |
Additional parameters include color (specifies the color of the line), title (specifies the title of the plot), and kind (specifies which type of plot to use). The default variable for the "kind" parameter of this method is 'line'. Therefore, you don't have to set it in order to create a line plot.
Example 1:
The example illustrates how to generate basic a line plot of a DataFrame with one y-axis variable. Use pandas in Python3 to plot the following data of someone's calorie intake throughout one week, here is our dataframe.
Code:
Python3
import pandas as pd
# Create a list of data to be represented in x-axis
days = [ 'Saturday' , 'Sunday' , 'Monday' , 'Tuesday' ,
'Wednesday' , 'Thursday' , 'Friday' ]
# Create a list of data to be
# represented in y-axis
calories = [ 1670 , 2011 , 1853 , 2557 ,
1390 , 2118 , 2063 ]
# Create a dataframe using the two lists
df_days_calories = pd.DataFrame(
{ 'day' : days , 'calories' : calories })
df_days_calories
Output:
Now, Plotting the variable.
Python3
# use plot() method on the dataframe
df_days_calories.plot( 'day' , 'calories' )
# Alternatively, you can use .set_index
# to set the data of each axis as follows:
# df_days_calories.set_index('day')['calories'].plot();
Output:
Example 2:
This example explains how to create a line plot with two variables in the y-axis.
A student was asked to rate his stress level on midterms week for each school subject on a scale from 1-10 (10 being the highest). He was also asked about his grade on each midterm (out of 20).
Code:
Python3
import pandas as pd
# Create a list of data to
# be represented in x-axis
subjects = [ 'Math' , 'English' , 'History' ,
'Chem' , 'Geo' , 'Physics' , 'Bio' , 'CS' ]
# Create a list of data to be
# represented in y-axis
stress = [ 9 , 3 , 5 , 1 , 8 , 5 , 10 , 2 ]
# Create second list of data
# to be represented in y-axis
grades = [ 15 , 10 , 7 , 8 , 11 , 8 , 17 , 20 ]
# Create a dataframe using the three lists
df = pd.DataFrame(list(zip( stress , grades )),
index = subjects ,
columns = [ 'Stress' , 'Grades' ])
df
Output:
Create a line plot that shows the relationships between these three variables.
Code:
Python3
# use plot() method on the dataframe.
# No parameters are passed so it uses
# variables given in the dataframe
df.plot()
Output:
An alternative way would be to use gca() method from matplotlib.pyplot library as follows:
Python3
import pandas as pd
import matplotlib.pyplot as plt
# Create a list of data
# to be represented in x-axis
subjects = [ 'Math' , 'English' , 'History ',
'Chem' , 'Geo' , 'Physics' , 'Bio' , 'CS' ]
# Create a list of data
# to be represented in y-axis
stress = [ 9, 3 , 5 , 1 , 8 , 5 , 10 , 2 ]
# Create second list of data to be represented in y-axis
grades = [ 15, 10 , 7 , 8 , 11 , 8 , 17 , 20 ]
# Create a dataframe using the two lists
df_days_calories = pd.DataFrame(
{ 'Subject' : subjects ,
'Stress': stress ,
'Grade': grades})
ax = plt.gca()
#use plot() method on the dataframe
df_days_calories.plot( x = 'Subject' , y = 'Stress', ax = ax )
df_days_calories.plot( x = 'Subject' , y = 'Grade' , ax = ax )
Output:
Example 3:
In this example, we will create a plot without explicitly defining variable lists. We will also add a title and change the color.
A coin collector initially has 30 coins. After that, for a duration of one month, he finds one coin every day. Show in a line plot how many coins he has each day of that month.
Python3
import pandas as pd
#initialize the temperature value at the first day of the month
c = 30
# Create a dataframe using the three lists
# the y-axis variable is a list created using
# a for loops, in each iteration,
# it adds 1 to previous value
# the x-axis variable is a list of values ranging
# from 1 to 31 (31 not included) with a step of 1
df = pd.DataFrame([ c + x for x in range( 0 , 30 )],
index = [*range( 1 , 31 , 1 )],
columns = [ 'Temperature (C)' ])
# use plot() method on the dataframe.
# No parameters are passed so it uses
# variables given in the dataframe
df.plot(color='red', title = 'Total Coins per Day')
Output:
Example 4:
In this example, we will plot specific columns of a dataframe. The dataframe consists of three lists, however, we will select two lists only to add to the plot.
Code:
Python3
import pandas as pd
# Create a dataframe using three lists
df = pd.DataFrame(
{'List1': [ 1 , 2 , 3 , 4 , 5 , 6 ],
'List2': [ 5 , 10 , 15 , 20 , 25 , 30 ],
'List3': [ 'a' , 'b' , 'c' , 'd' , 'e' , 'f' ]})
# use plot() method on the dataframe.
# List3 is in the x-axis and List2 in the y-axis
df.plot( 'List3' , 'List2' )
Output:
Similar Reads
Creating A Time Series Plot With Seaborn And Pandas
In this article, we will learn how to create A Time Series Plot With Seaborn And Pandas. Let's discuss some concepts : Pandas is an open-source library that's built on top of NumPy library. It's a Python package that gives various data structures and operations for manipulating numerical data and st
4 min read
How to Plot a Vertical Line on a Time Series Plot in Pandas
When working with time series data in Pandas, it is often necessary to highlight specific points or events in the data. One effective way to do this is by plotting vertical lines on the time series plot. In this article, we will explore how to plot a vertical line on a time series plot in Pandas, co
3 min read
How to Plot a Time Series in Matplotlib?
Time series data is the data marked by some time. Each point on the graph represents a measurement of both time and quantity. Â A time-series chart is also known as a fever chart when the data are connected in chronological order by a straight line that forms a succession of peaks and troughs. x-axis
4 min read
Python | Pandas Series.at_time()
Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.at_time() function is used to
3 min read
Python | Pandas Series.plot() method
With the help of Series.plot() method, we can get the plot of pandas series by using Series.plot() method. Syntax : Series.plot() Return : Return the plot of series. Example #1 : In this example we can see that by using Series.plot() method, we are able to get the plot of pandas series. Python3 1=1
1 min read
Python | Pandas Series.between_time()
Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.between_time() function selec
3 min read
Plotting time-series with Date labels on X-axis in R
In this article, we will discuss how to plot time-series with date labels on the x-axis in R Programming Language supportive examples.Method 1 : Using plot() methodThe plot() method in base R is a generic plotting function. It plots the corresponding coordinates of the x and y axes respectively. The
2 min read
Box plot visualization with Pandas and Seaborn
Box Plot is the visual representation of the depicting groups of numerical data through their quartiles. Boxplot is also used for detect the outlier in data set. It captures the summary of the data efficiently with a simple box and whiskers and allows us to compare easily across groups. Boxplot summ
3 min read
Pandas - Plot multiple time series DataFrame into a single plot
In this article, we are going to see how to plot multiple time series Dataframe into single plot. If there are multiple time series in a single DataFrame, you can still use the plot() method to plot a line chart of all the time series. To Plot multiple time series into a single plot first of all we
3 min read
Animated Scatter Plots in Plotly for Time-Series Data
Time-series data consists of observations collected at regular intervals over time, often used in fields such as finance, meteorology, and economics. One powerful way to visualize this type of data is through animated scatter plots, which not only display the data points but also reveal changes and
5 min read