14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.
org
IBM Cloud Watson Studio
Estimated Time (45 min)
IBM Watson Studio is a service from IBM, that provides a suite of tools and a collaborative environment for data scientists, developers and
domain experts. In this lab, you will use Watson Studio and explore different datasets. As we have learnt in the course, the data is not only about
numbers, it can be anything such as numeric data, text data, images, videos, audios etc. You will work on three samples.
Sample 1 in which you will learn about the dataset in which only numeric attributes are present.
Sample 2 in which you will learn about the dataset in which numeric & text attributes are present.
Sample 3 in which you will analyze how the Jupyter Notebooks look like. How a Data Scientist create the models?
Let's take a look that how different datasets are used by Data Scientist.
Objectives :
You will learn to:
Launch Watson Studio for accessing Data Science Problems
Evaluate Numeric dataset
Evaluate dataset with Non-Numeric attributes
Evaluate Jupyter Notebook
Pre-requisite:
Before you start, you need to have an IBM Cloud account. If not, follow the instructions given in the link
Exercise 1: Launch Watson Studio for accessing Data Science Problems
1. Login to IBM Cloud: https://cloud.ibm.com/login
2. Scroll down and click Services given in Resource Summary.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 1/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
3. When you click on Services, all your existing services will be shown in the list. Click the Watson Studio service you created:
4. Click Launch IBM Cloud Pak for Data.
Exercise 2: Evaluate Numeric dataset
1. Click on Navigation Menu.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 2/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
2. Click on Gallery.
3. Select All Filters. From Format select Data and from Topic select Energy & Utilities, Enviornment and Industry Accelerator
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 3/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
4. Click on UCI: Forest Fires.
5. Preview the data using the Preview option.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 4/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
Explore the data
The data is related to forest fires where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using
meterological and other data.
Attribute Information:
1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9
2. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9
3. month - month of the year: 'jan' to 'dec'
4. day - day of the week: 'mon' to 'sun'
5. FFMC - FFMC index from the FWI system: 18.7 to 96.20
6. DMC - DMC index from the FWI system: 1.1 to 291.3
7. DC - DC index from the FWI system: 7.9 to 860.6
8. ISI - ISI index from the FWI system: 0.0 to 56.10
9. temp - temperature in Celsius degrees: 2.2 to 33.30
10. RH - relative humidity in %: 15.0 to 100
11. wind - wind speed in km/h: 0.40 to 9.40
12. rain - outside rain in mm/m2 : 0.0 to 6.4
13. area - the burned area of the forest (in ha): 0.00 to 1090.84
(this output variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).
Exercise 2: Evaluate Non-Numeric dataset
The data doesn't have to be only based on numbers. Data can be text, images and other types as well. Let's look into data having text values.
1. Use the All Filters. From Format select Data and from Topic select Economy and Business.
You will get mutiple datasets given. Scroll down and select Airbnb Data for Analytics: Trentino Reviews (If you will not get the data using Load
More option)
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 5/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
2. Preview the data using the Preview option.
Explore the data
Airbnb, Inc. is an American company that operates an online marketplace for lodging, primarily homestays for vacation rentals, and tourism
activities. Airbnb guests may leave a review after their stay, and these can be used as an indicator of airbnb activity.The minimum stay, price and
number of reviews have been used to estimate the occupancy rate, the number of nights per year and the income per month for each listing.
This data can be used in various ways - To analyze the star ratings of places, to analyze the location preferences of the customers, to analyze the
tone and sentiment of customer reviews and many more. Airbnb uses location data to improve guest satisfaction.
💡 Can you think of what you can use this data for?
The dataset comprises of three main tables:
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 6/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
listings - Detailed listings data showing 96 attributes for each of the listings. Some of the attributes used in the analysis are
price(continuous), longitude (continuous), latitude (continuous), listing_type (categorical), is_superhost (categorical), neighbourhood
(categorical), ratings (continuous) among others.
reviews - Detailed reviews given by the guests with 6 attributes. Key attributes include date (datetime), listing_id (discrete), reviewer_id
(discrete) and comment (textual).
calendar - Provides details about booking for the next year by listing. Four attributes in total including listing_id (discrete), date(datetime),
available (categorical) and price (continuous).
Exercise 3: Evaluate Jupyter Notebook
Use the All Filters. From Format select Notebook and select Finding optimal locations of new stores using Decision Optimization
This notebook shows you how Decision Optimization can help to prescribe decisions for a complex constrained problem using Python to help
determine the optimal location for a new store.
The objective is to minimize the total distance from libraries to coffee shops so that a book reader always gets to our coffee shop easily. It can
be done by analyzing and displaying the location of the coffee shops on a map.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 7/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
When we validate the dataset, the locations on map are seperated.
But it is impossible to determine where to ideally open the coffee shops by just looking at the map.
This is solved by an optimization model that will help us determine where to locate the coffee shops in an optimal way.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 8/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org
Summary
In this lab, you have learnt about how different datasets are available and how a data scientist create and predict the models using the model
building in IBM Watson Jupyter Notebook.
Author(s)
Malika Singla
Other Contributor(s)
Lavanya
Change log
Date Version Changed by Change Description
2022-02-16 1.1 Niveditha Updated watson Screenshot
2021-06-010 1.0 Malika Singla Initial Version
© IBM Corporation 2021. All rights reserved.
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 9/9