0% found this document useful (0 votes)
14 views9 pages

Watson Gallery - MD

This document describes exploring different datasets using IBM Watson Studio. It discusses three samples: 1) a numeric dataset on forest fires, 2) a dataset with numeric and text attributes on Airbnb reviews, and 3) a Jupyter notebook using optimization to determine optimal locations for new coffee shops. The objective is to learn how data scientists evaluate different types of datasets and build models in Jupyter notebooks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Watson Gallery - MD

This document describes exploring different datasets using IBM Watson Studio. It discusses three samples: 1) a numeric dataset on forest fires, 2) a dataset with numeric and text attributes on Airbnb reviews, and 3) a Jupyter notebook using optimization to determine optimal locations for new coffee shops. The objective is to learn how data scientists evaluate different types of datasets and build models in Jupyter notebooks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.

org

IBM Cloud Watson Studio


Estimated Time (45 min)

IBM Watson Studio is a service from IBM, that provides a suite of tools and a collaborative environment for data scientists, developers and
domain experts. In this lab, you will use Watson Studio and explore different datasets. As we have learnt in the course, the data is not only about
numbers, it can be anything such as numeric data, text data, images, videos, audios etc. You will work on three samples.

Sample 1 in which you will learn about the dataset in which only numeric attributes are present.

Sample 2 in which you will learn about the dataset in which numeric & text attributes are present.

Sample 3 in which you will analyze how the Jupyter Notebooks look like. How a Data Scientist create the models?

Let's take a look that how different datasets are used by Data Scientist.

Objectives :
You will learn to:

Launch Watson Studio for accessing Data Science Problems


Evaluate Numeric dataset
Evaluate dataset with Non-Numeric attributes
Evaluate Jupyter Notebook

Pre-requisite:
Before you start, you need to have an IBM Cloud account. If not, follow the instructions given in the link

Exercise 1: Launch Watson Studio for accessing Data Science Problems


1. Login to IBM Cloud: https://cloud.ibm.com/login

2. Scroll down and click Services given in Resource Summary.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 1/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

3. When you click on Services, all your existing services will be shown in the list. Click the Watson Studio service you created:

4. Click Launch IBM Cloud Pak for Data.

Exercise 2: Evaluate Numeric dataset


1. Click on Navigation Menu.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 2/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

2. Click on Gallery.

3. Select All Filters. From Format select Data and from Topic select Energy & Utilities, Enviornment and Industry Accelerator

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 3/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

4. Click on UCI: Forest Fires.

5. Preview the data using the Preview option.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 4/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

Explore the data


The data is related to forest fires where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using
meterological and other data.

Attribute Information:

1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9


2. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9
3. month - month of the year: 'jan' to 'dec'
4. day - day of the week: 'mon' to 'sun'
5. FFMC - FFMC index from the FWI system: 18.7 to 96.20
6. DMC - DMC index from the FWI system: 1.1 to 291.3
7. DC - DC index from the FWI system: 7.9 to 860.6
8. ISI - ISI index from the FWI system: 0.0 to 56.10
9. temp - temperature in Celsius degrees: 2.2 to 33.30
10. RH - relative humidity in %: 15.0 to 100
11. wind - wind speed in km/h: 0.40 to 9.40
12. rain - outside rain in mm/m2 : 0.0 to 6.4
13. area - the burned area of the forest (in ha): 0.00 to 1090.84

(this output variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).

Exercise 2: Evaluate Non-Numeric dataset


The data doesn't have to be only based on numbers. Data can be text, images and other types as well. Let's look into data having text values.

1. Use the All Filters. From Format select Data and from Topic select Economy and Business.

You will get mutiple datasets given. Scroll down and select Airbnb Data for Analytics: Trentino Reviews (If you will not get the data using Load
More option)

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 5/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

2. Preview the data using the Preview option.

Explore the data


Airbnb, Inc. is an American company that operates an online marketplace for lodging, primarily homestays for vacation rentals, and tourism
activities. Airbnb guests may leave a review after their stay, and these can be used as an indicator of airbnb activity.The minimum stay, price and
number of reviews have been used to estimate the occupancy rate, the number of nights per year and the income per month for each listing.

This data can be used in various ways - To analyze the star ratings of places, to analyze the location preferences of the customers, to analyze the
tone and sentiment of customer reviews and many more. Airbnb uses location data to improve guest satisfaction.

💡 Can you think of what you can use this data for?

The dataset comprises of three main tables:


https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 6/9
14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

listings - Detailed listings data showing 96 attributes for each of the listings. Some of the attributes used in the analysis are
price(continuous), longitude (continuous), latitude (continuous), listing_type (categorical), is_superhost (categorical), neighbourhood
(categorical), ratings (continuous) among others.

reviews - Detailed reviews given by the guests with 6 attributes. Key attributes include date (datetime), listing_id (discrete), reviewer_id
(discrete) and comment (textual).

calendar - Provides details about booking for the next year by listing. Four attributes in total including listing_id (discrete), date(datetime),
available (categorical) and price (continuous).

Exercise 3: Evaluate Jupyter Notebook


Use the All Filters. From Format select Notebook and select Finding optimal locations of new stores using Decision Optimization

This notebook shows you how Decision Optimization can help to prescribe decisions for a complex constrained problem using Python to help
determine the optimal location for a new store.

The objective is to minimize the total distance from libraries to coffee shops so that a book reader always gets to our coffee shop easily. It can
be done by analyzing and displaying the location of the coffee shops on a map.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 7/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

When we validate the dataset, the locations on map are seperated.

But it is impossible to determine where to ideally open the coffee shops by just looking at the map.

This is solved by an optimization model that will help us determine where to locate the coffee shops in an optimal way.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 8/9


14/4/22, 00:26 https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org

Summary
In this lab, you have learnt about how different datasets are available and how a data scientist create and predict the models using the model
building in IBM Watson Jupyter Notebook.

Author(s)
Malika Singla

Other Contributor(s)
Lavanya

Change log
Date Version Changed by Change Description

2022-02-16 1.1 Niveditha Updated watson Screenshot

2021-06-010 1.0 Malika Singla Initial Version

© IBM Corporation 2021. All rights reserved.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module 2/Watson_Gallery.md.html?origin=www.coursera.org 9/9

You might also like