0% found this document useful (0 votes)
20 views21 pages

IS4834 Lab 7

Uploaded by

Kapzy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views21 pages

IS4834 Lab 7

Uploaded by

Kapzy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

IS4834 Business Intelligence and

Analytics
Lab 7 – Logistic Regression
Today we are going to
• Explore the crowdfunding data
• Run the logistic regression
– Examine the factors that may affect the probability of
projects being successfully funded
Preparations
• Create a folder “Lab7” on your desktop
• Create a folder “Library” under the folder “Lab7”
• Download Crowdfunding.sas7bdat from Canvas to
Library folder
Create a project

• Set your created folder


“Lab7” as the SAS Server
Directory.
• Name the project as ***
“Crowdfunding”.
Your LoginID

***

Your LoginID
Create a library
1. Select File New Library
2. Name: Lab7
3. Path: C:\Users\Your LoginID\Desktop\Lab7\Library

***

Your LoginID
The Story – Crowdfunding

• You are consulting for a crowdfunding platform. The crowdfunding platform


provides opportunities for poor people in many countries to raise money for their
projects from funders worldwide.
• The platform provides you Crowdfunding data set that contains information about
more than 11,000 projects listed during January 2015.
• The platform wants to know what factors may
affect projects’ success of fundraising. The findings
may help the borrowers to better frame their
projects and help platform to better attract
lenders.
• We could run logistic regression to do the analysis.
Define a Data Source
Select (File New Data Source)
Select a SAS Table.
Define a Data Source
Select (File New Data Source)
Set the Role and Level of Variables.
❑Our target is the status of the project (whether it is fully funded or not). So we
change the role of Status to Target and the level to Binary.
The Data
Model Measurement
Name Role Level Description
ID ID Nominal Project ID
Name Rejected Nominal Name of borrowers
Whether the project got fund
Status Target Binary
successfully
GrouporNot Input Nominal Whether the project is a group project
Country Input Nominal Country of borrowers
Industry Input Nominal Industry of the project
Funding goal (requested amount) of
LoanAmount Input Interval
the project
RepaymentTerm Input Interval Number of months to repay the loan
ListedDate TimeID Interval Listed date of the project
Define a Data Source
Create Diagram
• Create a new diagram Crowdfunding (File -> New -> Diagram).
• Drag Crowdfunding under data source into the diagram workspace.
• Run the crowdfunding node
Exploring the Data
•Right-click the Crowdfunding data source and select Edit
Variables
•Select all listed inputs by dragging the cursor across all of the
input names and click Explore
Exploring the Data
– Maximize the Sample Properties window
– Sample Method: Random; Fetch size: Max
– Now we could examine histograms of all the variables
Successful Projects
• Maximize the Status histogram
• By checking the histogram, we could understand the distribution of the
projects (e.g., fully funded or not)
• We have around 11,000 projects. The histogram shows that there are
around 1,200 project failed (around 10%). It means almost 90%
projects fully funded through the platform.
Logistic Regression
• We could do logistic regression using Regression node (the same node as
linear regression)
• Select the Model tab. Drag a Regression tool into the diagram workspace.
• Connect the Data node to the Regression node.
Logistic Regression
• Select the Regression node and examine the Property panel
• By default, the regression type is logistic, so we do not need to change the
setting.
• Select the Variable ellipsis (…) to select the variables for our regression model.
• Choose No if you do not want to include the variable in the regression
model.
• Run the Regression node
Logistic Regression
• Check the results – model fit

Our model is statistically


different from a model
without any independent
variable ☺
Logistic Regression
• Check the results – coefficients estimation
Logistic Regression
• Check the results – coefficients estimation

● If the project is not a group project, the log odds of the project
being successfully fund will decrease by 0.3237 compared to a
group project, while holding other variables unchanged.

● If the project is not a group project, the odds of the project being
successfully fund will decrease by 27.65% (e^(-0.3237)-1= -0.2765)
compared to a group project, while holding other variables
unchanged.

● It is statistically different from 0 at a confidence level of 95%.


Logistic Regression
• Check the results – coefficients estimation

• A unit increase in loan amount will decrease the log odds of the
project being successfully fund by 0.00027, while holding other
variables unchanged.

• A unit increase in loan amount will decrease the odds of the


project being successfully fund by 0.027% (e^(-0.00027)-1 = -
0.00027), while holding other variables unchanged.

• It is statistically different from 0 at a confidence level of 95%


Deliverables
Please download the Worksheet “IS4834 Lab 7.docx”, fill in
your answers accordingly and submit on Canvas.

Please submit the compressed folder Lab7 to Canvas!!!

You might also like