0% found this document useful (0 votes)

78 views5 pages

Startup Case Study

This document analyzes Indian startup funding data from a CSV file. It cleans the data by handling missing values and formatting issues. It then creates visualizations showing the trend in annual funding amounts over time and the top 10 cities in India for startups based on company counts. Key findings include that 2015 and 2016 saw the most funding and that Bangalore, Mumbai, and New Delhi have the most startups.

Uploaded by

Anubhav Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views5 pages

Startup Case Study

Uploaded by

Anubhav Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

Indian Startup Case Study

Importing neccessary Libraries
Problem Statement : To perform an Indian startup case study analysis

In [1]: #importing necessary libraries

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

Reading Data
In [2]: data_1 = pd.read_csv('./Datasets/startup_funding.csv')

data = data_1.copy()

data.head()

Out[2]: Sr Date Industry City Investors

Startup Name SubVertical
No dd/mm/yyyy Vertical Location Name

Tiger Global
0 1 09/01/2020 BYJU’S E-Tech E-learning Bengaluru
Management

App based Susquehanna

1 2 13/01/2020 Shuttl Transportation shuttle Gurgaon Growth
service Equity

Retailer of
baby and Sequoia
2 3 09/01/2020 Mamaearth E-commerce Bengaluru
toddler Capital India
products

Online New Vinod

3 4 02/01/2020 https://www.wealthbucket.in/ FinTech
Investment Delhi Khatumal

Embroiled Sprout
Fashion and
4 5 02/01/2020 Fashor Clothes For Mumbai Venture
Apparel
Women Partners

In [3]: data.shape

Out[3]: (3044, 10)

Cleaning Data
In [4]: data.isnull().sum()

Out[4]: Sr No 0

Date dd/mm/yyyy 0

Startup Name 0

Industry Vertical 171

SubVertical 936

City Location 180

Investors Name 24

InvestmentnType 4

Amount in USD 960

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 1/5

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

Remarks 2625

dtype: int64

In [5]: # changing the names of the columns inside the data

data.columns = ["SNo", "Date", "StartupName", "IndustryVertical", "SubVertical",

"City", "InvestorsName", "InvestmentType", "AmountInUSD", "R

# need to extract year from Date column

data.Date.dtype

Out[5]: dtype('O')

In [6]: # lets clean the strings

def clean_string(x):

return str(x).replace("\\xc2\\xa0","").replace("\\\\xc2\\\\xa0", "")

# lets apply the function to clean the data

for col in ["StartupName", "IndustryVertical", "SubVertical", "City",

"InvestorsName", "InvestmentType", "AmountInUSD", "Remarks"]:

data[col] = data[col].apply(lambda x: clean_string(x))

Checking the trend of investments by plotting

number of fundings done in each year.
In [9]: # to find out issues in Date column like . and // in place of / in some dates .

unique_dates = data.Date.unique().tolist()

# unique_dates

In [12]: # removing issue in Date column

data.Date = data.Date.str.replace('.','/' )

data.Date = data.Date.str.replace('//','/')

# extracting year from date column

year = data.Date.str.split('/' , expand = True)[2]

# sorting year in chronological order

year = year.value_counts().sort_index()

x = year.index

y = year.values

# plotting line plot

plt.plot(x,y)

plt.title('Trend of investments')

plt.xlabel("Year")

plt.ylabel("Number of Fundings")

plt.show()

for i in range(3):

print('Year : ' , x[i],', No. of fundings : ' , y[i])

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 2/5

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

Year : 015 , No. of fundings : 1

Year : 2015 , No. of fundings : 935

Year : 2016 , No. of fundings : 993

In [13]: # function to clean the AmounInUsd Column

def clean_amount(x):

x = ''.join([c for c in str(x) if c in ['0', '1', '2', '3', '4', '5', '6', '7',
x = str(x).replace(",","").replace("+","")

x = str(x).lower().replace("undisclosed","")

x = str(x).lower().replace("n/a","")

if x == '':

x = '-999'

return x

# lets apply the function on the column

data["AmountInUSD"] = data["AmountInUSD"].apply(lambda x: float(clean_amount(x)))

# lets check the head of the column after cleaning it

plt.rcParams['figure.figsize'] = (15, 3)

data['AmountInUSD'].plot(kind = 'line', color = 'black')

plt.title('Distribution of Amount', fontsize = 15)

plt.show()

Top 10 Indian cities which have most number of

startups
In [14]: # droppping rows having NaN values in CityLocation column

data_temp = data.copy()

data_temp = data_temp[data_temp['City'].notnull()]

data_temp.City.dropna(inplace = True)

# sorting out issues in city names

def separateCity(city):

return city.split('/')[0].strip()

data_temp.City = data_temp.City.apply(separateCity)

data_temp.City.replace('Delhi','New Delhi' , inplace = True)

data_temp.City.replace('bangalore' , 'Bangalore' , inplace = True)

In [15]: ## Counting startups in each city

city_num = data.City.value_counts()[0:10]

city = city_num.index

num_city = city_num.values

## plotting a pie chart shwoing percentage share of each city in no. of startups the
plt.rcParams['figure.figsize'] = (15,9)

plt.pie(num_city , labels = city , autopct='%.2f%%' , startangle = 90 , wedgeprops

plt.show()

for i in range(len(city)):

print('City : ' , city[i] ,' , Number of Startups :' , num_city[i])

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 3/5

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

City : Bangalore , Number of Startups : 701

City : Mumbai , Number of Startups : 568

City : New Delhi , Number of Startups : 424

City : Gurgaon , Number of Startups : 291

City : nan , Number of Startups : 180

City : Bengaluru , Number of Startups : 141

City : Pune , Number of Startups : 105

City : Hyderabad , Number of Startups : 99

City : Chennai , Number of Startups : 97

City : Noida , Number of Startups : 93

Calculating percentage of funding each city has got!

In [16]: data_temp.City = data_temp.City.apply(separateCity)

data_temp.City.replace('Delhi','New Delhi' , inplace = True)

data_temp.City.replace('bangalore' , 'Bangalore' , inplace = True)

# Removing ',' in Amount column and converting it to integer

data_temp.AmountInUSD = data_temp.AmountInUSD.apply(lambda x : float(str(x).replace(

data_temp.AmountInUSD = pd.to_numeric(data_temp.AmountInUSD)

# Calculating citywise amount of funding received.

city_amount = data_temp.groupby('City')['AmountInUSD'].sum().sort_values(ascending =
city = city_amount.index

amountCity = city_amount.values

## calculating percentage of the funding each city has received .

perAmount = np.true_divide(amountCity , amountCity.sum())*100

for i in range(len(city)):

print(city[i] , format(perAmount[i], '.2f'),'%')

plt.bar(city, perAmount, color = sns.color_palette("flare"))

Bangalore 31.10 %

Bengaluru 23.45 %

Mumbai 13.51 %

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 4/5

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

Gurgaon 9.52 %

New Delhi 9.18 %

Noida 3.50 %

nan 3.46 %

Gurugram 2.36 %

Chennai 1.96 %

Pune 1.95 %

Out[16]: <BarContainer object of 10 artists>

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 5/5

Visual Testing: - Asme - Section 5 (NDT) - Section 5 - Article 9 (VT)
100% (3)
Visual Testing: - Asme - Section 5 (NDT) - Section 5 - Article 9 (VT)
29 pages
Informatics Practices Record Class 12
No ratings yet
Informatics Practices Record Class 12
60 pages
Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Task
No ratings yet
Task
15 pages
Universal Data Analytics Algorithm
No ratings yet
Universal Data Analytics Algorithm
51 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
CST 383 Start-Up Success Failure - Colaboratory
No ratings yet
CST 383 Start-Up Success Failure - Colaboratory
32 pages
Case Study 2
No ratings yet
Case Study 2
13 pages
Artificial Neural Networks: Supriya A Jadhav
No ratings yet
Artificial Neural Networks: Supriya A Jadhav
40 pages
IP Practical PRGM
No ratings yet
IP Practical PRGM
41 pages
Ip Final File
No ratings yet
Ip Final File
46 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
Python Practical Questions
No ratings yet
Python Practical Questions
13 pages
A Project Report On Bank Management System
No ratings yet
A Project Report On Bank Management System
20 pages
Assignment: Master in Business Administration
No ratings yet
Assignment: Master in Business Administration
18 pages
Project Ip
No ratings yet
Project Ip
20 pages
I037 - Manas Patel Experiment09
No ratings yet
I037 - Manas Patel Experiment09
9 pages
Profitanalysis
No ratings yet
Profitanalysis
18 pages
Oxy Metre
No ratings yet
Oxy Metre
17 pages
What Is Meant by Unpacking Columns ?: (X, Y) X y (A, B, C) A B C
No ratings yet
What Is Meant by Unpacking Columns ?: (X, Y) X y (A, B, C) A B C
8 pages
Ip HW
No ratings yet
Ip HW
14 pages
Practical No. 01
No ratings yet
Practical No. 01
114 pages
Draft 1 - Huan Heo
No ratings yet
Draft 1 - Huan Heo
31 pages
Class 1 - 2024 Business Analytics
No ratings yet
Class 1 - 2024 Business Analytics
8 pages
DVPD LABfile
No ratings yet
DVPD LABfile
41 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
Practical File IP Class 12 2024 25 Sharing Removed
No ratings yet
Practical File IP Class 12 2024 25 Sharing Removed
29 pages
Pyhtonpractice Questions
No ratings yet
Pyhtonpractice Questions
5 pages
List of Practicals Python 2024 - 25
No ratings yet
List of Practicals Python 2024 - 25
13 pages
DW Lab File
No ratings yet
DW Lab File
18 pages
Ip HW
No ratings yet
Ip HW
15 pages
Practicals
No ratings yet
Practicals
42 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
IP Practical File 2022
No ratings yet
IP Practical File 2022
26 pages
IP Practical 2024-25 (1 To 34)
No ratings yet
IP Practical 2024-25 (1 To 34)
33 pages
Certificate
No ratings yet
Certificate
25 pages
Startup Ecosystem Analysis Model
No ratings yet
Startup Ecosystem Analysis Model
21 pages
Codigo Phyton
No ratings yet
Codigo Phyton
8 pages
Chirayu (1) Merged Merged
No ratings yet
Chirayu (1) Merged Merged
76 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Python-Pandas Notes
No ratings yet
Python-Pandas Notes
5 pages
Gen 6 Battery and Riser Card Replacement
100% (1)
Gen 6 Battery and Riser Card Replacement
14 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Set B
No ratings yet
Set B
8 pages
Dejene Chala Stat606 Screening Quiz Programming Part
No ratings yet
Dejene Chala Stat606 Screening Quiz Programming Part
12 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
Course - Introduction To Data Science (SD211105)
No ratings yet
Course - Introduction To Data Science (SD211105)
10 pages
Marking Scheme Practical Paper
No ratings yet
Marking Scheme Practical Paper
5 pages
Pandas Data Manipulation Extended CheatSheet 1731972219
No ratings yet
Pandas Data Manipulation Extended CheatSheet 1731972219
9 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Practical File Class 12 2025-26
No ratings yet
Practical File Class 12 2025-26
19 pages
Supermarket Sales Data Analysis
No ratings yet
Supermarket Sales Data Analysis
6 pages
Documentation Part by Pranay Kashyap
No ratings yet
Documentation Part by Pranay Kashyap
7 pages
Pandas
No ratings yet
Pandas
20 pages
Pandas Notes
No ratings yet
Pandas Notes
3 pages
Assignment 4 On Visualization On Graph With Solution
No ratings yet
Assignment 4 On Visualization On Graph With Solution
14 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Finance Report
No ratings yet
Finance Report
5 pages
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
No ratings yet
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
11 pages
Unit 2
No ratings yet
Unit 2
71 pages
Audio Recording & Mastering Tips
93% (15)
Audio Recording & Mastering Tips
2 pages
CE 3220 11 Drilling Rock and Earth PDF
No ratings yet
CE 3220 11 Drilling Rock and Earth PDF
67 pages
Key To b1
No ratings yet
Key To b1
16 pages
Honda 2012 Cbr1000rr Parts List
100% (70)
Honda 2012 Cbr1000rr Parts List
4 pages
Web Tech Merged
No ratings yet
Web Tech Merged
24 pages
Government of Uttar Pradesh: Rajesh Kumar Singh
No ratings yet
Government of Uttar Pradesh: Rajesh Kumar Singh
1 page
My First Project
No ratings yet
My First Project
7 pages
DLL - English 4 - Q1 - W5
No ratings yet
DLL - English 4 - Q1 - W5
5 pages
Hexa Research Inc
No ratings yet
Hexa Research Inc
5 pages
Machine Standard Configuration: Horizon 03ix
No ratings yet
Machine Standard Configuration: Horizon 03ix
8 pages
Ecological Network
100% (1)
Ecological Network
11 pages
CH12
No ratings yet
CH12
8 pages
Week4 EnhancedSystemDecomposition Part2
No ratings yet
Week4 EnhancedSystemDecomposition Part2
22 pages
Java Exception Handling Mechanism
No ratings yet
Java Exception Handling Mechanism
11 pages
Cylinder Head Valves
No ratings yet
Cylinder Head Valves
6 pages
Concept Note Project
No ratings yet
Concept Note Project
3 pages
Security Aspects in IoT Based Cloud Computing
No ratings yet
Security Aspects in IoT Based Cloud Computing
12 pages
Religion, Guilt, and Ethical Standards
No ratings yet
Religion, Guilt, and Ethical Standards
17 pages
EDA On FIFA Dataset: Importing Essential Libraries
No ratings yet
EDA On FIFA Dataset: Importing Essential Libraries
21 pages
Java Applet
No ratings yet
Java Applet
11 pages
Cambridge IGCSE: PHYSICS 0625/41
No ratings yet
Cambridge IGCSE: PHYSICS 0625/41
16 pages
Activity in STS 101
No ratings yet
Activity in STS 101
3 pages
Behavioral Finance: Jay R. Ritter
No ratings yet
Behavioral Finance: Jay R. Ritter
3 pages
Sum of Elements of Matrix
No ratings yet
Sum of Elements of Matrix
1 page
Sample Diagnostic
No ratings yet
Sample Diagnostic
29 pages
Catcher User Manual For Customer Full PDF
No ratings yet
Catcher User Manual For Customer Full PDF
51 pages
1.0 Executive Summary: Abdm3313 Entrepreneurship
No ratings yet
1.0 Executive Summary: Abdm3313 Entrepreneurship
17 pages
MidPoint Array
No ratings yet
MidPoint Array
1 page
Class Object - Array - CS - ISC: 23 Jul, 2021 10:20:56 AM
No ratings yet
Class Object - Array - CS - ISC: 23 Jul, 2021 10:20:56 AM
1 page
Lab Report Writing Guidelines: AP Chemistry ASK
No ratings yet
Lab Report Writing Guidelines: AP Chemistry ASK
13 pages
FN Series: Dry Heat Sterilizers /ovens
No ratings yet
FN Series: Dry Heat Sterilizers /ovens
2 pages
Size of Capacitor For Power Factor Correction Size of Capacitor For Power Factor Correction
No ratings yet
Size of Capacitor For Power Factor Correction Size of Capacitor For Power Factor Correction
4 pages
Garage Door Control W/keyfob DSC-007: Application Note
No ratings yet
Garage Door Control W/keyfob DSC-007: Application Note
2 pages
Abhishek Dhiman
No ratings yet
Abhishek Dhiman
3 pages

Startup Case Study

Uploaded by

Startup Case Study

Uploaded by

9/7/2021 STARTUP_CASE_STUDY(GRP - 3)

Indian Startup Case Study

In [1]: #importing necessary libraries

import matplotlib.pyplot as plt

import seaborn as sns

Out[2]: Sr Date Industry City Investors

App based Susquehanna

Online New Vinod

Out[3]: (3044, 10)

Industry Vertical 171

City Location 180

Amount in USD 960

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 1/5

In [5]: # changing the names of the columns inside the data

data.columns = ["SNo", "Date", "StartupName", "IndustryVertical", "SubVertical",

"City", "InvestorsName", "InvestmentType", "AmountInUSD", "R

In [6]: # lets clean the strings

return str(x).replace("\\xc2\\xa0","").replace("\\\\xc2\\\\xa0", "")

# lets apply the function to clean the data

for col in ["StartupName", "IndustryVertical", "SubVertical", "City",

"InvestorsName", "InvestmentType", "AmountInUSD", "Remarks"]:

data[col] = data[col].apply(lambda x: clean_string(x))

Checking the trend of investments by plotting

In [12]: # removing issue in Date column

# extracting year from date column

year = data.Date.str.split('/' , expand = True)[2]

# sorting year in chronological order

# plotting line plot

print('Year : ' , x[i],', No. of fundings : ' , y[i])

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 2/5

Year : 015 , No. of fundings : 1

Year : 2016 , No. of fundings : 993

In [13]: # function to clean the AmounInUsd Column

# lets apply the function on the column

data["AmountInUSD"] = data["AmountInUSD"].apply(lambda x: float(clean_amount(x)))

# lets check the head of the column after cleaning it

data['AmountInUSD'].plot(kind = 'line', color = 'black')

plt.title('Distribution of Amount', fontsize = 15)

Top 10 Indian cities which have most number of

# sorting out issues in city names

data_temp.City.replace('Delhi','New Delhi' , inplace = True)

data_temp.City.replace('bangalore' , 'Bangalore' , inplace = True)

In [15]: ## Counting startups in each city

plt.pie(num_city , labels = city , autopct='%.2f%%' , startangle = 90 , wedgeprops

print('City : ' , city[i] ,' , Number of Startups :' , num_city[i])

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 3/5

City : Bangalore , Number of Startups : 701

City : Mumbai , Number of Startups : 568

City : New Delhi , Number of Startups : 424

City : Gurgaon , Number of Startups : 291

City : nan , Number of Startups : 180

City : Bengaluru , Number of Startups : 141

City : Pune , Number of Startups : 105

City : Hyderabad , Number of Startups : 99

City : Chennai , Number of Startups : 97

City : Noida , Number of Startups : 93

Calculating percentage of funding each city has got!

data_temp.City.replace('Delhi','New Delhi' , inplace = True)

data_temp.City.replace('bangalore' , 'Bangalore' , inplace = True)

# Removing ',' in Amount column and converting it to integer

data_temp.AmountInUSD = data_temp.AmountInUSD.apply(lambda x : float(str(x).replace(

# Calculating citywise amount of funding received.

## calculating percentage of the funding each city has received .

perAmount = np.true_divide(amountCity , amountCity.sum())*100

print(city[i] , format(perAmount[i], '.2f'),'%')

plt.bar(city, perAmount, color = sns.color_palette("flare"))

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 4/5

New Delhi 9.18 %

Out[16]: <BarContainer object of 10 artists>

localhost:8888/nbconvert/html/STARTUP_CASE_STUDY(GRP - 3).ipynb?download=false 5/5

You might also like