0% found this document useful (0 votes)
97 views22 pages

Analyzing Unicorn Companies (Final)

This report analyzes unicorn companies, defined as privately held companies valued over $1 billion, using a dataset that includes details like company name, country, sector, and valuation. It outlines the process of creating a database, loading data, and executing 30 SQL queries to answer specific research questions regarding unicorn companies, including metrics on countries, sectors, and investors. Key findings include the top countries and sectors by number of unicorns and average valuations, as well as insights into the most common investors.

Uploaded by

Umar Iftikhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views22 pages

Analyzing Unicorn Companies (Final)

This report analyzes unicorn companies, defined as privately held companies valued over $1 billion, using a dataset that includes details like company name, country, sector, and valuation. It outlines the process of creating a database, loading data, and executing 30 SQL queries to answer specific research questions regarding unicorn companies, including metrics on countries, sectors, and investors. Key findings include the top countries and sectors by number of unicorns and average valuations, as well as insights into the most common investors.

Uploaded by

Umar Iftikhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Data Analytics Report: Analyzing

Unicorn Companies

This report presents an extensive analysis of unicorn companies, which are privately
held companies valued at over $1 billion as of November 2021. The dataset used
contains key details such as company name, country, sector, valuation, founding year,
and major investors. This analysis explores several key metrics such as top-performing
countries, sectors, and investors. The SQL queries used for this analysis address 30
specific research questions, with detailed results provided.

1. Database Creation and Table Setup


The first step in this project is to set up the database and the table structure to store the
unicorn companies' data. The following SQL commands are used to create the database
and define the structure of the table:

-- Step 1: Create the database

CREATE DATABASE unicorn_companies;


USE unicorn_companies;
SELECT DATABASE();

-- Drop the unicorns table if it already exists to avoid duplication error


DROP TABLE IF EXISTS unicorns;

-- Recreate the unicorns table after dropping the existing one


CREATE TABLE IF NOT EXISTS unicorns (
id INT AUTO_INCREMENT PRIMARY KEY,
company VARCHAR(255),
country VARCHAR(100),
sector VARCHAR(100),
valuation DECIMAL(10, 2),
founded_year INT,
investors TEXT
);

Explanation:

 The CREATE DATABASE command is used to create a new database named


unicorn_companies, followed by the USE command to switch to that database.
 The DROP TABLE IF EXISTS command ensures that if a table named unicorns
already exists, it will be dropped before creating a new one. This avoids conflicts
when rerunning the script.
 The table unicorns is created with the following structure:
a. id: A primary key with auto-incrementing integer values.
b. company: A string field to store the name of the company.
c. country: A string field for the country of the company.
d. sector: A string field to categorize the company’s sector.
e. valuation: A decimal field for the company's valuation in billions.
f. founded_year: An integer field representing the year the company was founded.
g. investors: A text field to store the investors associated with the company.

This step ensures that the database and table structure are correctly defined before any
data is loaded.

Step 2: Loading Data into the Table


-- Load data from CSV into the unicorns table with appropriate cleaning and mapping

LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server


8.0/Uploads/Unicorn_Companies.csv'
INTO TABLE unicorns
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@company, @valuation, @date_joined, @country, @city, @industry, @investors,
@founded_year, @total_raised, @financial_stage, @investor_count, @deal_terms,
@portfolio_exits)
SET
company = @company,
country = @country,
sector = @industry,
valuation = REPLACE(REPLACE(@valuation, '$', ''), ',', ''), -- Clean the valuation field
founded_year = CASE
WHEN @founded_year = 'None' OR @founded_year = '' THEN NULL
ELSE CAST(@founded_year AS UNSIGNED)
END, -- Handle invalid founded_year values
investors = @investors;

Explanation:

a. The LOAD DATA INFILE command reads data from the provided CSV file and loads
it into the unicorns table.
b. : The fields are terminated by commas, enclosed by double quotes, and the file uses
newline characters (\n) to separate rows.
c. : Since the CSV file contains fields not relevant to the table (such as city and
total_raised), variables (e.g., @company, @valuation) are used to read these fields
but only map the relevant fields (company, country, sector, etc.) into the table.
d. Data Cleaning:
 The valuation field is cleaned to remove dollar signs ($) and commas before
being inserted into the table.
 For the founded_year field, invalid values such as None or empty strings are
converted to NULL.

This section contains 30 SQL queries designed to answer specific research questions
about unicorn companies. Each query is followed by an explanation of its purpose and
the result obtained from the analysis.

Step 3: Answering Analysis Questions

Query 1: Top 5 Countries by Number of Unicorns

SQL Code:

SELECT country, COUNT(*) AS num_unicorns


FROM unicorns
GROUP BY country
ORDER BY num_unicorns DESC
LIMIT 5;

Explanation:
 This query groups unicorns by country, counts how many unicorns are present in
each country, and orders the results by the number of unicorns in descending order.
 The query limits the result to the top 5 countries with the most unicorns.

Result:
Country Number of Unicorns
United States 9112
China 2856
India 1071
United Kingdom 714
Germany 408

Query 2: Top 3 Sectors by Average Valuation

SQL Code:

SELECT sector, AVG(valuation) AS avg_valuation


FROM unicorns
GROUP BY sector
ORDER BY avg_valuation DESC
LIMIT 3;

Explanation:
 This query calculates the average valuation of unicorn companies in each sector
by using the AVG() function. It groups the data by the sector and orders the
sectors based on average valuation in descending order.
 The query is limited to the top 3 sectors with the highest average valuations.

Result:

Average Valuation (Billion


Sector
USD)
Sequoia Capital, Thoma 32
Bravo, Softbank
Finttech 10
Other 4.73

Query 3: Unicorns Founded After 2010

SQL Code:

SELECT company, country, sector, valuation, founded_year


FROM unicorns
WHERE founded_year > 2010;=

Explanation:

 This query retrieves unicorns founded after the year 2010 by filtering the results
using the founded_year field.
 It provides the company's name, country, sector, valuation, and year of founding.

Result:

Valuation (Billion Founded


Company Country Sector
USD) Year
Bytedance China Artificial Intelligence 140.0 2012
Canva Australia Internet software & 40.00 2012
services
Checkout.com United Fintech 40.00 2012
Kingdom
Query 4: Total Valuation of Unicorns in the FinTech Sector

SQL Code:

SELECT SUM(valuation) AS total_valuation


FROM unicorns
WHERE sector = 'FinTech';

Explanation:

 This query calculates the total valuation of unicorn companies operating in the
FinTech sector using the SUM() function.

Result:
Total Valuation (Billion USD)

13341.94

Query 5: Most Common Investors

SQL Code:

SELECT investors, COUNT(*) AS investor_count


FROM unicorns
GROUP BY investors
ORDER BY investor_count DESC
LIMIT 5;

Explanation:

 This query groups unicorns by their investors and counts how many unicorns each
investor has in their portfolio. It orders the results by the number of unicorns in
descending order and limits the output to the top 5 investors.

Result:
Investor Name Number of Unicorns

Sequoia Capital 51
Sequoia Capital China, Qiming Venture Partners, Tencent 34
Holdings
Speedinvest, Valar Ventures, Uniqa Ventures 34
Insight Partners, Sequoia Capital, Index Ventures 34

Query 6: Top 10 Companies by Valuation


SQL Code:

SELECT company, valuation


FROM unicorns
ORDER BY valuation DESC
LIMIT 10;

Explanation:

 This query retrieves the top 10 companies based on their valuation. It orders the
companies in descending order by their valuation and limits the result to the top 10.

Result:
Company Valuation (Billion USD)

Bytedance 140.0
SpaceX 100.3
Stripe 95.0
Klarna 46.0
UiPath 35.0
Rivian 27.6
Instacart 39.0
Checkout.com 40.0
Databricks 38.0
Epic Games 32.0

Query 7: Average Valuation of Companies Founded in 2010 or Later

SQL Code

SELECT AVG(valuation) AS avg_valuation


FROM unicorns
WHERE founded_year >= 2010;

Explanation:

 This query calculates the average valuation of unicorn companies that were founded
in 2010 or later, using the AVG() function.

Result:
Average Valuation (Billion USD)

3.193929
Average Valuation (Billion USD)

Query 8: Number of Unicorns in Each Sector

SQL Code:

SELECT sector, COUNT(*) AS num_unicorns


FROM unicorns
GROUP BY sector
ORDER BY num_unicorns DESC;

Explanation:

 This query groups unicorns by sector, counts how many unicorns are in each sector,
and orders the results by the number of unicorns in descending order.

Result:
Sector Number of Unicorns

Fintech 3485
Internet software & services 3264
E-commerce & direct-to-consumer 1819
Artificial intelligence 1326
Health 1173

Query 9: Number of Unicorns per Decade

SQL Code:

SELECT FLOOR(founded_year/10)*10 AS decade, COUNT(*) AS num_unicorns


FROM unicorns
GROUP BY decade
ORDER BY decade;

Explanation:

 This query groups unicorn companies by the decade they were founded and counts
how many unicorns were founded in each decade.

Result:
Decade Number of Unicorns

1970 34
Decade Number of Unicorns

1980 17
1990 391
2000 2465
2010 13447
2020 527
1970 34

Query 10: Countries with the Most Unicorns in the FinTech Sector

SQL Code:
SELECT country, COUNT(*) AS num_unicorns
FROM unicorns
WHERE sector = 'FinTech'
GROUP BY country
ORDER BY num_unicorns DESC;

Explanation:

 This query counts how many unicorns in the FinTech sector are in each country
and orders the results by the number of unicorns in descending order.

Result:
Country Number of FinTech Unicorns

United States 1836


United Kingdom 425
India 238
China 136
Germany 85

Query 11: Total Valuation of Unicorns in Each Country

SQL Code:
SELECT country, SUM(valuation) AS total_valuation
FROM unicorns
GROUP BY country
ORDER BY total_valuation DESC;

Explanation:

 This query calculates the total valuation of unicorns in each country by using the
SUM() function to aggregate valuations for each country.
Result:
Total Valuation (Billion
Country
USD)
United States 31105.58
China 9735.73
India 3179.85
United Kingdom 3020.56
Germany 1133.90

Query 12: Average Valuation by Country

SQL Code:
SELECT country, AVG(valuation) AS avg_valuation
FROM unicorns
GROUP BY country
ORDER BY avg_valuation DESC;

Explanation:

 This query calculates the average valuation of unicorn companies in each country
using the AVG() function and orders the results by average valuation in descending
order.

Result:
Average Valuation
Country
(Billion USD)
Indonesia 4.414286
United Kingdom 4.230476
Turkey 3.823333
Austria 3.805000
United States 3.413694

Query 13: Unicorns with Valuation Greater than $50 Billion

SQL Code:
SELECT company, valuation
FROM unicorns
WHERE valuation > 50;

Explanation:

 This query retrieves all unicorns with a valuation greater than $50 billion.
Result:
Company Valuation (Billion USD)
Bytedance 140.0
SpaceX 100.3
Stripe 95.0
Klarna 46.0

Query 14: Number of Unicorns by Year of Founding

SQL Code:
SELECT founded_year, COUNT(*) AS num_unicorns
FROM unicorns
GROUP BY founded_year
ORDER BY founded_year;

Explanation:

 This query counts how many unicorns were founded in each year by grouping the
results by the founded_year field.

Result:
Founded Year Number of Unicorns

1919 17
1973 17
1979 17
1984 17
1990 17
1991 17
1992 34
1993 17
1994 34
1995 34
1996 17
1997 17
1998 68
1999 136
2000 204
2001 136
2002 51
2003 119
2004 136
2005 255
2006 221
2007 391
Founded Year Number of Unicorns

2008 391
2009 561
2010 629
2011 1292
2012 1479
2013 1445
2014 1785
2015 2448
2016 1717
2017 1139
2018 935
2019 578
2020 391
2021 136

Query 15: Investors with Most FinTech Unicorns

SQL Code:
SELECT investors, COUNT(*) AS num_unicorns
FROM unicorns
WHERE sector = 'FinTech'
GROUP BY investors
ORDER BY num_unicorns DESC
LIMIT 5;

Explanation:

 This query identifies investors with the most unicorn companies in the FinTech sector
by grouping the results by investor and counting the number of FinTech unicorns in
their portfolio.

Result:
Number of FinTech
Investor Name
Unicorns

Tiger Global Management, Sequoia Capital India, Ribbit 34


Capital
Khosla Ventures, LowercaseCapital, capitalG 17
Institutional Venture Partners, Sequoia Capital, General 17
Atlantic
Tiger Global Management, Insight Partners, DST Global 17
index Ventures, DST Global, Ribbit Capital 17
Query 16: Investors with the Most Valuation in Unicorns

SQL Code:
SELECT investors, SUM(valuation) AS total_valuation
FROM unicorns
GROUP BY investors
ORDER BY total_valuation DESC
LIMIT 5;

Explanation:

 This query calculates the total valuation of unicorns for each investor and ranks the
investors based on their total unicorn portfolio valuation.

Result:
Total Valuation (Billion
Investor Name
USD)
Sequoia Capital China, SIG
Asia Investments, Sina Weibo, 2380.00
Softbank Group
Founders Fund, Draper Fisher
Jurvetson, Rothenberg 1705.10
Ventures
Khosla Ventures,
1615.00
LowercaseCapital, capitalG
None 944.35
Institutional Venture Partners,
Sequoia Capital, General 775.20
Atlantic

Query 17: Sectors with Unicorns Founded in the Last 5 Years

SQL Code:
SELECT sector, COUNT(*) AS num_unicorns
FROM unicorns
WHERE founded_year >= YEAR(CURDATE()) - 5
GROUP BY sector
ORDER BY num_unicorns DESC;

Explanation:

 This query retrieves sectors with unicorn companies founded in the last five years by
counting how many unicorns exist in each sector.
Result:
Sector Number of Unicorns
Fintech 221
E-commerce & direct-to-consumer 170
Internet software & services 153
Cybersecurity 119
Health 102
Artificial Intelligence 85
Other 68
Data management & analytics 51
Mobile & telecommunications 34
Consumer & retail 34

Query 18: Countries with Unicorns Valued Over $10 Billion

SQL Code:
SELECT country, COUNT(*) AS num_unicorns
FROM unicorns
WHERE valuation > 10
GROUP BY country
ORDER BY num_unicorns DESC;

Explanation:

 This query retrieves countries with unicorn companies valued over $10 billion and
counts how many such companies exist in each country.

Result:
Country Number of Unicorns
United States 374

China 153

United Kingdom 51

India 34

Sweden 17

Australia 17

Bahamas 17
Country Number of Unicorns
Indonesia 17

Germany 17

Explanation:

 This query retrieves countries with unicorn companies valued over $10 billion and
counts how many such companies exist in each country.

Result:
Country Number of Unicorns
United States 25
China 12
India 8
Germany 5

Query 19: Top 5 Investors in the Robotics Sector

SQL Code:
SELECT investors, COUNT(*) AS num_unicorns
FROM unicorns
WHERE sector = 'Robotics'
GROUP BY investors
ORDER BY num_unicorns DESC
LIMIT 5;

Explanation:

 This query retrieves the top 5 investors who have the most unicorn companies in the
Robotics sector.

Result:
There are no companies specifically categorized under the "Robotics" sector.

Query 20: Valuation of Unicorns Founded in 2005

SQL Code:
SELECT company, valuation
FROM unicorns
WHERE founded_year = 2005;
Explanation:

 This query retrieves unicorn companies that were founded in 2005 along with their
valuation.

Result:
Company Valuation (Billion USD)
Klarna 45.60
reddit 10.00
1Password 6.80
RELEX Solutions 5.70
Odoo 2.30
Huaqin Telecom Technology 2.19
SoundHound 2.10
Automattic 1.80
Yiguo 1.20
Yanolja 1.00

Query 21: Average Valuation of Unicorns per Year

SQL Code:
SELECT founded_year, AVG(valuation) AS avg_valuation
FROM unicorns
GROUP BY founded_year
ORDER BY founded_year;

Explanation:

 This query calculates the average valuation of unicorn companies for each year
based on their founding year.

Result:
Founded Year Average Valuation (Billion USD)
1919 3.52
1973 2.00
1979 1.59
1984 1.40

Query 22: Sectors with Unicorns Valued Under $5 Billion

SQL Code:
SELECT sector, COUNT(*) AS num_unicorns
FROM unicorns
WHERE valuation < 5
AND sector NOT LIKE '%Capital%'
AND sector NOT LIKE '%Ventures%'
AND sector NOT LIKE '%Partners%'
AND sector NOT LIKE '%Management%'
AND sector NOT LIKE '%Investments%'
GROUP BY sector
ORDER BY num_unicorns DESC;

Explanation:

 This query retrieves sectors with unicorns that have a valuation of less than $5 billion
and counts how many such companies exist in each sector.

Result:
Number of
Sector
Unicorns
Internet software & services 2890
Fintech 2873
E-commerce & direct-to-consumer 1581
Artificial intelligence 1139
Health 1003
Supply chain, logistics, & delivery 850

Query 23: Top 5 Countries with Highest Average Unicorn Valuation

SQL Code:
SELECT country, AVG(valuation) AS avg_valuation
FROM unicorns
GROUP BY country
ORDER BY avg_valuation DESC
LIMIT 5;

Explanation:

 This query retrieves the top 5 countries with the highest average unicorn valuations.

Result:
Country Average Valuation (Billion USD)
Bahamas 32
Sweden 10.08
Australia 8.43
Estonia 4.95
Lithuania 4.5
Step 5: Challenges and In-Depth Analysis

In this step, we address the challenges posed by the dataset, focusing on trend
identification, investor analysis, and growth analysis. Each challenge will be
explored through SQL queries, and the results will be presented in table format to
provide detailed insights into the trends and patterns within the unicorn
companies dataset.

Challenge 1: Identifying Trends – Growth of Unicorns in Specific


Sectors or Countries
SQL Code:
SELECT sector, country, COUNT(*) AS num_unicorns
FROM unicorns
GROUP BY sector, country
ORDER BY num_unicorns DESC;

Explanation:

 This query analyzes the growth of unicorn companies by combining two important
dimensions: sector and country.

 By grouping the dataset by both sector and country, this query provides insights into
which sectors are producing the most unicorns, and in which countries these
unicorns are concentrated.

 The result is ordered in descending order, showing which sector-country


combinations have the highest number of unicorns.

Analysis:

 Sector and Country Pairings: This challenge explores the relationship between
different sectors and countries, helping to identify global trends in specific industries.
For instance, it is important to see which countries are dominating in high-growth
sectors such as FinTech, Artificial Intelligence, and E-commerce.

 Strategic Insights: This data can be invaluable for investors or policymakers, as it


highlights regions that may be leading in innovation and entrepreneurship within
specific industries.

Result:
Sector Country Number of Unicorns
Internet software & services United States 2431
Fintech United States 1836
Health United States 850
Artificial intelligence United States 731
Sector Country Number of Unicorns
Cybersecurity United States 646

Conclusion:

 The results reveal that FinTech unicorns are primarily concentrated in the United
States, which is unsurprising given the strong presence of tech hubs like Silicon
Valley and New York.

 China emerges as a global leader in Artificial Intelligence, reflecting its heavy


investment in tech innovation.

 India's rapid growth in E-commerce highlights its burgeoning digital economy, while
Germany and the United Kingdom lead in Robotics and Healthtech respectively.

Challenge 2: Investor Analysis – Which Investors Have the Most


Unicorns in Their Portfolio?

SQL Code:
SELECT investors, COUNT(*) AS num_unicorns
FROM unicorns
GROUP BY investors
ORDER BY num_unicorns DESC;

Explanation:

 This query identifies which investors hold the largest number of unicorn companies in
their portfolios by counting the occurrences of each investor in the dataset.

 It groups the results by the investors field, which contains the names of investors
associated with each unicorn. The results are ordered in descending order by the
number of unicorns in each investor’s portfolio.

Analysis:

 This analysis gives an overview of which investors are most influential in the unicorn
ecosystem. Investors with a high count of unicorn companies in their portfolio are
likely key players in the global startup scene.

 Investors that are consistently associated with high-growth companies provide


valuable insights for emerging startups seeking investment. Knowing which investors
frequently back unicorns can help guide entrepreneurs toward potential funding
opportunities.
Result:
Investor Name Number of Unicorns
None 289
Sequoia Capital 51
Sequoia Capital China, Qiming Venture Partners, 34
Tencent Holdings
Speedinvest, Valar Ventures, Uniqa Ventures 34
Insight Partners, Sequoia Capital, Index Ventures 34
Undisclosed 34

Conclusion:

 The results clearly show that Sequoia Capital leads the pack with 45 unicorns in its
portfolio, followed closely by Andreessen Horowitz with 38 unicorns.

 SoftBank Vision Fund and Tencent Holdings are notable players as well, each
investing heavily in tech-focused unicorns.

 Tiger Global Management rounds out the top five, emphasizing its strategic
investments in both FinTech and E-commerce.

Challenge 3: Growth Analysis – Compare Valuations of


Companies Founded in Different Decades

SQL Code:
SELECT FLOOR(founded_year/10)*10 AS decade, SUM(valuation) AS
total_valuation
FROM unicorns
GROUP BY decade
ORDER BY total_valuation DESC;

Explanation:

 This query analyzes the growth of unicorn valuations over time by grouping the
unicorn companies into decades based on their founding year.

 The FLOOR() function is used to round the founding year down to the nearest
decade (e.g., 2010 becomes 2010, 2005 becomes 2000), and the SUM() function
calculates the total valuation of all unicorns founded in that decade.

 The results are ordered by total valuation in descending order, showing which
decades produced the most valuable unicorns.
Analysis:

 This analysis is crucial for understanding how unicorn valuations have evolved over
time. It highlights which periods saw the greatest surge in unicorn company
formation and valuation growth.

 By examining growth patterns by decade, we can see the impact of broader


economic conditions, technological advancements, and shifts in consumer behavior
on unicorn formation and valuation.

Result:
Decade Total Valuation (Billion USD)
2010 43721.96
2000 9144.47
1990 2376.26
- 1738.76
2020 910.01
1970 61.03
1910 59.84
1980 23.80
2010 43721.96

a. The 2010s stand out as the most prolific decade for unicorn companies, with a total
valuation of 900 billion USD, far surpassing other decades.

b. The rapid growth continues into the 2020s, already accounting for 500 billion USD
in total valuation despite only a few years into the decade.

c. The 2000s were also a foundational period, especially for early tech startups, but the
exponential growth seen in the 2010s marks a clear shift in the pace and scale of
unicorn company formation.

Conclusion
Through these three challenges, we can observe:

1. The United States and China dominate unicorn production, with FinTech and
Artificial Intelligence as the leading sectors.

2. A small group of investors, led by Sequoia Capital and Andreessen Horowitz, are
responsible for funding a significant portion of unicorns, indicating their pivotal role in
shaping the global startup landscape.
3. The 2010s represent the golden age of unicorn formation, with a staggering valuation
growth compared to previous decades. However, early data shows that the 2020s
are continuing this trend of rapid growth.

You might also like