YouTube Architecture
YouTube Architecture
com/youtube-architecture
YouTube Architecture
Wednesday, March 12, 2008 at 3:54PM
Todd Hoff in Apache, CDN, Example, Linux, MySQL, Python, Shard, lighttpd
YouTube grew incredibly fast, to over 100 million video views per day,
with only a handful of people responsible for scaling the site. How did
they manage to deliver all that video to all those users? And how have
they evolved since being acquired by Google?
Information Sources
1. Google Video
Platform
1. Apache
2. Python
3. Linux (SuSe)
4. MySQL
5. psyco, a dynamic python->C compiler
6. lighttpd for video instead of Apache
http://weibo.com/developerworks 2012-11-11 整理 第 1/8页
http://highscalability.com/youtube-architecture
What's Inside?
The Stats
1. Supports the delivery of over 100 million videos per day.
2. Founded 2/2005
3. 3/2006 30 million video views/day
4. 7/2006 100 million video views/day
5. 2 sysadmins, 2 scalability software architects
6. 2 feature developers, 2 network engineers, 1 DBA
Web Servers
1. NetScalar is used for load balancing and caching static content.
2. Run Apache with mod_fast_cgi.
3. Requests are routed for handling by a Python application server.
4. Application server talks to various databases and other informations
sources to get all the data and formats the html page.
5. Can usually scale web tier by adding more machines.
6. The Python web code is usually NOT the bottleneck, it spends most
of its time blocked on RPCs.
http://weibo.com/developerworks 2012-11-11 整理 第 2/8页
http://highscalability.com/youtube-architecture
Video Serving
Costs include bandwidth, hardware, and power consumption.
Each video hosted by a mini-cluster. Each video is served by more than
one machine.
Using a a cluster means:
- More disks serving content which means more speed.
- Headroom. If a machine goes down others can take over.
- There are online backups.
Servers use the lighttpd web server for video:
- Apache had too much overhead.
- Uses epoll to wait on multiple fds.
- Switched from single process to multiple process configuration to handle
more connections.
Most popular content is moved to a CDN (content delivery network):
- CDNs replicate content in multiple places. There's a better chance of
http://weibo.com/developerworks 2012-11-11 整理 第 3/8页
http://highscalability.com/youtube-architecture
content being closer to the user, with fewer hops, and content will run
over a more friendly network.
- CDN machines mostly serve out of memory because the content is so
popular there's little thrashing of content into and out of memory.
Less popular content (1-20 views per day) uses YouTube servers in
various colo sites.
- There's a long tail effect. A video may have a few plays, but lots of
videos are being played. Random disks blocks are being accessed.
- Caching doesn't do a lot of good in this scenario, so spending money on
more cache may not make sense. This is a very interesting point. If you
have a long tail product caching won't always be your performance savior.
- Tune RAID controller and pay attention to other lower level issues to
help.
- Tune memory on each machine so there's not too much and not too little.
Serving Thumbnails
Surprisingly difficult to do efficiently.
There are a like 4 thumbnails for each video so there are a lot more
thumbnails than videos.
http://weibo.com/developerworks 2012-11-11 整理 第 4/8页
http://highscalability.com/youtube-architecture
Databases
1. The Early Years
http://weibo.com/developerworks 2012-11-11 整理 第 5/8页
http://highscalability.com/youtube-architecture
- Use MySQL to store meta data like users, tags, and descriptions.
- Served data off a monolithic RAID 10 Volume with 10 disks.
- Living off credit cards so they leased hardware. When they needed
more hardware to handle load it took a few days to order and get
delivered.
- They went through a common evolution: single server, went to a
single master with multiple read slaves, then partitioned the
database, and then settled on a sharding approach.
- Suffered from replica lag. The master is multi-threaded and runs
on a large machine so it can handle a lot of work. Slaves are single
threaded and usually run on lesser machines and replication is
asynchronous, so the slaves can lag significantly behind the master.
- Updates cause cache misses which goes to disk where slow I/O
causes slow replication.
- Using a replicating architecture you need to spend a lot of money
for incremental bits of write performance.
- One of their solutions was prioritize traffic by splitting the data
into two clusters: a video watch pool and a general cluster. The idea
is that people want to watch video so that function should get the
most resources. The social networking features of YouTube are less
important so they can be routed to a less capable cluster.
2. The later years:
- Went to database partitioning.
- Split into shards with users assigned to different shards.
- Spreads writes and reads.
- Much better cache locality which means less IO.
- Resulted in a 30% hardware reduction.
- Reduced replica lag to 0.
- Can now scale database almost arbitrarily.
Lessons Learned
1. Stall for time. Creative and risky tricks can help you cope in the
short term while you work out longer term solutions.
2. Prioritize. Know what's essential to your service and prioritize your
resources and efforts around those priorities.
3. Pick your battles. Don't be afraid to outsource some essential
services. YouTube uses a CDN to distribute their most popular
content. Creating their own network would have taken too long and
cost too much. You may have similar opportunities in your system.
Take a look at Software as a Service for more ideas.
4. Keep it simple! Simplicity allows you to rearchitect more quickly
so you can respond to problems. It's true that nobody really knows
what simplicity is, but if you aren't afraid to make changes then
that's a good sign simplicity is happening.
5. Shard. Sharding helps to isolate and constrain storage, CPU,
http://weibo.com/developerworks 2012-11-11 整理 第 7/8页
http://highscalability.com/youtube-architecture
memory, and IO. It's not just about getting more writes performance.
6. Constant iteration on bottlenecks:
- Software: DB, caching
- OS: disk I/O
- Hardware: memory, RAID
7. You succeed as a team. Have a good cross discipline team that
understands the whole system and what's underneath the system.
People who can set up printers, machines, install networks, and so
on. With a good team all things are possible.