0% found this document useful (0 votes)
63 views21 pages

Inner Architecture of A Social Networking System: Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner

This document summarizes the key aspects of the social networking system Takeplace, which was built using Hadoop, HBase, and Memcached. It discusses the functional and technical requirements including high performance, scalability, and handling billions of rows. It provides an overview of the technologies used, including Hadoop for distributed processing, HBase for distributed storage, and Memcached for caching. It also includes diagrams of the system architecture showing how user data is stored across multiple tables in HBase and how the news feed is generated and cached in Memcached.

Uploaded by

Anca Ancaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views21 pages

Inner Architecture of A Social Networking System: Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner

This document summarizes the key aspects of the social networking system Takeplace, which was built using Hadoop, HBase, and Memcached. It discusses the functional and technical requirements including high performance, scalability, and handling billions of rows. It provides an overview of the technologies used, including Hadoop for distributed processing, HBase for distributed storage, and Memcached for caching. It also includes diagrams of the system architecture showing how user data is stored across multiple tables in HBase and how the news feed is generated and cached in Memcached.

Uploaded by

Anca Ancaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Inner Architecture of a

Social Networking System


Petr Kunc, Jaroslav krablek,
Tom Pitner

Who am I?
Master student of FI MU
Member of LaSArIS
Webtops
Modern web applications
Cloud (and distributive) solutions

First time speaker at conference

Social network systems


Hundreds million users => advanced
software architecture and
technologies
High performance
Scalability
Billions of rows

Table of contents
What and why?
Takeplace
Which way?
Hadoop
HBase
Memcached
How?
Architecture and design
Was it worth it?
Testing

Takeplace

Takeplace and Social


Networking
Web-based service facilitating organization of
events based on meeting, sharing and
communication.
Emphasis on social and interpersonal interaction
Easy tool to comment conferences (feedback)
Professional user network: to create relations
among academic and professional world with
common interests
Analysis and statistics
To behave like Facebook with relations like
Twitter and to be used as LinkedIn.

Functional requirements
Entities can create asymmetric
relations
Posts
Walls and news feed
Comments and like

Technology requirements
Linux and Cloud
Data-oriented application
High throughput
Heavy loads
Concurrent requests

Caching tool

Relational databases
Fixed schema, ACID, indexes, joins
Problems
scaling up dataset size
Read/write concurrency

Typical use of MySQL: Production =>


Memcached (losing ACID) => Costly server
=> Denormalizing => materialize most
common queries => drop triggers, indexes
(compromises or expensive)

Hbase

Inspired by Google BigTable


Regions
4 dimensions
multidimensional sorted persistent
distributed key-value map
Keys & values = array of bytes
Row, CF, Columns & Version

Example
{
aa : {
cf : {
c1 : data
c2 : data
}
cf2 : {
anyByteArray : true
}
},
ab : { }
}

Hadoop
SW framework backbone of distributed
environment
MapReduce

HDFS

HBase

No real indexes
Automatic partitioning
Scale linearly and automatically
Parallel
Cheap
Not for everyone
Write once, read many
Built on top of Hadoop

Memcached
Distributed cache
Typical usage
public Data getData (String query) {
Data data = memcached.get(query);
if (data == null) {
data = database.get(query);
memcached.set(query, data);
}
return data;
}

Architecture

Architecture (2)

To be used in any system


Interface of services (REST, SOAP, )
User tables
Services: Follow, Wall, Like and
Discussion
Security

Architecture (3)

User ID
transformation

Data!
Three tables
Entities
Followers, Following, Blocked, Count,
News

Walls
Info, text, likes

Discussions (similar to Walls)

Storing data

Row IDs! Performance!


Lexically
Sequence scanner
UID (constant length)
yyyymmddhhmmssSSS
Inverted bytes -> newest to oldest

News feed
One by one (slow)

OR
Store news at each profile (great redundancy)

MEMCACHED!
Post put in DB => search followers =>
store minimized in Memcached => links to
news feed => 1 normal q & 1 batch q to
Memcached
TTL (LRU)

Conclusion
Pros
High volume data distribution
Scalability
High throughput
Heavy data load (write once, read many)

Cons
Losing relations, indexes, triggers,
Responsibility for consistent data
still not sure how it will behave when deployed on
production

You might also like