CST3624 - Lab 1
CST3624 - Lab 1
WEEKLY READINGS
Lab 1: Part1 Based on the material above, write a 2 summary for the
Reading/Instructional videos part. 1 page discussion for the topic in the Discussions part.
Include your personal opinion on the matter. Total 3 pages, using your own words.
Unlike traditional relational database that use tables in columns and row to store data. NoSQL
uses unstructured data to scale larger data sets. NoSQL databases are generally classified into
four main categories: Document databases: Some of these databases store data such as JSON or
XML and can be queried using document-oriented query languages. These kinds of databases
include the popular MongoDB and Cassandra. NoSQL is faster as a result of unstructured data.
High Scalability
NoSQL can handle huge amounts of data based on said database but most databases use
horizonal scaling than vertical scaling
High Availability
NoSQL like MongoDB makes it highly available because in case of any failure data
replicates itself to the consistent state.
Types of NoSQL
Key Value
Data is stored in key pairs. This design handles a large data load. In some way it’s similar to
document databases with value JSON, BLOB, BLOB, and strings. Examples include Redis, Dynamo,
Riak, Couchbase and Aerospike.
Column Bases
Column Base databases are the type of databases that store data in columns instead of rows.
Examples include Bigtable, Cassandra, HBase, and Hypertable.
Document-Oriented
Document-Oriented databases and retrieve data as key value pair but the value part is stored as
a document instead. Examples include MongoDB-the most popular, CouchDB, Amazon SimpleDB,
and Lotus Notes.
Graph-Based database the entities are stored as nodes. The edges the relationship of nodes.
Examples includes GraphQL, Neo4J, and OrientDB
Discussion Video
In the video, How Facebook Tracks your Data. Although short there laid complex data tracking of
a Facebook. Facebook reasons for tracking users’ data is what makes Facebook profit and the
reason the app/ website is free. They can track your location based on your consent, your
interest, job, political affiliation, and lifestyle. This data is then sold to advertisers and or third
parties.
Location Data – Self explanatory based on your permissions you location get racked and
stored in a database.
Based on your data profile that Facebooks stores on you. You will get personalized ads, that’s
how Facebook makes money. The advisers know from the ad services Facebook provides, how
many clicks and how long someone viewed the ad for.
I would like to end here; this video has left me contemplating this digital age of collecting data
and some privacy invasions. It’s good to analyze the pros and cons of how social media
companies conduct business. Personally, I feel indifferent to how Facebook tracks you. I’m not a
user of social media and when I do use it the ads don’t interest me, so I don’t click on them. But
this is what keeps social media free and makes profit for stockholders.
1
What are the key challenges that traditional databases face in handling the increasing volume of
data in today's digital landscape, as discussed in the first chapter?
There are many issues with relational databases. Firstly:
Unstructured data cannot be I relational databases.
2
According to this field, what are some of the factors that have driven the evolution of databases
from traditional relational models to newer approaches? How are these factors reshaping the way
organizations manage and process data?
With the huge rise of the IoT, wireless revolution (smart phones, PDA’s)
High cost of maintain programs- software or websites that had huge traffic – Facebook and
Google
3
In the first chapter of our textbook (also the slides), the author introduces the concept of "Big Data"
and its impact on database management. Can you explain the main characteristics of Big Data
and how they challenge traditional database systems?
4
As outlined in the book's first chapter, what are the distinguishing features of NoSQL databases?
How do they differ from traditional relational databases, and what types of use cases are better
suited for each type?
NoSQL databases are designed to handle large volumes or data and types. They are
designed to be scalable and more reliable compared to SQL databases, though they don’t
provide ACID guarantees. They provide BASE most of the time.
5
In the context of database evolution, what role do distributed systems play? How does
the first chapter of the book discuss the shift from centralized to distributed
architectures?
Distributed
Set if databases stored on multiple computers.
It often times replicates data across multiple nodes to ensure data availability and reduce
risk of data loss
Centralized
Stored and managed at a single location.
Boost efficiency and productivity
Much cheaper than distributed databases.
If there is system failure, then users won’t have access to the database.
C - Consistency
A - Availability
P – Partition Tolerance
When it comes to consistency or availability programmers must choose what’s important. This
will have an impact on UX.
Programmers need to carefully consider the application's requirements and user expectations to
decide which trade-offs are acceptable.
The choice of what database system to use is also included in the CAP theorem.
7
What distinguishes NewSQL databases from traditional relational databases and NoSQL
databases, and what are some examples of NewSQL technologies?
Performance.
Modern Architecture.
ACID Compliance.
Examples of NewSQL
VoltDB
8
The lecture slides discuss the challenges of maintaining data consistency in distributed databases.
In the context of distributed systems, what are the trade-offs between strong consistency and high
availability, and how do different database technologies approach this trade-off?
Strong consistency means that all nodes in a system always have the same data. In a strongly
consistent system, all nodes agree on the order in which operations occurred. Reads always
return the most recent version of the data, and writes are visible to all nodes immediately after
they occur.
Weak consistency means that there is no guarantee that all nodes will have the same data at
any time. There are many different implementations of weak consistency.
9
What do you expect to learn in this class, and how do you plan to use the knowledge in your
career in the future?
I expect to learn what’s the right type of database to use for specific cases.
Exposure to large data sets and security of NoSQL databases
This will be useful to my career as I plan to become a data engineer.