0% found this document useful (0 votes)
24 views12 pages

ABP W5-W6 Big Data Analytics Lab-CASSANDRA

The document provides a comprehensive guide on using Cassandra for big data analytics, focusing on keyspace operations, CRUD operations, and collection types like Set, List, and Map. It includes detailed syntax and commands for creating keyspaces, tables, and performing various operations such as inserting, updating, and deleting data. Additionally, it covers altering table structures and demonstrates practical examples for each operation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views12 pages

ABP W5-W6 Big Data Analytics Lab-CASSANDRA

The document provides a comprehensive guide on using Cassandra for big data analytics, focusing on keyspace operations, CRUD operations, and collection types like Set, List, and Map. It includes detailed syntax and commands for creating keyspaces, tables, and performing various operations such as inserting, updating, and deleting data. Additionally, it covers altering table structures and demonstrates practical examples for each operation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

BIG DATA ANALYTICS LAB

(A7902) (VCE-R21)

Week-5 Cassandra

a) Implement keyspace operations to group column families


together for the given application data.
b) Implement CRUD operations on the given dataset using
Cassandra.
5.a) Implement keyspace operations
A keyspace is a container to hold application data. It is comparable to a relational
database. It is used to group column families together. Typically, a cluster has one
keyspace per application. Replication is controlled on a per keyspace
When one creates a keyspace, it is required to specify a strategy class. There are two
choices available with us. Either we can specify a “SimpleStrategy” or a
“NetworkTopologyStrategy” class. While using Cassandra for evaluation purpose, go with
“SimpleStrategy” class and for production usage, work with the
“NetworkTopologyStrategy” class.
5.1. Create keyspace: To create a new keyspace.

Syntax: CREATE KEYSPACE <keyspace_name> WITH <properties>

i. Create a keyspace by the name “Students”


Command:
cqlsh> CREATE KEYSPACE Students WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 1 };

ii. To describe all the existing keyspaces.


Command: cqlsh> DESCRIBE keyspaces;

Output: system students system_traces

5.2. Use: connects the client session to the specified keyspace.

Syntax: USE keyspace_name;

Command: cqlsh> USE Students;

Output: cqlsh:students>

Note: Cassandra converted the Students keyspace to lowercase as quotation marks were
not used.

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

5.3. To create a column family or table by the name “student_info”.


cqlsh:students> CREATE TABLE Student_Info (
RollNo int PRIMARY KEY,
StudName text,
DateofJoining timestamp,
LastExamPercent double );

The table “student_info” gets created in the keyspace “students”.


Note: Tables can have either a single or compound primary key.
5.4. To lookup the names of all tables in the current keyspace, or in all the keyspaces
if there is no current keyspace.
cqlsh:students> DESCRIBE TABLES;
Output: student_Info

5.5. To describe the table “student_info” use the below command.


cqlsh:students> DESCRIBE TABLE student_info;
Output:

Note: The output is a list of CQL commands with the help of which the table “student_info”
can be recreated.

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

5.b) CRUD (CREATE, READ, UPDATE, AND DELETE) OPERATIONS


1. CREATE: To creating a column family or table in a keyspace
Syntax: CREATE TABLE tablename (
column1_ name datatype PRIMARY KEY,
column2_ name data type,
column3_ name data type …
) ;

To create a column family or table by the name “student_info”.


cqlsh:students> CREATE TABLE Student_Info (
RollNo int PRIMARY KEY,
StudName text,
DateofJoining timestamp,
LastExamPercent double );

The table “student_info” gets created in the keyspace “students”.


Note: Tables can have either a single or compound primary key.

2. INSERT: To insert data into the column family “student_info”. An insert writes one or
more columns to a record in Cassandra table atomically. An insert statement does not
return an output.
cqlsh:students> BEGIN BATCH
INSERT INTO student_info (RollNo,StudName,DateofJoining,LastExamPercent)
VALUES (1,'Michael Storm','2012-03-29', 69.6)
INSERT INTO student_info (RollNo,StudName,DateofJoining,LastExamPercent)
VALUES (2,'Stephen Fox','2013-02-27', 72.5)
INSERT INTO student_info (RollNo,StudName,DateofJoining,LastExamPercent)
VALUES (3,'David Flemming','2014-04-12', 81.7)
INSERT INTO student_info (RollNo,StudName,DateofJoining,LastExamPercent)
VALUES (4,'Ian String','2012-05-11', 73.4)
APPLY BATCH;

3. READ: To retrieve or fetch the data from a table.


Syntax: SELECT <column1_name>,<column2_name> ..
FROM table_name
WHERE <condition>;

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

3.1. To view the data from the table “student_info”.


cqlsh:students> SELECT * FROM student_info;

The above select statement retrieves data from the “student_info” table.

3.2. To view or display information of only those students records where the RollNo
column either has a value 1 or 2 or 3.
cqlsh:students> SELECT * FROM student_info WHERE RollNo IN (1,2,3);

3.3. To execute the query using the index defined on “studname” column.
cqlsh:students> CREATE INDEX ON student_info(studname)
cqlsh:students> SELECT * FROM student_info WHERE studname= 'Stephen Fox';

4.UPDATE: An update updates one or more column values for a given row to the Cassandra
table. It does not return anything.
4.1. To updatethe value held in the “StudName” column of the “student_info” column family
to “David Sheen” for the record where the RollNo column has value = 2.
cqlsh:students> UPDATE student_info SET StudName = 'David Sheen'
WHERE RollNo = 2;
cqlsh:students> SELECT * FROM student_info WHERE RollNo = 2;

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

4.2. Let us try updating the value of a primary key column.


cqlsh:students> UPDATE student_info SET rollno=6 WHERE rollno=3;

Note: It does not allow update to a primary key column.

4.3. Updating more than one column of a row of Cassandra table.

5. DELETE: Delete statement removes one or more columns from one or more rows of a
Cassandra table or removes entire rows if no columns are specified.

5.1. To delete the column “LastExamPercent” from the “student_info” table for the record
where the RollNo = 2.
cqlsh:students> DELETE LastExamPercent FROM student_info WHERE RollNo= 2;
cqlsh:students> SELECT * FROM student_info WHERE RollNo = 2;

5.2. To delete a row (where RollNo = 2) from the table “student_info”.


cqlsh:students> DELETE FROM student_info WHERE RollNo= 2;

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

Week-6 Cassandra

a) Design a table/column family and perform various


collection types Set, List and Map using Cassandra.
b) Design a table/column family and perform Alter table
commands using Cassandra.
COLLECTIONS: Cassandra provides collection types as a way to group and store data
together in a column i.e; to store multiple values in a column like storing multiple mobile
number etc. They are used when to store or denormalize a small amount of data.
6.a) CQL makes use of the following collection types: SET, LIST, MAP
6.1. SET: A column of type set consists of unordered unique values. When the column is
queried, it returns the values in sorted order.
Ex.: for text values, it sorts in alphabetical order.

Problem statement: Create a table “users” with columns “user_id” (as primary key),
“first_name”, “last_name” and “emails” column as set-collection. Insert the given values to
the table: 'AB', 'Albert', 'Baggins', {'[email protected]', '[email protected]'}. Add an element
to the emails set {'[email protected]'}. Retrieve all email addresses. Now, remove an
element from the set and view the record. Then remove all elements from the set and
display the table.

1.1. To create a table “users” with an “emails” column. The type of this column “emails” is
“set” collection.
cqlsh:students> CREATE TABLE users (
user_id text PRIMARY KEY,
first_name text,
last_name text,
emails set<text> );

1.2. To insert values into the “emails” column of the “users” table.
Note: Set values must be unique.
cqlsh:students> INSERT INTO users (user_id, first_name, last_name, emails)
VALUES ('AB', 'Albert', 'Baggins', {'[email protected]', '[email protected]'});

1.3 Add an element to a set using the UPDATE command and the addition (+) operator.
cqlsh:students> UPDATE users SET emails = emails + {'[email protected]'}
WHERE user_id = 'AB';

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

1.4. To retrieve email addresses for Albert from the set.


cqlsh:students> SELECT user_id, emails FROM users WHERE user_id = 'AB';

1.5. To remove an element from a set using the subtraction (-) operator.
cqlsh:students> UPDATE users SET emails = emails - {'[email protected]'}
WHERE user_id = 'AB';
cqlsh:students> SELECT * FROM users

1.6. To remove all elements from a set by using the UPDATE or DELETE statement.
cqlsh:students> UPDATE users SET emails = {} WHERE user_id = 'AB';
(OR)
cqlsh:students> DELETE emails FROM users WHERE user_id = 'AB';
cqlsh:students> SELECT * FROM users

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

6.2. LIST Collection:


 List is used when the order of elements matter.
 List allows to store the same value multiple times.
Eg. When we want to store preferences of places to visit by a user, we would like to
follow the preferences and retrieve the values in the same order rather than sorted
order.
6.2.1. To alter the “users” table to add a column, “top_places” of type list Collection.
cqlsh:students> ALTER TABLE users ADD top_places list<text>;

6.2.2. Update the list column “top_places” with values ['Lonavla', 'Khandala'] in the “users”
table for user_id = ‘AB’.
cqlsh:students> UPDATE users SET top_places = ['Lonavla', 'Khandala']
WHERE user_id = 'AB';
cqlsh:students> SELECT * FROM users WHERE user_id = 'AB';

6.2.3. Prepend an element 'Mahabaleshwar' to the list by enclosing it in square brackets


and using the addition (+) operator.
cqlsh:students> UPDATE users SET top_places = ['Mahabaleshwar'] + top_places
WHERE user_id = 'AB';
cqlsh:students> SELECT * FROM users;

6.2.4. To append an element to the list by switching the order of the new element data and
the list name in the update command.
cqlsh:students> UPDATE users SET top_places = top_places + ['Tapola']
WHERE user_id = ‘AB’
cqlsh:students> SELECT * FROM users;

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

6.2.5. To query the database for a list of top places.


cqlsh:students> SELECT user_id, top_places FROM users WHERE user_id = 'AB';

6.2.6.To remove an element from a list using the DELETE command and the list index
position in square brackets.
cqlsh:students> DELETE top_places[3] FROM users WHERE user_id = 'AB';
cqlsh:students> SELECT * FROM users;

6.3. MAP Collection: As the name implies, a map is used to map one thing to another. A
map is a pair of typed values. It is used to store timestamp related information. Each
element of the map is stored as a Cassandra column. Each element can be individually
queried, modified, and deleted. Objective:

Using Map: Key, Value Pair


6.3.1. To alter the “users” table to add a map column “todo”.
cqlsh:students> ALTER TABLE users ADD todo map<timestamp, text>;
cqlsh:students> DESCRIBE TABLE users

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

6.3.2. To update “todo” values for the record for user (user_id = ‘AB’) in the “users” table.
cqlsh:students> UPDATE users SET todo = {‘2024-09-24’ : ‘Cassandra Session’ ,
‘2024-10-02’ : ‘MangoDB Session’ ,}
WHERE user_id = 'AB';
cqlsh:students> SELECT user_id, top_places FROM users WHERE user_id = 'AB';

6.3.3. To delete an element from the map using the DELETE command and enclosing the
timestamp of the element in square brackets.
cqlsh:students> DELETE todo['2024-09-24'] FROM users WHERE user_id = 'AB';
cqlsh:students> SELECT user_id, todo FROM users WHERE user_id = 'AB';

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

6.b) ALTER commands: to bring about changes to the structure of the table/column
family.
1. Create a table “sample” with columns “sample_id” and “sample_name”.
cqlsh:students> CREATE TABLE sample(
sample_id text, sample_name text,
PRIMARY KEY (sample_id) );

2. Insert a record into the table “sample”.


cqlsh:students> INSERT INTO sample( sample_id, sample_name)
VALUES ('S101', 'Big Data');

3. View the records of the table “sample”.


cqlsh:students> SELECT * FROM sample;

6b.1. Alter Table to Change the Data Type of a Column


1. Alter the schema of the table “sample”. Change the data type of the column “sample_id” to
integer from text.
cqlsh:students> ALTER TABLE sample ALTER sample_id TYPE int;

2. After the data type of the column “sample_id” is changed from text to integer, try
inserting a record as follows and observe the error message:
cqlsh:students> INSERT INTO sample(sample_id, sample_name)
VALUES( 'S102', 'Big Data');

3. Try inserting a record as given below into the table “sample”.


cqlsh:students> INSERT INTO sample(sample_id, sample_name) VALUES( 102, 'Big Data');

4. Alter the data type of the “sample_id” column to varchar from integer.
cqlsh:students> ALTER TABLE sample ALTER sample_id TYPE varchar;

5. Check the records after the data type of “sample_id” has been changed to varchar from
integer.
cqlsh:students> SELECT * FROM sample;

A. Bhanu Prasad, Associate Professor of CSE, VCE


BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

6b.2. Alter Table to Delete a Column


1. Drop the column “sample_id” from the table “sample”.
cqlsh:students> ALTER TABLE sample DROP sample_id;

Note: The request to drop the “sample_id” column from the table “sample” does not
succeed as it is the primary key column.

2. Drop the column “sample_name” from the table “sample”.


cqlsh:students> ALTER TABLE sample DROP sample_name;
Note: the above request to drop the column “sample_name” from table “sample” succeeds.

3. Drop a Table: Drop the column family/table “sample”.


cqlsh:students> DROP columnfamily sample;
cqlsh:students> DESCRIBE TABLE sample;

The above request succeeds. The table/column family no longer exists in the keyspace.

5. Drop a Database: Drop the keyspace “students”.


cqlsh:students> DROP keyspace students;
cqlsh:students> DESCRIBE keyspace students;

A. Bhanu Prasad, Associate Professor of CSE, VCE

You might also like