0% found this document useful (0 votes)
12 views5 pages

Book Database

The document describes a dataset from Kaggle containing information on Amazon's Top 50 bestselling books from 2009 to 2019, with details on 550 books including title, author, user rating, reviews, price, year, and genre. It outlines the structure of the dataset and introduces a Book class to manage the book data, along with Java files for reading the dataset and performing various tasks such as counting books by an author and listing books by rating. The document also specifies tasks that can be performed on the dataset, such as retrieving books by a specific author or rating.

Uploaded by

Ujjwal Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Book Database

The document describes a dataset from Kaggle containing information on Amazon's Top 50 bestselling books from 2009 to 2019, with details on 550 books including title, author, user rating, reviews, price, year, and genre. It outlines the structure of the dataset and introduces a Book class to manage the book data, along with Java files for reading the dataset and performing various tasks such as counting books by an author and listing books by rating. The document also specifies tasks that can be performed on the dataset, such as retrieving books by a specific author or rating.

Uploaded by

Ujjwal Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

For this project, we have taken a dataset from Kaggle.

This
dataset is on Amazon’s Top 50 bestselling books from 2009 to
2019. It keeps the record of 550 books in a .csv file.

Amazon’s top 50 bestselling books

This is a Kaggle dataset in .csv format. It includes the information


on name, author, user rating, reviews, price, year, and genre of
550 different books. So, data is arranged using the seven
columns below.

Dataset entries

Name Author User Reviews Price Year Genre


Rating

10-Day Green JJ Smith 4.7 17350 8 2016 Non fiction


Smoothie
Cleanse

12 Rules for Jordan B. 4.7 18979 15 2018 Non fiction


Life: An Peterson
Antidote to
Chaos

1984 (Signet George 4.7 21424 6 2017 Fiction


Classics) Orwell

5,000 National 4.8 7665 12 2019 Non fiction


Awesome Facts Geographic
(About Kids
Everything!)
(National
Geographic
Kids)
A Dance with George R. 4.4 12643 11 2011 Fiction
Dragons (A R. Martin
Song of Ice
and Fire)

... ... ... ... ... ... ...

The above table represents a book with various attributes

detailing its characteristics and performance on Amazon. Let’s

discuss these columns as follows:

● Name: This column contains the title of the book


● Author: This column lists the author’s name.
● User Rating: It shows the average Amazon user rating, which
ranges from 3.3 to 4.9.
● Reviews: It indicates the number of reviews written by users
on Amazon, with a minimum of 37 and a maximum of
87,800 reviews.
● Price: It provides the cost of the book, spanning from $0 to
$105.
● Year: It specifies the year or years the book appeared on the
bestseller list, covering the period from 2009 to 2019.
● Genre: Lastly, it classifies the book as either fiction or
nonfiction.

Reading the Dataset

To begin working with the dataset, we need to read the data from
a CSV file named data.csv. This file contains information about
various books, structured in a tabular format. Each row
represents a book and includes details such as the title, author,
user rating, number of reviews, price, publication year, and
genre.

Define Book class

In this section, we will define a Book class that models the


attributes of a book based on the dataset provided. The Book
class will contain all the necessary details about each book, such
as its title, author, user rating, number of reviews, price,
publication year, and genre.

This class is designed to provide a structured way to manage and


manipulate book data within our application.

Attributes:

○ title: The title of the book.


○ author: The author of the book.
○ userRating: The average user rating of the book.
○ reviews: The number of user reviews.
○ price: The price of the book.
○ year: The year the book appeared on the bestseller list.
○ genre: The genre of the book (either fiction or
non-fiction).
● Constructor: Initializes a Book object with the provided
values for each attribute.
● Getters and setters: These methods provide access to and
modification of the book's attributes.

In the code above, we can have three java files used to read the
dataset. Lets explore the objective of each file as follows:
● The Book.java file defines the Book class, This class
represents a Book object with attributes for the title, author,
user rating, reviews, price, year, and genre. It includes
getters for each attribute and a printDetails method to print
the details of the book in a formatted manner.
● The DatasetReader.java file is responsible for reading a CSV
file and creating a list of Book objects. It handles the parsing
of each line in the CSV, ensuring that each book has the
required data fields, and skips malformed lines.
● The driver.java file contains the main method, which serves
as the entry point of the program. It uses DatasetReader to
read the dataset from the CSV file, and then iterates over
the list of Book objects to print their details using the
printDetails method of the Book class.

Tasks
1. Total number of books by an author
○ It takes the name of an author and dataset as input
and returns the total number of books written by the
author
2. All the authors in the dataset
○ Print name of all authors in the dataset
3. Names of all the books by an author
○ It takes the author as an input and returns all the
books written by the author. Just for reference, Author
is the second column, and Name (name of the book) is
the first column in the dataset.
4. Classify with a user rating
○ It takes the rating as an input and returns all books
with the user rating equal to rating.
5. Price of all the books by an author
○ It takes the name of the author as an input and returns
the names and prices of all the books written by the
author.

You might also like