0% found this document useful (0 votes)

2 views7 pages

Programming Assignment 3 v03

The ECS 116 Programming Assignment 3 focuses on hands-on experience with MongoDB, requiring teams of 3-4 to complete various tasks related to data management and aggregation. The assignment includes creating collections with embedded reviews and calendar availability, utilizing Python and MongoDB's aggregation pipeline, and submitting JSON and CSV files with specific data formats. Teams must adhere to deadlines, avoid plagiarism, and document their use of AI tools like ChatGPT.

Uploaded by

toreyune

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views7 pages

Programming Assignment 3 v03

Uploaded by

toreyune

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ECS 116 Databases for Non-Majors / Data Management for Data Science

Programming Assignment 3

Prelude
1. The goal of this programming assignment is to provide some hands-on experience with MongoDB, one of the
early and still widely used NoSQL databases.

2. The assignment is of 10 points.

3. The assignment is due Sunday, June 1, 2025, at 11:59 pm.

4. This assignment will be in teams of 3 or 4 people.

5. Each team member should have a full installation of the data and working code in their laptop, so that they
can run everything. (Teammates might develop different parts of the code, but in the end each teammate
should be able to run everything on their machine. Also, optionally, teammates may create some or all parts
of the code on their own). In particular, each teammate will be required to create the json files and the csv file
described below, all based on execution on their own machine.

6. Each team should create a single document that includes the jointly written paragraphs describing the results
obtained (Part 6).

7. Some parts of this project may take a long time to run. (E.g., one part of Step 1.1 can take about an hour or
more.) So be sure to start on the project early, and plan to complete it well before the deadline. Also, if you
have a very slow machine, then contact the professor about having a modified requirement for some parts.

8. As with Programming Assignments 1 and 2, ChatGPT and/or other LLMs can be used, and we ask that you
give a brief statement about how it was used and what each teammates experience was.

9. Late submissions will be graded according to the late policy. Specifically, 10% of grade is deducted if you are
up to 24 hours late, 20% is deducted if you are 24 to 48 hours late, and no credit if turned in after 48 hours.

10. Plagiarism is strictly prohibited. You’re free to discuss high-level concepts amongst your peers. However,
cheating will result in no points on the assignment and reporting to OSSJA.

Step 1: Creating MongoDB collection holding listings with embedded

reviews, using df and dict operations
For this step you are to:

1. Create a notebook that populates the MongoDB on your laptop with the full set of 37,434 listing objects, with
reviews data embedded, as illustrated in Loading Local MongoDB with Listings & Reviews-vXX.ipynb.

2. The collection you create should be named listings with reviews.

3. The notebook that you submit for this part should be named 1--Building-listings-with-reviews-using-
python.ipynb.

4. Once you have MongoDB populated, you are to add 4 cells to your notebook with pymongo scripts/queries that
do the following:

1
• Query 1: Output is the number of listings that have last review between February 1, 2021, and March
15, 2023, inclusive.

• Query 2: Output is the number of listings that have an array of reviews with length at least 50. (You may
want to take inspiration from https://stackoverflow.com/questions/41918605/mongodb-find-array-length-
greater-than-specified-size.)

• Query 3: Output is the number of listings that have a review containing the word ”awesome” (case
sensitive) OR a review containing the word ”amazing” (case sensitive).

• Query 4: Same as Query 3, but ignoring case.

5. Please prepare a csv file and name it as Step1 query counts.csv. It should have the following properties:

(a) The csv file has a header row.

(b) Column 1 is labeled ”Query Number”

(c) Column 2 is labeled ”Count”

(d) There are 4 rows, corresponding to the 4 queries, in that order.

(e) The entries of the first column are simply 1,2,3 and 4.

(f) The entries of the second column should be the counts that you obtained for each of the 4 queries by
running your notebook on your machine.

Here is an example of the format for the csv file that you are to create. The Count values here are completely
made up.

Query number Count

1 10

2 150

3 23

4 104

Table 1: Example structure of table in the csv file to be submitted

(You can create the csv file by hand, i.e., you do not need to have your notebook produce it.)

6. Final Note: In this Step, you brought slightly pre-processed data from PostgreSQL into your PyMongo environ-
ment, and used python and pandas to further format the data for insertion into a MongoDB collection. That
approach was not time efficient – it may have taken around 50 to 70 minutes for some of the processing. In
Steps 2, 3 and 4, you will do things primarily within the MongoDB system, and primarily using the aggregate
operation with pipelines. You will see that if you can do something within MongoDB, then it is generally much
more time efficient.

2
Step 2: Creating MongoDB collection holding listings with embedded
calendar availability, using an aggregation pipeline
For this step. please refer to the notebook ”2–Loading-Local-MongoDB-with-calendar-csv-data–vXX.ipynb” in the
PA3 Materials folder in Canvas. That notebook shows how you can load the calendar.csv file into a dataframe, and
from there into mongodb. In the notebook, the resulting collection is called ”calendar”. The datatypes for fields
holding dates should be datetime, and for any boolean fields should be Boolean.

Your goal is to build a notebook, called ”2–Building-listings-with-calendar-using-aggregate.ipynb”, that creates an-

other collection, called ”listings with calendar” which includes a document for each listing. The document should
have fields for

1. id: holds the listing id value of the listing

2. average price: holds the average price across all of the records in calendar.csv associated with this listing,
having type numeric (integer or float is OK)

3. first available date: holds the minimum date of any calendar record associated with the listing, having
type datetime

4. last available date: holds the maximum date of any calendar record associated with the listing, having type
datetime

5. dates list: Holds an array of documents, one for each calendar entry associated with the listing. Each of
these documents should include the following fields:

(a) date, of type datetime

(b) available, of type Boolean

(c) price, of type numeric (not string)

(d) minimum nights, of type integer

(e) maximum nights, of type integer

The notebook includes some code illustrating how you can check the data types of documents in a MongoDB
collection. It also includes a function convert lwc to json that illustrates how you can convert MongoDB documents
into python dictionaries that can be written into json files.

For this step of the Programming Assignment, you are to:

1. Create a pipeline specification that will build the listings with calendar collection, and use that pipeline as part
of the notebook to create the collection. Here are some notes:

(a) In one approach to building a pipeline that works, the first step is a $group operator. That can be used
to define how the scalar fields are to be populated. Also, to populate the dates list field you can use
the $push operator, which forms an array of all elements that are being grouped.

The second (and final) step is to use the $out operator to write the result of the aggregation into the
collection listings with calendar.

(b) When you are working to define the pipeline, you may want to work with a small collection that corresponds
to, e.g., of the first 5000 documents in calendar.

(c) Your collection listings with calendar should hold 37,431 documents. (Why is that three less that the
number of documents in the collection listings with reviews that you built for Step 1 above?)

3
2. Select a subset of the collection which holds documents for all listings whose id has prefix ’1001’, convert
these documents into something that can be written into a json file, and write them into a file named
listings with calendar subset 1001.json’. This file is to be included into your zip submission. Here
are some notes:

(a) The notebook ”2–Loading Local MongoDB with calendar csv data–vXX.ipynb” provides an illustration
of how to convert data from MongoDB into dictionaries that can be written out to json files.

(b) You can find a json file similar to the file you are to produce in Canvas in the PA3 Materials folder; it is
named listings with calendar subset avg price 18370.json.

Note: One good way to inspect a big json document is to open a new tab in Firefox (not Chrome) and
then drag the file into that browser window.

(c) Your file should hold 28 documents.

Step 3: Creating MongoDB collection holding listings with embedded

reviews, using an aggregation pipeline
In this step you will revisit the goal of Step 1, which was to build a MongoDB collection listings with reviews.
But for this step, you will use the aggregate function rather than doing things with pandas and python.

Specifically, you are to build a notebook called ”3–Building-listings-with-reviews-using-aggregate.ipynb” that:

1. Imports the listings.csv and reviews.csv files into dataframes. (As in the notebook 2-Building-listings-with-
calendar-using-aggregate.ipynb, as you import listings.csv into a dataframe make sure that the datatypes for
id and host id are strings, and similarly for reviews.csv and field names id, listing id, and reviewer id.)

2. Modifies the dataframe for listings in the following ways:

(a) It has only the 18 columns for the listings table that were used in Step 1. (Use a command like
df listings.drop(cols to drop, axis=1, inplace=True), where cols to drop is a list of all columns
to be dropped from the dataframe.

(b) The types of price and reviews per month columns are converted to numeric using commands like:
df listings[’reviews per month’] = pd.to numeric(df listings[’reviews per month’]). (For
price you may need to drop some ’$’ and ’,’ characters.)

3. Puts these dataframes into MongoDB collections listings and reviews. (Please ensure that date columns
are converted to the datetime data type, and address issues with NaT, as in the notebook 2-Building-listings-
with-calendar-using-aggregate.ipynb.)

4. Using an aggregation pipeline, builds a collection listings with reviews m. This should hold data very
similar to the collection listings with reviews that you built for Step 1. (There are some minor differences
because of some operations performed in Step 1 vis-a-vis some operations performed here. Can you find them?)

Some notes:

(a) One way to build the pipeline would be to start with a $lookup, and then use $out to write the output
into the target collection

(b) IMPORTANT NOTE: To make your pipeline run quickly (e.g., in about 7 to 15 seconds), you should
create an index on ’listing id’ in your db.reviews collection. You can use a command such as the following:

db.reviews.create_index(’listing_id’)

4
(If you don’t have this index, your pipeline would probably run for 2 to 4 hours!)

5. Finally, produce a file ”listings with reviews m subset 1001.json” that holds documents that correspond to the
documents in your listings with reviews m collection whose id value has prefix ”1001”. Some notes:

(a) In addition to dealing with the datetime values, you will have to modify the ObjectID values (make them
strings) and the NaN values (test for that using math.isnan operator, and map to None).

(b) Your file should hold 28 documents, as in Step 2.

Step 4: Creating MongoDB collection holding listings with embedded

data for both reviews and calendar availability
For this step, you are to create a notebook named ”4–Building-listings-with-reviews-and-cal.ipynb” that forms a
kind of join of your collections listings with reviews m and listings with calendar, and puts it into a collection
called listings with reviews and cal. In particular, each document in the collection listings with reviews and cal
should include information about a listing, including

1. all scalar fields about that listing from both listings with reviews m and listings with calendar, except
for the id field from listings with calendar.

2. a field reviews, holding the array of data about reviews associated with the listing

3. a field dates list holding the array of data about calendar entries associated with the listing.

Here are some notes about one way to build a pipeline to create the collection listings with reviews and cal.

1. Run the aggregate command on the listings with reviews m collection.

Remember that listings with reviews m has 3 listings that listings with calendar does not have.

2. Start the pipeline with a $lookup that will form, intuitively speaking, something close to the left join of
listings with reviews m and listings with calendar. In my $lookup, I used the name cal docs for
holding the array of docs from listings with calendar.

3. Now use the $unwind operator on the $cal docs field. This has the effect of breaking each $cal docs array
into separate documents. The intermediate result after the $unwind is quite close to being the left join of
listings with reviews m and listings with calendar. In order to retain data about the three listings not
in listings with calendar, you need to use the following formulation

{ ’$unwind’: { ’path’: ’$cal_docs’,

’preserveNullAndEmptyArrays’ : True
}
},

4. Now use an $addFields operator to add in the fields for average price, first available date, last available date,
and dates list. For each of these you will need a formulation something like:

’first_available_date’: ’$$ROOT.cal_docs.first_available_date’,

What is $$ROOT here? This step of the pipeline is basically operating on a stream of documents. For a given
document, $$ROOT refers to the root of that document.

5. Now use an $unset operator to remove the cal docs field.

6. Finally, use $out to write the output of the pipeline into the collection listings with reviews and cal.

5
As with steps 2 and 3, you are to produce a json file called ”listings with reviews and cal subset 1001.json” that
holds documents that correspond to the documents in your listings with reviews and cal collection whose id
value has prefix ”1001”.

Step 5: Comments about what you observed

As a team, write a short paragraph for each of the following questions based on the your work on this assignment.

1. For Part 3, if you include the index then the pipeline runs in 1 or 2 minutes, but if you leave the index out
then it would take about 4 hours. For this question, assume that on a particular machine it takes 2 minutes
with index, and 4 hours without index.

Compute the (approximate) time it takes for MongoDB to make one full scan of the db.reviews collection.

Compute the (approximate) time it takes, on average, for MongoDB to perform an index-based retrieval of all
documents in db.reviews having a particular listing id value.

Hint: The pipeline used for this Step is essentially doing a left-join of db.listings with db.reviews.

2. For Part 2 we did not use an index. Why does your pipeline for Part 2 run in a minute or two, even though
the calendar.csv file has many more entries than the review.csv file?

3. For Part 4 we again did not use an index. Why does your pipeline for Part 4 run in a minute or two, even
though both collections being joined have 37K+ documents in them?

4. Would your pipeline for Part 4 fun faster if you included an index on id for one or both of the collections?

Step 6: Submission Instructions

There are 2 components to your submission:

1. Each TEAM should create a single pdf report with the filename Assignment-3-TEAM-REPORT.pdf. (We have
realized that when a file is submitted into Canvas, is is automatically pre-pended with student last name and
student first name.) The report should include the following named sections, in this order

(a) ”Teammates”: List your teammate names here.

(b) ”Statement about ChatGPT (and/or other LLMs)”. Please include here a short statement about which
teammates, if any, used ChatGPT and/or other LLMs to help to generate any of your code. If there
was use of LLMs, then please indicate who used them and for what purposes. Also, please describe the
experience - was it helpful or not, and how/why?

(c) ”Statement about distributed work”. Please include here a short statement about which team members
did which work to create your codebase.

(d) ”Comments on Observed Performance”. Include here 4 subsections that include the answers to the 4
questions posed in Step 5 above.

(e) ”References”. If you used any outside sources, please include them in this section. (If you didn’t use any
outside sources, then include the statement ”We did not use any outside sources”.)

2. Each STUDENT should submit a zip including several files. The name of the file have the form
PA 3.zip. The zip should include the following things:

(a) The report Assignment-3-TEAM-REPORT.pdf produced by your team.

6
(b) The following json files:

i. listings with reviews m subset 1001.json

ii. listings with calendar subset 1001.json

iii. listings with reviews and cal subset 1001.json

(d) All notebooks and helper function files that you used to create the json files and visualizations on your
machine. (This code might have been developed individually by you and/or jointly by your team.) Please
use the suggested file names for the notebooks, and meaningful file names for other files.

ECE 484W: Assignment #4: Author
No ratings yet
ECE 484W: Assignment #4: Author
17 pages
MongoDB (BDSL456B) Manual
No ratings yet
MongoDB (BDSL456B) Manual
31 pages
Lab Sheet 06 - Introduction To NoSQL Databases Using MongoDB
No ratings yet
Lab Sheet 06 - Introduction To NoSQL Databases Using MongoDB
5 pages
Nosql Lab Mongodb
No ratings yet
Nosql Lab Mongodb
3 pages
DSA Practical Workbook - LAb Manuals 18cs
No ratings yet
DSA Practical Workbook - LAb Manuals 18cs
141 pages
Mongodb Homework 3.1 Python
100% (1)
Mongodb Homework 3.1 Python
6 pages
Mongodb m101 Homework 2.2
100% (1)
Mongodb m101 Homework 2.2
8 pages
MongoDb Lab Progam Syllabus
No ratings yet
MongoDb Lab Progam Syllabus
3 pages
Lab 10 Mongo - DB Installtion Aand Config
No ratings yet
Lab 10 Mongo - DB Installtion Aand Config
22 pages
CO7401 NoSQL Database Design and Build Assignment
No ratings yet
CO7401 NoSQL Database Design and Build Assignment
14 pages
Outside Curriculum
No ratings yet
Outside Curriculum
22 pages
NoSQL Lab MongoDB Submission
No ratings yet
NoSQL Lab MongoDB Submission
3 pages
Updated Mongodb Lab Manual IV Sem
No ratings yet
Updated Mongodb Lab Manual IV Sem
48 pages
Mongodb Homework 6.3 Answer
100% (1)
Mongodb Homework 6.3 Answer
5 pages
Assignment 16 Utkarsh
No ratings yet
Assignment 16 Utkarsh
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
60 pages
InternalAssignment2
No ratings yet
InternalAssignment2
11 pages
Mongodb Homework 4.2 Answer
100% (1)
Mongodb Homework 4.2 Answer
4 pages
Big Data Practical 3
No ratings yet
Big Data Practical 3
4 pages
Mongodb - Question Bank For IA
No ratings yet
Mongodb - Question Bank For IA
6 pages
Big Data
No ratings yet
Big Data
11 pages
BDA - Manual - 1to6 Ayushi
No ratings yet
BDA - Manual - 1to6 Ayushi
22 pages
Homework 3.1 Mongodb Answer
100% (1)
Homework 3.1 Mongodb Answer
6 pages
No SQL
No ratings yet
No SQL
4 pages
Mongodb DBA Homework
100% (1)
Mongodb DBA Homework
4 pages
mongoDB Syllabus
No ratings yet
mongoDB Syllabus
3 pages
Objectives:: DB - Createcollection ("Products")
No ratings yet
Objectives:: DB - Createcollection ("Products")
4 pages
SLIP's fsemMCA
No ratings yet
SLIP's fsemMCA
19 pages
Mongodb Homework 5.4
100% (1)
Mongodb Homework 5.4
7 pages
MongoDB Actporpares 4form
No ratings yet
MongoDB Actporpares 4form
6 pages
DS Retest
No ratings yet
DS Retest
18 pages
Homework 3.4 Mongodb
100% (1)
Homework 3.4 Mongodb
5 pages
Homework 4.3 Mongodb PDF
100% (1)
Homework 4.3 Mongodb PDF
6 pages
Learn MongoDB in 24 Hours
From Everand
Learn MongoDB in 24 Hours
Alex Nordeen
5/5 (2)
Bda 2
No ratings yet
Bda 2
2 pages
Mongodb m202 Homework
100% (1)
Mongodb m202 Homework
7 pages
NoSQL Assignment
No ratings yet
NoSQL Assignment
2 pages
BDA Manual SHUBHAM
No ratings yet
BDA Manual SHUBHAM
22 pages
Python Record Manual
No ratings yet
Python Record Manual
18 pages
Ccs368-Stream Processing Lab Manual
No ratings yet
Ccs368-Stream Processing Lab Manual
50 pages
Mongodb Homework 5.2
100% (1)
Mongodb Homework 5.2
4 pages
Mongo DB - Sub - 241114 - 092501
No ratings yet
Mongo DB - Sub - 241114 - 092501
6 pages
Unit 5 Lab Programs Ex - No.5.1 To 5.3
No ratings yet
Unit 5 Lab Programs Ex - No.5.1 To 5.3
8 pages
CCS368 Stream Processing Record
No ratings yet
CCS368 Stream Processing Record
35 pages
9536 BDA Lab5
No ratings yet
9536 BDA Lab5
8 pages
031 Data Wrangling With Mongodb
100% (1)
031 Data Wrangling With Mongodb
8 pages
Stream Processing Lab
No ratings yet
Stream Processing Lab
50 pages
NGT Journal
No ratings yet
NGT Journal
43 pages
Practical # 2
No ratings yet
Practical # 2
7 pages
BDA Lab Manual 200305105108
No ratings yet
BDA Lab Manual 200305105108
44 pages
MongoDB Class Exercise - 1
No ratings yet
MongoDB Class Exercise - 1
8 pages
FSD Ca2
No ratings yet
FSD Ca2
16 pages
MongoDB - Course Curriculum
No ratings yet
MongoDB - Course Curriculum
5 pages
m101j Homework 5.1 Answer
100% (1)
m101j Homework 5.1 Answer
6 pages
Python Imp Questions With Answers
No ratings yet
Python Imp Questions With Answers
31 pages
Homework 4.4 Mongodb
No ratings yet
Homework 4.4 Mongodb
6 pages
Question Bank
No ratings yet
Question Bank
2 pages
Bdalabmanual
No ratings yet
Bdalabmanual
11 pages
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
No ratings yet
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
4 pages
NGT Practical
No ratings yet
NGT Practical
18 pages
MongoDB - Lab 11
No ratings yet
MongoDB - Lab 11
16 pages
List - Unsorted and Sorted
No ratings yet
List - Unsorted and Sorted
60 pages
Example Python Journal Submission
No ratings yet
Example Python Journal Submission
14 pages
Introduction To ProModel
100% (1)
Introduction To ProModel
82 pages
Resume Vikas
No ratings yet
Resume Vikas
1 page
FREE-ELECTRONS-embedded Linux Kernel and Drivers
No ratings yet
FREE-ELECTRONS-embedded Linux Kernel and Drivers
181 pages
Linked List
No ratings yet
Linked List
9 pages
Industrial Training Report ON "Python 3 Programming": in Partial Fulfillment of Requirements For The Award of Degree in
No ratings yet
Industrial Training Report ON "Python 3 Programming": in Partial Fulfillment of Requirements For The Award of Degree in
30 pages
PDS Function
No ratings yet
PDS Function
71 pages
PLSQL Mock Test III
No ratings yet
PLSQL Mock Test III
8 pages
GaiaGPU Sharing GPUs in Container Clouds
No ratings yet
GaiaGPU Sharing GPUs in Container Clouds
8 pages
Hostel Ranking System Using Merge Sort: Akarsh Srivastava 17BCI0091
No ratings yet
Hostel Ranking System Using Merge Sort: Akarsh Srivastava 17BCI0091
14 pages
CV Template Tri Sutisna SQA 2 Weeks
No ratings yet
CV Template Tri Sutisna SQA 2 Weeks
4 pages
GoogleTest 1.10.0 GCC MinGW Windows
No ratings yet
GoogleTest 1.10.0 GCC MinGW Windows
13 pages
Learning Journal 1
No ratings yet
Learning Journal 1
4 pages
Spark Join2
No ratings yet
Spark Join2
14 pages
Report
No ratings yet
Report
26 pages
Junior Csharp DotNet 15 Essential Junior Level Topics
No ratings yet
Junior Csharp DotNet 15 Essential Junior Level Topics
75 pages
I.T. 5th Sem
100% (1)
I.T. 5th Sem
11 pages
8956 Gen AI
No ratings yet
8956 Gen AI
8 pages
Mapping
No ratings yet
Mapping
56 pages
Exp No-8-219
No ratings yet
Exp No-8-219
6 pages
Spring Boot JWT Security-Mr - RAGHU Spring Boot Application: Raghu Sir (Naresh I Technologies, Ameerpet, Hyderabad)
No ratings yet
Spring Boot JWT Security-Mr - RAGHU Spring Boot Application: Raghu Sir (Naresh I Technologies, Ameerpet, Hyderabad)
50 pages
CIVIL VIth IOT TW Topic
No ratings yet
CIVIL VIth IOT TW Topic
3 pages
Unit-1 Python
No ratings yet
Unit-1 Python
118 pages
Splunk Quick Reference Guide PDF
100% (1)
Splunk Quick Reference Guide PDF
6 pages
Getting Started With Linux and Fortran - Part 1: by Simon Campbell
No ratings yet
Getting Started With Linux and Fortran - Part 1: by Simon Campbell
7 pages
Module 01.2 - Command Line Skills
No ratings yet
Module 01.2 - Command Line Skills
28 pages
Simran Kureel: Education Skills
No ratings yet
Simran Kureel: Education Skills
1 page
JavaScript Drag and Drop
No ratings yet
JavaScript Drag and Drop
38 pages

Programming Assignment 3 v03

Uploaded by

Programming Assignment 3 v03

Uploaded by

ECS 116 Databases for Non-Majors / Data Management for Data Science

2. The assignment is of 10 points.

3. The assignment is due Sunday, June 1, 2025, at 11:59 pm.

4. This assignment will be in teams of 3 or 4 people.

Step 1: Creating MongoDB collection holding listings with embedded

2. The collection you create should be named listings with reviews.

• Query 4: Same as Query 3, but ignoring case.

(a) The csv file has a header row.

(b) Column 1 is labeled ”Query Number”

(c) Column 2 is labeled ”Count”

(d) There are 4 rows, corresponding to the 4 queries, in that order.

Query number Count

Table 1: Example structure of table in the csv file to be submitted

Your goal is to build a notebook, called ”2–Building-listings-with-calendar-using-aggregate.ipynb”, that creates an-

1. id: holds the listing id value of the listing

(a) date, of type datetime

(b) available, of type Boolean

(c) price, of type numeric (not string)

(d) minimum nights, of type integer

(e) maximum nights, of type integer

For this step of the Programming Assignment, you are to:

(c) Your file should hold 28 documents.

Step 3: Creating MongoDB collection holding listings with embedded

Specifically, you are to build a notebook called ”3–Building-listings-with-reviews-using-aggregate.ipynb” that:

2. Modifies the dataframe for listings in the following ways:

(b) Your file should hold 28 documents, as in Step 2.

Step 4: Creating MongoDB collection holding listings with embedded

1. Run the aggregate command on the listings with reviews m collection.

{ ’$unwind’: { ’path’: ’$cal_docs’,

5. Now use an $unset operator to remove the cal docs field.

Step 5: Comments about what you observed

Step 6: Submission Instructions

(a) ”Teammates”: List your teammate names here.

(a) The report Assignment-3-TEAM-REPORT.pdf produced by your team.

i. listings with reviews m subset 1001.json

ii. listings with calendar subset 1001.json

iii. listings with reviews and cal subset 1001.json

You might also like