0% found this document useful (0 votes)

7 views5 pages

Termproject

The document outlines project ideas for a Database Management System course, with a final report due by April 15th, 2023. It includes various project categories such as logical database design, buffer management, indexing and hashing, query processing, and distributed databases, each with specific objectives and methodologies. Students are encouraged to submit project proposals for approval and will be evaluated based on the volume of work and completeness.

Uploaded by

vinothediting2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views5 pages

Termproject

Uploaded by

vinothediting2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Database Management System Project Ideas

Guidelines:

Project groups will be same as that of the mini-project groups.

Final report submission and demonstration will be done by April 15th, 2023. There will be intermediate
evaluations.

Project will be evaluated based on the volume of work, and completeness.

Some broad project ideas are given below. You are free to choose other ideas too. Multiple groups may
take up the same project idea.

You have to submit a concrete project proposal within a week and get it approved by us.

Deliverables: Report on the project (with group member names and roll numbers) and a demonstration.

A. Projects on Logical Database Design for Specialized Databases:

1. Spatial Databases for Disaster Management:

Design and populate a database of road networks, population, water level etc is provided from multiple
data sources. The goal of the project is to provide an information system for decision support in disaster
management tasks like evacuation, relief. It is part of a larger project for National Spatial Data
Infrastructure specific to coastal disaster management system in Eastern coast of India. The database
should be supported by a web interface. Data may be pulled from the Google Earth Engine.

Similar spatial database project ideas are listed in:

https://sites.google.com/view/summerofearthengine/projects

2. High QPS Text Search Engine

Objective: The project will use the Apache Solr framework running on ZooKeeper framework to develop
scalable text search engines.

Methodology: Develop keyword based search engine. Do some benchmarking, to see how search
performance scales in a very high query per second (QPS) setting. QPS can be obtained using either
something like a multithreaded HTTP client using a ThreadPool and ConnectionPool, or by using a tool
like Siege, that can simulate multiple concurrent HTTP requests on a server.

Solr is an open source enterprise search platform from the Apache Lucene project. Its major features
include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and
rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is
highly scalable.
Siege is an http load testing and benchmarking utility. It was designed to let web developers measure
their code under duress, to see how it will stand up to load on the internet. Siege supports basic
authentication, cookies, HTTP and HTTPS protocols. It lets its user hit a web server with a configurable
number of simulated web browsers. Those browsers place the server “under siege.”

Outcome: QPS data outcome using multithreaded http clients with the help of seige connection pool
and thread pool.

3. Large scale graph processing

The goal of the project is process large graphs in a database

i. Install any graph processing systems e.g., ApacheGraph, Pregel (GoldenOrb), Giraph, or Stanford GPS,
ii. Load a large graph from Stanford SNAP large graph repository

iii. Provide interface to run simple graph queries. Bonus for computing PageRank.

iv. Profile performance

4. XML database

Build a XQuery interface for the Wikipedia/DBLP XML data. The interface should support structured as
well as unstructured queries.

5. Multimedia database

Build a multimedia database consisting of text/structured data/music/images/video. Public data may be

downloaded to populate the database. Basic queries should be supported.

6. Datalog query language

Datalog is a query language based on the relational calculus. Write datalog programs for a graph
database to answer various graph queries.

B. Projects on Buffer Management

7. Simulating Buffer Manager Strategies

The of this project is to simulate a small buffer pool for simple Join/Selection queries on few small tables
in the C/C++ language. Popular buffer manager strategies like LRU/MRU/CLOCK/Pinned blocks may be
simulated. The strategies may be compared in terms of the number of disk i/o required.

You may use the SQLite C Library for more realistic simulation. https://sqlite.org/index.html

C. Indexing and Hashing

8. High Dimensional Indexing

Implement the R-Tree/KD-Tree for high dimensional indexing. You may choose a high dimensional data
9say, audio or image) and use the tree to index and search it.

9. Implementing Extendable Hashing

Implement the extendable hashing algorithm. Implement the data structures for the hash table. Assume
only data expansion occurs. Benchmark the access time on a suitable dataset. You may compare it with
the SQLite implementation.

10. Auto-admin for Index Creation

Given a set of query workloads and some table statistics along with a index storage budget write a
program to decide the best indices to create. Auto-admin tools are often available to recommend
indices, etc. Design a tool in SQLite that recommends a set of indices to build given a particular workload
and a set of statistics in a database.

D. Query Processing

11. External Memory Join Algorithms

Implement external memory join computation algorithms and profile their performance on large data
sets.

12. Metric Reporting

Build a wrapper/interface which collect query processing metrics like table statistics, CPU/memory
usage in run time while executing a query. You may use the inbuilt commands of Postgres.

13. Rule based query rewriting

Specify some fixed rules for query optimization. Write a query rewriter for simple queries which takes as
input a relational algebra query and returns its optimal version.

D. Distributed Databases

14. Design of large, heterogeneous, distributed database systems

The goal of the project is to design a large database running on a distributed map-reduce platform that
can handle heterogeneous data obtained from different sources. The map-reduce distributed Hadoop
platform will be used. The database will use Apache Hbase system as the table structure. It will integrate
multiple data sources using the Protocol Buffer architecture.

Such databases are common in processing of large and unstructured data common in search engines
and online social networks.

Steps:
1. Install Hadoop/Hbase on a laptop/server cluster

2. Load data to nodes

3. Write map-reduce operations

4. Pipe map-reduce outputs using Protocol Buffer

5. Run simple queries

Technologies Involved: Hadoop, Apache Hbase, Protocol Buffers

Data APIs (each group may select a separate data and appropriate queries):

Twitter, Amazon public data sets (http://aws.amazon.com/datasets)

15. Datawarehouse on Hadoop

The goal of the project is to run the Hive data warehouse system on Hadoop. And run aggregate/
reporting queries on large data sets in a map-reduce framework.

Steps:

1. Install Hive on Hadoop running on a laptop/server cluster

2. Load data to Hive

3. Write and execute queries in HiveQL

11. Text search in a map-reduce framework

The project will use the Apache Solr framework running on ZooKeeper framework.

In this project you will develop a keyword based search engine. Then do some benchmarking, to see
how search performance scales in a very high query per second (QPS) setting. QPS can be obtained using
either something like a multithreaded HTTP client using a ThreadPool and ConnectionPool, or by using a
tool like Siege, that can simulate multiple concurrent HTTP requests on a server. Solr is an open source
enterprise search platform from the Apache Lucene project. Its major features include full-text search,
hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g.,
Word, PDF) handling. Providing distributed search and index replication, Solr is highly scalable. Siege is
an http load testing and benchmarking utility. It was designed to let web developers measure their code
under duress, to see how it will stand up to load on the internet. Siege supports basic authentication,
cookies, HTTP and HTTPS protocols. It lets its user hit a web server with a configurable number of
simulated web browsers. Those browsers place the server “under siege.” Outcome is a high QPS data
search benchmarking results using multithreaded http clients with the help of seige connection pool and
thread pool.

VH GR-7 Mathematics T1 Sample-QP
100% (2)
VH GR-7 Mathematics T1 Sample-QP
6 pages
Recommendation For OIST Research Internship Applicant OIST Graduate University
No ratings yet
Recommendation For OIST Research Internship Applicant OIST Graduate University
1 page
Nursing Care Plans
100% (3)
Nursing Care Plans
10 pages
MODULE 9&10 F: E M A E M I T S C O L L E G E P H I L I P P I N E S
No ratings yet
MODULE 9&10 F: E M A E M I T S C O L L E G E P H I L I P P I N E S
12 pages
Railway Traning Report
100% (2)
Railway Traning Report
45 pages
You Must Be Mad!: Warbirds RPG Mad Science Sourcebook
100% (2)
You Must Be Mad!: Warbirds RPG Mad Science Sourcebook
55 pages
Project Report
0% (1)
Project Report
23 pages
Tribhuvan University Institute of Engineering Pulchwok Central Campus Pulchwok, Lalitpur
No ratings yet
Tribhuvan University Institute of Engineering Pulchwok Central Campus Pulchwok, Lalitpur
13 pages
5 6089131777291453670
100% (1)
5 6089131777291453670
70 pages
PT Mathematics-6 Q2
No ratings yet
PT Mathematics-6 Q2
7 pages
MK PT en
No ratings yet
MK PT en
308 pages
Volume Shockers (Stocks With Rising Volumes), Technical Analysis Scanner
No ratings yet
Volume Shockers (Stocks With Rising Volumes), Technical Analysis Scanner
2 pages
Supported Upgrade Paths For FortiOS Firmware 5.2
0% (1)
Supported Upgrade Paths For FortiOS Firmware 5.2
20 pages
7 ROOT LOCUS Part 1
No ratings yet
7 ROOT LOCUS Part 1
7 pages
Approach, Method, and Technique
100% (1)
Approach, Method, and Technique
23 pages
Leeb Hardness Tester
No ratings yet
Leeb Hardness Tester
4 pages
Math C4 Practice
No ratings yet
Math C4 Practice
53 pages
Carbon and Alloy Steel Nuts For Bolts For High Pressure or High Temperature Service, or Both
No ratings yet
Carbon and Alloy Steel Nuts For Bolts For High Pressure or High Temperature Service, or Both
11 pages
Giancoli Chap 3 Vectors Kinematics in 2 Dimensions
No ratings yet
Giancoli Chap 3 Vectors Kinematics in 2 Dimensions
37 pages
RV College of Engineering: Database Development and Testing For Automobile Industry
No ratings yet
RV College of Engineering: Database Development and Testing For Automobile Industry
17 pages
Magnetism Part 1
No ratings yet
Magnetism Part 1
8 pages
Emergency Cart Checklist
No ratings yet
Emergency Cart Checklist
1 page
Dbms MANUAL
No ratings yet
Dbms MANUAL
98 pages
Magic and The Mind
No ratings yet
Magic and The Mind
379 pages
A Study On Business Market Research On Croma To Release Their Own Products
No ratings yet
A Study On Business Market Research On Croma To Release Their Own Products
3 pages
FinalProject Description
No ratings yet
FinalProject Description
5 pages
Fascinating Photos of Afghanistan in The 1960s Show Life Before The Taliban
No ratings yet
Fascinating Photos of Afghanistan in The 1960s Show Life Before The Taliban
1 page
Bharath V Resume
No ratings yet
Bharath V Resume
5 pages
Synopsis A
No ratings yet
Synopsis A
16 pages
CNS Unit 3
No ratings yet
CNS Unit 3
94 pages
BBCP
No ratings yet
BBCP
124 pages
Basics of Share Allotement
No ratings yet
Basics of Share Allotement
3 pages
CAWRT Drill Flyer
No ratings yet
CAWRT Drill Flyer
1 page
Campus Map
No ratings yet
Campus Map
1 page
Eti 2 - Compressed
No ratings yet
Eti 2 - Compressed
11 pages
Manual Mango
No ratings yet
Manual Mango
17 pages
TCA 1 Hard Surface Flooring Proposal and Reason Statement
No ratings yet
TCA 1 Hard Surface Flooring Proposal and Reason Statement
2 pages
Lab Project Ideas
No ratings yet
Lab Project Ideas
31 pages
Big Data Hadoop - Course Curriculum - V1
No ratings yet
Big Data Hadoop - Course Curriculum - V1
7 pages
DBMS Black
No ratings yet
DBMS Black
19 pages
9th Major-4 English NCERT Paper Zdyxcq
No ratings yet
9th Major-4 English NCERT Paper Zdyxcq
7 pages
8 MapReduce Different Phases 08-01-2025
No ratings yet
8 MapReduce Different Phases 08-01-2025
28 pages
Big Data Analytics - Sem 7 CVMU
No ratings yet
Big Data Analytics - Sem 7 CVMU
4 pages
Adbms Projects LIST
No ratings yet
Adbms Projects LIST
4 pages
IA Big Data Lab Works
No ratings yet
IA Big Data Lab Works
7 pages
SQL Projects
No ratings yet
SQL Projects
7 pages
Distributed Caching & Data Management: Mastering Redis, Memcached, And Apache Ignite Caching
From Everand
Distributed Caching & Data Management: Mastering Redis, Memcached, And Apache Ignite Caching
Rob Botwright
No ratings yet
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
From Everand
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
Anand Vemula
No ratings yet
DP-420 Designing and Implementing Cloud-Native Applications Using Microsoft Azure Cosmos DB Certification Exam Guide
From Everand
DP-420 Designing and Implementing Cloud-Native Applications Using Microsoft Azure Cosmos DB Certification Exam Guide
Anand Vemula
No ratings yet
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
From Everand
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
Robert Johnson
No ratings yet
Hadoop Engineering
From Everand
Hadoop Engineering
Jaxon Vyas
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
RESTful Java Web Services Interview Questions You'll Most Likely Be Asked: Second Edition
From Everand
RESTful Java Web Services Interview Questions You'll Most Likely Be Asked: Second Edition
Vibrant Publishers
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Mastering Data Engineering: Advanced Techniques with Apache Hadoop and Hive
From Everand
Mastering Data Engineering: Advanced Techniques with Apache Hadoop and Hive
Peter Jones
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
From Everand
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
vivian njoroge
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
From Everand
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
Tim Warren
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
From Everand
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
Neylson Crepalde
No ratings yet
Ultimate Nuxt.js for Full-Stack Web Applications: Build Production-Grade Server-Side Rendering (SSR) and Static-Site Generated (SSG) Vue.js Applications Using Nuxt.js, Node.js, and Composition API
From Everand
Ultimate Nuxt.js for Full-Stack Web Applications: Build Production-Grade Server-Side Rendering (SSR) and Static-Site Generated (SSG) Vue.js Applications Using Nuxt.js, Node.js, and Composition API
Lau Tiam
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Web Devlopment
From Everand
Web Devlopment
Netra
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Ultimate Nuxt.js for Full-Stack Web Applications: Build Production-Grade Server-Side Rendering (SSR) and Static-Site Generated (SSG) Vue.js Applications Using Nuxt.js, Node.js, and Composition API (English Edition)
From Everand
Ultimate Nuxt.js for Full-Stack Web Applications: Build Production-Grade Server-Side Rendering (SSR) and Static-Site Generated (SSG) Vue.js Applications Using Nuxt.js, Node.js, and Composition API (English Edition)
Lau Tiam Kok
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Hadoop Blueprints
From Everand
Hadoop Blueprints
Anurag Shrivastava
No ratings yet
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
From Everand
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
Pablo Alejandro Echeverria Barrios
No ratings yet
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Building Websites with OpenCms
From Everand
Building Websites with OpenCms
Matt Butcher
No ratings yet
IBM Cognos 8 Planning
From Everand
IBM Cognos 8 Planning
Jason Edwards
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Mastering Apache Cassandra - Second Edition
From Everand
Mastering Apache Cassandra - Second Edition
Nishant Neeraj
No ratings yet
Learning DHTMLX Suite UI
From Everand
Learning DHTMLX Suite UI
Eli Geske
No ratings yet
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Professional Heroku Programming
From Everand
Professional Heroku Programming
Chris Kemp
4/5 (2)
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet

Termproject

Uploaded by

Termproject

Uploaded by

Database Management System Project Ideas

Project groups will be same as that of the mini-project groups.

Project will be evaluated based on the volume of work, and completeness.

A. Projects on Logical Database Design for Specialized Databases:

1. Spatial Databases for Disaster Management:

Similar spatial database project ideas are listed in:

2. High QPS Text Search Engine

3. Large scale graph processing

The goal of the project is process large graphs in a database

iv. Profile performance

Build a multimedia database consisting of text/structured data/music/images/video. Public data may be

6. Datalog query language

B. Projects on Buffer Management

7. Simulating Buffer Manager Strategies

C. Indexing and Hashing

9. Implementing Extendable Hashing

10. Auto-admin for Index Creation

11. External Memory Join Algorithms

12. Metric Reporting

13. Rule based query rewriting

14. Design of large, heterogeneous, distributed database systems

2. Load data to nodes

3. Write map-reduce operations

4. Pipe map-reduce outputs using Protocol Buffer

5. Run simple queries

Technologies Involved: Hadoop, Apache Hbase, Protocol Buffers

Twitter, Amazon public data sets (http://aws.amazon.com/datasets)

15. Datawarehouse on Hadoop

1. Install Hive on Hadoop running on a laptop/server cluster

2. Load data to Hive

3. Write and execute queries in HiveQL

11. Text search in a map-reduce framework

You might also like