FinalProject Description
FinalProject Description
This course has a semester-long programming project, done in groups of 2-3 students.
Project Timeline
Component Due Date
Milestones
• Group formation: find a project partner(s) and begin to discuss project problems and
ideas.
• Final report and software/source code: the final report should include:
• Team Organization (Team members and their role in the project)
• Survey related work in the related work section;
• Server/System Software/Hardware configuration
• Project Definition
o Present the problem and summarize your contributions
o List of functional/non-functional requirements
• Architecture (architecture outline and architectural diagram)
• Include a detailed description of your methodology, analysis, and implementation
in the technical section
• Describe evaluation methodology and significant results in the evaluation section
• The report should also include a paragraph explaining, for each group member,
their contributions and duties in the project.
• Please specify a hyperlink at the end of your report through which we can
download your source code and data set for reproducing your experimental
results.
Source code submission: Please provide your COMPLETE source code, datasets, and
runnable software in one package. You should provide a link to Github or Canvas and
upload the code to the system. Please include a README file specifying how to compile
and run your code. Students can use libraries or online code during implementation, but
such source code won't be considered as your workload.
Project Ideas
Below are some high-level project ideas. You can further search Google Scholar and/or IEEE to
find reference papers for your proposals/implementation. You are encouraged to improve on the
topic of the paper.
I will be happy to discuss these ideas with you as well as help you refine your own ideas.
• Spatial Database
A spatial database is a database that is optimized for storing and querying data that represents
objects defined in a geometric space. Most spatial databases allow representing simple geometric
objects such as points, lines and polygons. Spatial databases use a spatial index to speed up
database operations. Some basic operations are: Spatial Measurements, Spatial Functions, Spatial
Predicates, and Geometry Constructors. Some databases support only simplified or modified sets
of these operations, especially in cases of NoSQL systems like MongoDB and CouchDB.
The followings are some well known relational spatial database:
PostGIS, SQL Server, Oracle Spatial & Oracle Locator, IBM DB2 Spatial Extender, and
SpatiaLite
• Distributed Database
A distributed database is a set of databases stored on multiple computers that typically appears to
applications as a single database. Consequently, an application can simultaneously access and
modify the data in several databases in a network. Each database in the system is controlled by
its local server but cooperates to maintain the consistency of the global distributed database. This
project is very easy to describe: You should design and implement a real, live distributed
database that handle techniques for transaction management, concurrency and recovery.
• Streaming Database
A data stream is an unbounded data set that is produced incrementally over time, rather than
being available in full before its processing begins. A traditional database management system
typically processes a stream of ad-hoc queries over relatively static data. In contrast, a Data
Stream Management System (DSMS) evaluates static (long-running) queries on streaming data,
making a single pass over the data and using limited working memory. One important feature of
a DSMS is the possibility to handle potentially infinite and rapidly changing data streams by
offering flexible processing at the same time, although there are only limited resources such as
main memory.
Examples of Data Stream Management systems are:
SQLstream, STREAM, AURORA, QSTREAM, TelegraphCQ, SAP Event Stream Processos,
InfoSphere Streams.
• Image Retrieval
While searching for textual data on the World Wide Web and in other databases has become
common practice, search engines for pictorial data are still rare. This comes as no
surprise, since it is a much more difficult task to index, categorize and analyze images
automatically, compared with similar operations on text. An easy way to make a searchable
image database is to label each image with a text description, and to perform the actual search on
those text labels. However, a huge amount of work is required in manually labelling every
picture, and the system would not be able to deal with any new pictures not labelled before.
Furthermore, it is difficult to give complete descriptions for most pictures. Early approaches to
the content-based image retrieval problem include the IBM QBIC (Query-By-Image Content)
System, where users can query an image database by average color, histogram, texture, shape,
sketch, etc. Later techniques based on example query are developed. Query by example is a
query technique that involves providing the system with an example image that it will then base
its search upon. Below is a list of publicly available Content-based image retrieval (CBIR)
engines. These image search engines look at the content (pixels) of images in order to return
results that match a particular query:
Pixolution, JustVisual, Elastic Vision, Google Image Search, SearchByImage, Picalike, ID My
Pill, PicScout.