Thesisfs: Online Document Management System: Joseph Christian G. Noel William Yu Pierre Tagle, PHD
Thesisfs: Online Document Management System: Joseph Christian G. Noel William Yu Pierre Tagle, PHD
Thesisfs: Online Document Management System: Joseph Christian G. Noel William Yu Pierre Tagle, PHD
4. Dynamic
Search functionalities are needed to help the user find specific
files quickly. This feature goes hand and hand with scalability as
it enables the productivity of the user scale as well with the
number of documents in the collection. ThesisFS will have
multiple options for conducting searches, metadata and content
search, Labels, and Search Folders.
iDisk is the online file-storage system by Apple Computers, Inc., 4.1 Filesystem
as part of its' .Mac package of Internet services. It supports the
creation of folders for grouping files, and a public folder for
making files available online. Rather than being a web-based
system, it uses the WebDAV protocol for deployment. Accessing
it requires a Mac OS X computer which has built-in iDisk
WebDAV support, or the iDisk Utility for Windows XP. A big
feature of iDisk, at least on Apple computers, is its integration
with Mac OS X.
By defining their own Action classes when the need arises, the To get the appropriate Indexer class for a specific file type,
system can be adapted to the user's needs and be able to meet the getIndexer method of the indexer.IndexerUtil is called.
emerging requirements for workflows and document This method takes a file type as an argument, parses the
management. indexer.xml file with a SAX parser, and returns the Indexer for
that file type. The complete code for parsing during uploads is:
4.5 Usability
The system is meant to be used like any other web-based IndexerUtil indexerUtil = new IndexerUtil
filesystem out there. There will be clear link for downloading (context.getRealPath(indexerConfigPath));
files and entering folders, and html forms for uploads and Indexer indexer = indexerUtil.getIndexer(bf.getType
searches. Users are encouraged to create Search Folders and ());
Action Folders as much as possible to help ease their job of if (indexer != null) {
managing their files. Indeed, the ease with which to create Search indexer.contentIndex((new String(bf.getFile())).
Folders and Action Folders are designed to facilitate this. Proper toLowerCase(), id, statement, bf.getOwner());
use of Search Folders and Action Folders allow the user to be }
more dynamic and gives him more options.
5.2 Searching
To summarize, the key components for any document During searches, ThesisFS tokenizes each search word and tries
management system is effective indexing and searching to match each word to a file's name, Labels, and contents. A
functionality, and a way to automate workflow for easy search score is given for every matched search word. Name
management and collaborations. ThesisFS has content and matches are given 100 points, Label matches are given 50 points
metadata indexing and searching, and even allows the user to add for every matching Label value, content matches are given five
their own metadata through Labels. Searches will look through multiplied by the number of occurrences for every matching
all these and return the relevant results back to the user. A key word. All files with a search score greater than zero will be
feature of ThesisFS is to be able to save those searches as Search ordered descendingly according to search score and returned to
Folders, for later accessing and having folders with dynamic the user in list form.
contents. Action Folders are folders with Actions attached to
them. Actions are triggered whenever a file is moved into folder. 5.3 Action Folders
Actions can perform any action upon a file and can be used to Action Folders are the authors' implementation of automated
manage the workflow and free up the user from doing repetitive actions. Action classes are mapped to an Action Folder with the
tasks. is_action and action column of the folder table. Action classes
are defined in the action.xml configuration file and implemented
These are the main features of ThesisFS. Taken together, they in the action package. A sample of an entry in the action.xml
can be a framework for later and more advanced systems doing file, for the Zip action is:
document management.
<action>
5. IMPLEMENTATION DETAILS <name>Zip</name>
<class>action.Zip</class>
5.1 Indexing </action>
Indexing is done through the specific Indexer classes. All Indexer
classes will have a method called contentIndex. This method All Action files should implement the Action interface and define
takes as argument parameters the contents of the file, the file id, a the doAction method. doAction takes a beans.BinaryFile object
java.sql.Statement object, and the username of the user. The as an argument. This beans.BinaryFile will contain all the
Indexer is responsible for parsing the contents of the file and information of a file for editing. To get the Action class for a
inserting it into the index table of the database. The index table specific Action Folder, the getAction method of the
has the following schema: action.ActionUtil class is used. This method takes as an
argument a String object specifying the name of the Action. In
file TEXT REFERENCES file(id), the Upload servlet, the code for activating an action when a file
word TEXT, is uploaded to an Action Folder is:
count INTEGER,
String actionName = result.getString("action");
The index table contains what words appear in a file, and how ServletContext context = getServletConfig().
many times it appears in the file. The Indexer class inserts data getServletContext();
to the table through the java.sql.Statement argument in the ActionUtil actionUtil = new ActionUtil
contentIndex method. Indexers are mapped to a specific file (context.getRealPath(actionConfigPath));
Action action = actionUtil.getAction(actionName); indexing content. ThesisFS, on the the other hand, goes further
action.doAction(bf); by allowing the user to save those searches, so that dynamic
results can be retrieved at a later date. ThesisFS also implements
the concepts of Labels, basically user-defined metadata that can
6.RESULTS be added to files. This is on the premise that a user will almost
The system was run on an iBook G4 with an 800Mhz G4 certainly know best what information should be tagged along
processor and 640MB of RAM. The operating system is Mac with the file for indexing. Lastly, ThesisFS provides Action
OX X 10.3.8. The application server used is JBoss 3.2.5, and the Folders to help better manage the user's workflow. The
database server is PostgreSQL 7.3. possibilities for advanced Action classes are nearly limitless and
are constrained only by the imagination and skill of the
6.1 Comparison of Features programmer. It is these features, advanced searching
functionality, Search Folders, and Action Folders, that separate
ThesisFS from other document management systems.
Feature DMS Yahoo! FileNet iDisk Thesis
FS
8. RECOMMENDATIONS
Web- Ideal Yes * Yes Yes The authors recommend that further Indexer classes for different
acces filetypes be created the increase the range of files the system can
content index. Also, the authors further recommends that more
File Required Yes Yes Yes Yes
Action classes be created supporting more advanced automated
system
actions upon files.
features
Content Required No Yes No Yes Another way to improve the system is to integrate keywords into
Indexing the search functionality, similar to Google. This can enable the
users to do more advanced searches by constraining or expanding
Plug-in Ideal No * No Yes the search criteria as they see fit.
based
Indexing
Seach Required No Yes Yes Yes Lastly, we recommend adding more advanced filesystem features
features into the system. These features may include Access Control
Lists, copying and moving commands, and online editing of the
Search Ideal No No No Yes contents of files.
Folders
Work Required No Yes No Yes 9. REFERENCES
flow [1] “Working with Spotlight”, Apple Developer Connection,
<http://developer.apple.com/macosx/tiger/spotlight.html>
[9] FileNet Content Manager Brochure, FileNet Corporation, [11] Lawrence, Steve, Bollacker, Kurt, Giles, C. Lee, “Indexing
<http://www.filenet.com/English/Products/Datasheets/02325002 and Retrieval of Scientific Literature”, Eight International
7.pdf> Conference on Information and Knowledge Management, pp.
139-146