The Implementation of A Web Crawler URL Filter Algorithm Based On Caching

Uploaded by

2023190255

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views4 pages

The Implementation of A Web Crawler URL Filter Algorithm Based On Caching

Uploaded by

2023190255

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

2009 Second International Workshop on Computer Science and Engineering

The Implementation of a Web Crawler URL Filter Algorithm Based on Caching

Wang Hui-chang Ruan Shu-hua Tang Qi-jie

School of Computer Science School of Computer Science School of Computer Science
Sichuan University Sichuan University Sichuan University
Chengdu Sichuan 610065, China Chengdu Sichuan 610065, China Chengdu Sichuan 610065, China
e-mail: [email protected] e-mail: [email protected] e-mail: [email protected]

Abstract — For large-scale Web information collection, the URL pages and follow the links to find new pages and links [1,2].
filter module plays important roles in a Web crawler which is a A Web crawler usually starts from a URL address alone or
central component of a search engine. The performance of an URL URL list to visit each pages specified by each URL, extracts
filter module influents the efficiency of the entire collection system the links in each page by the content analysis and eliminates
directly. This paper introduces one URL filter algorithm based on the repetitive URLs after download each page, then adds the
caching and its implementation. The performances of stability and new links to the URL list.
paralleling of the algorithm are verified by the experiments for
Websites which handle a large number of web pages. Experiment Seed URL Internet
results show the algorithm proposed in this paper can achieve
satisfactory performances through reasonable adjustments of its
some parameters and it is suitable for the process of the URL filter
of a Website which has a number of page navigator links and index URL List Page Downloader Crawling
pages especially. Parameter
Assistor
Keywords - Web Crawler; URL Filter; Caching Link Extractor

I. INTRODUCTION
Accompanied by the informationization of society and URL Filter
rapid development of the Internet, more and more
information is on the Internet. These make search engine,
one of the main applications of modern information
technology, become one necessary tool by which people can Figure 1. A simplified web crawler
get information through the Internet more easily and quickly.
A web crawler, which is a central component of a search It is a simplified Web crawler in Figure 1. According to
engine, impacts not only on the recall ratio and precision of a Figure 1, a Web crawler starts an URL called the Seed URL
search engine, but also on the capacity of the data storage to visit the Internet. The Page Downloader gets an URL from
and efficiency of a search engine. URL List to download the page from the Internet, transfers
World Wide Web can be viewed as a directed graph the page to the Link Extractor. The Page Downloader checks
without rules and borders, and a wide range of circles exists the parameters in accordance with the requirements of
in it. Due to a large number of index pages and navigation crawling to decide whether or not to download pages. As the
contents existed in a Website, as well as a large number of crawler visits these URLs, the Link Extractor indentifies all
relevant content links existed in a web page, the URLs get the hyperlinks in the pages in accordance with the
from web pages are much more than the actual number of requirements of crawling and transfers them to the URL
URLs existed in the Internet. Filter, stores the results into URL list. The Crawling
The URL filter module, which is an important Parameter Assistor provides the parameter setting for the
component of a crawler, is used to filter the URLs analyzed needs of all parts of the crawler.
from the web pages downloaded by the crawler of a search
engine so as to improve the efficiency of the crawler. We III. THE COMMON METHODS OF THE URL FILTER
design an URL filter algorithm and implement it based on In order to avoid downloading the same page repeatedly,
caching. Experiment results show the algorithm proposed in we need to eliminate the repetitive URLs. The URL filter can
this paper can achieve satisfactory performances. easily become a bottleneck in the performance of the crawler
II. A WEB CRAWLER system because the URL list to be inquired is growing bigger
and bigger.
A Web crawler (also known as a Web spider or a Web The main idea of the URL filter is to determine whether
robot) is a program or an automated script which can get an URL exists in the known URL list. The URL query
specified pages from the Internet, extract the links from these algorithm is the key. The basic process is for each string of

978-0-7695-3881-5/09 $26.00 © 2009 IEEE 1058

453
DOI 10.1109/WCSE.2009.851
10.1109/WCSE.2009.228
Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on June 28,2024 at 10:57:26 UTC from IEEE Xplore. Restrictions apply.
the URL analyzed by a web page to be compared with the a set can’t be mistakenly believed that it has not consisted in
each string in the known URL list. If it is a repetitive URL, this set, there is no false negative. A Bloom Filter can save a
then be abandon directly, or a new URL, then be added to lot of storage space by allowing a small amount false
the URL list. positive rate.
The common methods of the URL filter are the
followings. IV. THE URL FILTER ALGORITHM BASED ON CACHING
A. The URL filter based on a hash table The integrated ideas to deal with the URL filter are that
the crawler program downloads a page to the local and
Hash table is a linear list stored by hashing method. parses out URLs from the page, then passes the list of URL
According to the method of dealing with conflict, the given to the URL Filter for eliminating the duplications. URL
hash function h(key) converts a set of keywords to a limited Filter as a key component of the crawler provides non
set of consecutive addresses (or range) in which the “map” of duplicating URL address for crawler to crawl on the Internet,
keywords is the storage location. The process is called and as a storage place for parsed URLs communicates with
hashing and the storage addresses is called hashing addresses database to complete the operations of access URL data at
or hash table. Hash table is suitable to deal with finding the same time.
string because it converts the string to an integer and Due to the limited capacity of computer memory, the
associates the string with an address to control the capacity of URL caching queue is limited accordingly. There
complexity of search algorithm within O(1), which is is a large number of duplicating URL addresses in some site,
entirely superior to the other search algorithms [3]. and their presenting frequencies are relatively high. The
The URL was stored as the key of a hash table if we use URL filter algorithm based on caching will greatly improve
hash table to store the URL list. For each URL requesting to the rate of hits by putting these high presenting frequency
add into list, we can search hash table fast to be aware of the URLs into the URL caching queue. According to a cache
URL whether or not a repetitive one. If not a repetitive one, replacement algorithm, we can improve the efficiency of the
we can deposit it into the hash table. Hash table also has the crawler system by maintaining the URL caching queue with
shortcomings that cannot be sorted and traversed and is not a high hit rate and reducing the frequency of database access.
suitable for mass data storage for being subject to the The specific working principle is: when the crawler parses
limitations of memory capacity. out an URL, we query in the caching queue first, if find the
B. The URL filter based on a Bloom Filter URL to increase its repetition rate to the page but not to deal
with, if not find it to query the database on a relatively slow
A Bloom Filter is a collection querying algorithm with speed, then if find it to put it into the different caching
high efficiency in space and time by using a hash function queues according to if it had been visited. Otherwise, load
and representing data with a bit vector, which is suitable for the new URL list which includes all the URLs under the
mass data querying operation [4]. For a Web crawler, there is same directory into the caching queue with waiting for deal
an inconsistency between a large number of URLs and and reduce the number of database access.
loading URLs into memory for fast querying. But for each It is because of this kind of access mechanism that the hit
URL, a Bloom Filter produces random k-values by using k- rate of the caching queue is very high, which means the most
hash functions and set 1 on the corresponding location of a of URLs parsed by crawler program are in the cache queue,
bit vector. So if we use a Bloom Filter to store URL list, the only about 10%-20% of them needed to check in the
size of the URL list is only the size of the bit vector and we database. This greatly saves time to query a new URL and
can optimize the space greatly and make querying fast under need not query database in the most situations. In general,
the restrictions of time and the rate of miscarriage of justice. the sequence of crawler program to determine whether the
Towards every URL which requests to add into the list, k URL is a repetition is the caching queue first, then database.
kinds of hashing calculations shall be done on it. And k The URL filter algorithm based on caching includes
numbers generated as per calculation method shall be used to some parameters, such as caching capacity, exchange rate of
set the correspondence position of the bit vector to be 1. cache and external storage, depth of crawling etc. For a
Before each positioning, determination shall be firstly given variety of network environment, the corresponding
whether or not the k correspondence positions of the bit parameters can be adjusted to achieve the corresponding
vector through hashing calculations have been set to be 1.If performance requirements.
so, determination can be given the URL has already In this paper, the crawler URL filter system based on the
consisted in the list and is a repetitive one. Otherwise cache strategy was implemented by which combined with
position 1 opposite to the bit vector shall be added to the Java 2 platform and MySql5.0 DBMS.
URL list as a new URL. The URL filter algorithm which is based on caching
A Bloom Filter has the advantages of high speed and shows in Figure 2.
saving storage space, so mass data can be represented easily The description of the algorithm is as the followings:
and accessed fast, but in the use of a Bloom Filter to 1) To initialize the lists of URL in the URL Filter.
determine whether an element belongs to a set, there will be There are two URL caching lists in the URL Filter,
a certain false positive rate. That is, an element that is one is visitingList in which URLs will be visited
inexistent in a set may be mistakenly believed that it already and the other is visitedList in which URLs had been
consisted in the set. But an element that is already existent in visited respectively.

1059
454

Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on June 28,2024 at 10:57:26 UTC from IEEE Xplore. Restrictions apply.
If there is an URL list of Websites to be crawled in V. ALGORITHM IMPLEMENTATION AND PERFORMANCE
database, select the number of initNum of URLs EVALUATION
which are ordered descending by the number of
The performance of the URL filter algorithm based on
repetition to initialize visitingList, otherwise
caching, which is used in large-scale webpage to eliminate
initialize the visitingList with the index page of the
duplication, was analyzed and validated through experiments
Website.
by the prototype system.
In the experiments, having set the parameters (caching
Parsed URL capacity, exchange rate of cache and external storage, and
depth of crawling) according to different crawling situations,
we obtained basic data from a wide variety of Websites, and
Yes
In visitedList ? then got the interrelated data for URL filter based on the
synthetic analysis of basic data. The effect comparison of
URL filter algorithm, which is implemented in this paper,
No
shows in Figure 3, in which the average links of a page (the
Yes average number of effective links per page in a Website) is
In visitingList ? as X-coordinate, the rate of hitting the cache (the ratio of a
pending URL can be found in cache) is as Y- coordinate.
duplicated number + 1
100
No
90
The rate 80
No New URL, add to
In database ? of hitting
visitingList 70
the cache
Yes 60
(%)
50
No Add to visitingList,

2 .7

3. 1
3 .2
3. 3
4 .7

12
2
2.5

2.9

9.7
10 .6
11. 3
Visited ? duplicated number + 1
First Non-First The average links of a page
Yes
Figure 3. The effect comparison of URL filter
Add to visitedList,
duplicated number + 1 We know, from the experimental data, the higher the
Finish average links of a page, the higher the rate of hitting the
cache, the more efficient the URL filter, especially when
Figure 2. URL filter algorithm based on caching crawling the Website again, the rate of hitting the cache has
been increased significantly and the performance of the URL
2) To add the URLs parsed from one page and filter has been improved accordingly. So that it is very
eliminated duplication into the related URL lists in efficient to crawl the Website which has a large number of
the URL Filter. page navigator links and index pages by using the web
The specific filter steps are: for a parsed URL, crawler URL filter algorithm based on caching. In addition, a
searching in the visitedList of URL Filter first, if large number of experimental data shows that the algorithm
found, it is a repetition of some URL and increase has consistent performances of stability and parallel
its duplicated number, otherwise searching in processing while crawling all types of Websites. The
visitingList. If found, it is a repetition of some URL experimental data shows in table 1.
and increase its duplicated number, otherwise The experimental data in table 1 indicated that with the
searching in database. If not found, it is a new URL increase in the number of links, the rate of hitting the cache
and add it to visitingList, otherwise determining it a also increase, in particular, the higher the average links of a
repetition of some URL, increasing its duplicated page, the higher the efficiency of eliminating duplication.
number and adding it to the related URL lists The prototype system, which centers on the database storage,
according to it had been visited or not. has more than one processing node and each node can crawl
3) If the size of the URL caching lists in the URL parallel. The performance of eliminating duplication of each
Filter is larger than the setting value of the crawling node will not reduce with the increasing of node, and the
parameters, transfer and store the URL collection time processing performance of the whole system only will
with the setting capacity into database, otherwise be influenced by database concurrent processing and
load the some block of URLs from database into the network environment, so the stability and parallel
URL lists according to the replacement strategy. performances of the algorithm are good.

1060
455

Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on June 28,2024 at 10:57:26 UTC from IEEE Xplore. Restrictions apply.
Table I. URL filter data for different types of Websites experiments for Websites which handle a large number of
web pages. Experiment results show the algorithm proposed
Website URL NL NLAF RHCF RHCNF
in this paper can achieve satisfactory performances through
reasonable adjustments of its some parameters and is suitable
for the process of the URL filter of a Website which has a
http://cs.scu.edu.cn 1408 399 59.5 100
number of page navigator links and index pages especially.
http://www.jtstar.com 13986 1321 89.7 96.6
REFERENCES
http://www.scu.edu.cn 23550 7837 64.5 77.9
[1] Yuan Wan, and Hengqing Tong, “URL Assignment Algorithm of
http://news.sina.com.cn 38763 11861 67.1 78.7 Crawler in Distributed System Based on Hash,” IEEE International
Conference on Networking, Sensing and Control, 2008.
www.xinhuanet.com 61918 10368 68.0 88.6
[2] WU Lihui, WANG Bin, and YU Zhihua, “Design and Realization of
where: a General Web Crawler,” Computer Engineering, February 2005,
NL — Number of links pp.123-124.
NLAF — Number of links after filtering
RHCF — the Rate of Hitting the cache (First %) [3] Christopher Martinez, Wei-Ming Lin, and Parimal Patel, “Optimal
RHCNF — the Rate of Hitting the cache (Non-First %) XOR Hashing for A Linearly Distributed Address Lookup in
Computer Networks,” Symposium On Architecture For Networking
This paper introduces one URL filter algorithm based on And Communications Systems, 2005, pp. 203-210.
caching and its implementation. The stability and parallel [4] Xiao-Guang Liu, and Jun Lee, “K-Divided Bloom Filter Algorithm
performances of the algorithm are verified by the and Its Analysis,” Future generation communication and networking
(fgcn 2007), 2007, 1(6-8), pp. 220-224.

1061
456

Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on June 28,2024 at 10:57:26 UTC from IEEE Xplore. Restrictions apply.

5.web Crawler Writeup
No ratings yet
5.web Crawler Writeup
7 pages
Thesis On Focused Web Crawler
100% (2)
Thesis On Focused Web Crawler
8 pages
IR Module 3
No ratings yet
IR Module 3
45 pages
Web Crawler
0% (1)
Web Crawler
16 pages
08 Web Search and Web Crawling
No ratings yet
08 Web Search and Web Crawling
33 pages
Web Info PDF
No ratings yet
Web Info PDF
4 pages
PRWB: A Framework For Creating Personal, Site-Specific Web Crawlers
No ratings yet
PRWB: A Framework For Creating Personal, Site-Specific Web Crawlers
6 pages
SNA Mod 5 Search Engines
No ratings yet
SNA Mod 5 Search Engines
64 pages
Explores The Ways of Usage of Web Crawler in Mobile Systems
No ratings yet
Explores The Ways of Usage of Web Crawler in Mobile Systems
5 pages
21jul201512071432 DAIWAT A VYAS 1-6
No ratings yet
21jul201512071432 DAIWAT A VYAS 1-6
6 pages
Sethi2021 Article AnOptimizedCrawlingTechniqueFo
No ratings yet
Sethi2021 Article AnOptimizedCrawlingTechniqueFo
29 pages
Summarize Principles of Distributed Database Systems Chapter 12 Web Data Management
No ratings yet
Summarize Principles of Distributed Database Systems Chapter 12 Web Data Management
24 pages
Document 2
No ratings yet
Document 2
18 pages
Web Crawlers: Presented By: B. Tech. Final Year Information Technology
No ratings yet
Web Crawlers: Presented By: B. Tech. Final Year Information Technology
27 pages
Crawling The Web: Seed Page and Then Uses The External Links Within It To Attend To Other Pages
No ratings yet
Crawling The Web: Seed Page and Then Uses The External Links Within It To Attend To Other Pages
25 pages
Erformance Valuation EB Rawler: P E O W C
No ratings yet
Erformance Valuation EB Rawler: P E O W C
34 pages
Python Design and Implementation of A Simple Web Search E
No ratings yet
Python Design and Implementation of A Simple Web Search E
9 pages
Web Search. Web Spidering
No ratings yet
Web Search. Web Spidering
44 pages
Study of Webcrawler: Implementation of Efficient and Fast Crawler
No ratings yet
Study of Webcrawler: Implementation of Efficient and Fast Crawler
6 pages
Search Engine
100% (2)
Search Engine
42 pages
Lab1 Crawling Python
No ratings yet
Lab1 Crawling Python
10 pages
UNIT III-Web Crawlers Why Do We Need Web Crawlers?
No ratings yet
UNIT III-Web Crawlers Why Do We Need Web Crawlers?
19 pages
S O W C A: Urvey F EB Rawling Lgorithms
No ratings yet
S O W C A: Urvey F EB Rawling Lgorithms
8 pages
Design and Implementation of A Simple Web Search E
No ratings yet
Design and Implementation of A Simple Web Search E
9 pages
Extended Curlcrawler: A Focused and Path-Oriented Framework For Crawling The Web With Thumb
No ratings yet
Extended Curlcrawler: A Focused and Path-Oriented Framework For Crawling The Web With Thumb
9 pages
WI Sem8
No ratings yet
WI Sem8
56 pages
Untitled Document
No ratings yet
Untitled Document
9 pages
Dept. of Cse, Msec 2014-15
No ratings yet
Dept. of Cse, Msec 2014-15
19 pages
Keyw Word Quer Ry Based D Focused Dwebc Rawler: Sciencedirect
No ratings yet
Keyw Word Quer Ry Based D Focused Dwebc Rawler: Sciencedirect
7 pages
Focused Web Crawling Algorithms: Andas Amrin, Chunlei Xia, Shuguang Dai
No ratings yet
Focused Web Crawling Algorithms: Andas Amrin, Chunlei Xia, Shuguang Dai
7 pages
Big Data Unit III
No ratings yet
Big Data Unit III
20 pages
Chapter - 2 Literature Survey: S. No Page No
No ratings yet
Chapter - 2 Literature Survey: S. No Page No
22 pages
Crawler and URL Retrieving & Queuing
No ratings yet
Crawler and URL Retrieving & Queuing
5 pages
Crahid: A New Technique For Web Crawling in Multimedia Web Sites
No ratings yet
Crahid: A New Technique For Web Crawling in Multimedia Web Sites
6 pages
Unit 7 - Search Engine
No ratings yet
Unit 7 - Search Engine
10 pages
History and Working of Web Crawlers
No ratings yet
History and Working of Web Crawlers
3 pages
Web Crawler A Review
No ratings yet
Web Crawler A Review
5 pages
Ir 5
No ratings yet
Ir 5
18 pages
Search Engine
No ratings yet
Search Engine
42 pages
Study of Web Crawler and Its Different Types
No ratings yet
Study of Web Crawler and Its Different Types
8 pages
Unit IV
No ratings yet
Unit IV
12 pages
Internet Searching Technique - Last Edited
No ratings yet
Internet Searching Technique - Last Edited
36 pages
IR-UNIT 10 (Web Crawling)
No ratings yet
IR-UNIT 10 (Web Crawling)
62 pages
Crawler: 1.0 Introduction
No ratings yet
Crawler: 1.0 Introduction
12 pages
Introduction To J Base
No ratings yet
Introduction To J Base
14 pages
I) Web Crawling: Yash Pahlani D17B 49
No ratings yet
I) Web Crawling: Yash Pahlani D17B 49
7 pages
Certificate Validation Using Blockchain
No ratings yet
Certificate Validation Using Blockchain
4 pages
Deep Crawling of Web Sites Using Frontier Technique: Samantula Hemalatha
No ratings yet
Deep Crawling of Web Sites Using Frontier Technique: Samantula Hemalatha
11 pages
Effective Searching Policies For Web Crawler
No ratings yet
Effective Searching Policies For Web Crawler
3 pages
A Keyword Focused Web Crawler Using Domain Engineering and Ontology
No ratings yet
A Keyword Focused Web Crawler Using Domain Engineering and Ontology
3 pages
Build A Web Crawler
No ratings yet
Build A Web Crawler
6 pages
Brief Introduction On Working of Web Crawler: Rishika Gour Prof. Neeranjan Chitare
No ratings yet
Brief Introduction On Working of Web Crawler: Rishika Gour Prof. Neeranjan Chitare
4 pages
Seminar Report: Submitted By: Aanchal Garg CSE
No ratings yet
Seminar Report: Submitted By: Aanchal Garg CSE
22 pages
Ms. Poonam Sinai Kenkre
No ratings yet
Ms. Poonam Sinai Kenkre
43 pages
Crawling The Web: Information Retrieval © Crista Lopes, UCI
No ratings yet
Crawling The Web: Information Retrieval © Crista Lopes, UCI
25 pages
Java Web Crawler
No ratings yet
Java Web Crawler
1 page
Adaptive Focus
No ratings yet
Adaptive Focus
6 pages
Completed Final UNIT-V 9.10.17
100% (1)
Completed Final UNIT-V 9.10.17
74 pages
Final SRS
No ratings yet
Final SRS
7 pages
13 Building Search Engine Using Machine Learning Technique
No ratings yet
13 Building Search Engine Using Machine Learning Technique
4 pages
A Two Stage Crawler On Web Search Using Site Ranker For Adaptive Learning
No ratings yet
A Two Stage Crawler On Web Search Using Site Ranker For Adaptive Learning
4 pages
Collision Resolution Techniques
No ratings yet
Collision Resolution Techniques
10 pages
Svetlin Nakov - Books - Svetlin Nakov's Blog
No ratings yet
Svetlin Nakov - Books - Svetlin Nakov's Blog
11 pages
Data Structure (3330704) : Q.1 Write Difference Between Linear and Non-Linear Data
No ratings yet
Data Structure (3330704) : Q.1 Write Difference Between Linear and Non-Linear Data
21 pages
Eduqas A Computer Science Sams From 2015
No ratings yet
Eduqas A Computer Science Sams From 2015
48 pages
Blue Prism 6 Cyberark Integration User Guide
No ratings yet
Blue Prism 6 Cyberark Integration User Guide
20 pages
Andersen Et Al (2007) - Exploiting Similarity For Multi-Source Downloads
No ratings yet
Andersen Et Al (2007) - Exploiting Similarity For Multi-Source Downloads
14 pages
Quiz 1 Solutions
No ratings yet
Quiz 1 Solutions
16 pages
Embark Documentation
No ratings yet
Embark Documentation
55 pages
Assignment PS4 - StudentRecord PDF
No ratings yet
Assignment PS4 - StudentRecord PDF
5 pages
Dbms Notes
No ratings yet
Dbms Notes
56 pages
Hash Table
No ratings yet
Hash Table
31 pages
Class 11 Cs Question Paper
No ratings yet
Class 11 Cs Question Paper
5 pages
Database File Organisation Lecture
No ratings yet
Database File Organisation Lecture
32 pages
Solving Connect-4 On Medium Boardsizes: Icga 1
No ratings yet
Solving Connect-4 On Medium Boardsizes: Icga 1
3 pages
Forensic Analysis of Internet Explorer Activity Files: Keith J. Jones Foundstone
No ratings yet
Forensic Analysis of Internet Explorer Activity Files: Keith J. Jones Foundstone
21 pages
Module 12a: Dynamic Hashing: Database System Concepts, 6 Ed
No ratings yet
Module 12a: Dynamic Hashing: Database System Concepts, 6 Ed
19 pages
Rocks DB
No ratings yet
Rocks DB
66 pages
Unit 4 Hashing
No ratings yet
Unit 4 Hashing
35 pages
Lab - Compare Data With A Hash
No ratings yet
Lab - Compare Data With A Hash
4 pages
Cuckoo Hashing and Universal Hashing
No ratings yet
Cuckoo Hashing and Universal Hashing
31 pages
Python Hbuh
No ratings yet
Python Hbuh
4 pages
DSA Lab All Practicals
No ratings yet
DSA Lab All Practicals
56 pages
Abstract Data Structures
No ratings yet
Abstract Data Structures
19 pages
(2010) Privacy-Preserving Queries Over Relational Databases
No ratings yet
(2010) Privacy-Preserving Queries Over Relational Databases
18 pages
Mapreduce Ii: Permalink Comments (25) Trackbacks
No ratings yet
Mapreduce Ii: Permalink Comments (25) Trackbacks
6 pages
11 Hashing
No ratings yet
11 Hashing
11 pages
Hashing Data Structure: Hash Table
No ratings yet
Hashing Data Structure: Hash Table
5 pages