0% found this document useful (0 votes)
10 views47 pages

Vimala

This document outlines a project focused on categorizing user browsing behavior through a search matrix, utilizing Latent Semantic Indexing and Particle Swarm Optimization for effective page recommendations. It discusses the importance of understanding user navigation and browsing strategies to improve web services, particularly in the context of the CUBB service, which aggregates academic resources. The proposed system aims to enhance personalized search by analyzing user profiles and optimizing search results based on historical data.

Uploaded by

mrpctechnologies
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views47 pages

Vimala

This document outlines a project focused on categorizing user browsing behavior through a search matrix, utilizing Latent Semantic Indexing and Particle Swarm Optimization for effective page recommendations. It discusses the importance of understanding user navigation and browsing strategies to improve web services, particularly in the context of the CUBB service, which aggregates academic resources. The proposed system aims to enhance personalized search by analyzing user profiles and optimizing search results based on historical data.

Uploaded by

mrpctechnologies
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 47

ABSTRACT

Now a day’s internet plays vital role in the world. Internet

searching is part of our life to all generation. This project aims towards

effective categorization of an individual user’s browsing behavior which

is represented using a search matrix. The search matrix houses the

various search queries and the pages visited by the user. The pages

visited by the user are indexed using personalized weights based on the

factors that affect personalization like, user actions, page-view time and

page hits. The search matrix over a period of time increases in its

dimension. To effectively search for relevant pages from the Search

matrix, Latent Semantic Indexing is used to derive a smaller subset of

the original search matrix. The reduced search matrix is then clustered

using Particle Swarm Optimization (PSO) algorithm for identifying

semantically related searches. Such semantically related searches are

then analyzed for recommending pages to the user also aims to

evaluate the search results.

1
CHAPTER – I

INTRODUCTION

As many research communities are increasingly concerned with

issues of interaction design, one of the current foci in information

science is on user behaviour in seeking information on the World Wide

Web. A frequently applied methodology for studying this behaviour is log

analysis. This approach has several advantages: users do not need to be

directly involved in the study, a picture of user behaviour is captured in

non-invasive conditions, and every activity inside the system can be

tracked.

User log studies mainly use the average analytical approaches of

existing software packages for statistical reporting. Such software

provides limited knowledge of user behaviour, since it only produces

comparatively general insights into aspects of information services, such

as number of users per month or the mostly followed hyperlink, and thus

tells little about specific navigation behaviour.

A variety of aspects of user information-seeking behaviour using

log analysis have been studied previously, in digital libraries, web search

engines, and other web-based information services. Browsing behaviour

has not been studied that much.

2
The common belief seems to be that users prefer searching to

browsing: Lazonderclaims “…students strongly prefer searching to

browsing. Our usability studies show that more than half of all users are

search-dominant, about a fifth of the users are link-dominant, and the

rest exhibit mixed behaviour. The search-dominant users will usually go

straight for the search button when they enter a website: they are not

interested in looking around the site; they are task-focused and want to

find specific information as fast as possible. In contrast, the link-

dominant users prefer to follow the links around a site: even when they

want to find specific information, they will initially try to get to it by

following promising links from the home page. Only when they get

hopelessly lost will link-dominant users admit defeat and use a search

command. Mixed-behaviour users switch between search and link-

following, depending on what seems most promising to them at any

given time but do not have an inherent preference”.

These observations have implications for building searching-

oriented user interfaces. However, those results could be dependent on

a number of issues that might have not yet been recognized. One such

issue is, for example, the role of the web page layout in “favouring”

either of the two strategies. Hong conducted a study on browsing

strategies and implications for design of web search engines. The study

reports that existing browsing features of search engines are insufficient

to users. Even within the CUBB project, an initial belief about potential

user requirements was that end-users preferred searching to browsing 9.

3
After the browsing interface had been built, it showed that browsing was

much favoured.

The overall purpose of the project is to gain insights into real users

navigation and especially browsing behavior in a large service on the

web. This knowledge could be used to improve such services, in case

the CUBB service which offers a large LSI,PSO browsing structure. The

project aimed at studying the following topics: the unsupervised usage

behaviour of all CUBB users, complementing the initial CUBB user

enquiry; detailed usage patterns (quantitative/qualitative, paths through

the system); the balance between browsing, searching and mixed

activities; typical sequences of user activities and transition probabilities

in a session, especially in traversing the hierarchical LSI,PSO browsing

structure; the degree of usage of the browsing support features; and

typical entry points, referring sites, points of failure and exit points.

Because of the high cost of full usability lab studies, we also wanted to

explore whether a thorough log analysis could provide valuable insights

and working hypotheses as the basis for good usage and usability

studies at a reasonable cost.

BACKGROUND

CUBB service

CUBB exploits the success of subject gateways, where subject

experts select quality resources for their users, usually within the

academic and research communities. This approach has been shown to

provide a high quality and valued service, but encounters problems with

4
the ever increasing number of resources available on the Internet. CUBB

is based on a distributed model where major subject gateway services

across Europe can be searched and browsed together through a single

interface provided by the CUBB broker. The CUBB partner gateways

cover over 80,000 predominantly digital, web-based resources from

within most areas of academic interest, mainly written in English.

The CUBB service allows searching several subject gateways

simultaneously. What is searched are “catalogue records” (metadata) of

quality controlled web resources, not the actual resources. There are

two ways to search the service, either through a simple search box that

is available on the CUBB “Home” page or through the “Advanced

search” page allowing combination of terms and search fields and

providing options to limit searches in a number of different ways.

Apart from searching, CUBB offers subject browsing in a

hierarchical directory-style. It is based on intellectual mapping of

classification systems used by the distributed gateway services using

LSI, PSO. There are also several browsing-support features. The

graphical fish-eye display presents the classification hierarchy as an

overview of all available categories that surround the category one

started from, normally one level above and two levels below in the

hierarchy. This allows users to speed up the browsing and get an

immediate overview of the relevant CUBB browsing pages for a subject.

The feature “Search entry into the browsing pages” offers a short-cut to

categories in the browsing tree where the search term occurs.

5
CHAPTER – II

PROBLEM DEFINITION AND METHODOLOGY

Hit Rate calculation

Hit Rate calculation is a numeric value that represents how

important a page is on the web. Goggle figures that when one page links

to another page, it is effectively casting a vote for the other page. The

more votes that are cast for a page, the more important the page must

be.

To calculate the Page Rank for a page, all of its outbound links are

taken into account. And also to calculate number of pages that matches

for a given query.

PR (A) = (1-d) + d (PR (t1)/C (t1) + ... + PR (tn)/C (tn))

In the equation PR is page rank

't1 - tn' are pages linking to page A,

'C' is the number of outbound links that a page has and

'd' is a damping factor, usually set to 0.85.

External site A, B,C,D these are the outgoing links to the page links

PR (A) = (1-0.85) + 0.85((1/1)+(2/1)+(3/5))

= (1-0.85) + 0.85(1+2+0.6)

6
= (1-0.85) + 0.85(3.6)

= (1-0.85) + 0.51

= 0.15 + 0.51

= 0.66.

HTTP GET OPERATION/ HTTP POST

OPERATION

We can pass parameters to a Web Service by calling the ASMX

page with query string parameters for the method to call and the

values of simple parameters to pass.Works the same as GET

Operation except that the parameters are passed as standard URL

encoded form variables. If we use a client such as wwwIPStuff we

can use AddPostKey () to add each parameter in the proper

parameter order.

 SOAP

this is the proper way to call a Web Service in .Net and it's also

the way that .Net uses internally to call Web Services.

Browser Creation

Collaboration space object model provides for a place consisting of

rooms. A room is made up of pages. Folders are used to organize

pages. Members belong to rooms, and are those users authorized to

access them. Place type controls the creation of a place, including

how many rooms it has, for example. Room type controls the

appearance and content of rooms. A form manages the display of

7
data notes. A form can contain fields for containing data and employ

scripts to process and compute data. A page is the basic vehicle for

content. Content is created using an or importing content from an

external source. A member is also a data note, and each place

contains its own member directory. A place is created and managed

from a client browser in on-line mode and in offline mode with

respect to a replicated copy of the space. Room security is

independently managed, and security and aesthetics characteristics

of subrooms selectively inherited. Room navigation and workflow

processing is provided, as are forms creation and uploading from

browser to server.

Track User Search Behaviour’s

Five user experiments are presented on incorporating behavioral

information into the relevance feedback process. In particular, on

ranking terms for query expansion and is concentrated selecting new

terms to add to the user's query. The experiments are an attempt to

widen the evidence used for relevance feedback from simply the

relevant documents to include information on how users are

searching and show that this information can lead to more successful

relevance feedback techniques. The presentation of relevance

feedback to the user is important in the success of relevance

feedback is shown here..

1. Domain expertise

2. Search experience

8
3. Cognitive style

4. Goal type

5. Mode of seeking

6. Situational idiosyncrasies

Local page calculation

The local page is calculated using the web using some search

links. In this local page we can search the ordinary searches and words.

Here the repeated words and simple definitions are calculated. Then

using this local page we can search the web in word calculation process

also. Here we can browse some links on the web. Also some local

browser history can be find out. We know which user use which browser.

Increasing page optimization

One of the most important ways to increase link popularity is to

gain quality one way links and get a high performance. Once we stop

our link building campaign, our link popularity will be lowered

comparatively. When our site is ranking high on search engine result

page (SERP), it gets more exposure and lots of targeted traffic for free.

For any website to appear high in search engine rankings, it is required

that it has a decent tract of both ON-Page and OFF-Page website

optimization in the eyes of search engines. Off page optimization is done

9
by getting backward links or simply back links. The higher the quality

and relevancy of those inward links, the better it is for higher ranking in

search engines

CHAPTER – III

REQUIREMENT ANALYSIS DESIGN

HARDWARE REQUIREMENT

Processor : Dual Core

RAM : 1GB

Hard Disk : 320GB

Clock Speed : 2 GHZ

Display Type : 17 LED Monitor

Mouse : Any Mouse

Keyboard : Any Keyboard with 104 keys

SOFTWARE REQUIREMENTS:

Operating System : Microsoft Windows XP Professional

Front End : ASP.NET with C#

10
Back End : SQL Server 2000

Code Behind : VB.Net

Web Browser : IE 7.0

Web Server : IIS 5.0

CHAPTER – IV

SYSTEM STUDY

EXISTING SYTEM

 Normal search gives both related and unrelated links.

 The semantics of the user query and also the intention of the user,

the conceptual link between the search query and the relevant

pages are not considered.

 Problems with server side data collection

 Recommendation for a new user is difficult without search history.

 The results often frustrate and consume precious time of the

users.

PROPOSED SYSTEM

The aim of the proposed system is to perform personalized search

by recording user profile from users browsing pattern and to retrieve

more relevant and contextually related documents. Users search history

11
is represented by search matrix. The proposed system also focuses

towards effective categorization of user browsing history. The

dimension of the Search matrix increases over a period of time. To

effectively search for relevant pages from the Search matrix, Latent

Semantic Indexing is used to derive a smaller subset of the original

search matrix. The reduced search matrix is then clustered using

Particle Swarm Optimization (PSO) algorithm for identifying semantically

related searches. Such semantically related searches are then analyzed

for recommending pages to the user.

DATA FLOW DIAGRAM

Search Existing
User searchInterface Search
Engine WWW
Query

Page Track
Recommen user’s
dation search
behavior

Search
Preprocessing Matrix
construction

Dimensionality
reduction (LSI) Phase I
12
Semantically
related user Phase II
search
User data

Search matrix

Preprocessing

Clustering (PSO)

Parser POS Noun stemming


tag Extraction

TF * IDF

13
Proposed work for phase I

PHASE I

Track users Data Preprocessing Search Dimension


search collection Matrix ality
behavior reduction

14
Proposed work foe phase II:

PHASE II

Clustering Page Evaluation


recommendation

15
DATABASE DESIGN

TABLE NAME: QUERY

FIELD NAME DATATYPE KEY

USER QUERY VARCHAR NO PRIMARY KEY

16
TABLE NAME: BOOK MASTER:

FIELD NAME DATATYPE KEY

S.NO NUM PRIMARY KEY

DEFINED

QUERY VARCHAR NO PRIMARY KEY

URL VARCHAR NO PRIMARY KEY

HISTORY VARCHAR NO PRIMARY KEY

17
DATE & TIME DATE NO PRIMARY KEY

CHAPTER –V

IMPLEMENTATION AND TESTING

IMPLEMENTATION

After finishing the development of any computer based system the

next complicated time consuming process is system testing. During the

time of testing only the development company can know that, how far

the user requirements have been met out, and so on.

Following are the some of the testing methods applied to this

effective project:

SOURCE CODE TESTING

This examines the logic of the system. If we are getting the output

that is required by the user, then we can say that the logic is perfect.

SPECIFICATION TESTING
18
We can set with, what program should do and how it should

perform under various condition. This testing is a comparative study of

evolution of system performance and system requirements.

MODULE LEVEL TESTING

In this, the error will be found at each individual module, it

encourages the programmer to find and rectify the errors without

affecting the other modules.

UNIT TESTING

Unit testing focuses on verifying the effort on the smallest unit of

software-module. The local data structure is examined to ensure that

the date stored temporarily maintains its integrity during all steps in the

algorithm’s execution. Boundary conditions are tested to ensure that the

module operates properly at boundaries established to limit or restrict

processing.

INTEGRATION TESTING

19
Data can be tested across an interface. One module can have an

inadvertent, adverse effect on the other. Integration testing is a

systematic technique for constructing a program structure while

conducting tests to uncover errors associated with interring.

VALIDATION TESTING

It begins after the integration testing is successfully assembled.

Validation succeeds when the software functions in a manner that can

be reasonably accepted by the client. In this the majority of the

validation is done during the data entry operation where there is a

maximum possibility of entering wrong data. Other validation will be

performed in all process where correct details and data should be

entered to get the required results.

PERFORMANCE TESTING

Performance Testing is used to test runtime performance of

software within the context of an integrated system. Performance test

are often coupled with stress testing and require both software

instrumentation.

20
OUTPUT TESTING

After performing the validation testing, the next step is output

testing of the proposed system since no system would be termed as

useful until it does produce the required output in the specified format.

Output format is considered in two ways, the screen format and the

printer format.

USER ACCEPTANCE TESTING

User Acceptance Testing is the key factor for the success of any

system. The system under consideration is tested for user acceptance

by constantly keeping in touch with prospective system users at the

time of developing and making changes whenever required.

SYSTEM DESIGN

System Design transforms a logical representation of a given

system into the physical specification. The specifications are converted

into a physical reality during development. The design forms a blueprint

of the system and how the components relate to each other. The design

of the system reflects the strength of the software. Better the design;

better the quality, efficiency and reliability of the software.

System design goes through two phases of development.

ELELMENTS OF DESIGN

21
The elements to be designed are as follows.

Data Flows:

Movements of data into, around, out of the system.

Data Stores:

Temporary or permanent collection of data.

Processes:

Activities to accept manipulate and deliver data and information.

Controls:

Standards and guidelines for determination whether activities are

occurring in the anticipated or accepted manner, that is, under control.

Roles:

The responsibilities of all persons involved with the new system.

22
CHAPTER –VI

SOFTWARE DESCRIPTION

INTRODUCTION TO .NET FRAMEWORK

A Microsoft operating system platform that incorporates applications, a

suite of tools and services and a change in the infrastructure of the

company's Web strategy. There are four main principles of .NET from

the perspective of the user

 It erases the boundaries between applications and the Internet.

Instead of interacting with an application or a single Web

site, .NET will connect the user to an array of computers and

services that will exchange and combine objects and data.

 Software will be rented as a hosted service over the Internet

instead of purchased on a store shelf. Essentially, the Internet will

be housing all the applications and data.

23
 Users will have access to their information on the Internet from

any device, anytime, anywhere.

 There will be new ways to interact with application data, such as

speech and handwriting recognition.

.NET depends on four Internet standards

 HTTP

 XML

 SOAP

 UDDI

Microsoft views this new technology as revolutionary, enabling

Internet users to do things that were never before possible, such as

integrate fax, e-mail and phone services, centralize data storage and

synchronize all of a users computing devices to be automatically

updated.

The presence of many off-the-shelf libraries in .NET Framework

can assist us in developing your applications in a faster, cheaper and

easier manner. The most recent .Net Framework version is capable of

supporting over 20 different programming languages today.

The functionality of .Net Framework supporting many

programming languages is due to the use of the powerful CLR, the

Common Language Runtime engine. The application programming codes

are first compiled by CLR into a Microsoft Intermediate Language (MSIL)

code instead of native codes and the MSIL, which is nothing but an

instruction set, in turn creates the native code for running the

application.

24
The main advantage of language and platform independent

feature of .NET Framework can be attributed to CLR and the same CLR

also takes care of run-time services such as memory processes, security

enforcement, integration of language, and thread management. Hence,

you can make use of the various infrastructures that have been provided

in .NET Framework for creating your web-applications.

As per the classification by the Microsoft, there are two categories

of .NET Framework and they are CLR and .Net Framework class library.

Common Language Runtime: The CLR is responsible for providing

a common runtime environment or services with which all .NET

applications can run. Further, the various capabilities of CLR can enable

any developer to write even big applications with ease using the

features such as strong type naming, life-cycle management, dynamic

binding that is capable of making any business logic into re-usable

component and finally the cross-language exception handling.

.Net Framework class library: This class library constitutes various

predefined functional sets that are very useful while developing the

applications by developers. There are three main components in this

class library and they are:

 ASP.NET

 WINDOWS FORMS

 ADO.NET

With the .Net Framework you can make your codes written in

fewer lines and other favourable features such as easy web settings,

easy deployment of applications, easy compilation procedures, easy

25
Web configuration makes the .NET Framework a great platform to work

with. In an overall scenario, the developers will be able to concentrate

more on Web controls and spend an efficient time in application design

and implementation and to have an effective control over the flow of the

application sequence.

Another great feature that any developer can take note of is the

feature of .NET Framework taking into cognisance all the Web controls,

server-side blocks of codes and Web forms and getting them compiled

whenever a call for the page compilation is completed.

Once the components of .NET framework are compiled in the

machine, the compiled version can easily be uploaded with all the

relevant pages in the /bin directory of the system. The process of

uploading is very easy when compared to the complicated process of

web-application in ASP, where in you have to first upload the application

pages with the relevant components and you also need to register them

with the operating system.

In .NET Framework the simple uploading in /bin directory of the

operating system is enough and you need not carry out the complicated

process of registering the components of web-application with the

operating system.

With the help of an XML based web.config file you can carry out

the web settings which is nothing but configuring the .NET applications

for successful running. The XML based web.config file can be modified

through a program and when any such modification is done, the system

26
recognises the change and registers it immediately that makes

configuration of .NET applications easy and quick.

Caching is a process or a method with which the most commonly

and frequently used resources and data will get loaded onto the memory

for easy and fast access. There are three types of caching in .Net

Framework and they are output caching, data caching, and fragment

caching.

About Securing Web Services in .Net Framework

Web Services is the promising technology that allows enterprises

to share and integrate applications across different platforms. Since

anybody can consume web services from anywhere and from any

platform, this makes it prone to security threats. By security threats, we

mean that no unauthorized user should access, modify, or damage the

information.

Web Services are mostly used in distributed environments and

their data, code, and description are widely moved across different

security domains. Suppose web services pass to another domain then it

should carry the same security restrictions provided by the sender.

Simple Object Access Protocol (SOAP) is the communication level

protocol that has security extensions. These extensions have been

defined by W3C XML Encryption Working Group. They have defined a

standard, XML Encryption, which is extensively used to encrypt and

decrypt the messages.

27
Though SOAP is the default protocol for web services, .Net

Framework has in built options that allow you to expose or consume web

services. The .Net Framework has three classes such as Uri,

WebRequest, and WebResponse. The Uri class consists of the Uniform

Resource Indicator (URI) through which you can call the required web

services. The WebRequest class encapsulates a request to access web

services from a network resource. The WebResponse class acts as a

warehouse for all the incoming responses from the network resource.

Apart from the above security standards, W3C with the

cooperation of IBM, Microsoft and VeriSign has developed a common

standard called WS-Security. WS-Security is almost similar to the XML

Encryption method. The only difference is that WS-Security also allows

the sender of web services to sign through XML Digital Signature. Apart

from XML Encryption and XML Digital Signature methods, W3C is also

planning to launch technologies like XML Key Management Specification

and Security Assertion Markup Language (SAML).

Understanding ASP.NET Web Server Controls

ASP.NET Web Server Controls are controls that run at the web

server. All ASP.NET Web Server Controls can be identified by their

attribute ‘runat=”server”’. ASP.NET Web Server Controls are similar to

HTML controls. The only difference is that HTML controls run at the

client-side and the developers have to write the code for each type of

browsers. The ASP.NET Web Server Controls run at the server-side and

automatically adapt to the type of browser that request it.

28
ASP.NET Web Server Controls also encapsulates and generates

large amount of HTML tags, thereby allowing the developers’ time to

concentrate on coding. ASP.NET Web Server Controls makes exhaustive

use of View State management. This makes data and values to be

consistent across ASP.NET pages.

All web controls are obtained from a common base class. This

ensures that the object model remains consistent across various

controls. For example, in order to move the cursor consistently across a

form you can specify the Web control Tab Index property. This is very

difficult while using normal HTML. You can also disable a particular web

control by exposing the Enabled property. This process is also difficult in

HTML and ASP.

Web Server - Meaning

A Web server is a software program which serves web pages to

web users (browsers).

A web server delivers requested web pages to users who enter the

URL in a web browser. Every computer on the Internet that

contains a web site must have a web. The computer in which a

web server program runs is also usually called a "web server". So,

the term "web server" is used to represent both the server

program and the computer in which the server program runs.

Characteristics of web servers

In short, a 'web server' is a computer which is connected to the

internet/intranet and has software called 'web server'. The web server

29
program will be always running in the computer. When any users try to

access a website hosted by the web server, it is actually the web server

program which delivers the web page which client asks for.

All web sites in the internet are hosted in some web servers sitting in

different parts of the world.

The URL can be broken into two parts

1. The protocol we are going to use to connect to the server (http)

2. The server name (www.aspspider.com)

The browser breaks up the URL into these parts and then it tries to

communicate with the server looking up for the server name. Actually,

server is identified through an IP address but the alias for the IP address

is maintained in the DNS Server or the Naming server. The browser

looks up these naming servers, identifies the IP address of the server

requested and gets the site and gets the HTML tags for the web page.

Finally it displays the HTML Content in the browser.

When the user try to access a web site, he/she does't really need

to know where the web server is located. The web server may be

30
located in another city or country, but all we need to do is, type the URL

of the web site you want to access in a web browser. The web browser

will send this information to the internet and find the web server. Once

the web server is located, it will request the specific web page from the

webserver program running in the server. Web server program will

process the user request and send the resulting web page to the user

browser. It is the responsibility of the user browser to format and

display the webpage to us.

Number of web servers are needed for a web

site

Typically, there is only one web server required for a web site. But

large web sites like Yahoo, Google, MSN etc will have millions of visitors

every minute. One computer cannot process such huge numbers of

requests. So, they will have hundreds of servers deployed in different

parts of the world so that can provide a faster response.

Number of websites can be hosted in one server

A web server can host hundreds of web sites. Most of the small

web sites in the internet are hosted on shared web servers. There are

several web hosting companies who offer shared web hosting. If we buy

a shared web hosting from a web hosting company, they will host our

web site in their web server along with several other web sites for a Fee.

Visual Studio .NET (VS.NET)

31
Visual Studio .NET allows to easily create web pages. Some of the

benefits in using Visual Studio .NET are:

 We can simply drag and drop html controls to the web page and

VS.NET will automatically write the HTML tags for us.

 Start typing an HTML tag and VS.NET will complete it. When we

start typing a tag, VS.NET will show you the HTML tags starting

with the characters we typed. So, it is not necessary to even

remember all the tags.

 If we type any HTML tags wrong, VS.NET will highlight the errors

and tell us how to correct it.

NET (dot-net) is the name Microsoft gives to its general vision of

the future of computing, the view being of a world in which many

applications run in a distributed manner across the Internet. We can

identify a number of different motivations driving this vision.

Firstly, distributed computing is rather like object oriented

programming, in that it encourages specialized code to be collected in

one place, rather than copied redundantly in lots of places. There are

thus potential efficiency gains to be made in moving to the distributed

model.

Secondly, by collecting specialized code in one place and opening

up a generally accessible interface to it, different types of machines

(phones, handhelds, desktops, etc.) can all be supported with the same

code. Hence Microsoft's 'run-anywhere' aspiration.

32
Thirdly, by controlling real-time access to some of the distributed

nodes (especially those concerning authentication), companies like

Microsoft can control more easily the running of its applications. It

moves applications further into the area of 'services provided' rather

than 'objects owned'.

Interestingly, in taking on the .NET vision, Microsoft seems to have

given up some of its proprietary tendencies (whereby all the technology

it touched was warped towards its Windows operating system). Because

it sees its future as providing software services in distributed

applications, the .NET framework has been written so that applications

on other platforms will be able to access these services. For

example, .NET has been built upon open standard technologies like XML

and SOAP.

At the development end of the .NET vision is the .NET Framework.

This contains the Common Language Runtime, the .NET Framework

Classes, and higher-level features like ASP.NET (the next generation of

Active Server Pages technologies) and Win Forms (for developing

desktop applications).

The Common Language Runtime (CLR) manages the execution of

code compiled for the .NET platform. The CLR has two interesting

features. Firstly, its specification has been opened up so that it can be

ported to non-Windows platforms. Secondly, any number of different

languages can be used to manipulate the .NET framework classes, and

the CLR will support them. This has led one commentator to claim that

under .NET the language one uses is a 'lifestyle choice'.

33
Not all of the supported languages fit entirely neatly into the .NET

framework, however (in some cases the fit has been somewhat

Procrustean). But the one language that is guaranteed to fit in perfectly

is C#. This new language, a successor to C++, has been released in

conjunction with the .NET framework, and is likely to be the language of

choice for many developers working on .NET applications.

ADO .NET

Most applications need data access at one point of time making it

a crucial component when working with applications. Data access is

making the application interact with a database, where all the data is

stored. Different applications have different requirements for database

access. VB .NET uses ADO .NET (Active X Data Object) as it's data

access and manipulation protocol which also enables us to work with

data on the Internet.

Evolution of ADO.NET

The first data access model, DAO (data access model) was created

for local databases with the built-in Jet engine which had performance

and functionality issues. Next came RDO (Remote Data Object) and ADO

(Active Data Object) which were designed for Client Server architectures

but soon ADO took over RDO. ADO was a good architecture but as the

language changes so is the technology. With ADO, all the data is

contained in a recordset object which had problems when implemented

on the network and penetrating firewalls. ADO was a connected data

34
access, which means that when a connection to the database is

established the connection remains open until the application is closed.

Leaving the connection open for the lifetime of the applications raises

concerns about database security and network traffic.

Also, as databases are becoming increasingly important and as

they are serving more people, a connected data access model makes us

think about its productivity. For example, an application with connected

data access may do well when connected to two clients, the same may

do poorly when connected to 10 and might be unusable when connected

to 100 or more. Also, open database connections use system resources

to a maximum extent making the system performance less effective.

Need for ADO.NET

To cope up with some of the problems mentioned above,

ADO .NET came into existence. ADO .NET addresses the above

mentioned problems by maintaining a disconnected database access

model which means, when an application interacts with the database,

the connection is opened to serve the request of the application and is

closed as soon as the request is completed. Likewise, if a database is

updated, the connection is opened long enough to complete the update

operation and is closed. By keeping connections open for only a

minimum period of time, ADO .NET conserves system resources and

35
provides maximum security for databases and also has less impact on

system performance. Also, ADO .NET when interacting with the

database uses XML and converts all the data into XML format for

database related operations making them more efficient.

DataSet

The dataset is a disconnected, in-memory representation of data.

It can be considered as a local copy of the relevant portions of the

database. The DataSet is persisted in memory and the data in it can be

manipulated and updated independent of the database. When the use of

this DataSet is finished, changes can be made back to the central

database for updating. The data in DataSet can be loaded from any valid

data source like Microsoft SQL server database, an Oracle database or

from a Microsoft Access database.

The ADO.NET Data Architecture

36
Data Access in ADO.NET relies on two components: DataSet and Data

Provider.

Data Provider

The Data Provider is responsible for providing and maintaining the

connection to the database. A DataProvider is a set of related

components that work together to provide data in an efficient and

performance driven manner. The .NET Framework currently comes with

two DataProviders: the SQL Data Provider which is designed only to work

37
with Microsoft's SQL Server 7.0 or later and the OleDb DataProvider

which allows us to connect to other types of databases like Access and

Oracle. Each DataProvider consists of the following component classes.

The Connection object which provides a connection to the

database

The Command object which is used to execute a command

The DataReader object which provides a forward-only, read only,

connected recordset The DataAdapter object which populates a

disconnected DataSet with data and performs update .

Data access with ADO.NET can be summarized as follows

A connection object establishes the connection for the application

with the database. The command object provides direct execution of the

command to the database. If the command returns more than a single

value, the command object returns a DataReader to provide the data.

Alternatively, the DataAdapter can be used to fill the Dataset object.

database can be updated using the command object or the DataAdapter

The Connection Object

The Connection object creates the connection to the database.

Microsoft Visual Studio .NET provides two types of Connection classes:

the SqlConnection object, which is designed specifically to connect to

Microsoft SQL Server 7.0 or later, and the OleDbConnection object,

38
which can provide connections to a wide range of database types like

Microsoft Access and Oracle. The Connection object contains all of the

information required to open a connection to the database.

The Command Object

The Command object is represented by two corresponding classes:

SqlCommand and OleDbCommand. Command objects are used to

execute commands to a database across a data connection. The

Command objects can be used to execute stored procedures on the

database, SQL commands, or return complete tables directly. Command

objects provide three methods that are used to execute commands on

the database:

Execute Non Query: Executes commands that have no return

values such as INSERT,UPDATE or DELETE Execute Scalar: Returns a

single value from a database query Execute Reader: Returns a result set

by way of a Data Reader object.

The DataReader Object

The DataReader object provides a forward-only, read-only,

connected stream recordset from a database. Unlike other components

of the Data Provider, DataReader objects cannot be directly instantiated.

Rather, the DataReader is returned as the result of the Command

object's ExecuteReader method. The SqlCommand.ExecuteReader

method returns a SqlDataReader object, and the

OleDbCommand.ExecuteReader method returns an OleDbDataReader

object. The DataReader can provide rows of data directly to application

39
logic when you do not need to keep the data cached in memory.

Because only one row is in memory at a time, the DataReader provides

the lowest overhead in terms of system performance but requires the

exclusive use of an open Connection object for the lifetime of the

DataReader.

The DataAdapter Object

The DataAdapter is the class at the core of ADO .NET's

disconnected data access. It is essentially the middleman facilitating all

communication between the database and a DataSet. The DataAdapter

is used either to fill a DataTable or DataSet with data from the database

with it's Fill method. After the memory-resident data has been

manipulated, the DataAdapter can commit the changes to the database

by calling the Update method. The DataAdapter provides four properties

that represent database commands:

SelectCommand

InsertCommand

DeleteCommand

UpdateCommand

When the Update method is called, changes in the DataSet are

copied back to the database and the appropriate InsertCommand,

DeleteCommand, or UpdateCommand is executed.

40
SQL Server

SQL Server is one of the most popular and advanced database

systems currently available. SQL Server is provided by Microsoft.

Microsoft SQL Server is sometimes called as "Sequel Server". It can be

managed using Structured Query Language.

While MS Access is meant for small applications, SQL Server

supports large applications with millions of users or huge databases.

SQL Server is much more powerful than Access and provides several

other advanced features and much better security. SQL Server is

compatible with MS Access. The user can easily import/export data

between these two.

SQL Server is a Relational database where data is stored and

retrieved very efficiently.

Uses

Access is used by small businesses, within departments of large

corporations, and hobby programmers to create ad hoc customized

desktop systems for handling the creation and manipulation of data.

Access can also be used as the database for basic web based

applications hosted on Microsoft's Internet Information Services and

utilizing Microsoft Active Server Pages ASP. More complex web

applications may require tools like PHP/MySQL or ASP/Microsoft SQL

Server.

Some professional application developers use Access for rapid

application development, especially for the creation of prototypes and

41
standalone applications that serve as tools for on-the-road salesmen.

Access does not scale well if data access is via a network, so

applications that are used by more than a handful of people tend to rely

on a Client-Server based solution such as Oracle, DB2, Microsoft SQL

Server, Windows SharePoint Services, PostgreSQL, MySQL, Alpha Five,

MaxDB, or FileMaker. However, an Access "front end" (the forms,

reports, queries and VB code) can be used against a host of database

backends, including JET (file-based database engine, used in Access by

default), Microsoft SQL Server, Oracle, and any other ODBC-compliant

product.

Many developers who use Access use the Leszynski naming

convention though this is not universal it is a programming convention,

not a DBMS-enforced rule.

Features

One of the benefits of Access from a programmer's perspective is its

relative compatibility with SQL (structured query language) —queries

may be viewed and edited as SQL statements, and SQL statements can

be used directly in Macros and VBA Modules to manipulate Access

tables. In this case, "relatively compatible" means that SQL for Access

contains many quirks, and as a result, it has been dubbed "Bill's SQL" by

industry insiders. Users may mix and use both VBA and "Macros" for

programming forms and logic and offers object-oriented possibilities.

42
MSDE (Microsoft SQL Server Desktop Engine) 2000, a mini-version of MS

SQL Server 2000, is included with the developer edition of Office XP and

may be used with Access as an alternative to the Jet Database

Engine.Unlike a complete RDBMS, the Jet Engine lacks database triggers

and stored procedures. Starting in MS Access 2000 (Jet 4.0), there is a

syntax that allows creating queries with parameters, in a way that looks

like creating stored procedures, but these procedures are limited to one

statement per procedure.[1] Microsoft Access does allow forms to

contain code that is triggered as changes are made to the underlying

table (as long as the modifications are done only with that form), and it

is common to use pass-through queries and other techniques in Access

to run stored procedures in RDBMSs that support these.

In ADP files (supported in MS Access 2000 and later), the

database-related features are entirely different, because this type of file

connects to a MSDE or Microsoft SQL Server, instead of using the Jet

Engine. Thus, it supports the creation of nearly all objects in the

underlying server (tables with constraints and triggers, views, stored

procedures and UDF-s). However, only forms, reports, macros and

modules are stored in the ADP file (the other objects are stored in the

back-end database).

MSDE Facts

 Microsoft does not provide a graphical user itnerface (like SQL

Enterprise Manager) to manage the databases. However, there

43
are several third party management tools available. Or, you can

use a console based interface provided by Microsoft.

 Performance throttling prevents too many users accessing

database at the same time (Microsoft does not want you to use

MSDE for high traffic applications ! Instead, they want you to buy

SQL Server)

 MSDE is based on SQL Server engine, but is a scaled down

version of SQL Server.

 Migrating from MSDE to SQL Server is very easy. All you have to

do is, uninstall MSDE and install SQL Serer. Your databases and

code will work without any change.

 MSDE is not for sale as a separate product. It is available for

royalty-free redistribution by vendors under certain MSDE

licensing conditions.

 MSDE allows a maximum database size of 2 GB

 MSDE can use a maximum of 2 GB RAM

 Supports only 2 CPUs on the server.

44
 Features like Full-text search, profiler, import/export wizards,

index tuning wizard etc not available.

CHAPTER-VII

CONCLUSION

The aim of this project was to support the user to get the result of

their search in an efficient way and in less time. Whilst computers and

associated technologies have had a beneficial impact on daily life. On

flexibility and efficiency in the workplace, and on people's ability to

easily access services and information from their home computers,

increased dependence on computers in society may be endangering

workers' health and safety and negatively affecting personal privacy.

Now a days internet play a vital role in everyone life directly or

indirectly. The major part of the project is to categorize the user

45
behavior which is represented using a search matrix. The search matrix

differs for each user based on the search queries and personalized

weight based. And the search matrix supports the user to find the

similar search results which is available on the web.

It is the web based technology, which I used to set the Categorization of

user Browser.

BIBLIOGRAPHY

SUBJECT BOOK NAME AUTHOR

DOT NET .NET-A COMPLETE DEVELOPMENT ADDISON-WESLEY

CYCLE

46
ASP ASP.NET FOR ASP PROGRAMMERS BUDI KURNIAWAN

SQL DELIVERING BUSINESS BRIAN LARSON

INTELLIGENCE WITH MICROSOFT

SQL SERVER 2008

HTML DYNAMIC HTML IN ACTION MICHELE PETROVSKY

47

You might also like