Topic 7 Data Collection Approaches - Summarized
Topic 7 Data Collection Approaches - Summarized
At a high level, a tag (cookie) is placed on each page within your website, and it sends data to
your web analytics tool. Your tool collects and processes the data, assigning it to a specific
reporting container for each of your digital properties. People can then analyze a set of
reports for a specific digital property
Cookie Overview
JavaScript is a major component of the page tagging approach. Another essential
component for page tagging is the HTTP cookie. A cookie is simply a small text file
(program) that is stored in the web browser on a visitor’s computer or device. Cookies
were originally designed to keep track of unique visitors across a single session, and then
expanded to tracking visitors across multiple visits.
Cookies are important in everyday online experience. They help sites to remember your
preferences between visits, verify you’re still logged into the site, and track the products
you’ve added to your shopping cart.
In web analytics, cookies are used to calculate such metrics as visits, unique visitors, and
new/ return visitors. They also associate different attributes and behaviors to unique
visitors for segmentation purposes.
Types of cookies
In terms of web analytics, you need to be aware of two major types of cookies.
• A session cookie lasts only the duration of a single session. The cookie file is stored
in temporary memory by the web browser and then deleted when the browser
closes. Web analytics tools rely on session cookies to monitor visit-related
behaviors.
• A persistent cookie, persists until it reaches a specified expiration date, and then
the web browser deletes it. Persistent cookies are saved on a user’s computer or
device. Web analytics tools need persistent cookies to track the multi-session
behaviors of unique visitors.
Web analytics vendors have developed two approaches for handling cross-domain
tracking. Some vendors use the link approach to pass the session and persistent cookie
information for one domain (news.com) to the next domain (blogs. com) by appending
the client ID to the links via a query string. When the visitor lands on the next domain
(blogs.com), JavaScript code is used to write the information to both blogs.com’s first-
party session and persistent cookies to preserve the visit and visitor information across
the two domains. The same process would be repeated if the visitor went to yet another
root domain. This approach depends on every cross-domain link passing the necessary
cookie information on to the next domain’s first-party cookies to work properly.
Alternatively, to track visits and visitors across multiple domains a web analytics tool
may use the friendly third-party approach in addition to setting a first-party cookie.
2) Packet Sniffing
Packet sniffing technically is one of the most sophisticated ways of collecting web
data. It has also been around for quite some time, but for a number of reasons it is not
quite as popular as the other options outlined in this chapter. Among the vendors who
provide packet-sniffing web analytics solutions are Clickstream Technologies. Some
interesting ways of leveraging packet sniffers are also emerging—for example,
SiteSpect is using the technology for multivariate testing, eliminating the reliance on
tagging your website to do testing.
There are a total of five steps to collect data:
1. The customer types your URL in a browser.
2. The request is routed to the web server, but before it gets there, it passes
through a software- or hardware-based packet sniffer that collects attributes of
the request that can send back more data about the Visitor to the packet sniffer.
3. The packet sniffer sends the request on to the web server.
4. The request is sent back to the customer but is first passed to the packet sniffer.
The packet sniffer captures information about the page going back and stores
that data. Some vendor packet-sniffing solutions append a JavaScript tag that
can send back to the packet sniffer more data about the visitor.
5. The packet sniffer sends the page on to the visitor browser.
A packet sniffer can be a layer of software that is installed on the web servers and runs
“on top” of the web server data layer. Alternatively, it can be a physical piece of hardware
that is hooked up in your data center, and all traffic is then routed to your web server via
the packet sniffer solution