Report
Report
Submitted by:
Avantika Singh (2000970130043)
Hardik Srivastava (2000970130051)
Harsha Verma (2000970130054)
ACKNOWLEDGEMENT
We want to give special thanks to our Mini Project coordinator, Dr Javed Miya for the timely
advice and valuable guidance during designing and implementation of this project work.
We also want to express our sincere thanks and gratitude to Dr. Sanjeev Kumar Singh, Head of
Department (HOD), and Information Technology Department for providing us with the
facilities and for all the encouragement and support.
Finally, we express our sincere thanks to all staff members in the department of Information
Technology branch for all the support and cooperation.
Abstract 1
1.
Introduction 2-3
2
Literature Survey 4
3
Problem Statement 5
4
Methodology 6-8
5
Code 9-30
6
Result 31-32
7
References 34
9
List of Figures
SR. NO.
FIGURE PAGE
NO.
4. Methodology 8
1. Fig 4.1 8
6. Results 31-32
1. Fig 6.1 31
2. Fig 6.2 2631
3. Fig 6.3 2732
4. Fig 6.4 32
ABSTRACT
The large and growing industry of price comparison websites (PCWs) or ‘web aggregators’ is
poised to benefit consumers by increasing competitive pricing pressure on firms by acquainting
shoppers with more prices. However, these sites also charge firms for sales, which feeds back to
raise prices. I investigate the impact of introducing PCWs to a market for a homogeneous good. I
find that introducing a single PCW increases prices for all consumers, both those who use the
sites, and those who do not. Under competing PCWs, prices tend to rise with the number of
PCWs. I also conduct various extensions and use the analysis to discuss relevant industry
practices and policies. Price comparison websites make it easy for online buyers to compare their
products/services. There are well-established comparison sites in the marketplace, but you can
look for a niche market to offer something unique, an idea that is worth exploring. If the idea is
monetized perfectly, our price comparison site could generate greater revenue.
1
1. INTRODUCTION
Every shopper looks for the best deals & discounts before buying any product. Nowadays before
purchasing anything the buyers do some online research of the products on the internet. One of
the major factors which lead to purchasing of any product is cost or pricing. The buyers tend to
compare prices before purchasing any product.
But since it is very difficult to visit each & every website for price comparison, there needs to be
a solution to automate this process. The Price comparison website project proposed here gathers
information on product prices from various websites & presents it to the users. The users can
then choose to buy from the best options available. Even Ecommerce traders can use this price
comparison website to study their competitors and form new strategies accordingly to attract new
customers & stay ahead of their competitors.
This price comparison website for products will help to compare the price from various
ecommerce websites, This Price comparison site is extremely helpful for frequent online
shoppers to check prices on different online stores in one place.
This system will show you the product prices from different retailers to show you where to buy
the product at affordable price, Any two static websites classes are analyzed to get the pricing
details, To get the pricing details, the system visits the website based on user’s search and
downloads the html search page of that specific website, Once prices from both the websites are
retrieved, it is displayed on our website in the form of price comparison.
Web scraping is the art of extracting data from the internet. When it comes to its applications, it
has a vast amount of applications. One of them is price comparison from different websites.
Online shopping has become the boom in the industry now, and comparing the pricing of certain
products has become a necessity. We all visit multiple websites when we need to purchase a
particular product but have you ever thought of making a price comparison tool that does the
same job for you and places the best deal in front of you?
In this article, we will be making an amazing price comparison tool in Python that will let you
track the price of the products across different sources and inform you about the performance of
different competitors in the market. Furthermore, it will also let the business be informed either
the price of a specific product goes up or down than the predicted price.
Web scraping is the process of collecting structured web data in an automated fashion. It’s also
called web data extraction.
So let’s simplify this, suppose if you want to go to a website to fetch some information and save
it locally into your computer or desktop or file, how will you do it? Manually right?
We automate the entire process. Now the question is: “how to automate the process?” There is a
catch and the catch is, every webpage on the internet, what we see is basically an html document
and any html document can be processed or understood using tags and its attributes. So now we
just need to have a set of rules(collection of functions) to which we will feed this html and that
set of rules(collection of functions), uses the information of html tags and its attributes plus the
fed html page to give us the information back we want. The good news is, that collection of
2
functions something we already have we just need to know how to use it. This collection of
functions you can call as a package/library.
3
2. LITERATURE SURVEY
➢ In 2013, Hoffman, Novak, & Chatterjee created the first successful applications of the
Internet in the online shopping environment. These models rationalize price-dispersion in
homogeneous goods markets.Indeed, price dispersion has persisted despite the advances
of technology such the internet and comparison sites.
➢ Haubl & Trifts in 2015 created shopbots which would drastically affect all aspects of the
consumers' buying process. A recent study by Gorodnichenkoet al. (2015) finds
substantial cross-seller variation in prices, and voices support for clearing-house models
that categorize consumers into loyal and shopping consumers. The equilibria inmy model
feature price dispersion regardless of whether there is an aggregator (or ‘clearing-house’).
Without a PCW, this is because some price comparison is undertaken by consumers.
Producing price dispersion with a PCW that employs the pricing mechanism seen in
practice, is a challenge. The shift in the aggregator industry away from charging one-off
fixed fees, toward pay-per-sale fees to firms is rationalized as profit-maximizing PCW
behavior by Bayeet al. (2011).
➢ In 2003,Iyer & Pazgal and in 2004, Pan, Ratchford, & Shankar created shopping agents
which are more widespread than ever before. hese articles model platforms where buyers
and sellers meet to trade, focusing on platform pricing and the effect of network
externalities with differentiated products and platforms. These models do not explicitly
model seller-side competition, which is central to my setting.
➢ They become a significant element for online shoppers by providing access to price and
product information (Shin and Park, 2011, Su, 2011).
➢ Price comparison sites reduce buyers' search costs and help their decision-making by
providing price comparison information, which is seldom present in the physical retail
shopping context (Brynjolfsson & Smith, 2000).
➢ Although evidence suggests that online consumers become increasingly sensitive to price
after using price comparison sites (Cho & Song, 2002), what happens in terms of their
internal price and value perceptions remains unexplored in relation with the adoption of a
price comparison site.
4
3. PROBLEM STATEMENT
o To find the best deals on any product for customer using web scrapping with PHP:
• While purchasing for any product, one does some research on various websites for the
best deal. To avoid this hassle, we came up with this idea.
• The Price comparison website project proposed here gathers information on product
prices from various websites & presents it to the users.
• The users can then choose to buy from the best options available.
➢ To study competitors of Ecommerce websites:
• Even Ecommerce traders can use this price comparison website to study their competitors
and form new strategies accordingly to attract new customers & stay ahead of their
competitors.
• This will help them to stay in trend with the marketplace.
5
4. METHODOLOGY
This price comparison website for products will help to compare the price from various
ecommerce websites, This Price comparison site is extremely helpful for frequent online
shoppers to check prices on different online stores in one place, This system will show you the
product prices from different retailers to show you where to buy the product at affordable price,
Any two static websites classes are analyzed to get the pricing details, To get the pricing details,
the system visits the website based on user’s search and downloads the html search page of that
specific website, Once prices from both the websites are retrieved, it is displayed on our website
in the form of price comparison.
There are different programming languages that you can use to scrape the web, and within every
programming language, there are different libraries to achieve the same goal.
We will use the Requests library to fetch the HTML code from a specific URL and we will use
pure Python to organize the data.
Basic HTML
HTML stands for HyperText Markup Language. It is used to design web pages using the markup
language. HTML is the combination of Hypertext and Markup language. Hypertext defines the
link between the web pages and markup language defines the text document within the tag that
define the structure of web pages.
HTML tags that are used are:
<head> : Used to set the document head
6
Basic Scraping
Web scraping is a technique to fetch data from websites. While surfing on the web, many
websites don’t allow the user to save data for personal use. One way is to manually copy-paste
the data, which both tedious and time-consuming. Web Scraping is the automation of the data
extraction process from websites. This event is done with the help of web scraping software
known as web scrapers. They automatically load and extract data from the websites based on
user requirements. These can be custom built to work for one site or can be configured to work
with any website
How web scraping basically works:
➢ Find the Url.
➢ Inspect the Url page.
➢ Find the data you want to extract/scrape.
➢ Write and Run the Code.
➢ Save the Extracted/Scraped Data into any required format.
7
System Requirements
Hardware Requirement
a. Laptop or PC
8
5. CODE
Index.php
<link rel="icon" href="img/mdb-favicon.ico" type="image/x-icon">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.11.2/css/all.css">
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/jquery.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/popper.min.js"></script>
<script
src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"></script>
<style>
.card{margin:15px;border-radius:15px; }
img{height:180px;} spa{f ont-weight:bolder;
font-size: large;} #visitsite{text-
decoration:none; color:white;}
.card-text{ font-weight:bolder; font-size: large;}
</style>
<nav class="navbar navbar-expand-lg navbar-dark btn-dark">
<a class="navbar-brand" href="#">Ecommerce Price Comparison website Using Web
Scrapping</a>
<button class="navbar-toggler" type="button" data-toggle="collapse"
datatarget="#basicExampleNav"
aria-controls="basicExampleNav" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="basicExampleNav">
<ul class="navbar-nav mr-auto">
<a class="nav-link" href="#">Home<span class="sr-only">(current)</span></a>
<a class="nav-link dropdown-toggle" id="navbarDropdownMenuLink" data-toggle="dropdown"
aria-haspopup="true" aria-expanded="false">Dropdown</a>
9
<div class="dropdown-menu dropdown-dark"
arialabelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="#">Action</a>
<a class="dropdown-item" href="#">Another action</a>
<a class="dropdown-item" href="#">Something else here</a>
</div>
</li-->
</ul>
<form class="form-inline">
<div class="md-form my-0">
<input class="form-control mr-sm-2" type="text" placeholder="Search"
arialabel="Search">
</div>
</form>
</div>
</nav>
<br><br>
<div class="jumbotron jumbotron-fluid">
<div class="container">
<h1>Ecommerce Price Comparison website Using Web Scrapping</h1>
<p>Price Comparison Tool : Here you can compare prices of a product on various ecommerce
platforms..</p>
</div>
</div>
<footer class="page-footer font-small pt-4 btn-dark" >
<div class="container-fluid text-center text-md-left">
<div class="row">
<div class="col-md-6 mt-md-0 mt-3">
<h5 class="text-uppercase">Price Comparison Tool</h5>
<p>Here you can compare prices of a product on various e-commerce platforms.</p>
</div>
<hr class="clearfix w-100 d-md-none pb-3">
10
<div class="col-md-3 mb-md-0 mb-3">
<h5 class="text-uppercase">E-commerce sites</h5>
<ul class="list-unstyled">
<li><a href="#!">Amazon</a></li>
<li> <a href="#!">Flipkart</a></li>
</ul></div>
<div class="col-md-3 mb-md-0 mb-3">
<h5 class="text-uppercase">Links</h5>
</div></div></div>
</footer>
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/popper.min.js"></script>
<script type="text/javascript" src="js/bootstrap.min.js"></script>
<script type="text/javascript" src="js/mdb.min.js"></script>
<script type="text/javascript"></script>
Simple_html_dom.php
<?php
define('HDOM_TYPE_ELEMENT', 1);define('HDOM_TYPE_COMMENT', 2);
define('HDOM_TYPE_TEXT', 3);define('HDOM_TYPE_ENDTAG', 4);
define('HDOM_TYPE_ROOT', 5);define('HDOM_TYPE_UNKNOWN', 6);
define('HDOM_QUOTE_DOUBLE', 0);define('HDOM_QUOTE_SINGLE', 1);
define('HDOM_QUOTE_NO', 3);define('HDOM_INFO_BEGIN', 0);
define('HDOM_INFO_END', 1);define('HDOM_INFO_QUOTE', 2);
define('HDOM_INFO_SPACE', 3);define('HDOM_INFO_TEXT', 4);
define('HDOM_INFO_INNER', 5);define('HDOM_INFO_OUTER', 6);
define('HDOM_INFO_ENDSPACE', 7);
defined('DEFAULT_TARGET_CHARSET') || define('DEFAULT_TARGET_CHARSET',
'UTF-8');
defined('DEFAULT_BR_TEXT') || define('DEFAULT_BR_TEXT', "\r\n");
defined('DEFAULT_SPAN_TEXT') || define('DEFAULT_SPAN_TEXT', ' ');
11
defined('MAX_FILE_SIZE') || define('MAX_FILE_SIZE', 600000);
define('HDOM_SMARTY_AS_TEXT', 1); function file_get_html(
$url,$use_include_path = false,$context = null,$offset = 0,$maxLen = -1,$lowercase = true,
$forceTagsClosed = true,$target_charset = DEFAULT_TARGET_CHARSET,$stripRN = true,
$defaultBRText = DEFAULT_BR_TEXT,$defaultSpanText = DEFAULT_SPAN_TEXT)
{ if($maxLen <= 0) { $maxLen = MAX_FILE_SIZE; }
$dom = new simple_html_dom
(null,$lowercase,$forceTagsClosed,$target_charset,$stripRN,$defaultBRText,
$defaultSpanText;
if (empty($contents) || strlen($contents) > $maxLen) {$dom->clear();return false;}
return $dom->load($contents, $lowercase, $stripRN);} function str_get_html(
$str, $lowercase = true,$forceTagsClosed = true,
$target_charset = DEFAULT_TARGET_CHARSET,$stripRN = true,
$defaultBRText = DEFAULT_BR_TEXT,$defaultSpanText = DEFAULT_SPAN_TEXT){
$dom = new simple_html_dom(
null,$lowercase,$forceTagsClosed,$target_charset,$stripRN,$defaultBRText,
$defaultSpanText);
return $dom->load($str, $lowercase, $stripRN);} function
dump_html_tree($node, $show_attr = true, $deep = 0){
$node->dump($node);} class
simple_html_dom_node{
public $nodetype = HDOM_TYPE_TEXT; public $tag = 'text';
public $attr = array(); public $children = array(); public $nodes
= array(); public $parent = null;
public $_ = array(); public $tag_start = 0; private $dom = null;
function __construct($dom){
$this->dom = $dom; $dom->nodes[] = $this;}
function dump($show_attr = true, $depth = 0){
echo str_repeat("\t", $depth) . $this->tag; if
($show_attr && count($this->attr) > 0) {
12
}echo "\n";}
function dump_node($echo = true){
$string = $this->tag;
if (count($this->_) > 0) {
$string .= ' $_ (';
foreach ($this->_ as $k => $v) {
if (is_array($v)) { $string .=
"[$k]=>("; foreach ($v as
$k2 => $v2) {
$string .= "[$k2]=>\"$v2\", ";}$string .= ')';} else {$string .= "[$k]=>\"$v\", ";}}
$string .= ')';}
$string .= ' HDOM_INNER_INFO: '; if
(isset($node->_[HDOM_INFO_INNER])) {
$string .= "'" . $node->_[HDOM_INFO_INNER] . "'";
} else {$string .= ' NULL ';}
$string .= ' children: ' . count($this->children);
$string .= ' nodes: ' . count($this->nodes);
$string .= ' tag_start: ' . $this->tag_start;
$string .= "\n";
if ($echo) { echo $string;return;}
function prev_sibling(){ if ($this-
>parent === null) { return null;
}
$idx = array_search($this, $this->parent->children,
true); if ($idx !== false && $idx > 0) { return $this-
>parent->children[$idx - 1];
}
return null;
}
13
if (isset($this->_[HDOM_INFO_OUTER])) {
return $this->_[HDOM_INFO_OUTER];
}
if (isset($this->_[HDOM_INFO_TEXT])) {
return $this->dom->restore_noise($this->_[HDOM_INFO_TEXT]);
}
$ret = '';
if ($this->dom && $this->dom->nodes[$this->_[HDOM_INFO_BEGIN]]) {
$ret = $this->dom->nodes[$this->_[HDOM_INFO_BEGIN]]->makeup();
}
if (isset($this->_[HDOM_INFO_INNER])) {
if ($this->tag !== 'br') {
$ret .= $this->_[HDOM_INFO_INNER];
}
} elseif ($this->nodes)
{ foreach ($this->nodes as $n)
{
$ret .= $this->convert_text($n->outertext());
}
}
if (isset($this->_[HDOM_INFO_END]) && $this->_[HDOM_INFO_END] != 0) {
$ret .= '</' . $this->tag . '>';
}
return $ret;
}
if (strcasecmp($this->tag, 'script') === 0) { return ''; }
if (strcasecmp($this->tag, 'style') === 0) { return ''; }
$ret = '';
14
if (!is_null($this->nodes)) {
foreach ($this->nodes as $n) {
if ($n->tag === 'p') {
$ret = trim($ret) . "\n\n";
}
$ret .= $this->convert_text($n->text());
if ($n->tag === 'span') {
$ret .= $this->dom->default_span_text;
}}}
return $ret;
}
function xmltext(){
$ret = $this->innertext();
$ret = str_ireplace('<![CDATA[', '', $ret);
$ret = str_replace(']]>', '', $ret); return
$ret;
}
function makeup(){
if (isset($this->_[HDOM_INFO_TEXT])) { return $this->dom-
>restore_noise($this->_[HDOM_INFO_TEXT]);
}
$ret = '<' . $this->tag;
$i = -1;
foreach ($this->attr as $key => $val) {
++$i;
if ($val === null || $val === false) { continue; }
$ret .= $this->_[HDOM_INFO_SPACE][$i][0];
if ($val === true) {
$ret .= $key;
} else {
15
switch ($this->_[HDOM_INFO_QUOTE][$i]){
case HDOM_QUOTE_DOUBLE: $quote = '"'; break;
case HDOM_QUOTE_SINGLE: $quote = '\''; break;
default: $quote = '';
}
$ret .= $key
. $this->_[HDOM_INFO_SPACE][$i][1]
. '='
. $this->_[HDOM_INFO_SPACE][$i][2]
. $quote. $val. $quote;}}
$ret = $this->dom->restore_noise($ret); return $ret .
$this->_[HDOM_INFO_ENDSPACE] . '>';} function
__get($name){ if (isset($this->attr[$name])) {
return $this->convert_text($this->attr[$name]);
}
switch ($name) { case 'outertext': return
$this->outertext(); case 'innertext': return
$this->innertext();
case 'plaintext': return $this->text();
case 'xmltext': return $this->xmltext(); default:
return array_key_exists($name, $this->attr);
}}
function __set($name, $value)
{
global $debug_object;
if (is_object($debug_object)) { $debug_object->debug_log_entry(1); }
switch ($name) {
case 'outertext': return $this->_[HDOM_INFO_OUTER] = $value;
case 'innertext':
16
if (isset($this->_[HDOM_INFO_TEXT])) {
return $this->_[HDOM_INFO_TEXT] = $value;
}
return $this->_[HDOM_INFO_INNER] = $value;
}
if (!isset($this->attr[$name])) {
$this->_[HDOM_INFO_SPACE][] = array(' ', '', '');
$this->_[HDOM_INFO_QUOTE][] = HDOM_QUOTE_DOUBLE;
}
$this->attr[$name] = $value;}
function __isset($name)
{ switch ($name)
{ case 'outertext': return
true; case 'innertext': return
true; case 'plaintext': return true;
}
return (array_key_exists($name, $this->attr)) ? true : isset($this->attr[$name]);
}
function __unset($name){
if (isset($this->attr[$name])) { unset($this->attr[$name]); }
}
function convert_text($text){
global $debug_object;
if (is_object($debug_object)) { $debug_object->debug_log_entry(1); }
$converted_text = $text;
$sourceCharset = '';
$targetCharset = ''; if
($this->dom) {
$sourceCharset = strtoupper($this->dom->_charset);
$targetCharset = strtoupper($this->dom->_target_charset);
17
}
if (is_object($debug_object)) {
$debug_object->debug_log(3,
'source charset: '
. $sourceCharset
. ' target charaset: '
. $targetCharset
);
}
if (!empty($sourceCharset)
&& !empty($targetCharset)
&& (strcasecmp($sourceCharset, $targetCharset) != 0))
{ if ((strcasecmp($targetCharset, 'UTF-8') == 0)
&& ($this->is_utf8($text))) {
$converted_text = $text;
} else {
$converted_text = iconv($sourceCharset, $targetCharset, $text);
}
}
if ($targetCharset === 'UTF-8') {
if (substr($converted_text, 0, 3) === "\xef\xbb\xbf") {
$converted_text = substr($converted_text, 3);
}
if (substr($converted_text, -3) === "\xef\xbb\xbf") {
$converted_text = substr($converted_text, 0, -3);
}
}
return $converted_text;
}
18
static function is_utf8($str){$c = 0; $b = 0;$bits = 0;$len = strlen($str);
for($i = 0; $i < $len; $i++) {$c = ord($str[$i]);
if($c > 128) {if(($c >= 254)) { return false; }elseif($c >= 252) { $bits = 6; }
elseif($c >= 248) { $bits = 5; } elseif($c >= 240) { $bits = 4; }
elseif($c >= 224) { $bits = 3; } elseif($c >= 192) { $bits = 2; }
else { return false; }
if(($i + $bits) > $len) { return false; }
}}
return true;
}
function get_display_size(){
global $debug_object;
$width = -1; $height =
-1; if ($this->tag !==
'img') { return false;}
if (isset($this->attr['width'])) {$width = $this->attr['width'];}
if (isset($this->attr['height'])) {$height = $this->attr['height'];}
if (isset($this->attr['style'])) {$attributes = array();
preg_match_all('/([\w-]+)\s*:\s*([^;]+)\s*;?/',
$this->attr['style'],$matches,PREG_SET_ORDER);
foreach ($matches as $match) {
$attributes[$match[1]] = $match[2];
}
if (isset($attributes['width']) && $width == -1) { if
(strtolower(substr($attributes['width'], -2)) === 'px')
{ $proposed_width = substr($attributes['width'], 0, -2);
if (filter_var($proposed_width, FILTER_VALIDATE_INT)) {
$width = $proposed_width;
}}}
19
if (isset($attributes['height']) && $height == -1) { if
(strtolower(substr($attributes['height'], -2)) == 'px')
{ $proposed_height = substr($attributes['height'], 0, -2);
if (filter_var($proposed_height, FILTER_VALIDATE_INT)) {
$height = $proposed_height;
}}}}
$result = array(
'height' => $height,
'width' => $width
);
return $result;
}
function save($filepath = ''){
$ret = $this->outertext();
if ($filepath !== '') {
file_put_contents($filepath, $ret, LOCK_EX);
}
return $ret;
}
function addClass($class){
if (is_string($class)) {
$class = explode(' ', $class);
}
if (is_array($class))
{ foreach($class as $c) {
if (isset($this->class))
{ if ($this->hasClass($c))
{ continue;
} else {
$this->class .= ' ' . $c;
20
}
} else {
$this->class = $c;
}}
} else { if
(is_object($debug_object)) {
$debug_object->debug_log(2, 'Invalid type: ', gettype($class));
}}}
function hasClass($class){
if (is_string($class))
{ if (isset($this-
>class)) {
return in_array($class, explode(' ', $this->class), true);
}
} else {
if (is_object($debug_object)) {
$debug_object->debug_log(2, 'Invalid type: ', gettype($class));
}}
return false;
}
function removeClass($class = null){
if (!isset($this->class)) { return;
}
if (is_null($class)) {
$this->removeAttribute('class');
return;
}
if (is_string($class)) {
$class = explode(' ', $class);
}
21
if (is_array($class)) {
$class = array_diff(explode(' ', $this->class), $class);
if (empty($class)) {
$this->removeAttribute('class');
} else {
$this->class = implode(' ', $class);}}}
function hasAttribute($name){return $this->__isset($name);} function
removeAttribute($name){$this->__set($name, null);} function remove(){if
($this->parent) {$this->parent->removeChild($this);}} function
removeChild($node){
$nidx = array_search($node, $this->nodes, true);
$cidx = array_search($node, $this->children, true);
$didx = array_search($node, $this->dom->nodes, true); if
($nidx !== false && $cidx !== false && $didx !== false) {
foreach($node->children as $child) {
$node->removeChild($child);
}
foreach($node->nodes as $entity) {
$enidx = array_search($entity, $node->nodes, true);
$edidx = array_search($entity, $node->dom->nodes, true);
if ($enidx !== false && $edidx !== false)
{ unset($node->nodes[$enidx]);
unset($node->dom->nodes[$edidx]);
}}
unset($this->nodes[$nidx]); unset($this-
>children[$cidx]); unset($this->dom->nodes[$didx]);
$node->clear();
}}
function getElementById($id){return $this->find("#$id", 0); }
function getElementsById($id, $idx = null){return $this->find("#$id", $idx); }
22
function getElementByTagName($name){ return $this->find($name, 0); }
function getElementsByTagName($name, $idx = null){ return $this->find($name, $idx); }
function parentNode(){return $this->parent(); }
24
$node->_[HDOM_INFO_TEXT] = '<' . $tag . $this->copy_until('<>');
if ($this->char === '<') {
$this->link_nodes($node, false);
return true;
}
if ($this->char === '>') { $node->_[HDOM_INFO_TEXT] .= '>'; }
$this->link_nodes($node, false);
$this->char = (++$this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
return true;
}
$node->nodetype = HDOM_TYPE_ELEMENT;
$tag_lower = strtolower($tag);
$node->tag = ($this->lowercase) ? $tag_lower : $tag;
if (isset($this->optional_closing_tags[$tag_lower])) {
while (isset($this->optional_closing_tags[$tag_lower][strtolower($this->parent->tag)])) {
$this->parent->_[HDOM_INFO_END] = 0;
$this->parent = $this->parent->parent;
}
$node->parent = $this->parent;
}
$guard = 0; // prevent infinity loop
$space = array($this->copy_skip($this->token_blank), '', '');
do {
$name = $this->copy_until($this->token_equal); if ($name
=== '' && $this->char !== null && $space[0] === '') { break;
}
if ($guard === $this->pos) {$this->char = (++$this->pos < $this->size) ?
$this>doc[$this->pos] : null; // next
continue;
}
25
$guard = $this->pos;
if ($this->pos >= $this->size - 1 && $this->char !== '>') {
$node->nodetype = HDOM_TYPE_TEXT;
$node->_[HDOM_INFO_END] = 0;
$node->_[HDOM_INFO_TEXT] = '<' . $tag . $space[0] . $name;
$node->tag = 'text';
$this->link_nodes($node, false);
return true;
}
if ($this->doc[$this->pos - 1] == '<') {
$node->nodetype = HDOM_TYPE_TEXT;
$node->tag = 'text';
$node->attr = array();
$node->_[HDOM_INFO_END] = 0;
$node->_[HDOM_INFO_TEXT] = substr(
$this->doc,
$begin_tag_pos,
$this->pos - $begin_tag_pos - 1
);
$this->pos -= 2;
$this->char = (++$this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
$this->link_nodes($node, false);
return true;
}
if ($name !== '/' && $name !== '') { // this is a attribute name
$space[1] = $this->copy_skip($this->token_blank);
$name = $this->restore_noise($name); // might be a noisy name
if ($this->lowercase) { $name = strtolower($name); }
$node->_[HDOM_INFO_SPACE][] = $space;
$space = array(
26
$this->copy_skip($this->token_blank),
'',);
}
} while ($this->char !== '>' && $this->char !== '/');
$this->link_nodes($node, true);
$node->_[HDOM_INFO_ENDSPACE] = $space[0];
if ($this->copy_until_char('>') === '/') {
$node->_[HDOM_INFO_ENDSPACE] .= '/';
$node->_[HDOM_INFO_END] = 0;
} else {
if (!isset($this->self_closing_tags[strtolower($node->tag)])) {
$this->parent = $node;
}
}
$this->char = (++$this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
if ($node->tag === 'br') {
$node->_[HDOM_INFO_INNER] = $this->default_br_text;
}
return true;
}
protected function link_nodes(&$node, $is_child){
$node->parent = $this->parent;
$this->parent->nodes[] = $node;
if ($is_child) {
$this->parent->children[] = $node;
}}
protected function as_text_node($tag){
$node = new simple_html_dom_node($this);
++$this->cursor;
$node->_[HDOM_INFO_TEXT] = '</' . $tag . '>';
27
$this->link_nodes($node, false);
$this->char = (++$this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
return true;
}
protected function skip($chars){
$this->pos += strspn($this->doc, $chars, $this->pos);
$this->char = ($this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
}
protected function copy_skip($chars){
$pos = $this->pos;
$len = strspn($this->doc, $chars, $pos);
$this->pos += $len;
$this->char = ($this->pos < $this->size) ? $this->doc[$this->pos] : null; // next
if ($len === 0) { return ''; } return substr($this->doc, $pos, $len);
}
protected function copy_until($chars){
$pos = $this->pos;
$len = strcspn($this->doc, $chars, $pos);
$this->pos += $len;
$this->char = ($this->pos < $this->size) ? $this->doc[$this->pos] : null; // next return
substr($this->doc, $pos, $len);
}
function search_noise($text)
{ global $debug_object;
if (is_object($debug_object)) { $debug_object->debug_log_entry(1); }
foreach($this->noise as $noiseElement) { if (strpos($noiseElement,
$text) !== false) { return $noiseElement;
}}}
28
function __toString(){return $this->root->innertext();} function
childNodes($idx = -1){return $this->root->childNodes($idx);} function
firstChild(){return $this->root->first_child();} function lastChild()
{return $this->root->last_child();}
}
Phpcompatibility.xml
<?xml version="1.0" encoding="UTF-8"?>
<ruleset name="PHPCompatibility">
<description>Defines rules for PHPCompatibility</description>
<exclude-pattern>./app</exclude-pattern>
<exclude-pattern>./example</exclude-pattern>
<exclude-pattern>./manual</exclude-pattern>
<exclude-pattern>./testcase</exclude-pattern>
<exclude-pattern>./tests</exclude-pattern>
<config name="testVersion" value="5.6"/>
<rule ref="PHPCompatibility" />
</ruleset>
6. RESULT
29
6.1. Search Bar
30
6.3. Results for Iphone 13
31
what PCWs are or how they work, and there were a number of misconceptions about the search
results which could lead to consumers selecting products that do not meet needs or expectations.
Many interpret the simple layout and presentation of information about the insurance products on
PCWs as all they need to make a good decision, and a cognitive nudge not to look further.
The search results on PCWs are largely taken at face value and many assume the different
products and add-ons will work the same way or offer similar cover. Once the search results
appear, the headline cost and key features dominate consumer attention and this is the wrong
time to engage them with details about how the add-ons or features work. The data-entry stage is
the optimal time to engage consumers about how the products work.
Allowing them to filter on levels of cover or features at this initial stage would not only raise
awareness about differences in the policies, but also provide more tailored and relevant results
that would allow a true comparison of both price and product. It is currently hard to find detailed
and accessible information on PCWs summarizing how the policies work and the expectation
was for more detail to summarize the levels of cover or key product features.
Some examples offered little more than the PCW search results, reinforcing perceptions that
there is little more to know, or that the GI policies are all broadly the same. Only by committing
to the next steps and moving through to the insurer website do consumers find the product detail
they need to make a decision, and this is coming too far into the process. Many want to see a
greater distinction between search results and the product summary, with a more standardized
approach across the PCWs that sits between the search results and the full product. Consumers
go on a journey through the research.
They begin by feeling that PCWs are a good thing and allow them to quickly and easily get a
large number of quotes for insurance; compare the price of the quotes presented; and gain a very
high level sense for what features or add-ons are included.
However, as the research proceeds, many find that the PCWs have a number of limitations: that
they cannot compare the actual policies in a meaningful way (although they thought they could);
they cannot be sure that the add-ons that appear in the search results will be included at no extra
cost when they go to purchase that product; they cannot rely on the filter options at the data entry
stage to Page 15 PCW research report April 2014 appear in the search results; nor can they easily
interrogate the policies or find out exactly what they are covered for. In future we hope to add
more features into this website such as add email notifications or other platforms for updating the
customers about the new deals or we can add newsletter in the future for the feedbacks, so we
can know what the user can’t get on the website and add it later on or advertise the website on
various social media platforms or add subscriptions for more discounts and refined searches or
add other services like ticket booking for buses, trains, flights etc.
8. REFERENCES
32
“Bundling as a strategy for new product introduction”: Effects on consumers' reservation prices
for the bundle, the new product, and its tie-in.
Journal of Business Research
(1995)
[2] X. Pan et al.
“Price dispersion on the internet”: A review and directions for future research
Journal of Interactive Marketing
(2004)
[3] https://simplehtmldom.sourceforge.io/docs/1.9/index.html
[4] https://www.php.net/manual/en/function.xml-parse.php [5] PHP Web Services: APIs for the
Modern Web
33