Jump to content

Database query


rofl90

Recommended Posts

Wait, now it doesn't work, it did it for that one, but well... I'll explain:

 

Theres about 5,000,000 pages, and about 100,000 of them are attractions of which I want, so it needs to be able to differenciate between each of them, and when it finds an attraction send it to mysql.

 

It just won't load now.

 

Code:

 

mysql_connect($CONF["HOST"], $CONF["USERNAME"], $CONF["PASSWORD"]);
mysql_select_db($CONF["DATABASE"]);
$first = 100000;
while($first < 1500000) {
$fileName = "http://www.x.com/x-g--d" . $first . ".html";
$file = file_get_contents($fileName);
$match = array();
if(preg_match('~<div id="ATTRACTION_REVIEW" class="listing">(.+)</div><!--/ ATTRACTION_REVIEW\.listing-->~is', $file, $match)) {
	$result = mysql_query("INSERT INTO pages (linkName, theText) VALUES('$fileName', '$match[1]')") or die(mysql_error()); //important
	if($result) {
		echo "Node " . $first . ": Success";
		$first++;
	}
	else {
		echo "Failure\n\n";
	}
}
else {
	echo "Node" . $first . ": Failed";
}
}
?>

Hmm, I ran this all night, it didn't do a single one. Heres the code:

 

<?php
set_time_limit(1000000000000);
$CONF = array();
$CONF["DATABASE"] 				= 'x';
$CONF["USERNAME"] 				= 'x';
$CONF["PASSWORD"] 				= 'x';
$CONF["HOST"] 					= 'x';
mysql_connect($CONF["HOST"], $CONF["USERNAME"], $CONF["PASSWORD"]);
mysql_select_db($CONF["DATABASE"]);
$first = 5000;
while($first < 1500000) {
$fileName = "http://www.x.com/x-g--d" . $first . ".html";
$file = file_get_contents($fileName);
$match = array();
if(preg_match('~<div id="ATTRACTION_REVIEW" class="listing">(.+)</div><!--/ ATTRACTION_REVIEW\.listing-->~is', $file, $match)) {
	$result = mysql_query("INSERT INTO pages (linkName, theText) VALUES('$fileName', '$match[1]')") or die(mysql_error()); //important
	if($result) {
		echo "Node " . $first . ": Success";
		?>
            <script type="text/javascript">
		d = document.getElementById("d");
		d.innerHTML = <?php
		echo $first;
		?>;
		</script>
		<?php
		$first++;
	}
	else {
		echo "Failure\n\n";
	}
}
else {
	echo "Node" . $first . ": Failed";
}
}
?>

 

Although it does work on one.

Using it on one different to another one, it gives me this sql error:

 

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 's permanent collection spans the period from about 1250 to 1900 and consists of ' at line 1

Having a stab in the dark here but you are getting that error because the text inside your DIV has unescaped quote marks which is, i think, killing the query. Try this

 

$div_data = mysql_real_escape_string($match[1]);
$result = mysql_query("INSERT INTO pages (linkName, theText) VALUES('$fileName', '$div_data')") or die(mysql_error());

Well.. Your using, im assuming, remote files which means your server has to open all of the HTML files in the while loop and get its contents so it all depends on the amount of pages..

 

If you want a good estimate of time..limit it to 10 or so pages and see how long that takes if that takes 10seconds you could ALMOST assume that 300 records its going to take 5 minutes..

 

 

handle downwards:

 

$handle = fopen("c:\my_prog\tracking_info.txt", "a+") or die("Unable to open file");
while($first < 999999) {
$fileName = "http://www.x.com/x-g--d" . $first . ".html";
if($file = file_get_contents($fileName)) {
	$match = array();
	if(preg_match('~<div id="ATTRACTION_REVIEW" class="listing">(.*?)</div><!--/ ATTRACTION_REVIEW\.listing-->~is', $file, $match)) {
		$match[1] = mysql_real_escape_string($match[1]);
		$result = mysql_query("INSERT INTO pages (linkName, theText) VALUES('$fileName', '$match[1]')") or die(mysql_error()); //important
		if($result) {
			fwrite($handle, "Node " . $first . ": Success\n\n");
			$first++;
		}
		else {
			fwrite($handle, "Node" . $first . ": Failed\n\n>");
		}
	}
	else {
		fwrite($handle, "Node" . $first . ": Not an attraction\n\n");
		$first++;
	}
}
else {
	$first++;
	fwrite($handle, "Could not open Node" . $first . "\n\n");
}
}
?>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.