Categorized | PHP Tutorials

Create a PHP Site Map Generator

This PHP tutorial will provide you with step by step instructions on how you can create your very own PHP site map based on the Google XML site map generator. This is totally customized method, where you can insert the program all of your canonical URLs  and it will then output the XML syntax for the sitemap.xml file, which will be stored in a specified location on your server. Take note that you should be familiar with XML, if not it’s best to get familiar first.

There are two main advantages for creating this PHP site map in comparison to other site map generators.

  1. Google does indeed advise strongly that webmasters should submit XML site maps that contain ONLY canonical URLs.
  2. Even if you use some random XML site map generator online (just Google it), you still have to spend some time to filter out a lot of the actual code for canonical URLs, and commonly they are limited to only 300-500 urls which is inadequate if your running a large site, or a site that’s rapidly growing..

PHP Site map Generator Design

Most if not all  PHP web programs begin with a plan of design. The canonical URLs are placed into a  form. A text area will be used to accommodate the large number of canonical URLs (assuming you have a large site or a site that repidly grows).

The PHP script will  be used to carry out 4 tasks:

  1. Parse the URL from the  form
  2. Compute the priorities of the URL, which is used for the sitemap
  3. Generate the XML syntax
  4. Output the generated syntax with computed priorities back to the user’s web browser

The XML syntax must conform to the Google XML sitemap standard (Google it). An example of a conforming XML syntax is:

<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>
<url>
<loc>http://www.example.com/ </loc>
<priority>1</priority>
</url>
<url>
<loc>http://www.example.com/testing-page-1/ </loc>
<priority>0.90</priority>
</url>
<url>
<loc>http://www.example.com/testing-page-2/ </loc>
<priority>0.94</priority>
</url>
</urlset>

You might consider using standard colors for the XML syntax to easily differentiate URLS, which will come in handy when your troubleshooting the code. A captcha system might also be wise to prevent robots or spammers from using the form.

The web form HTML code:

Create the code for the form to serve as the user input:

//Start session for captcha or security code
<?php
session_start();
?>
<!–To display head tag and page title–>
<html>
<body>
<head>
<title>PHP site map code generator</title>
</head>
<?php
//Check if form is submitted
if (!$_POST['submit'])
{
//If  web form is not submitted, display the form
?>
<!– web form–>
<br />
<!–This form will submit the data to itself using PHP SELF–>
<form action=”<?php echo $SERVER['PHP_SELF']; ?>”
method=”post”>
<font face=”Arial” size=”3″>
<!-Below is the form content, instructions on how to use–>
<!–Enter content here for your own web program–>
<!–The Text area HTML code which will accepts user inputs in terms of URL–>
<!–It also includes the captcha image or the security code–>
<textarea name=”url” rows=”18″ cols=”120″></textarea>
<br /><br />
<img src=”/xmlsitemapgenerator/antibot.php” />
<br />
Type the anti-bot code above:
<br /> <br />
<input type=”text” name=”captcha” size=”10″>
<br /> <br />
<input type=”submit” name=”submit” value=”Generate Sitemap XML code”>
</form>
<a href=”/xmlsitemapgenerator/xmlsitemapgenerator.php”>Reset / Clear Form</a>

PHP Script Code:

As aforementioned the PHP Script should carry out 4 tasks (excluding retrieving data and validation)

<?php
}
else
{
//The web form has been submitted, grab the data from POST as well as remove the white space using trim command
$url =trim($_POST['url']);
//Check if the form submitted contains any data.
//Also check if the security code is correct.
//More information about captcha design here http://www.devshed.com/c/a/PHP/Designing-a-Captcha-System-with-PHP-and-MySQL/
if ((empty($url)) || (!(trim($_POST['captcha'])==$_SESSION['answer'])))
{
//Feedback to the user that the form does not contain any data
die (‘ERROR: Enter figures or correct captcha. <a href=”/descriptivestats.php”>Click here to proceed with the analysis</a>’);
}
else
{
//The data from the $url comes from an array (POST),
//You need to explode the data from the array using PHP explode function
//And then assigned the data to a $data variable
//This will do the job of actually parsing the URL from the web form
$data = explode(“n”, $url);
//Display to the web browser the heading sections of the XML syntax using Google XML sitemap standard
echo ‘<font face=”Courier New” size=”2″>’;
echo ‘<font color=”#C0547F”><i>&lt;&#63;xml version&#61;&#34;1.0&#34; encoding&#61;&#34;UTF&#45;8&#34;&#63;&#62;</i></font>’;
echo ‘<br />’;
echo ‘<font color=”#7C137F”><b>&lt;urlset</b></font> <b>xmlns</b>&#61;<font color=”blue”>&#34;http://www.sitemaps.org/schemas/sitemap/
0.9&#34;</font>&gt;’;
echo ‘<br /><br />’;
//The code below computes the priority of the URLs in sitemap.
//The most important URL (the one entered first by the user in the web form has a priority of 1.)
//The minimum or lowest assigned priority is 0.5 regardless of how many URLs are being processed.
//To compute the decrement value or how much is the priority differences between the highest priority of 1 to the lowest assigned priority of 0.5 are to use the formula:
// (0.5)/ ((Number of URLs)-1)
//In PHP, it needs to use the sizeof function to count the number of URLs in the data variable.
$difference = (-0.5)/((sizeof($data))-1);
//The following below is a WHILE Loop which will do the actual tasks of generating the XML syntax for all the URLs
$priority=1.0;
while (($priority>=0.4) && (list($key,$value) = each($data))) {
$roundpriority=round($priority,2);
echo “<font color=’#7C137F’><b>&lt;url</b></font>&gt;”;
echo “<br />”;
echo “&nbsp;&nbsp;&lt;<font color=’#7C137F’><b>loc</b></font>&gt;$value&lt;&#47;<font color=’#7C137F’><b>loc</b></font>&gt;”;
echo “<br />”;
echo “&nbsp;&nbsp;&lt;<font color=’#7C137F’><b>priority</b></font>&gt;$roundpriority&lt;&#47;<font color=’#7C137F’><b>priority</b></font>&gt;”;
echo “<br />”;
echo “&lt;&#47;<font color=’#7C137F’><b>url</b></font>&gt;”;
echo “<br /><br />”;
//after each loop, the priority is decreased by a difference value computed earlier, the $difference variable is a negative number.
$priority=$priority+$difference;
}
echo “&lt;&#47;<font color=’#7C137F’><b>urlset</b></font>&gt;”;
}
}
?>
<!–Clears out the session variable–>
<?php
$_SESSION = array ();
session_destroy ();
?>
</body>
</html>

Implementing the Code:

This is a customized XML PHP site map generator, you must consider 3 vital points:

  1. You must make sure that the URLs i9nserted into the form are arranged from the most important (i.e the index page) to least important (i.e the policy page).
  2. Use entity escape codes for the characters mentioned on this page. You need to apply escape codes before pasting those entire URL list into the form.
  3. Copy and paste the result on the web program output to a blank text file and save it as sitemap.xml

Lastly, using any FTP program ( filezilla or cuteftp) to upload it to the root directory, then of course you can finally submit the link pointing to your sitemap.xml file to google webmaster tools (i.e www.example.com/sitemap.xml)

3 Responses to “Create a PHP Site Map Generator”

  1. Sitemap Generator is freeware program that will help you to automatically generate site maps for google or yahoo in their respective file format.

Trackbacks/Pingbacks

  1. [...] Creating a PHP Site Map generator for your Web Site | PHP IDE [...]