Sitemap generator class
Here is Curl multi sitemap class, which can be used to generate multiple sitemaps by crawling sites
It can crawl one or more sites to retrieve its pages and follow links recursively and determine the addresses of all pages to include in a XML sitemap.
It can ignore given URLs to avoid crawling and including in the sitemap.
The class may also notify Google, Bing, Yahoo, Ask and Weblogs when the sitemap is updated.
Here is an example of this class usage: SitemapGenerator.co.cc
Contents
- Download
- Example codes
- Examples in action
- Method list
- Possible error messages
- Latest changes
- Rate us
- Support
- Awards
Download
Example codes
<?php //setting to no time limit, set_time_limit(0); //declaring class instance include("./sitemap.class.php"); $sitemap = new sitemap(); //optionally set proxy server name and port or ip and port //comment-out or set to an empty string to disable proxy use //$sitemap->set_proxy('10.1.1.1:8080'); //setting rules to ignore URLs which contains these substrings $sitemap->set_ignore( array("javascript:", ".css", ".js", ".ico", ".jpg", ".png", ".jpeg", ".swf", ".gif")); //parsing one page and gathering links $sitemap->get_links("http://code-snippets.co.cc"); //parsing other page and gathering links $sitemap->get_links("http://cms.annar2r.info"); //return URL list as array //$arr = $sitemap->get_array(); //echo "<pre>"; //print_r($arr); //echo "</pre>"; header ("content-type: text/xml"); //generating sitemap $map = $sitemap->generate_sitemap(); //submitting site map to Google, Yahoo, Bing, Ask and Moreover services $sitemap->ping("http://".$_SERVER['SERVER_NAME']. $_SERVER['REQUEST_URI']); echo $map; ?>
Examples in action
Example scripts provided with package in action:
Method list
- Ignore URL substrings
- Set proxy
- Get links
- Get array of URLs
- Notify services about your sitemap update
- Generate XML sitemap
Ignore URL substrings
| Method name | set_ignore($ignore_list) |
| Description | Set rules to exclude URLs from sitemap, which contains these substrings |
| Input parameters | array $ignore_list - array with *wildcards* as values to ignore |
| Example input | set_ignore(array("javascript:", ".css", ".js", ".ico", ".jpg", ".png", ".jpeg", ".swf", ".gif")) |
Set proxy
| Method name | set_proxy($host_port) |
| Description | Set proxy host and port |
| Input parameters | string $host_port - profy host and port , for example someproxy:8080 or 10.1.1.1:8080 |
| Example input | set_proxy("10.1.1.1:8080") |
Get links
| Method name | get_links($domain) |
| Description | Parse provided url to collect all urls from same domain |
| Input parameters | string $domain - url to website |
| Example input | get_links("http://code-snippets.co.cc") |
Get array of URLs
| Method name | get_array() |
| Description | Return array of all colected urls after calling method get_links |
| Input parameters | string $domain - url to website |
| Example input | get_links("http://code-snippets.co.cc") |
Notify services about your sitemap update
| Method name | ping($sitemap_url, $title ="", $siteurl = "") |
| Description | Notifies services like Google, Yahoo, Ask, Bing and Moreover about your sitemap and sitemap updates |
| Input parameters | string $sitemap_url - url to generated sitemap string title - your website title (optional) string $siteurl - URL to your site (optional) |
| Example input | ping("http://code-snippets.co.cc/sitemap.xml", "Code snippets", $siteurl = "http://code-snippets.co.cc") |
Generate XML sitemap
| Method name | generate_sitemap() |
| Description | Generate XML site map based on method get_links collected URLs and returns it as string |
Possible error messages
List of all errors and meanings
| Error text | Meaning | Solution |
| Provided file IMAGE_PATH isn't correct image format | File you are trying to add isn't image, or has unsupported format. | Convert it to any supported format, like png, jpg or gif |
| Provided file IMAGE_PATH doesn't exist | File you are trying to add doesn't exist | Check your specified path and/or file/directory permissions |
Latest changes
15.09.2010 - added proxy support
Rate us
Try it out and Rate on PHPclasses.org
Support
PHP classes support forum or comments below
Awards
Curl multi sitemap was nominated to Innovation Award and achieved 5th place, thanks everyone for voting
You may also be interested in:
Powered by BlogAlike.com










