Wednesday, December 06, 2006

Using Google Sitemaps

(Joining us as a guest blogger today is Brian Hoke, Principal of Bentley & Hoke, a web development and internet consulting firm in Syracuse, New York. His topic: using Google sitemaps to help search engines find the content on your site. Thanks to Brian for the contribution.    -John Whiteside)

Google and other search engines rely on spidering – automated searches of your site by Google servers – to update their stored copy of your page's content. A Google search for "The Opinionated Marketers", for instance, returns a #1 ranking in part because yesterday Google's spiders crawled this blog, found the phrase "The Opinionated Marketers", stored it in their servers, and matched it with my search just now. Other factors play a role here (most notably the number of links from other sites to the page in question) but making sure Google has spidered your pages is key to being found.

Hopefully other pages (lots of other pages) link to all of the pages on your site, but that's almost certainly not 100% true. One way to ensure that Google's spiders will find all of the pages on your site, including the ones you just put up last night, is to include an XML sitemap. Fashioned in a format dictated by Google, this file offers Google a directory of all of the pages on your site - a friendly invitation to drop by each of the pages you want found. Optionally, you can include extra info for each page: date of last modification, how often the page changes, and the relative importance of the page compared to the rest of your site.

Here's an example from my Google sitemap file:

<url>
   <loc>http://www.bentleyhoke.com/index.php</loc>
   <lastmod>2006-11-14</lastmod>
   <changefreq>weekly</changefreq>
   <priority>0.9</priority>
</url>

Submitting this file to Google tells the search engine that this page of my site was last modified on Novemeber 14th, changes on a weekly basis, and is (at 0.9 out of a possible 1.0) among the most important pages on my site.

Many web authoring systems (blog software, content-management systems, and the like) make easy the process of creating a sitemap file. There are free sites which will do this for you. And you can always do it yourself, by hand or programmatically: the sitemap file I built for my site populates the listing of pages from the database in which I store page content. Periodically - after each content update - I'll resubmit the sitemap to Google using their webmaster tools.

A few weeks ago, Google, Yahoo, and MSN all agreed on a common format for sitemaps - Google's format. Microsoft says they will implement this in 2007; Yahoo and Google have already implemented it. This is good news (and less work) for those of us already using sitemap files.

Will adding an XML sitemap file to your site dramatically boost search engine rankings? Certainly not. But Google's sitemap is one more tool to help ensure that the content you want found gets found.

2 comments:

Anonymous said...

Spidering wasted too much money. No need for wasting too much money.
I need you support my ideas, Sitebases Protocol, thank you for your advice.

Anonymous said...

Also, more than half an year go now, they have agreed on XML sitemaps auto discovery using robots.txt, i.e. so search engines can find out where your XML sitemap is just by looking at robots.txt - I found this blog post about it: http://www.micro-sys.dk/blogs/2007/04/12/new-search-engines-support-xml-sitemaps-protocol/