The WyriHaximus.Network LogoWyriHaximus.Network?: WyriHaximus.net | 11 vistors online | Guest | Login | Register

Monthly Website Tips: Sitemap and Ping February


This month I'll get into getting your site noticed by several big parties on the web like Google, Yahoo!, MSN/Live and several pingback sites.

Sitemap

The sitemap specifycation allows webmasters to tell crawlers wich pages they have on their site. This is the basic goal of a sitemap file.

Priority

Besides that you can specify the latest update date and the priority of the page within the website. NOTE: Alot of people seem to think the priority field will get them a higher position in search engines. It DOESN'T! It only tells the importance of a page within site therefor it should be used with some things in mind. First of all you don't want your SE (Search Engine) visitors to enter you site on a list of topics (I'll use a forum as example in this post) but on a topic. So you give the topic pages a higher priority then the topic list pages. While a memberlist/member profile is 99.99999% of the time useless for someone that searches (unless your specific searching for it) so we rather award is the lowest possible priority within the site. When you have a menu on your site the pages in it are generally important pages with information you want your visitors to see, award them the highest possible priority. So the priority values for these pages are: menu page 1.0, topic 0.7, topiclist 0,5 and memberlist/member profile 0.1.

Files

The sitemap spec also allow separating the sitemap file into different files. This gives you a 'root' file and several 'child' files (note it doesn't allow nesting). The 'root' file is a list of 'child' files to include (note that the 'root' file cannot have any url entries in it only the including of 'child' files). The files have a limit of 10MB in size or 50.000 entries. (A normal site shouldn't reach the theoretical limit of 2.500.000.000 pages (roughly 2 humans on the sphere of mud per page) on the site. When you have 50.000 entires in the 'root' file and in each 'child' file. And otherwise you can always create a second sitemap file set that could handle anything over that limit.)

Google

Google's Webmasters Centre is a very big tool (especialy compared the Yahoo!'s and MSN/Live's) to manage and maintain your site within their index. One of the sections is the sitemap management and one of the most usefull features from the sitemap section is that they show how many pages are found in your sitemap and how many are indexed. If you (like me) split up your sitemap into different files it also shows how many out of how many links are from a file. It also (if encouring) shows you in what file there are any errors.

Besides the sitemaps management it also shows on what keywords your the highest and on what keywords you got the most visitors from. Giving a great insight on how well your site is doing in their index. Besides those numbers you can use the "site:" + domain syntax to find out how many pages and what is included from the specified domain. (Note that this does include subdomains.)

Yahoo!

The interface of Yahoo! is very simple and imo not very usefull compared to the google one. 2 months ago I added my sitemap there and it's barely visiting my site to check for updates. Also it seems to hold on to old pages very very long and still checks pages that didn't update in ages on a regular basis. (They are still checking my 3 (probably 5 year old now) year old portal.php that has the same message there ever since I start using bAdaptive for my site.)

MSN/Live

MSN/Live has a fairely better interface then Yahoo! not alot bet it gives you a bit more insigths on what their crawler has indexed and how they think about your site. It gives you site a number (x out of 5) where I can't find the requirements for. My guess is that it's their version of pagerank and they seem to have failed. (My site gets a 5 out of 5 and there are lot bigger sites then mine.) Or it could be a index how wel SEOed your site is.

Images

Images are a bit harder to get into search engines.results. Been trying to get my screenshots into images.google.com but for some reason google doesn't pick them up. MSN/Live does (although only 1.500 (out of 14.000), a college joked I got in at MSN/Live by not following standards and google doesn't let me in because of that :P. (Got both the title and alt attribute in them.) It kinda made me wonder why google wouldn't let me into their index. Some other month more on this when I figured it out :).

Pinging the engines

When you got your neat little sitemap you also want to let the engines knows you have it ready or changed. They run ping beacons for that. You simply send an HTTP request with your full sitemap URL and they add it to their queue of things to check. There are also services like AAA that ping them for you and save's you the trouble of pinging 5 different engines. But if you want to do this your self (like me) here are the most used (get a link for that stupid!) search engines and how to ping them:

  • Normal Engines
    • Ask.com: http://submissions.ask.com/ping?sitemap=[SITEMAPURL]
    • Google: http://www.google.com/webmasters/sitemaps/ping?sitemap=[SITEMAPURL]
    • Yahoo: http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=YahooDemo&url=[SITEMAPURL]
    • MSN/Live.com: http://webmaster.live.com/ping.aspx?siteMap=[SITEMAPURL]
  • Ping Services
    • http://www.sitemapwriter.com/notify.php?crawler=all&url=[SITEMAPURL]

Pinging the engines 2

(This next bit only applies to blogs.)

Ok this is the same but with a bit of a twist in it. Instead of pinging the sites about a changed sitemap you'll be pinging blog related services that either ping others (Ping-O-Matic), crawlers like Google Blogsearch or services Feedburner to let them know you created or edited a post.

Here is my previous post on what services to ping:

  • http://rpc.blogcatalog.com/
  • http://ping.feedburner.com/
  • http://blogsearch.google.com/ping/RPC2
  • http://rpc.weblogs.com/RPC2
  • http://api.moreover.com/RPC2
  • http://www.blogdigger.com/RPC2
  • http://blog.goo.ne.jp/XMLRPC
  • http://ping.bloggers.jp/rpc/
  • http://ping.syndic8.com/xmlrpc.php
  • http://ping.weblogalot.com/rpc.php
  • http://bblog.com/ping.php
  • http://pinger.blogflux.com/rpc/
  • http://www.holycowdude.com/rpc/ping/
  • http://blogbot.dk/io/xml-rpc.php
  • http://holycowdude.com/rpc/ping/
  • http://blog.with2.net/ping.php
  • http://ping.ask.jp/xmlrpc.m
  • http://ping.blog360.jp/rpc
  • http://ping.fc2.com/
  • http://ping.kutsulog.net/
  • http://ping.namaan.net/rpc
  • http://pinger.blogflux.com/rpc
  • http://r.hatena.ne.jp/rpc
  • http://www.wasalive.com/ping/
  • http://www.overskrift.dk/ping/
  • http://rpc.twingly.com/
  • http://www.britishblogs.co.uk/xmlrpc.php
  • http://www.syndic8.com/xmlrpc.php
  • http://rpc.blogcatalog.com
  • http://www.zhuaxia.com/rpc/server.php
  • http://www.xianguo.com/xmlrpc/ping.php
  • http://rpc.technorati.com/rpc/ping
  • http://ping.syndic8.com/xmlrpc.php

Pingbacks

Pingbacks are an automated way of linking back to site (or blog for that matter) that links to you. When you have a site with certain pages on it that allow comments make sure you allow pingbacks. For my own site I busted out code form WordPress to make half of the system work. Using phpxmlrpc made it realy easy to set up both the server and client side scripting (the other half).

Referers

Referers are sites linking to your page. Normaly this is just static information, but you can make it semi-useful by turning it into a referer-pingback. Here is a simple TODO list on how to do this, later I'll make a How To about it:

1) Make sure your site supports pingbacks.

2) Save your incoming referers somewhere (doesn't matter where as long as you can access them it's fine).

3) Create a script that checks incoming referers list and send a pingback (to your own site) if it  finds a new referer.

4) Setup a cronjob to run this once every hour or everyday (depending on how fresh you want this info on your site.

So this month was about automatic extending your reach, next month I'll wrote about what you can do by hand to increase traffic (comments, forum posts, social networking).

WyriHaximus

http request to send the pings Score: 2 Vote:
By brent @ 24 March 2008, 16:35
Do you know how to send an HTTP request with PHP to send the pings?  My goal is to set up a cron job to run the PHP script weekly after the sitemap is created, unfortunately the stuff written on google merely says to use wget or curl without any direction on how... The manuals for both are extremely confusing to me. brent@mimoymima.com 


Name:
Website:
Email:
Subject:
Message: