Thursday, July 26, 2007

Spam and Statistics

Mark Cahill at Vario Creative has an interesting post on spam and web statistics, and raises some great questions about how we interpret web stats:

One trend I’ve been seeing in web traffic would suggest that a good percentage of the web traffic we see in our reports is bogus as well. I attribute this to scripted robots looking for contact forms, forum registration forms and blog comments on which they hope to post their message unnoticed.

Here’s what got me thinking: I was looking at traffic stats for a local business. They don’t do any business outside of their local shops, no online product sales, etc. On the surface, they had fairly good traffic. The problem came in when I started to look at the geographic distribution of their visitors. Nearly half were coming from Asia - and on top of it, they had a huge depth of visit. Almost as if they were reading every single page of the site…


As Mark observed, these aren't real visitors. But then how do we understand what's happening on a site? Some of it is filtering things like this out, of course, but I think it's helpful to remember why the site is there. It's generally not just so that people can see it. It's usually so that people can do something: buy something, download something, ask for more information, whatever.

And so traffic stats can not only be misleading, but they can also take you down the wrong path. What matters is the visitors who do what you want them to do when they reach the site. What's more interesting than how many people come to your site (people, not robots) is what they do, how they move through the site, and what actions they take when they are there.

Sounds obvious, of course, but it's not as simple as it sounds.

2 comments:

rchmura said...

If you use a service like GoStats you can weed out all of the spam traffic and actually get a realistic picture of your traffic. Since GoStats is a third party tool, it is constantly being updated. You will also find that the stats-collection model is much more resistant to the effects of spam traffic.

Mark Cahill said...

Ah, it is a good tool, no doubt, but I don't want my analytics tool filtering anything I didn't tell it to filter.

The big point that I really didn't come right out and say is this: much of the traffic on the web, be it email or sites is spam traffic. I worry what will happen when legimate traffic is a minor percentage...

Thanks for the link John!