|
Why counters are (worse than) useless
David Dorn explains why the figures you see in
your counters are misleading at best and dangerous most of the time
In order to understand why the figures you see on
your counters are almost definitely not even close to being
accurate, you need to understand how the Web works – and the concept
of cacheing. When you surf to a Web page, what you’re actually doing
is requesting a set of linked files from a potential sequence of
servers. The first stop is usually your browser cache…
Browser Cache
99% of web browsers maintain a directory on the host
PC into which the currently viewed page’s files are copied.
Normally, when you return to that page, in that session, your
browser takes the information from that directory – the browser
cache. The result is that the Web site sees no access. Its files are
not needed.
Local cache
“Local” in this sense means local to the IP block in
which your PC resides. Your ISP may implement caches specific to
ranges of dynamically assigned IP addresses. If that’s the case,
when you request the page, the Local Cache
is searched for it files first, and, if they’re there, the Web site
doesn’t ge a request for them – so it doesn’t see an access.
“ISP” Cache
Many ISPs (like AOL) maintain cacheing servers set
up purely to store pages from the WorldWide Web that their users
have accessed. So, if someone from the same ISP as you has already
been to the Web site you’re trying to get to, you’ll be served the
pages from the ISP Cache – again, the Web site itself won’t be
troubled to send the files – so it doesn’t know you’ve had a look.
They may cascade through a local cache to get to you.
As you can see, there are numerous opportunities for
your browser to find the pages you’re looking for without the Web
site you’re trying to get to ever knowing you’ve read it. If the
site doesn’t know you’re “been there” (that is, seen the pages) then
it cannot add you to the statistics, or increment the counter.
Any count you get, then, is likely to be on the low
side.
Can I apply any tricks to help?
If you’re the Web master, there are tricks that will
subvert the cacheing process – but I wouldn’t ever recommend that
you implement them. The main reason that cacheing exists in the
first place is to speed up the Internet, or, more specifically, the
Web. Consider that some Web sites are hosted on machines that really
couldn’t handle the enormous numbers of visitors that it could
potentially get. If local and ISP caches didn’t exist to remove some
of the load, the site would slow to a crawl, or maybe crash
altogether.
Then there are bandwidth considerations. It’s much
less costly to provide you, as a visitor, with a set of files to
make up an HTML page on the first hop of your connection than it is
to serve them remotely. If, by sending a site once to a cache, you
can then send it on to, say, 20,000 visitors, each of whom is
dialling directly into just one server, the reduction in bandwidth
necessary is massive.
In other words, cacheing helps prevent the Web
becoming clogged during periods of peak usage. Using zero-length
refreshes and pragma statements to force a cache bypass, then,
subverts the whole process, and could lead to the Web grinding to a
halt.
Can I apply a multiplier to get the right
figures?
Nope! There is absolutely no way of knowing what
access figures you do see represent, and no way of estimating what
scaling factor to use. About the only thing you can say for certain
is that you have had at least the number of accesses your counters
show. That’s the only thing – you can’t even estimate the number of
people, since accesses does not equal people and individual IP
addresses don’t equal people, either. When you log into AOL, you’re
assigned an IP address. Next time, you log in, chances are you’ll
have a different IP address. You could, then, appear twice in the
stats for a given Web site on any given day – but you’re the same
person.
There’s absolutely no correlation between access
counts and actual views – there’s no way of determining one, and no
way to be even close to accurate about a count.
So why have counters?
That’s the $64,000 dollar question. I would be very
wary of having visible counters on any Web site – they’re not going
to be even slightly accurate, and your visitors are not likely to
understand everything that you do (now) about how they work, and why
they can never be trusted. At the very best, they’re only a very
rough guide as to whether or not you’re actually getting any
visitors at all. To actually base a strategy for your site on a
counter would be folly indeed, unless you were absolutely certain
that you’d forced a no-cache policy that worked on every last file
in your site.
Trust me – that’s all but impossible, unless you
know the internal workings of every form of cache that’s in use by
every ISP on the planet.
The bottom line is this: Counters are, at best,
misleading, and, at worst, totally, completely, and utterly useless.
If you absolutely must have one, treat it as a bit of fun, and don’t
let your visitors see it.
This page has been viewed 135,790,876,330,749,001 times since you
logged on... yeah, right!
^top
Have your say - click here
David Dorn
|