hi all,
I run a server for several websites of reasonably low volume (and pretty constant) volume. Last night my monitors showed sudden increase in traffic volume, and more surprisingly the average database usage went up ten-fold from the average usage the weeks before. (though like I said, I'm running very low-traffic sites).
Naturally I checked the logs, and it turns out that googlebot is making these hits:
Code:
07/Feb/2005:08:42:46 +0100 "GET /forum/index.php?sid=some-number HTTP/1.1" 200 19979 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
07/Feb/2005:08:42:52 +0100 "GET /forum/index.php?sid=some-other-number HTTP/1.1" 200 19979 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
And this about 300 hits an hour for about 6 hours (still going on).
Of course this would be fine, if it weren't for the weird URLS.
I obfuscated them for privacy protection (it is not my site), but I checked manually and there is no "clickable" way to get thousands of different "sid" in the URL. It is a forum, but none of the Googlebot hits actually seem to view a forum or a posting. They all just hit with a new sid (session Id, I presume) and therefore all return the same page size (forum main page).
Does anyone have any idea what could be causing this? I think google is supposed to be indexing the forumpostings, but it never seems to get there.
(while just clicking around on the site, it is easy to get to a posting, and I can't even get a URL with a different "sid" just by clicking...)
I wonder where google gets it's "sid" from.
or is this a googlebot look-a-like that is trying to capture sessions or something?
Follow-up: I e-mailed google on the subject. I'll post whatever comes of that.