[Fixed] Google banned from crawling the forum
[Fixed] Google banned from crawling the forum
I was trying to search the forum with site:forums.informaction.com today, and noticed no useful results. Searching for 'noscript' finds this:
InformAction Forums • Information - NoScript
forums.informaction.com/ - Cached - Similar
Information. You have been permanently banned from this board. Please contact the Board Administrator for more information. A ban has been issued on your ...
https://encrypted.google.com/search?out ... .com&gbv=1
InformAction Forums • Information - NoScript
forums.informaction.com/ - Cached - Similar
Information. You have been permanently banned from this board. Please contact the Board Administrator for more information. A ban has been issued on your ...
https://encrypted.google.com/search?out ... .com&gbv=1
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0
Re: Google banned from crawling the forum
I noticed that on a different board & was like, huh?You have been permanently banned from this board.
Thinking it is just something going on, like there isn't enough, with phpBB boards.
I just ignored that message & logged in normally.
Or maybe that message is just meant for robots.txt?
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 Pinball NoScript FlashGot AdblockPlus
Mozilla/5.0 (Windows NT 5.1; rv:26.0) Gecko/20100101 SeaMonkey/2.23a2
- GµårÐïåñ
- Lieutenant Colonel
- Posts: 3365
- Joined: Fri Mar 20, 2009 5:19 am
- Location: PST - USA
- Contact:
Re: Google banned from crawling the forum
This may indicate that at some point the crawling access by bots was open or * meaning either no robot.txt or one that had no deny and later a robot.txt was added with deny items and/or updated to deny bots of certain types and future updates to the cache were met with that message. It should update or flush over time as it propagates.
~.:[ Lï£ê ï§ å Lêmðñ åñÐ Ì Wåñ† M¥ Mðñê¥ ßå¢k ]:.~
________________ .: [ Major Mike's ] :. ________________
________________ .: [ Major Mike's ] :. ________________
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.1.0.0 Safari/537.36
-
- Senior Member
- Posts: 109
- Joined: Sun May 20, 2012 5:09 pm
Re: Google banned from crawling the forum
The issue still exists; it must be resolved ASAP. Could one of the mods, or maybe Giorgio look into the problem?
@Guardian: it can't be about robots.txt; it entirely depends upon the application to respect it. It must be phpBB's internal blacklist that's behind this.
@Guardian: it can't be about robots.txt; it entirely depends upon the application to respect it. It must be phpBB's internal blacklist that's behind this.
Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20130401 Firefox/21.0
- Giorgio Maone
- Site Admin
- Posts: 9454
- Joined: Wed Mar 18, 2009 11:22 pm
- Location: Palermo - Italy
- Contact:
Re: Google banned from crawling the forum
I'm checking the blacklisted IPs.
It might take a while: they're thousands and I need to perform a reverse DNS lookup for each
[EDIT]:
actually there where just 3 IP bans, and almost 40,000 user bans.
Yet, I've got no idea of how Googlebot is banned, exactly, since as far as I know it shouldn't try to login (or should it?)
[EDIT2]:
OK, I've found it. Apparently known search bots get assigned a conventional user account by phpBB, and Googlebot's (userid=16) has been banned by someone (not me) with reason: spam
Reactivating...
It might take a while: they're thousands and I need to perform a reverse DNS lookup for each
[EDIT]:
actually there where just 3 IP bans, and almost 40,000 user bans.
Yet, I've got no idea of how Googlebot is banned, exactly, since as far as I know it shouldn't try to login (or should it?)
[EDIT2]:
OK, I've found it. Apparently known search bots get assigned a conventional user account by phpBB, and Googlebot's (userid=16) has been banned by someone (not me) with reason: spam
Reactivating...
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0
- GµårÐïåñ
- Lieutenant Colonel
- Posts: 3365
- Joined: Fri Mar 20, 2009 5:19 am
- Location: PST - USA
- Contact:
Re: Google banned from crawling the forum
Read what I said, I didn't say it was robot.txt that's the issue. I said it might have been any number of changes to the way crawler's behave on the site which turned out to be true since Giorgio just posted that it was due to the Googlebot being blocked by someone which is a very rookie mistake. So its been resolved, let's move on.access2godzilla wrote:@Guardian: it can't be about robots.txt; it entirely depends upon the application to respect it. It must be phpBB's internal blacklist that's behind this.
~.:[ Lï£ê ï§ å Lêmðñ åñÐ Ì Wåñ† M¥ Mðñê¥ ßå¢k ]:.~
________________ .: [ Major Mike's ] :. ________________
________________ .: [ Major Mike's ] :. ________________
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0
- GµårÐïåñ
- Lieutenant Colonel
- Posts: 3365
- Joined: Fri Mar 20, 2009 5:19 am
- Location: PST - USA
- Contact:
Re: Google banned from crawling the forum
This is why I only ban people that ACTUALLY post SPAM unlike some of my fellow mods who think that since the account is created, let's get them before they post. Sometimes yeah waiting a day has resulted in having to delete 30 spam, but at least we don't end up crippling our users like this. So lesson in fact that jumping the gun can have unintended consequences. And not to sound like a broken record, IP blocking is valid in VERY rare circumstances and should not be used as the primary method of banning.Giorgio Maone wrote:[EDIT2]:
OK, I've found it. Apparently known search bots get assigned a conventional user account by phpBB, and Googlebot's (userid=16) has been banned by someone (not me) with reason: spam
Reactivating...
Thank you Giorgio for taking care of this, sorry for the hassle
~.:[ Lï£ê ï§ å Lêmðñ åñÐ Ì Wåñ† M¥ Mðñê¥ ßå¢k ]:.~
________________ .: [ Major Mike's ] :. ________________
________________ .: [ Major Mike's ] :. ________________
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0
Re: [Fixed] Google banned from crawling the forum
Yet just who is "Google [Bot]" ?
And how do you ban the "user" if the user doesn't show as one, & cannot, seemingly, "post"?
And how do you ban the "user" if the user doesn't show as one, & cannot, seemingly, "post"?
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 Pinball NoScript FlashGot AdblockPlus
Mozilla/5.0 (Windows NT 5.1; rv:26.0) Gecko/20100101 SeaMonkey/2.23a2
- Giorgio Maone
- Site Admin
- Posts: 9454
- Joined: Wed Mar 18, 2009 11:22 pm
- Location: Palermo - Italy
- Contact:
Re: [Fixed] Google banned from crawling the forum
It's the name given by phpBB to its built-in Google Bot account, reserved for usage by the bot.therube wrote:Yet just who is "Google [Bot]" ?
There are about 50 for any known bots and crawlers, all with user IDs < 50.
You just need to type "Google [Bot]" in the "ban by username" field (there may be other ways, though, I guess).therube wrote: And how do you ban the "user" if the user doesn't show as one, & cannot, seemingly, "post"?
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0
- GµårÐïåñ
- Lieutenant Colonel
- Posts: 3365
- Joined: Fri Mar 20, 2009 5:19 am
- Location: PST - USA
- Contact:
Re: [Fixed] Google banned from crawling the forum
According to Giorgio, Google Bot actually has an account. If anyone is familiar with Google Analytics and Webmaster Tools, they know that given login access it can provide better crawling. Given that this is not the case here obviously where Giorgio gave them explicit access, the Bot can detect certain platforms, such as our forum software and when the ability exists it will create an account or use an existing one provided by the platform (ie. phpBB) so as a user it has access to more information to crawl than would probably not be available if it was accessing it anonymously. That account it uses for crawling is what seems to have been banned hence causing the bot to have issues. It believed that it was explicitly being prohibited from crawling the site, so it "broke" it. Does that help?therube wrote:Yet just who is "Google [Bot]" ?
And how do you ban the "user" if the user doesn't show as one, & cannot, seemingly, "post"?
EDIT: When I posted this noticed Giorgio posted at the same time, he gave you the same answer just a bit more concise.
~.:[ Lï£ê ï§ å Lêmðñ åñÐ Ì Wåñ† M¥ Mðñê¥ ßå¢k ]:.~
________________ .: [ Major Mike's ] :. ________________
________________ .: [ Major Mike's ] :. ________________
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0