Home / Joomla Tips & Tricks / Google Bug Affecting Joomla Sites 
Joomla Tips & Tricks
May
28
2008
Google Bug Affecting Joomla Sites
Written by Steve Burge   
Avatar

Google BugBack in Janaury I mentioned that I'd met a puzzle I couldn't solve ... Google was indexing a lot of search results on Joomla sites.

The problem appeared all on kinds of Joomla sites and with all kinds of URL extensions. I just couldn't work out what where the bug was in Joomla.

It turns out the bug was in Google.

The Google Bug

Google is trying a new crawling method ... automatically filling in forms such as search boxes in order to try and find new URLs. Matt Cutts discusses it here. Unfortunately they are creating new pages as well as finding them and in Joomla the main outcome is that random search pages are indexed.

Example of the Problem URLs

  • Default Joomla URLs : /index.php?option=com_search&searchword=stuff
  • Default SEF URLs: /component/option,com_search/Itemid,38/index.php?searchword=stuff
  • sh404SEF: /search/newest-first.html?searchphrase=any&searchword=stuff

Solution

Add the search component to your robots.txt file. With the examples above, you would use this code:

  • Default Joomla URLs: Disallow: /*com_search*/
  • Default SEF URLs: Disallow: /*com_search*/
  • sh404SEF: Disallow: /search/

Other Recent Google Bugs You May Have Missed

 

Comments  

 
#1 Tina 2008-05-28 17:28
Thank you for adding such a useful information online and for indicating. This was Google...Anyway, I think You should have written a detailed email to Google indicating the problems in order to let fine tune their services.

Thanks for your post again

Regards - Tina
Quote
 
 
#2 Sean Cook 2008-05-28 17:37
Thanks for the heads-up Steve!! ;-)
Quote
 
 
#3 WadeO 2008-05-29 13:24
I use Joomla to host a site called ConsumerCowboy computer discount site, I noticed that the there were a ton of new search queries that are being indexed. I have had trouble getting any sort of PG and this may be cause of this and that I did not have nofollow on all my banners.

This is good advise I will be placing the code in my robots.txt

thanks
Quote
 
 
#4 Steve Burge 2008-05-30 09:22
Hi Wade - glad to hear you're fixing it.

I see 300,000 problem URLs with basic Joomla URLs:
http://www.google.com/search?hl=en&q=allinurl: index.php?option=com_search ([url:error])

I'm sure theres a lot more with other URL variations.
Quote
 
 
#5 C. Scott Lovejoy 2008-05-30 14:05
Awesome find, thank you Steve!
Quote
 
 
#6 Chirag 2008-06-03 23:05
Hello ,


i am writing on alledia.com for very first time.
if anything i am doing wrong,Please advise me.I am not techie guy.

Google is crawling our site but there are some url showing up like following even after we have blocked through robot.txt file.

When we search " folding gate ottawa"
See Following URl

http://www.google.com/search?q=folding gate ottawa&sourceid=navclient-ff&ie=UTF-8&rlz=1B3GGGL_enCA237CA238

google shows joomla search results a randomly created one.Does google creating url with joomla searchbot?

Is it due to Virtuemart Product Serchbot or any permission or still robot problem?

Any clue how to overcome this


Quote
 
 
#7 Ulas ALKAN 2008-06-06 20:29
your article shows the importance of using robots.txt again. thanks
Quote
 
 
#8 Allwin 2010-01-22 05:17
Yeah, that is correct . Most of the people are not using the robots.txt files fearing that they will block the search engines and so they will not be listed in search results. Anyway this article is very much helpful and thanks for Steve.
Quote
 

Add comment


Security code
Refresh