 A few weeks ago we mentioned that there was some good news for those people who have duplicate content on their site. A Google staff member mentioned that there was no longer going to be an active penalty for websites that committed this particular mistake. Some people were happy. Some were dubious and belived the penalty still existed. Some simply said, "what the beep is duplicate content and how does it affect my Joomla site"? This post is that for that last group of people. What is duplicate content? When a website has several pages, all with substantially the same content.
Why is it disliked by search engines? Because spammers can use this to make their site appear much bigger than it really is to search engines. If you come across an apparently small site that has 50,000 pages indexed by Google, its a fair bet they are using duplicate content to trick the search engines.
Do you think the penalty still exists? No, but I believe duplicate content can still cause you a lot of problems and that it should be avoided. Aaron Wall, writer of the web's most popular SEO book, has recently mentioned that you can increase your search engine ranking, simply by preventing fewer junk pages from being indexed. Duplicate content produces a lot of junk pages and eventually Google's bots will get tired of visiting your many useless URLs.
Why is it a problem with Joomla? Because Joomla has a tendency to produce many different URLs to just one page. We'll use this page as an example. The following six URLs can reach this page. Each URL has the same content and the same metadata. Its duplicate content hell:
- Regular, non Search Engine-Friendly URL
- Regular, non Search Engine-Friendly URL with a menu Itemid
- URL to make the page display as a PDF
- URL to make the page display in print view
- URL to make the page display in Print view with a menu Itemid
- URL with Search-Engine-Friendly URL component turned on
Adding more components can produce even more URLs. How can you stop your Joomla site being penalised?
- Unpublish your PDF and Print buttons for all articles.
- Use JPromoter. Analyze your site and then go to "Optimize Your Site". Search by using "Group by Same Titles". Make sure you choose "No index" and "No follow" for all but one copy of each page. This means that Google should only index the pages you want indexed.
- Start your site right by choosing one SEF URL component and sticking with it as long as you possibly can. Different SEF components often render links in different ways.
- Instead of simply creating menu links to a component, create a URL link to the SEF URL for that component. For example, instead of having a menu link to "index.php?option=com_login&Itemid=65" you can have a menu link to "login". This makes sure that only the "login" URL is read by search engines.
- If you're a spammer .... stop!
|
Comments
As for JPromoter, contrary to what they're claiming on their website, there is the Joomla SEO Patch from joomlatwork.com which, at a fraction of the price, also gives full control over meta tags on non-com_content pages, besides doing other great things. (No I'm not affiliated.)
I'm with you again completely on item 3, and there is a strong reason for using a SEF URL component that you didn't even mention: It can rectify Joomla's inherent ItemID issues and make sure that a certain page is always reached via *ONE* URL no matter what ItemID Joomla thinks it should have. That's what I'm using OpenSEF for on almost all my sites.
OpenSEF also rewrites all internal links so you can leave them the standard way. I haven't tried SEF Advance but would imagine it operates much the same.
Thanks for the article and kind regards.
Thanks for the great comment. Open-SEF has become a great product - it really is driven by talented developers.
I normally find that my choice of SEF component is determined which ones have sef_ext.php files for the other components we're using. For example, if we're using SOBI we go for Artio SEF, but if we're using Community Builder its SEF Advanced.
Eventhough I am happy with OpenSEF but due to several good review I read on SH404SEF, I feel the need to try the component which I definitely say a big misstake.
It messed up with content as well as pagerank. After several hours of headache I just installed the OpenSEF again.
regards
In version 1.5 of Joomla, there is no need to disable PDFs as they have the no follow attribute built into the link, however the print buttons do not have that attribute built in.
So whats your advice, do you recommend disabling the print button on Joomla 1.5 sites then?
From an accessibility point of view and usability point of view, its really nice having a print option!
You're right - the PDF problem is much worse than the print problem.
In fact the PDF nofollow came from an idea by XTraze.net and a post on this site:
http://www.alledia.com/blog/search-engine-optimisation-(seo)/get-out-of-joomla-pdf-hell/
One easy way to remove the print pages might just be to use robots.txt and insert this:
disallow: /*print*
Printing through a "print" CSS is obviously the best and cleanest way - but Joomla doesn't support that out of the box. You have to switch off the print icon and make your own "print" CSS for this method to work.
In Joomla 1.5, disallowing index2.php doesn't work any more since they stopped using the index2.php method and now run everything through index.php. Steve's tip will remove the print pages, and with an additional
disallow: /*pdf*
you can remove the PDF pages as well if you don't trust the nofollow method.
Kind regards,
Zorro
disallow: /*pdf*
/*print*
Is that right?
Kind fregards,
Zorro
Can one set no index and no follow for all but one url without JPromoter... i.e. simply using Open SEF??
RSS feed for comments to this post