Welcome to Alledia, the Joomla SEO Experts

Here at Alledia we provide you with advice and extensions to rank your Joomla! sites high in search engines such as Google, Yahoo and MSN.

You can read the most popular Joomla blog, join the Joomla SEO Club, check out our SEO-friendly Joomla template or attend a Joomla training class.

Home / Blog / Search Engine Optimisation (SEO) / Get out of Joomla PDF Hell

03

May

2007

Alledia.com Blog
Get out of Joomla PDF Hell
Search Engine Optimisation (SEO)
Written by Steve Burge   
One Joomla's major causes of duplicate content is the PDF generator. Brian Teeman has even pointed out that when he does his in-depth searches for Joomla Weekly News, he finds many PDF pages ranking higher than the original pages.

 

The problem is so bad, and the PDF so useless, that if you check the demo of Joomla 1.5, you'll see that its about to be dropped. For those of us running  the current version of Joomla, what do we do to avoid Joomla PDF hell?

 

  1. Unpublish the PDFs completely.
  2. Use robots.txt to stop Google from picking up the PDF pages.
  3. A very simple, but useful tip from XTraze.net. He suggests simply adding a "no-follow" to the PDF links. No-follow is often used by sites that suffer heavy spam attacks or have lots of extra pages that can reduce the value of their site as a whole.

 

Open up /components/com_content/content.html.php

    <a href="<?php echo $link; ?>" target="_blank" onclick="window.open(’<?php echo $link; ?>’,'win2′,’<?php echo $status; ?>’); return false;" title="<?php echo _CMN_PDF;?>">>


Add the rel=”nofollow” attribute:

    <a href="<?php echo $link; ?>" rel="nofollow" target="_blank" onclick="window.open(’<?php echo $link; ?>’,'win2′,’<?php echo $status; ?>’); return false;" title="<?php echo _CMN_PDF;?>">>


 

Comments (15)Add Comment
Anthony Olsen
Anthony Olsen
May 03, 2007

Yes its funny isnt it. This was one of the features that attracted me to Joomla when I first saw it ... I thought wow! people can print, email or pdf your text thats got to be useful.But I have never used it once and neither have any of my users (as far as Im aware) ...

0
Johan Janssens
May 03, 2007

Actually for 1.5 we have completely refactored the pdf library. It now supports images and is fully internationalised. I have made a quick change to 1.5 and added the nofollow to the pdf links. Thanks for the tip !

Steve Burge
Steve Burge
May 03, 2007

Thanks Johan - thats great news.

Its not a 100% perfect solution ... no-follow doesn't work on Ask.com, but fortunately I don't think anyone uses them anymore smilies/smiley.gif

Should work fine on Google, Yahoo and MSN

Steve

0
sean
May 09, 2007

It all makes sense now. I was checking the "friendly" urls in Open-SEF on one of my site and I saw about 400 extra urls that could hardly be termed friendly. The week before when I'd implemented Open-SEF initially, those urls weren't there. Does that mean I have hundreds of PDF's lurking somewhere on my site, and, if so, where?
This remains my number one rated blog for SEO, period!

0
XTraze
May 17, 2007

I just went through that file source and found this but I couldn't find the Print button's link over there. Can you let me know if you got it ?

Thanks for these lines mate.

A very simple, but useful tip from XTraze.net. He suggests simply adding a "no-follow" to the PDF links. No-follow is often used by sites that suffer heavy spam attacks or have lots of extra pages that can reduce the value of their site as a whole.

0
Gavin
May 30, 2007

I have been thinking about this and I'm not 100% sure if this is causing any problems. I just read this article on Webmaster world http://www.webmasterworld.com/forum44/711.htm

I don't publish the PDF option in Joomla myself. If I don't publish the pdf option is it still neccessary to add the rel="nofollow" ?

Steve Burge
Steve Burge
May 30, 2007

Hi Gavin

Thanks for the interesting link.

Even if they are right (I'd still disagree and say it can still cause problems with Google spidering your site), there's still the problem of PDF pages ranking above regular pages.

For example, you want to find the Joomla.org article about 100,000 forum members. Try searching Google for "joomla.org 100000" and the PDF comes up first. The original article is nowhere to be seen.

This means that visitors won't go to your site - they'll download the PDF. In all likelihood you've lost that visitor.

0
Keith Schilling
July 12, 2007

Hate to rehash an old topic, but Google treated my URL's just fine when as I've never had the PDF button published...HOWEVER, when using site explorer with Yahoo i have PDF hell everywhere. So, I'll try this trick and see if Yahoo treats me better.

0
rachel
November 22, 2007

thank you so much for all the info on this site. i just found it through google and love it! it's definitely going in my bookmarks.

0
Kingdom
April 29, 2008

I found I had an exta > which showed on the page when I changed the code!

0
Dking
May 23, 2008

Be carefull if you are copying/pasting this code I had several character errors with the ' character

0
Wärmepumpe
September 17, 2008

Hello, thanks for this good tip, and where i can set a nofollow tag to print page? I havent found it in content.html.php. Where i can find the print page link, and where i can set a nofollow tag? Big thanks!

0
TimTim2
January 06, 2009

2. Use robots.txt to stop Google from picking up the PDF pages.

How?

Steve Burge
Steve Burge
January 06, 2009

Enter this into your robots.txt file:

Disallow: /*.PDF$

0
TimTim2
January 07, 2009

Yikes.. so which is a good one?
Disallow: /*.PDF$
Or
disallow: /*pdf*

Write comment
 
  smaller | bigger
 

busy