DAM Assets Issue on disabling Google search | Community
Skip to main content
BhargavThogata
New Participant
October 16, 2015
Solved

DAM Assets Issue on disabling Google search

  • October 16, 2015
  • 7 replies
  • 3530 views

Hi,


The requirement is:

We need to make a CQ page non Google Searcheable. For that we have added <meta name="robots" content="noindex"> in template level and provided a checkbox option in page properties so that he can disable search for a specific page.
Now the challenge is for Assets (PDF), i added a custom checkbox but dont know where to place this <meta> tag since we dont have any templates for Assets unlike pages. Need help on this.

 

Thanks,
Bhargav
This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by edubey

Yes, 

Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.

Example: 

Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler  will take that stored property and updated the robot.txt file accordingly.

have doubts? let me know 

7 replies

edubey
edubeyAccepted solution
New Participant
October 16, 2015

Yes, 

Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.

Example: 

Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler  will take that stored property and updated the robot.txt file accordingly.

have doubts? let me know 

edubey
New Participant
October 16, 2015

Your welcome :)

edubey
New Participant
October 16, 2015

Hi Bhargav,

One similar question was asked sometime back where user don't want PDF to be searchable.

Please see the thread and let me know if you have any doubt on it.

Thread : http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manager.topic.html/forum__xdoz-hi_im_runningcq.html

Thanks

BhargavThogata
New Participant
October 16, 2015

If we need to test whether its properly working or not, how can we do that ?

 

Thanks,

Bhargav

BhargavThogata
New Participant
October 16, 2015

So I need to update the robots.txt at /content/<project> with asset path (say for example /content/dam/geometrixx/documents/GeoSphere_Datasheet.pdf) ?

One more doubt is does the <meta> tag i stated in question is no longer needed right since we are controlling from checkbox !?!

Please correct me if I go wrong

 

Thanks,

Bhargav

edubey
New Participant
October 16, 2015
edubey
New Participant
October 16, 2015

Yes, you are on right track. You need to mentioned you PDF file path.

For PDF we do not have <meta> tag like we have for pages but as you mentioned you can use these tags for pages.

Here is more for you:

1) Use robots.txt to block the files from search engines crawlers:

User-agent: * Disallow: /pdfs/ # Block the /pdfs/directory. Disallow: *.pdf  # Block pdf files. Non-standard but works for major search engines.

2) Use rel="nofollow" on links to those PDFs

<a href="something.pdf" rel="nofollow">Download PDF</a>

Complete Documentation: http://www.robotstxt.org/robotstxt.html

Any Doubt? let me know

thanks