The following post is an update from the original ‘SharePoint 2007 and Adobe PDF‘ post written in 2007. These notes are based on SharePoint 2010 Beta 2 (made publicly available in November 2009). Once the product has officially launched on 12 May 2010, an update will be posted if any changes are made to the process. The process is very similar to SharePoint 2007, with minor changes to folder location (14 instead of 12) and a slightly different administration user interface in the browser.
SharePoint Server 2010, like its predecessors, includes indexing and search capabilities. But what doesn’t come out of the box is the ability to index and search for PDF documents. PDF is a format owned by Adobe, not Microsoft. If you want to be able to find Adobe PDF documents, or have the PDF icon appear when viewing PDF files in a SharePoint document library (see image above), you will need to set it up for yourself. This post describes how to.
* Additional Notes:
SharePoint Server 2010, like its predecessors, includes indexing and search capabilities. But what doesn’t come out of the box is the ability to index and search for PDF documents. PDF is a format owned by Adobe, not Microsoft. If you want to be able to find Adobe PDF documents, or have the PDF icon appear when viewing PDF files in a SharePoint document library (see image above), you will need to set it up for yourself. This post describes how to.
- Download and install Adobe’s 64-bit PDF iFilter*1 – http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
- Download the Adobe PDF icon (select Small 17 x 17) – http://www.adobe.com/misc/linking.html
- Give the icon a name or accept the default: ‘pdficon_small.gif’
- Save the icon (or copy to) C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\IMAGES
- Edit the DOCICON.XML file to include the PDF icon
- In Windows Explorer, navigate to C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\XML
- Edit the DOCICON.XML file (I open it in NotePad, you can also use the built-in XML Editor)
- Ignore the section <ByProgID> and scroll down to the <ByExtension> section of the file
- Within the <ByExtension> section, insert <Mapping Key=”pdf” Value=”pdficon_small.gif” /> attribute. The easiest way is to copy an existing one – I usually just copy the line that starts <Mapping Key=”png”… and replace the parameters for Key and Value (see image below)
- Save and close the file
- Add PDF to the list of supported file types within SharePoint
- In the web browser, open SharePoint Central Administration
- Under Application Management, click on Manage service applications
- Scroll down the list of service apps and click on Search Service Application
- Within the Search Administration dashboard, in the sidebar on the left, click File Types
- Click ‘New File Type’ and enter PDF in the File extension box. Click OK
- Scroll down the list of file types and check that PDF is now listed and displaying the pdf icon.
- Close the web browser
- Stop and restart Internet Information Server (IIS)*2 Note: this will temporarily take SharePoint offline. Open a command line (Start – Run – enter ‘cmd’) and type ‘iisreset’
- Perform a full crawl of your index. Note: An incremental crawl is not sufficient when you have added a new file type. SharePoint only indexes file names with the extensions listed under File Types and ignores everything else. When you add a new file type, you then have to perform a full crawl to forcibly identify all files with the now relevant file extension.
* Additional Notes:
- At time of writing (March 2010), Adobe has published PDF iFilter 9 for 64-bit applications, tested on SharePoint 2007 but not yet listed as tested on SharePoint 2010. So far, it is working fine on my builds of SharePoint 2010 (Beta versions)
- When setting this up, I initially just restarted the search service rather than IIS but found myself locked out of SharePoint. Resetting IIS fixed it. I don’t know for certain if you also need to restart the search service. Will test on the next build and update here.
- As with SharePoint 2007, there are alternative PDF ifilters. The most well known is Foxit Pro – http://www.foxitsoftware.com/. Rumoured to perform indexing faster than using Adobe’s iFilter. I can’t comment, I haven’t tested it. Given PDFs don’t change (they are usually PDFs specifically to not be edited) they are only indexed when first uploaded or when you perform a full crawl. Most organisations should primarily be performing incremental crawls – updating the index with content that has been added or changed rather than re-indexing everything
- An absolute cheat for getting round the need to do registry edits is to install Adobe Reader on your server…
- There’s a Powershell script for doing all of this, see Johan Skoglund’s blog – Use Powershell to configure PDF search in SharePoint 2010. Haven’t tried it yet, will give it a go on the next demo build and it’s been confirmed int he comments
- Another suggestion from the comments – you don’t have to use the default icons (but do make sure whatever ones you do use are licensed or free to use). Thanks to Jon for suggesting http://www.iconmaker.com where you can search for icons free to use commercially. Nothing stopping you from using your own icons for all the different file types…
No comments:
Post a Comment