desktopsearchifilters

Cancel
Save
Edit

Desktop Search IFilters


Windows Desktop Search uses plug-ins called IFilters to enable it to index new file types. IFilters are used by several other Microsoft products, including Index Server, Sharepoint, and SQL Server. By downloading new IFilters – for example, from http://addins.msn.com – you can search more filetypes. You can even write your own!

What filetypes does Windows Desktop Search index by default?


Windows Desktop Search comes with IFilters for the following file types (see the official list)CONCSearchableFileTypes.htm

Document Type IFilter DLL
ASCX, ASP, ASPX, CSS, HHC, HTA, HTM, HTML, HHT, HTW, HTX, ODC, STM nlhtml.dll
DOC, DOT, POT, PPS, PPT, XLB, XLC, XLS, XLT offfilt.dll
TXT, ASM, BAT, C, CPP, CXX, CMD, DEF, DIC, H, HPP, XML, ... as plain text query.dll
RTF rtffilt.dll
EML mimefilt.dll

It can also index WMA, MP3, and JPG files, because the shell provides document properties for those filetypes. This may prevent custom IFilters for these filetypes from working - see DesktopSearchBugReports for details.


What other filetypes might work?


Several Microsoft applications add their own IFilters when they're installed. This means that any files you create with those applications will automatically be indexed by Windows Desktop Search.

Document Type IFilter installed by
MDI, TIF, TIFF Microsoft Office Document Imaging
ONE OneNote 2003
JNT Tablet PC Journal application


Where can I download new IFilters?


Any well-written IFilter should work with Windows Desktop Search. The MSN team have links to several of them at http://addins.msn.com

Note that when you install a new IFilter, Windows Desktop Search won't automatically re-index your existing documents. You can force a re-index by going to Desktop Search Options and selecting Rebuild Index, or by moving the files so that Desktop Search thinks that they have changed.

Document Type Download From
CAB Citeknet
CEL Alna AB
CHM Citeknet
DAT (Palm Desktop) Bloggit
DGN Alna AB
DWF IFilterShop
DWG Autodesk , CAD & Company
GIF IFilterShop
EPS, PS, PSD IFilterShop
EXE Citeknet
HLP Citeknet
JPG, JPEG AimingTech_ifilter , IFilterShop , PixVue
LF4T LB
MHT Citeknet
MP3 meticulus
MPP Net Intent
MSG Alna AB , Hallogram Publishing , IFilterShop
PNG IFilterShop
PDF Adobe , IFilterShop
PRT Net Intent
RAR Alna AB , Citeknet
RTF Microsoft_filter.asp
SHTML IFilterShop
SLDPRT, SLDDRW, SLDASM Net Intent
SVG IFilterShop
TIF, TIFF PixVue , IFilterShop
VCF IFilterShop
VDX, VSD, VSS, VST, VSX, VTS Microsoft
WMA, WMV IFilterShop
WP Corel
XML Microsoft_Filter.asp , QuiLogic
ZIP 4-Share , Alna AB , Citeknet , IFilterShop

Mozilla Thunderbird email requires a Protocol Handler from Citeknet rather than an IFilter

See also Scott Stonehouse's list of IFilters at http://www.ifilter.org/. Note that these third-party IFilters have not been tested and certified by Microsoft!


How do I see what IFilters are installed?


* Long way: Read the MSDN documentation on Finding the Filter DLL for a File._03xh.asp
* Fast way: Download the excellent IFilter Explorer tool from citeknet.com


How do I write my own IFilter?


Start with the official guide#AddingaNewFileType from the developer's page (note: this may supersede some of the information below)

* MSDN documentation:
* Introduction to IFilters_9sfm.asp
* Using Custom IFilters_912d.asp
* How to Write a FilterSP2003ta/html/ODC_HowToWriteaFilter.asp
* A couple of blogs:
* I Coffee, therefore IFilter by Michael Kaplan
* Baby Bedding
* Testing Full-Text IFilters by Andrew Cencini
* Additional information: http://www.columbiasoft.com/download/ifilters.htm

Make sure that you implement ""IPersistStream"" as well as the normal ""IPersistFile"". To optimize your IFilter for Windows Desktop Search, you can also output additional properties such as ""DocAuthor"" (document author) when implementing the ""GetValue""() method of the IFilter interface. Many of these properties are used to correctly display the Desktop Search results view. For example, outputting ""DocAuthor"" enables users to sort documents of your file type by author in the Desktop Search results view. The most important properties to output are:
* ""DocAuthor"" - the document author.
* ""PrimaryDate"" - the most important or most significant date.
* ""DocTitle"" - the title that will be displayed for the item in the search results view.
* ""PerceivedType"" (see below) – ensures that your file type shows up under the right Desktop Search category.

For a complete list of supported properties used by Windows Desktop Search, see
C:\Documents and Settings\<YOURNAME>\Local Settings\Application Data\MSN Toolbar Suite\DS\Config\Schema.txt

Use the ""PerceivedType"" property to classify your file type so that users can filter their search results by category:
* contact
* communications
* communications/e-mail
* communications/calendar
* communications/task
* communications/im (coming soon!)
* document/note
* document
* document/text
* document/spreadsheet
* document/presentation
* music
* images
* images/picture
* images/video
* folder
* favorite
* program
When you implement the ""GetChunk""() method within the IFilter interface, make sure that you output a propid of D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType. Then make sure that the ""GetValue""() method returns one of the above strings. For example, if you create an IFilter for a file with the extension .FOO, and it’s a picture file format, you would want to implement ""GetChunk""() to return ""D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType"" and ""GetValue""() to return VT_LPWSTR = "images/picture"

Lastly, if you register your IFilter using the same registration method as Indexing Service, Windows Desktop Search will automatically pick up your IFilter when the end user installs it. Once your IFilter is complete and tested, make it available to end users through this forum!


What order does Windows Desktop Search load IFilters in?


Ben from Citeknet explains in this newsgroup thread_frm/thread/b9ea65cc301970ec/953017650ff95351#953017650ff95351 :

''I think MSN DS does load IFilters the correct way (It seems to be the same way as Sharepoint 2003 does, using a custom tquery.dll). The only problem I can see is that it hasn't been documented correctly. It was briefly described in the Sharepoint 2001 SDKiloadfilteriywh.asp

From what I've seen, Windows Desktop Search looks for suitable IFilters in this kind of order:
*From Extension and CLSID (at HKEYLOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)
*From the content type of the file (at HKEYLOCALMACHINE\SOFTWARE\Classes\MIME\Database\Content Type)
*From the extension of the file (the same way the Win32 LoadIFilter API does)
*From Default (at HKEYLOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)

In this case, for JPG files, the content type image/jpeg has a default IFilter defined, so it gets loaded first. If you want to register a custom JPG IFilter, you need :
  1. to replace the CLSID at HKEYLOCALMACHINE\SOFTWARE\Classes\MIME\Database\Content.Type\image/jpeg
  2. or add the .jpg extension to HKEYLOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters
Note that (2) will make the IFilter available only to MSN DS (and not to Sharepoint or other application that use content type to find IFilters).

You can check with the IFilter Explorer to see the which IFilters get loaded by MSN DS or other applications. I hope this will help to solve your problem.''


Credits


Initial content from the Windows Desktop Search team and Mike Smith-Lonergan

Back to: MSNSearchFeedback


============================================================================


WINDOWS VISTA - Indexing TIF files


I have a large archive with several years of files scanned, OCR-ed and stored in TIF format. The MODI filter that has been included with MS Office since version 2000 took care of indexing the OCR-ed text. Not any more. In Vista it does not work. Can anybody advise how to re-employ the MODI filter?