Desktop Search IFilters
Windows Desktop Search uses plug-ins called
IFilters to enable it to index new file types.
IFilters are used by several other Microsoft products, including Index Server, Sharepoint, and SQL Server. By downloading new
IFilters – for example, from http://addins.msn.com – you can search more filetypes. You can even write your own!
What filetypes does Windows Desktop Search index by default?
Windows Desktop Search comes with
IFilters for the following file types (see the
official list)CONCSearchableFileTypes.htm
| Document Type | IFilter DLL |
| ASCX, ASP, ASPX, CSS, HHC, HTA, HTM, HTML, HHT, HTW, HTX, ODC, STM | nlhtml.dll |
| DOC, DOT, POT, PPS, PPT, XLB, XLC, XLS, XLT | offfilt.dll |
| TXT, ASM, BAT, C, CPP, CXX, CMD, DEF, DIC, H, HPP, XML, ... as plain text | query.dll |
| RTF | rtffilt.dll |
| EML | mimefilt.dll |
It can also index WMA, MP3, and JPG files, because the shell provides document properties for those filetypes. This may prevent custom
IFilters for these filetypes from working - see
DesktopSearchBugReports for details.
What other filetypes might work?
Several Microsoft applications add their own
IFilters when they're installed. This means that any files you create with those applications will automatically be indexed by Windows Desktop Search.
| Document Type | IFilter installed by |
| MDI, TIF, TIFF | Microsoft Office Document Imaging |
| ONE | OneNote 2003 |
| JNT | Tablet PC Journal application |
Where can I download new IFilters?
Any well-written
IFilter should work with Windows Desktop Search. The MSN team have links to several of them at http://addins.msn.com
Note that when you install a new
IFilter, Windows Desktop Search won't automatically re-index your existing documents. You can force a re-index by going to
Desktop Search Options and selecting
Rebuild Index, or by moving the files so that Desktop Search thinks that they have changed.
| Document Type | Download From |
| CAB | Citeknet |
| CEL | Alna AB |
| CHM | Citeknet |
| DAT (Palm Desktop) | Bloggit |
| DGN | Alna AB |
| DWF | IFilterShop |
| DWG | Autodesk , CAD & Company |
| GIF | IFilterShop |
| EPS, PS, PSD | IFilterShop |
| EXE | Citeknet |
| HLP | Citeknet |
| JPG, JPEG | AimingTech_ifilter , IFilterShop , PixVue |
| LF4T | LB |
| MHT | Citeknet |
| MP3 | meticulus |
| MPP | Net Intent |
| MSG | Alna AB , Hallogram Publishing , IFilterShop |
| PNG | IFilterShop |
| PDF | Adobe , IFilterShop |
| PRT | Net Intent |
| RAR | Alna AB , Citeknet |
| RTF | Microsoft_filter.asp |
| SHTML | IFilterShop |
| SLDPRT, SLDDRW, SLDASM | Net Intent |
| SVG | IFilterShop |
| TIF, TIFF | PixVue , IFilterShop |
| PVAU | PVUA |
| VCF | IFilterShop |
| VDX, VSD, VSS, VST, VSX, VTS | Microsoft |
| WMA, WMV | IFilterShop |
| WP | Corel |
| XML | Microsoft_Filter.asp , QuiLogic |
| ZIP | 4-Share , Alna AB , Citeknet , IFilterShop |
Mozilla Thunderbird email requires a
Protocol Handler from
Citeknet rather than an
IFilter See also Scott Stonehouse's list of
IFilters at http://www.ifilter.org/. Note that these third-party
IFilters have not been tested and certified by Microsoft!
How do I see what IFilters are installed?
* Long way: Read the MSDN documentation on
Finding the Filter DLL for a File._03xh.asp
* Fast way: Download the excellent
IFilter Explorer tool from citeknet.com
How do I write my own IFilter?
Start with the
official guide#AddingaNewFileType from the developer's page (note: this may supersede some of the information below)
* MSDN documentation:
*
Introduction to IFilters_9sfm.asp
*
Using Custom IFilters_912d.asp
*
How to Write a FilterSP2003ta/html/ODC_HowToWriteaFilter.asp
* A couple of blogs:
*
I Coffee, therefore IFilter by Michael Kaplan
*
Baby Bedding *
Testing Full-Text IFilters by Andrew Cencini
* Additional information: http://www.columbiasoft.com/download/ifilters.htm
Make sure that you implement ""IPersistStream"" as well as the normal ""IPersistFile"". To optimize your
IFilter for Windows Desktop Search, you can also output additional properties such as ""DocAuthor"" (document author) when implementing the ""GetValue""() method of the
IFilter interface. Many of these properties are used to correctly display the Desktop Search results view. For example, outputting ""DocAuthor"" enables users to sort documents of your file type by author in the Desktop Search results view. The most important properties to output are:
* ""DocAuthor"" - the document author.
* ""PrimaryDate"" - the most important or most significant date.
* ""DocTitle"" - the title that will be displayed for the item in the search results view.
* ""PerceivedType"" (see below) – ensures that your file type shows up under the right Desktop Search category.
For a complete list of supported properties used by Windows Desktop Search, see
C:\Documents and Settings\<YOURNAME>\Local Settings\Application Data\MSN Toolbar Suite\DS\Config\Schema.txt
Use the ""PerceivedType"" property to classify your file type so that users can filter their search results by category:
* contact
* communications
* communications/e-mail
* communications/calendar
* communications/task
* communications/im (coming soon!)
* document/note
* document
* document/text
* document/spreadsheet
* document/presentation
* music
* images
* images/picture
* images/video
* folder
* favorite
* program
When you implement the ""GetChunk""() method within the
IFilter interface, make sure that you output a propid of
D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType. Then make sure that the ""GetValue""() method returns one of the above strings. For example, if you create an
IFilter for a file with the extension .FOO, and it’s a picture file format, you would want to implement ""GetChunk""() to return ""D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType"" and ""GetValue""() to return VT_LPWSTR = "images/picture"
Lastly, if you register your
IFilter using the same registration method as Indexing Service, Windows Desktop Search will automatically pick up your
IFilter when the end user installs it. Once your
IFilter is complete and tested, make it available to end users through this forum!
What order does Windows Desktop Search load IFilters in?
Ben from Citeknet explains in
this newsgroup thread_frm/thread/b9ea65cc301970ec/953017650ff95351#953017650ff95351 :
''I think MSN DS does load
IFilters the correct way (It seems to be the same way as Sharepoint 2003 does, using a custom tquery.dll). The only problem I can see is that it hasn't been documented correctly. It was briefly described in the
Sharepoint 2001 SDKiloadfilteriywh.asp
From what I've seen, Windows Desktop Search looks for suitable
IFilters in this kind of order:
*From Extension and CLSID (at HKEY
LOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)
*From the content type of the file (at HKEY
LOCALMACHINE\SOFTWARE\Classes\MIME\Database\Content Type)
*From the extension of the file (the same way the Win32
LoadIFilter API does)
*From Default (at HKEY
LOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)
In this case, for JPG files, the content type image/jpeg has a default
IFilter defined, so it gets loaded first. If you want to register a custom JPG
IFilter, you need :
- to replace the CLSID at HKEYLOCALMACHINE\SOFTWARE\Classes\MIME\Database\Content.Type\image/jpeg
- or add the .jpg extension to HKEYLOCALMACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters
Note that (2) will make the
IFilter available only to MSN DS (and not to Sharepoint or other application that use content type to find
IFilters). You can check with the
IFilter Explorer to see the which
IFilters get loaded by MSN DS or other applications. I hope this will help to solve your problem.''
Credits
Initial content from the Windows Desktop Search team and
Mike Smith-Lonergan Back to:
MSNSearchFeedback
============================================================================
WINDOWS VISTA - Indexing TIF files
I have a large archive with several years of files scanned, OCR-ed and stored in TIF format. The MODI filter that has been included with MS Office since version 2000 took care of indexing the OCR-ed text. Not any more. In Vista it does not work. Can anybody advise how to re-employ the MODI filter?