FAST Search – Repair or Rebuild your Corrupt Index from FIXML?

The FAST Search environment is a robust search solution for your SharePoint environment and once it is configured correctly and optimized then it will purr in the background surfacing rich search results for your users. However, on the rare occasion that your index gets corrupted, you as a SharePoint administrator will need to be aware of the tools and methods you can use to get it working again. If anything else, this should be part of your recovery procedure for your SharePoint environment.

So what options do you have when your index gets corrupted? Well, firstly you could completely delete the Index from SharePoint in the Search administration and via powershell on your FAST Search admin server. Or if you’re Index will take too long to re-build and will not meet your SLA then you can recover your Index using the FIXML. When FAST Search for SharePoint is indexing items it not only stores the physical index itself in the location ‘%drive%\FASTSeach\data\data_index’ but it also stores each indexed item in the FIXML in the location ‘%drive%\FASTSearch\data\data_fixml’. The FIXML contains all the information which is to become that item in the index.

Now that we know we can use the FIXML, there are two options available to you that are detailed below.

Option A – Repair a corrupt Index from FIXML

This process can be performed from any FS4SP server and on any column or row. The index reset does not rebuild the column from scratch, the indexer validates each item within the FS4SP column against the original FIXML. Any item not in sync or corrupt will be updated in the column / index.

  1. Ensure all crawls are stopped in Search Administration and the FS4SP column is idle. Use the powershell command below to show the status of FAST Search.

</p><p>Indexerinfo --row=0 --column=0 status
</p><p>

  1. Stop the web analyser and relevancy admin processes.

</p><p>Waadin abort processing
</p><p>Spreladmin abortprocessing
</p><p>

  1. Issue an Index Reset. You can re-run the second command to monitor the status or check the indexer.txt file in the logs directory (FAST Search\var\log\indexer).

</p><p>Indexeradmin --row=0 --column=0 resetindex
</p><p>Indexeradmin --row=0 --column=0 status
</p><p>

  1. Once the repair is complete, you can then resume the web analyser and relevancy admin.

</p><p>Waadin enqueueview
</p><p>Spreladmin enqueue
</p><p>

Your FAST Search Index will now be repaired and will be operational.

Option B – Rebuild an Index from XML

Ok, so you attempted Option A and this didn’t resolve your issue and your index is still corrupted. The next step is to rebuild the index from your fixml. Rebuilding the index requires a lot more disk space than option A as temporary files are created and released (within FAST Search directory) which means you are likely to consume twice as much disk space. If the disk space gets to 2GB of free space then the rebuild will fail so you will need to manage your disk space. Follow the steps below to complete this task.

  1. Ensure all crawls are stopped in Search Administration and the FS4SP column is idle. Use the powershell command below to show the status of FAST Search and keep a note of the entries “document size” and the “indexed=0”.

</p><p>Indexerinfo --row=0 --column=0 status
</p><p>

  1. Stop the web analyser and relevancy admin

</p><p>Waadin abort processing
</p><p>Spreladmin abortprocessing
</p><p>

  1. Rebuild the primary index column. Run the command from the primary indexer server to stop the processes.

</p><p>Nctrl stop
</p><p>

  1. Delete the folder ‘data_index’ within the directory ‘FASTSearch\data’ and start the services again using the nctrl command. When you start the processes the ‘data_index’ folder will be re-created and will be populated with a rebuild of the index.

</p><p>Nctrl start
</p><p>Indexerinfo --row=0 --column=0 status
</p><p>

Notice the “document size” entry, and check that the “indexed=0” is displayed as this comes from the fixml and means the index is empty. Keep re-running the status query until indexed items are at their original value. This is when it is complete.

  1. Once the rebuild is complete, you can then resume the web analyser and relevancy admin.

</p><p>Waadin enqueueview
</p><p>Spreladmin enqueue
</p><p>

Option B is now in place and complete. This will bring your FAST Search Index back online and ready for use.

Advertisements

FAST Search 2010 (FS4SP) – Configuring the Document Converter for Legacy Office Documents

As with most organisations who have implemented a new SharePoint 2010 farm, there is likely to be a need to import/migrate office documents from shared file servers or other types of document management systems into SharePoint. This will mean that there might well be documents that were created using Office 97, XP or 2000 onwards that will be migrated into SharePoint 2010. On a recent exercise while using FAST Search for SharePoint 2010 as the search component, we found that the content of legacy office documents were failing to be indexed and thus not appearing in the users search queries. After further investigation and delving through logs, it was discovered that the Microsoft iFilter 2.0 was not able to convert the legacy documents to HTML to enable them to be indexed. However, this only related to the contents of the documents as the metadata were being crawled and made available in search results.

So how do we spot if this is occurring in your FAST environment? Well, you will need to go into FAST Search administration and check for the warnings in the crawl log for your content source. You will need to be looking for the warning message stated below:

“The FAST Search backend reported warnings when processing the item. ( Document conversion failed: LoadIFilter() failed: Bad extension for file (0x800401e6) )”

Unfortunately, there is not a fix for the Microsoft iFilter 2.0 but a workaround is possible with a configuration change in the FAST environment which will ensure that the legacy file types such as “.doc”, “.ppt” and “.xls” are converted using the SearchExportConverter and not the iFilterConverter. The SearchExportConverter still supports legacy formats and handles their conversion better than the iFilterConverter. The iFilterConverter will still be used to convert new versions of office documents such as “.docx” and “.xlsx” etc.

In order to make this change, I would strongly recommend implementing the process on your test environment to ensure you are happy that the legacy documents are being indexed and are available in search queries for your users. Please follow the steps below on each of your FAST Search servers to implement the SearchExportConverter into your FAST environment:

1) Go to your FAST Search root directory and set the Search Export Converter to “active=Yes” in the file \etc\config_data\documentprocessor\optionalprocessing.xml.

2) Open the converter_rules.xml file in the location \etc\formatdetector\ and comment out the legacy document types.

3) Open the pipelineconfig.xml and swap the order of the “iFilterConverter” and the “SearchExportConverter” in both pipeline sections.

4) In FAST Powershell, you now need to run the following cmdlet “ntrl reloadcfg” which reloads the configuration.

5) Again in FAST Powershell, you will need to re-start all your procservers by using the cmdlet “ntrl restart procserver_X”. X applies to the number of the procserver which you will need to restart as there will be more than one., “nctrl status” will list all the active procservers on your FAST Server.

I would also recommend a reboot of the FAST Search Servers and then you are free to run search queries against the contents of legacy office documents after running a full crawl.

Important Note: If you apply this fix before Service Pack 1, then you will need to re-apply step 3 after deploying Service Pack 1 as the update deploys a new version of the pipelineconfig.xml which includes a new entry called ‘CustomExtractor’ in the Elevated Linguistics pipeline section of the file.