Storage Metrics in SP2010 Service Pack 1

Quite a nice feature in SharePoint 2010 that has gone relatively unnoticed in SharePoint 2010 service pack 1, is the Storage Metrics feature. It is available under Site Collection Administration and it enables you to get useful information regarding how content is being used within your site collection. The report will give you a list breakdown of the content and the physical size along with the percentage of the site quota that has been used. This is particularly useful for Site Collection Administrators and IT Pro’s who will want to monitor the size of the content for particular sites. You can take this a step further with using third party tools such as Axceler’s Control Point application which will give you trend analysis on your content.

You are able to go granular with the results to the individual file level, so you can troubleshoot a growing content database by finding out what sort of file types are taking up space in a particular document library or list.

Please note that this feature is only available in Service Pack 1 of SharePoint 2010.

Copenhagen for the European SharePoint Conference 2013

This February it was the turn of the European SharePoint community to get a piece of the SharePoint pie. Four days of keynote speeches, workshops, seminars and a bustling array of third party vendors offering knowledgeable expertise, products and eye-catching prizes. The event was hosted at the Bella Center in Copenhagen which was a great venue and well organised but with a dreary wi-fi connection. So yes, I was one of the special people constantly manoeuvring to try and find a better signal (just like one of many lemmings each day!). I did raise a cheeky grin to see the speakers struggle with stuttering demos down to the wi-fi connection too. Steve Jobs would be smiling down with joy!

After the four days, I have gathered some useful information and discussion points that I will be blogging in the next few days. They will mainly be focused around the IT Pro arena so I will try and delve into as much detail as I can for you that will hopefully help you in your SharePoint journey.

On a separate note, the event was well organised and there was good banter around the booths from the energetic 3rd party vendors and creative stands (Well done to Axceler for the green smurfs..). The AvePoint and Axceler parties were very enjoyable and provided a chance to chat and see the experts let loose. There was even a rumour of projectile vomiting from a delegate on the way to Wednesday’s keynote speech, that would have been a long day for that poor chap!

Anyhow, keep an eye out for my next few blogs relating to the SharePoint 2013 Conference.

FAST Search 2010 – Crawl\Indexing Baseline Performance Testing

In preparation for making your FAST Search for SharePoint environment live in your production environment, it is imperative that you have tested your crawl performance to get your baseline statistics. Once you have your baseline stats then you will be able to re-test your production environment at monthly\six monthly intervals to ensure your FAST Search environment is performing consistently and there is no service degradation.

In order to get your baseline results you need to know which perfmon counters you will need to monitor during a full crawl and what is the acceptable performance range. The FAST Servers have different roles so there will be an overlap of components during the testing. I have detailed below the main perfmon counters to monitor during a full crawl and what to look for in your results. The performance ranges that I have detailed in the results below were the outcome of working with a Microsoft FAST Search specialist so they are accurate:

Processor Utilization – (System\Processor Queue Length): This is the number of threads queued and waiting for time on the CPU. Results: Divide this by the number of CPU’s in the system; if less than 10 then the system is running well.

Useful Additional Counter – (Processor\%Processor Time\_Total): Useful to show how busy the CPU is during the crawl process.

Memory Utilization – (Memory\Pages Input/Sec): This shows the rate at which pages are read from disk to resolve hard page faults. To go into more depth, this is the number of times the system was forced to retrieve something from disk that should have been from RAM. Results: Occasional spikes are acceptable but this should generally be very low, if not zero.

Useful Additional Counter – (Process\Working Set\_Total): This will show how much memory is in the working set.

Disk Utilization – (PhysicalDisk\Current Disk Queue Length\driveletter): Highly valuable counter to monitor. This shows how many read or write requests are waiting to execute to the disk. Results: Single disks it should be idle at 2-3 or below with occasional spikes being fine. For RAID arrays, divide by the number of active spindles in the array and aim for 2-3 or below. If the queue lengths are high then you will need to look at the memory\pages input/sec counter and potentially add more RAM. Aim for zero for this counter. The disk utilization will only be apparent on the Indexing server as the processing servers will mainly be using CPU and memory to process documents.

Useful Additional Counter – (Physical Disk\Bytes/sec\_Total) shows the number of bytes per second being written to or read from disk.

Network Utilization – (Network Interface\Output Queue Length\nic name): This is the number of packets in queue waiting to be sent. Results: If there is a persistent average of more than two packets in queue then you should look to resolve the network for a bottleneck. Aim for zero packets in the queue during a crawl. (Network Interface\Packets Received Errors\nic name): This is packet errors that kept the TCP/IP stack from delivering packets to higher layers. Results: This value should be low.

Useful Additional Counter – (Network Interface\Bytes Total/sec\nic name): This monitors the number of bytes sent or received.

Once you have completed your full crawl and evaluated your results, you will have a good idea of how your FAST Search environment has coped with the volume of content you are crawling and will provide you with vital information to determine if your FAST Search topology is adequate or whether you need to add any more hardware to your servers. I would recommend Microsoft’s article on performance and capacity planning for preparing your FAST Search topology.

In addition, I can also recommend using the Performance and Analysis of Logs (PAL) tool from Codeplex. You can extract a threshold xml file which will have a number of pre-defined counters for FAST Search, which you can then use with your perfmon Data Collector Set to gather and analyse your results. The beauty of this tool is that you can add your results file (.blg file) to the application and it will analyse the results against the recommended thresholds for each counter. Very useful and an excellent way to assist in presenting your results!

FBA Users automatically being made inactive from a Site Collection

An issue that had been causing no end of trouble in our MOSS 2007 extranet environment had been when FBA users appear to have been mysteriously deleted from a Site collection. This had been happening with no administration interference from the SharePoint site administrators and at times outside of when the Profile import synchronisation timer job had been run. This proved to be a challenge to further investigate and identify the cause of the issue.

The first step was to identify what was going on under the covers of SharePoint (i.e.in the content database). On a new occurrence of the issue I identified the users FBA login and ran the SQL query below against the userinfo table of the content database. The userinfo table contains the information on the SharePoint profiles that have been created in the database. A SharePoint profile is created the first time a user logs on to the site. To show the information for this instance I will use the name ‘d.newton’ as an example FBA login name.

select * from [ContentDatabaseName].[dbo].[UserInfo] where tp_login like ‘%d.newton%’

The SQL query above returned more than one user with the same tp_login name and more importantly the column tp_lsActive was set to ‘0’ for all the ‘d.newton’ logins. This effectively means that the user profile for ‘d.newton’ is inactive in SharePoint and no access is granted to the site collection. In addition, the column tp_deleted was populated with the same value as the users profile number. After comparing the duplicated tp_login names, I found that the names appeared to be the same but when you copy the login name from SQL to notepad I found that the duplicate tp_login names had a whitespace at the end. An example is below:

Tp_id = 551    tp_login = ‘membershipprovider:d.newton’    

Tp_id = 552    tp_login = ‘membershipprovider:d.newton ‘    (Single white space)

Tp_id = 553    tp_login= ‘membershipprovider:d.newton ‘    (double white space)

So how, when and why are the duplicated profiles created?? After running a number of custom reports I found that the duplicated profiles were being created by the SharePoint system account and after further testing against the custom extranet login.aspx it was discovered that the login page would accept a username with a whitespace at the end of the username ‘d.newton’ (Note: the password will need to be the same as the original username’s password). So what is happening in the userinfo table when you add the username with the whitespace? The original FBA user is made inactive with the tp_IsActive column being set to ‘0’ and a new profile is created with the same tp_login name with a whitespace and a new profile id but with no access to the relevant SharePoint site. From the user’s experience, they will not be able to see the relevant site. When the user logs in with the incorrect login, SharePoint detects that there is conflict in the users tp_login name where it is required to be distinct, so SharePoint automatically makes the original SharePoint profile inactive to resolve the conflict.

Armed with this new information, we now have the knowledge that the profile 551 with the username ‘d.newton’ with no whitespace is the correct profile and we can make this user active again from within site permissions of the relevant site.

It is also useful to know that we can now replicate the issue, which is good news! However, we need to find a resolution to stop this from re-occurring. The first task I would recommend undertaking is to change the extranet’s forms loginurl to the out of the box Microsoft login.aspx and re-test the same issue. It is very likely that you will find that the Microsoft’s default login.aspx page doesn’t have the same behaviour. The reason for this is that the Microsoft login.aspx page trims whitespace at the end of the username, which means that a username conflict or duplication never occurs. The steps to change the forms login page back to the OOTB login.aspx is detailed below:

1. On your web front end server, open the web.config of your extranet site and navigate to the ‘authentication mode’ section.

2. Change the forms loginurl to “/_layouts/login.aspx” (Details below)

<authentication mode=”Forms”>

<forms loginUrl=”/_layouts/Login.aspx” />

</authentication>

In our environment it turned out to be the custom login.aspx page that was causing the issue. Our custom login page was referencing code that was not trimming the username correctly and thus allowing the duplication of profiles to be created in the userinfo table. What is annoying is that the userinfo table accepts whitespaces in the tp_login column! If you encounter the same issue as us then I would recommend either resolving your code or create a new login.aspx page based on the Microsoft OOTB login.aspx.

Removing the Duplicated Users

As we now had a number of duplicated SharePoint profiles in the userinfo table, it would be useful if the profile synchronisation timer job would clean up the duplicates or inactive users. However, I have been advised by Microsoft that the profile synchronisation timer job will only delete Active Directory inactive users and not FBA inactive users. The only supported way to remove an FBA user from the userinfo table is via powershell and the article by Niklas Goude explains it very well.

Web Analyzer Failing for FAST Search

The FAST Search implementation for SharePoint 2010 is a convoluted process and mistakes in the deployment.xml can cause performance and functionality headaches further down the line during your testing and implementation process. As part of the initial installation phase of FAST search you are required to pre-configure the deployment.xml file. The deployment.xml file is used to determine what components are to be utilized and for which server in your FAST Search topology. As part of all FAST Search topologies, you will have an admin server which will host the web analyzer component. This component is used to analyse search clickthrough logs and hyperlink structures which both contribute to better ranked search results and relevancy.

An issue that we discovered in testing was that the web analyzer was not functioning correctly as we had numerous error messages in the webanalyzer.log file stating that the web analyzer could not connect to the config server. Typical error messages we discovered are detailed below:

“Module (Web Analyzer) at “fast1.domain:13300″ is not responding”

“Couldn’t connect to the Config Server”    

So what caused this error? The error relates to the deployment.xml, the file is case sensitive and if you name your admin server with a case mismatch (e.g “Server1.domain.com”) then this will cause the web analyzer to fail as it will not be able to identify the config server due to the upper case ‘S’ even if you use the FQDN. So in order to avoid this issue then I recommend using a lower case naming convention for all server names when configuring your deployment.xml.

If you encounter this issue in your existing environment then there is a process that can be followed to update your FAST Search configuration. The steps are detailed below:

1.Stop the services on all FAST Search Servers using “nctrl stop” in FAST powershell.

2. Edit the deployment.xml on the FAST Search Admin server and change the FQDN for the admin component to be all lower case.

3. On the admin server, edit ‘etc\waconfig.xml’ and ‘etc\node.xml’ and update the hostname to be all lowercase.

4. In FAST Powershell, run “Set-FASTSearchConfiguration” on the admin server and start the services by running “nctrl start”.

5. Now repeat steps 3 and 4 for all your other FAST servers in your topology.

Your FAST Search environment will now have a working web analyzer. For further reading on the deployment.xml then please refer to the msdn article.

FAST Search 2010 (FS4SP) – Configuring the Document Converter for Legacy Office Documents

As with most organisations who have implemented a new SharePoint 2010 farm, there is likely to be a need to import/migrate office documents from shared file servers or other types of document management systems into SharePoint. This will mean that there might well be documents that were created using Office 97, XP or 2000 onwards that will be migrated into SharePoint 2010. On a recent exercise while using FAST Search for SharePoint 2010 as the search component, we found that the content of legacy office documents were failing to be indexed and thus not appearing in the users search queries. After further investigation and delving through logs, it was discovered that the Microsoft iFilter 2.0 was not able to convert the legacy documents to HTML to enable them to be indexed. However, this only related to the contents of the documents as the metadata were being crawled and made available in search results.

So how do we spot if this is occurring in your FAST environment? Well, you will need to go into FAST Search administration and check for the warnings in the crawl log for your content source. You will need to be looking for the warning message stated below:

“The FAST Search backend reported warnings when processing the item. ( Document conversion failed: LoadIFilter() failed: Bad extension for file (0x800401e6) )”

Unfortunately, there is not a fix for the Microsoft iFilter 2.0 but a workaround is possible with a configuration change in the FAST environment which will ensure that the legacy file types such as “.doc”, “.ppt” and “.xls” are converted using the SearchExportConverter and not the iFilterConverter. The SearchExportConverter still supports legacy formats and handles their conversion better than the iFilterConverter. The iFilterConverter will still be used to convert new versions of office documents such as “.docx” and “.xlsx” etc.

In order to make this change, I would strongly recommend implementing the process on your test environment to ensure you are happy that the legacy documents are being indexed and are available in search queries for your users. Please follow the steps below on each of your FAST Search servers to implement the SearchExportConverter into your FAST environment:

1) Go to your FAST Search root directory and set the Search Export Converter to “active=Yes” in the file \etc\config_data\documentprocessor\optionalprocessing.xml.

2) Open the converter_rules.xml file in the location \etc\formatdetector\ and comment out the legacy document types.

3) Open the pipelineconfig.xml and swap the order of the “iFilterConverter” and the “SearchExportConverter” in both pipeline sections.

4) In FAST Powershell, you now need to run the following cmdlet “ntrl reloadcfg” which reloads the configuration.

5) Again in FAST Powershell, you will need to re-start all your procservers by using the cmdlet “ntrl restart procserver_X”. X applies to the number of the procserver which you will need to restart as there will be more than one., “nctrl status” will list all the active procservers on your FAST Server.

I would also recommend a reboot of the FAST Search Servers and then you are free to run search queries against the contents of legacy office documents after running a full crawl.

Important Note: If you apply this fix before Service Pack 1, then you will need to re-apply step 3 after deploying Service Pack 1 as the update deploys a new version of the pipelineconfig.xml which includes a new entry called ‘CustomExtractor’ in the Elevated Linguistics pipeline section of the file.

FAST Search 2010 (FS4SP) – Disabling TCP/IP Offloading Engine (TOE)

Installing and configuring FAST Search for SharePoint 2010 is a complex process which is mainly due to FAST Search being bought by Microsoft and then bolted onto the SharePoint 2010 product. If you are deploying your FAST Search Server to virtualised servers, then you may well encounter an issue where FAST servers fail to communicate with each other correctly with the default settings of the network adapter. In addition to the communication issues, the feeding of content to a multi-node FAST environment will not be indexing properly and the full crawls will take drastically longer than they should to complete. This will be due to a known issue where IPSEC does not work correctly with TCP/IP offloading.

The types of errors you will be getting in your FAST Search logs are:

WARNING systemmsg Module (Search Dispatcher) at fast.yourdomain.com:15102 is not responding
VERBOSE systemmsg The ping call resulted in the following exception: socket.error: [Errno 10061] Connection refused.
WARNING systemmsg Module (NodeControl) at fast.yourdomain.com:16015 is not responding
VERBOSE systemmsg The ping call resulted in the following exception: Timeout

TCP/IP Offloading (or TOE for short) is a technology system used for the purpose of optimizing the throughput of Ethernet systems. The communication speed in Ethernet systems has increased faster than computer processor speed. This means that the speed of the data from the network is too much for the processor to handle and thus will cause a bottleneck. TOE would solve this problem by removing offloading from the microprocessor and I/O subsystem. However, TOE does not work with IPSEC which is used by FAST Search for SharePoint so TOE is required to be disabled on your virtual machine. In doing this, you will see a performance increase in FAST Search (and your SharePoint servers) and the communication errors during crawls will be resolved.

To disable TCP/IP offloading on your virtual machine, follow the steps below:

1) Open a Cmd prompt and run the following:

netsh int ip set global taskoffload=disabled

netsh int ip show global (This will show that it is disabled Task Offload’)

Netsh int tcp set global chimney=disabled

Netsh int tcp show global (This will show the ‘Chimney Offload State’ as disabled)

2) Now go to the ‘Advanced’ tab of the Network Adapter device properties and change all the properties with ‘Offload’ in the property name to ‘Disabled’ and reboot the machine. Repeat this process for all your FAST Search servers.

This process is now complete and you will no longer have the communication errors during crawls and the warnings will no long appear in your logs. If you wish to read more about this TCP/IP offloading issue, then I would recommend Microsoft’s KB article.

Optimizing SharePoint using IIS Compression

This is the first post of my blog, so I thought I would at least make it as interesting as possible and unravel information on IIS compression for your SharePoint web front end servers. The installation and configuration of SharePoint can be complex depending on the farm topology but little attention is made to the optimisation of IIS, so I thought I would detail some information that may help your web front end servers reduce the size of data transmitted to users while reducing load times and improving page rendering.

SharePoint consists of two primary sources which are static files in the SharePoint root directory and dynamic data that is stored in the content database e.g web parts, navigation fields etc. During runtime SharePoint merges the page contents from both sources prior to submitting them in an http response to the user. This process depending on the volume of dynamic data can result in higher than necessary load times and page rendering, especially for users across the WAN. In IIS 6 and IIS 7 there is a compression feature that can reduce the payload of http responses prior to transmitting them across the network. The compression ranges from 0 (no compression) to 10 (full compression) depending on how aggressive you want it to be. When compression is turned on for your website in IIS the defaults compression settings is set which are 7 for static data and 0 for dynamic data. This means that your dynamic data is not being compressed in your http response to the users and you could be missing out on faster load times and page rendering.

As with most things, there is a slight caveat to compression which is the load on the servers cpu and memory, so I would avoid using the compression value of 10. The general consensus for web sites to get an optimized performance between compression and system utilisation is to set 9 for static data and 4 for dynamic data. This is a recommendation so I would suggest experimenting with the setting in your test environment to find the correct levels for you. In addition, a really nice feature of compression is that when the server cpu goes above 90% then compression is temporarily disable (90% for dynamic data / 100% for static data) and it will only be re-enabled when cpu drops below 50%. Very useful!

To enable compression and set the optimised level follow the steps below:

  1. Enable compression on your web site. Go to IIS and select Compression in the console.

  1. Run the following from a batch file or from a command prompt:

C:\Windows\System32\Inetsrv\Appcmd.exe set config -section:httpCompression -[name=’gzip’].staticCompressionLevel:9 -[name=’gzip’].dynamicCompressionLevel:4

Note: By default the compression levels are not specified in the applicationHost.config file, but once you set the levels to a new setting then they will be viewable in the applicationHost.config.

If you would like to read more about IIS compression then I can recommend Scott Forsyth’s blog and Idera’s white paper called ’10 Steps to Optimize SharePoint Performance’.