Troubleshooting a Memory Leak on your SharePoint Web Server

You may encounter an issue where your SharePoint web application continually falls over and you need to ascertain why! One of the questions you may need to ask is whether you have a memory leak on your web front end servers? So how do we find this out? In this instance, performance monitor is your friend and you can set up a Data Collector Set that will include the relevant counters specific to discovering a memory leak.

I have detailed below the XML code that you can use to create your Data Collector Set:

</p><p>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
</p><p>&lt;?Copyright (c) Microsoft Corporation. All rights reserved.?&gt;
</p><p>&lt;DataCollectorSet&gt;
</p><p>&lt;Name&gt;SharePoint_Server_2010_Memory&lt;/Name&gt;
</p><p>&lt;DisplayName&gt;@%systemroot%\system32\wdc.dll,#10026&lt;/DisplayName&gt;
</p><p>&lt;Description&gt;@%systemroot%\system32\wdc.dll,#10027&lt;/Description&gt;
</p><p>&lt;Keyword&gt;Memory&lt;/Keyword&gt;
</p><p>&lt;Keyword&gt;Disk&lt;/Keyword&gt;
</p><p>&lt;Keyword&gt;Performance&lt;/Keyword&gt;
</p><p>&lt;RootPath&gt;%systemdrive%\perflogs\System\Performance&lt;/RootPath&gt;
</p><p>&lt;SubdirectoryFormat&gt;3&lt;/SubdirectoryFormat&gt;
</p><p>&lt;SubdirectoryFormatPattern&gt;yyyyMMdd\-NNNNNN&lt;/SubdirectoryFormatPattern&gt;
</p><p>&lt;PerformanceCounterDataCollector&gt;
</p><p>    &lt;Name&gt;SharePoint_Server_2010_Memory&lt;/Name&gt;
</p><p>    &lt;SampleInterval&gt;15&lt;/SampleInterval&gt;
</p><p>    &lt;Counter&gt;\LogicalDisk(*)\% Idle Time&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\LogicalDisk(*)\Split I/O /sec&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\% Committed Bytes In Use&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Available MBytes&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Committed Bytes&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Pages Input/sec&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Pages/sec&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Pool Nonpaged Bytes&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Memory\Pool Paged Bytes&lt;/Counter&gt;
</p><p>    &lt;Counter&gt;\Process(_Total)\Private Bytes&lt;/Counter&gt;
</p><p>&lt;/PerformanceCounterDataCollector&gt;
</p><p>&lt;DataManager&gt;
</p><p>    &lt;Enabled&gt;-1&lt;/Enabled&gt;
</p><p>    &lt;CheckBeforeRunning&gt;-1&lt;/CheckBeforeRunning&gt;
</p><p>    &lt;MinFreeDisk&gt;200&lt;/MinFreeDisk&gt;
</p><p>    &lt;MaxSize&gt;1024&lt;/MaxSize&gt;
</p><p>    &lt;MaxFolderCount&gt;100&lt;/MaxFolderCount&gt;
</p><p>    &lt;ResourcePolicy&gt;0&lt;/ResourcePolicy&gt;
</p><p>    &lt;FolderAction&gt;
</p><p>        &lt;Size&gt;0&lt;/Size&gt;
</p><p>        &lt;Age&gt;1&lt;/Age&gt;
</p><p>        &lt;Actions&gt;3&lt;/Actions&gt;
</p><p>    &lt;/FolderAction&gt;
</p><p>    &lt;FolderAction&gt;
</p><p>        &lt;Size&gt;0&lt;/Size&gt;
</p><p>        &lt;Age&gt;56&lt;/Age&gt;
</p><p>        &lt;Actions&gt;8&lt;/Actions&gt;
</p><p>    &lt;/FolderAction&gt;
</p><p>    &lt;FolderAction&gt;
</p><p>        &lt;Size&gt;0&lt;/Size&gt;
</p><p>        &lt;Age&gt;168&lt;/Age&gt;
</p><p>        &lt;Actions&gt;26&lt;/Actions&gt;
</p><p>    &lt;/FolderAction&gt;
</p><p>    &lt;ReportSchema&gt;
</p><p>        &lt;Report name="PAL Report" version="1" threshold="100"&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Common.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Summary.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Performance.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.CPU.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Network.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Disk.xml"/&gt;
</p><p>            &lt;Import file="%systemroot%\pla\reports\Report.System.Memory.xml"/&gt;
</p><p>        &lt;/Report&gt;
</p><p>    &lt;/ReportSchema&gt;
</p><p>    &lt;Rules&gt;
</p><p>        &lt;Logging level="15" file="rules.log"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Common.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Summary.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Performance.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.CPU.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Network.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Disk.xml"/&gt;
</p><p>        &lt;Import file="%systemroot%\pla\rules\Rules.System.Memory.xml"/&gt;
</p><p>    &lt;/Rules&gt;
</p><p>&lt;/DataManager&gt;
</p><p>&lt;/DataCollectorSet&gt;
</p><p>

 

So now that we have run the Data Collector Set and got our results, how do we discover if there is an issues with the memory? The information below will help you troubleshoot your results. The source information has come from an excellent blog CC Hamed who is a Microsoft Support Engineer. The blog goes into more detail so it is well worth a read.

Memory \ %Committed Bytes in Use:

If this value is consistently over 80% then your page file may be too small.

Memory \ Available Bytes:

If this value falls below 5% of installed RAM on a consistent basis, then you should investigate.  If the value drops below 1% of installed RAM on a consistent basis, there is a definite problem!

Memory \ Committed Bytes:

Keep an eye on the trend of this value – if the value is constantly increasing without levelling off, you should investigate.

Memory \ Pages / sec:

This will depend on the speed of the disk on which the page file is stored.  If there are consistently more than 40 per second on a slower disk or 300 per second on fast disks you should investigate.

Memory \ Pages Input / sec:

This will vary – based on the disk hardware and overall system performance.  On a slow disk, if this value is consistently over 20 you might have an issue.  A faster disk can handle more.

Memory \ Pool Nonpaged Bytes:

If Nonpaged pool is running at greater than 80%, on a consistent basis, you may be headed for a Nonpaged Pool Depletion issue (Event ID 2019).

Memory \ Pool Paged Bytes:

Paged Pool is a larger resource than Nonpaged pool – however, if this value is consistently greater than 70% of the maximum configured pool size, you may be at risk of a Paged Pool depletion (Event ID 2020). 

Process (_Total) \ Private Bytes:

Similar to the Committed Bytes counter for memory, keep an eye on the trending of this value.  A consistently increasing value may be indicative of a memory leak

LogicalDisk (pagefile drive) \ % idle time:

If the drive(s) hosting the page file are idle less than 50% of the time, you may have an issue with high disk I/O

LogicalDisk (pagefile drive) \ Split I/O / sec:

Issues relating to Split I/O depend on the disk drive type and configuration.