I had a customer log a support case detailing an issue whereby all Lync 2010 clients were displaying a red banner stating “Limited functionality is available due to an outage”, however this customer had only a single front end server. This message is typically displayed when users registered against and SBA experience a WAN outage in their location which results in them being unable to contact the associated front end server. The message shown in the client is illustrated below.
This issue affected all users registered again the front end server and upon viewing the application, system and Lync Server event logs a picture of how the issue was produced could be gained. Firstly in the front end server Lync Server event log the following error had been produced at the time the customer reported the issue:
Pas with FQDN: lync.domain.co.uk has been detected to be down.
PAS stands for Presence Agent Server and is the component of the Lync Server handling presence logic and traffic, turning to the system event logs on the front end detailed why the problem had occurred and why contacts could not be seen within the Lync client. The first error displayed was the following:
Reset to device, \Device\RaidPort0, was issued.
This detailed that access to the virtual machines storage subsystem had been interrupted for a short period of time and as such the following event was then also displayed:
SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [c:\csdata\backendstore\rtc\dynlogpath\rtcdyn.ldf] in database [rtcdyn] (7). The OS file handle is 0x0000000000000740. The offset of the latest long I/O is: 0x000000060aca00
The above error essentially caused the underlying message that was displayed in the Lync client and in order resolve the problem the virtual machine was moved to another virtualisation host which immediately caused the error in the client to disappear as reliable access to the storage volume was now possible.