It is great to end the year feeling like (almost all) of our clients had great operational results. Our clients had 99.99% uptime during business hours across all of their IT systems in December. 94%+ of our clients experienced 100% uptime.To those clients who did not enjoy this same result as President of PCIT I am sorry. I feel very committed to eliminating the causes that contributed to the small operational disruptions. Some of these issues were preventable and that really hurts.
Lessons Learned in December
Microsoft Exchange 2013 (email, calendars, contacts) can behave in a way that is very unexpected. When deploying a patch ALL services on this server were found to not just shut off but they were set to disabled! So we were happily working away one morning while our client was not so happy. They had no email. Our systems gave us zero alerts there was a problem (the patch had disabled the tools ability to do that). Perhaps every Exchange administrator in the world besides us knows this is possible but we did not.
When I first saw the condition of the server while looking over a Customer Care technicians shoulder I honestly thought somehow the server had been hacked and broken in a way we’d need to rebuild it. Services disabled? That was supposed to require a human to set each Windows server service to this state in my experience.
Enough whining. Our patching team has created a new solution to alert us of this problem. Every single server patched and rebooted should also have a human visually inspect the rebooted server before business hours. This visual inspection was part of our server reboot checklist. Somehow that step was also missed thereby compounding the problem.