Guest blog post by Erin Jones, Partner Manager, xMatters
Access your monitoring platform and find the alert. Export the data report. Create an issue in Jira, then attach the data report. Search assignees and add all necessary parties to the ticket. Spin up a chat room for the incident to facilitate swarming. Log into your StatusPage and let your users know about the incident, at each stage, as you can now finally get around to resolving it.
These are all the things Incident Managers may find themselves doing when a major issue or outage occurs - and all of these put together increase the time to when your teams can actually begin working on fixing the issue. Not to mention, if this incident occurs in the middle of the night - imagine repeating all these steps, under stress, and at 2 a.m.
When the average cost of downtime is $300,000 per hour, doing all these steps manually can easily add six figures to this total - so, why aren’t you leveraging proactive notifications to decrease this time?
We’ve assembled a simple checklist of the top proactive measures you should be automating to decrease time to report and engage for a faster time to incident resolution.
- Create a Jira Issue
With the push of a button from the xMatters notification, you can create a Jira issue in your team’s project, including issue type (as you’ve customized it for your use case) and proper assignee. Plus, the data from the incident alert in your monitoring tool (ex. Splunk, Dynatrace, AppDynamics) will automatically be added to the issue - so your assignee and watchers have all the information they need to get started on the issue
2. Start a Chat Room
Swarming is one of the most effective ways to get your team collaborating on incident resolution in real time. From xMatters, you can also push a button to start a chat room in HipChat, Stride, and/or Slack. Then, select from your on-call schedule to pull in the necessary people. And just like with the Jira automation, your monitoring alert data appears in the chat room without any additional work.
3. Spin up a Conference Bridge
Need to get the right people on the phone? You guessed it - another button click and you’ve got a Conference Bridge with the right people invited to join. No more logging into your call system, setting up a number, and texting/ emailing folks to share this info. Now, you have more time to get to what matters.
4. Notify Stakeholders
At this point, you have stakeholders needing updates - but you probably don’t have the time to stop working and craft an email with all the information. By customizing your Comm Plan[DG3] in xMatters, all this can be automated too. Simply set up who needs to know what and when (i.e. “Email this list of people only for a P1”) and you have one less thing pulling you away from resolving your incident.
5. Post to StatusPage
Outside of your internal teams, you also have other important stakeholders: your customers. To keep them up to date on the issue and your steps to resolve, click the StatusPage button in xMatters to automatically push updates. Not only does this let your clients know what’s going on (without added time away from incident resolving), but it also significantly decreases the number of new tickets you’ll get from clients who don’t know you’re aware of (and working on fixing) the problem.
Automating any of the five above steps drastically reduces your mean time to resolve - and with xMatters, you can easily accomplish all five. Optimizing your Incident Management process saves your company money and keeps your teams, stakeholders, and clients happy. Plus, it makes those 2 a.m. outages a lot less stressful.
Ready to implement these 5 steps to faster MTTR?
Contact Praecipio Consulting about licensing and implementing xMatters.