Incident managers have been up to their armpits in process flows for years. Now they can use a visual workflow builder with an elegant drag-and-drop interface and no code to connect toolchains and build processes with ease.
How many of you have overly complicated and manual major incident processes? Have you ever wanted to click a link and just restart that crashed server? Do you copy the text of an email and paste it into a ticketing system so someone else can do something about it?
Whether you’re kicking off a major incident process with a conference call, creating a new ServiceNow ticket and a new Slack channel, or triggering an application restart after an alert from Splunk, you want to do something useful with the targeted notifications you receive.
At xMatters, we’ve been enabling our customers to do these kinds of things for a while now, but the Flow Designer visual workflow builder makes it all available as a drag-and-drop interface. You can build simple to complex toolchains to all of your applications, cloud-based or behind a firewall, to help you minimize the business impact.
In this article I’ll introduce the concepts and UI components of Flow Designer, and in subsequent articles we’ll add some additional complexity and dive into more advanced topics. But enough talk, let’s see the goods.
Getting familiar with Flow Designer
Flow Designer is a graph of nodes called steps. Just drag these steps onto the canvas (the big grey area), and it’s easy to hook them together to make flows. The flows then get kicked off by triggers such as a notification response, event status, or notification comment. We’ll include the whole zoo of triggers and a variety of other exotic species in the near future.
The image above is a canvas showing a simple toolchain tied to a response in an event from Stackdriver. The canvas shows three response options: Acknowledge, Escalate, and Start MIM.
- Acknowledge is a pretty standard response option and, when selected, tells xMatters to stop harassing you and the rest of your team on all your devices.
- Escalate indicates to xMatters that you are unavailable and to go find someone else on your team to harass. These are all part of the standard response behaviors available in xMatters and other applications in the same class.
- Start MIM, however, is where xMatters stands out from the rest. This response option kicks off a flow and pulls together several different applications that a team might be using. Let’s walk through one example, and see what each step does as it executes.
Let’s walk through some of the items in this video:
Exploring the major incident management process
The first step is to create an incident in Statuspage, Atlassian’s excellent tool for communicating service disruptions or maintenance issues. After that, the flow creates a new ServiceNow priority 1 incident and assigns it to a team and copies in all of the details from the Stackdriver incident.
ServiceNow passes the incident number (INC10001) back to the step, and we use this unique number to create a Slack channel. This helps encapsulate all of the collaboration that is about to happen in one channel, which helps make the post mortem easier. It also prevents unnecessary chatter from clouding the incident resolution process. Once the channel is created, we drop a link to the Stackdriver incident, the Statuspage incident, and the ServiceNow incident into the Slack channel so the relevant parties can quickly navigate to the pieces they need to resolve the incident.
Finally, we go ahead and post the same information into the ServiceNow incident, again to help the post mortem folks collate the relevant information and to help anyone trying to track down the status.
As the flow wraps up, the creation of the ServiceNow incident launches a separate notification process (not shown) to the MIM team. By the time those notifications are in flight, they have all the relevant information about the Stackdriver issue within the incident, they have a Slack channel to start collaborating to resolve the issue, and there is even a Stauspage incident communicating with the broader audience. All of this with some drag-and-drop at configure time and the touch of a button at runtime.
Truly minimizing the blast radius.
What are you going to build? In my next blog, I’ll explore orchestrated toolchain integrations with Flow Designer.