Building a Modern Chatops Platform

If you have stumbled upon The Startup Zeitgeist post on HackerNews then as a operations person there is one thing you can not miss: the emergence of Slack and chatops ecosystem surrounding it. Slack has definitely disrupted how we communicate with teams and with machines too. But beyond just team communication – such tools have enabled a lot more:

  • All operations are now known to team instead of known to one member. The handoff between team members is transparent and seamless.
  • A lot more teams can participate in querying or modifying infrastructure based on access control while also keeping the whole thing visible
  • A continuous log & audit trail of activities is maintained which is also searchable.

Again, what is ChatOps?

In simple words Chatops enables people to get work done through Chat tools. Chatops enables self serviceability of complex tasks in a team environment so that feedback loop is faster and people are empowered.

Capabilities of Modern ChatOps platform

In this post we will explore all capabilities that one should keep in mind when building a chatops platform. A lot of what is needed as “chatops platform” might be application/organization specific – but we want to draw a blueprint from which you can pick up and choose to build a chatops platform. We have intentionally focused on capabilities – and not talking from a tool/platform perspective. At the same time some tools have been mentioned in each section – which largely accomplish the capability being discussed.

Collaboration/Chat platform

Chat platform is no doubt one of main pieces of a ChatOps platform and a way to interface with ChatOps platform. The interface allows forming groups/discussions, file sharing etc. But probably the key differentiator in modern chat platforms as compared to traditional once is the integrations. These platforms integrate with a variety of services & chatbots to accomplish a lot more than traditional chatting platforms. A chat platform which does not integrate with anything external is absolutely deal breaker to build a chatops platform. The most common alternatives fulfilling this capability are Slack, Hipchat, Mattermost, Campfire and of course the good old IRC.

ChatBot

Chatbot platforms form the core of a ChatOps platform and does all orchestration between multiple systems. These platforms provide a wide variety of plugins to interact with multiple systems and extensibility to write your own plugins easily. This is one area where a lot of customization will happen over period of time and probably OOTB installation won’t be of much use. It is also important to choose a platform which is inline with your team’s comfort level of programming language in which bot is written so that customizations are easier to build in. The popular options are Lita written in Ruby, Hubot written at Github in javascript and Python based Err.

Integrations

While the chat platform and ChatBots provide plenty of integrations OOTB, there are some integrations which are absolutely  must for a successful ChatOps platform.

Monitoring & log management systems

Most of system’s health information comes from monitoring systems (Such as Zabbix, Nagios, Sensu etc.) and log management platforms (Likes of ELK stack, Splunk etc.) It is essential to be able to integrate with these systems and pull out as much data as possible without leaving the chat console. It should be possible not only to monitor health of system but also services and APIs. For example in case of API – it may not be down but the service might have degraded due to 1/2 instances being down at times. If the API/service is a public facing service, updating the status with services such as StatusPage is also a critical factor.

CI/CD ecosystem

A lot of developers and support engineer’s time and focus is spent interacting with systems which enable delivery of software. ChatOps platform should enable interacting with such systems for example getting status of a certain deployment or status of a given build etc. Some basic operations on source code management system is also useful in enabling faster communication. Being able to interact with ticketing systems is an important feature of chatops platform.

Configuration management platform

Chatops platform should enable ops teams to take action on infrastructure right from chat interface. This has advantage of enabling teams without access to machines but also tracking the changes closely as a team. What level of integrations exist with likes of Capistrano, Chef, Puppet, Ansible, Saltstack and what additional work will be required to enable team fully is a key criteria in building the platform.

Schedule and escalation management

Most of systems today are built for 24×7 world and managing the on call rotation can get fairly complex. Integrating with a system which handles escalation policies, on call rotation and notifying right person at right time is critical for uptime and success of such online systems. Some systems which come to mind are PagerDuty, OpsGenie

ChatOps

Summary

Chatops enables great deal of collaboration and openness between teams while getting things done at super fast speed. We are in very nascent stage of ChatOps – the possibilities are endless, for example checkout the talk here.  There is a dedicated ChatOps topic on reddit and discussions are defining the future. We would love to hear your ChatOps story.

 

 

 

 

Vishal Biyani

Vishal Biyani has worked across the whole spectrum of SDLC from developing code to deploying code and supporting customer. Vishal's roles have spanned from being a consultant to Fortune 500 companies to hands on platform building for Internet scale companies. Vishal is a DevOps practitioner, likes to work in Agile environments with a focus on Test Driven Development. Vishal's interests span continuous delivery, Kubernetes, containers and security. Before founding InfraCloud he worked in companies like HCL and AudienceScience building multiple Cloud and DevOps solutions.

Leave a Reply

Your email address will not be published. Required fields are marked *

Enjoy this blog? Please spread the word :)