200 - Automating operations with Playbooks and Runbooks

Authors

  • Stephen Salim, Well-Architected Geo Solutions Architect.

Contributors

  • Brian Carlson, Well-Architected Operational Excellence Pillar Lead.
  • Jang Whan Han, Well-Architected Geo Solutions Architect.

Introduction

Manually running your runbooks and playbooks for operational activities has a number of drawbacks:

  • Activities are prone to errors & difficult to trace.
  • Manual activities do not allow your operational practice to scale in line with your business requirements.

In contrast, implementing automation in these activities has the following benefits:

  • Improved reliability by preventing the introduction of errors through manual processes.
  • Increased scalability by allowing non linear resource investment to operate your workload.
  • Increased traceability on your operation through log collection of the automation activity.
  • Improved incident response by reducing idle time and automatically triggering activity based on known events.

At a glance, both runbooks and playbooks appear to be similar documents that technical users, can use to perform operational activities. However, there an essential difference between them:

  • A playbook documents contain processes that guides you through activities to investigate an issue. For example, gathering applicable information, identifying potential sources of failure, isolating faults, or determining the root cause of issues. Playbooks can follow multiple paths and yield more than one outcome.

  • A runbook contains procedures necessary to achieve a specific outcome. For example, creating a user, rolling back configuration, or scaling resource to resolve the issue identified.

This hands-on lab will guide you through the steps to automate your operational activities using runbooks and playbooks built with AWS tools.

We will show how you can build automated runbooks and playbooks to investigate and remediate application issues using the following AWS services:

Goals:

  • Build and run automated playbooks to support your investigations
  • Build and run automated runbooks to remediate specific faults
  • Enabling traceability of operations activities in your environment

Prerequisites:

Costs

NOTE: You will be billed for any applicable AWS resources used if you complete this lab that are not covered in the AWS Free Tier.

Steps: