15 March 2021

How to write an AWS CLI script, Part I: Patterns

by Alex Harvey

This blog series presents patterns for writing an testing AWS CLI shell scripts.

Introduction
Patterns
Concluding remarks
See also
Revision history

Introduction

I have by now written a lot AWS CLI scripts in Bash, and, over time, have adapted programming patterns that, I think, AWS CLI scripts actually should follow. These patterns forced themselves on me, and I believe that anyone who writes enough of them will eventually come up with something similar. This post is intended to document those patterns. I also hope that others will adopt them.

My other hope in writing this is that the people who may not have realised it yet may discover that Bash is in fact a fine language for automating AWS. I frequently find that there is a misplaced prejudice against Bash in favour of Python, Golang, and other languages, for automating AWS. That is, there is a sense these others are “real” programming languages, despite that they can complicate development and testing in some ways — often, for no real gain. If operators are going to use the AWS CLI to communicate with AWS, then why write automation in a different language?

In the first part of this series, I introduce the programming patterns needed to write a complex AWS CLI shell script. Along the way I present some real life script examples. And then in Part II, I will look at the patterns for unit testing the scripts using the shunit2 framework.

Patterns

Structure of a script

Before getting into the nuts and bolts of Bash programming, I want to start by looking at how to organise a script. In my view, every script should have the following sections:

a usage function
a get_opts function (that is rarely ommitted)
a validate_opts function (that is sometimes omitted)
functions that implement the script’s logic
a main function
and a guard clause.

The following subsections look at what these are and why they are needed.

A `usage` function

Every program or script, whatever it does, should have a usage function that provides a help message as a bare minimum level of documentation. This function should be invoked, at least, when -h is passed to the command line, or when no options are passed to a script that expects at least one.

The usage function should remind the user of: how to run the script; what it does; and what its command line arguments are. Perhaps, it should also give a usage example or two.

Never hard code the script name in the usage function. Instead, use the special variable $0.

Here is an example of a usage function:

usage() {
  echo "Usage: $0 [-h] [-s STACK_NAME]"
  echo "Describe each EC2 instance in a stack"
  exit 1
}

For more on usage functions, see this page.

`get_opts` function

After the usage function, I nearly always have a get_opts function, whose purpose is to process the script’s command line arguments, and, usually, call the Bash built-in getops. Here are two examples of the get_opts function, one using, and the other not using, getopts.

Pattern A - using getopts

The getopts command should be used most of the time unless the script is very simple. Here is an example of a get_opts function that calls getopts:

get_opts() {
  local opt OPTARG OPTIND
  while getopts "hs:" opt ; do
    case "$opt" in
      h) usage ;;
      s) stack_name="$OPTARG" ;;
      \?) echo "ERROR: Invalid option -$OPTARG"
          usage ;;
    esac
  done
  shift $((OPTIND-1))
}

Notice that the $OPTARG and $OPTIND variables have been localised. This is important. Aside from being good style, failure to localise these can lead to some odd behaviour in unit testing. More on that later.

How to use getopts is of course beyond the scope of this article, but see here for a good tutorial on getopts.

Pattern B - without getopts

Occasionally, it is better to not use getopts. The reasons for this include the limitations with getopts (e.g. no POSIX-compliant support for long options), or that getopts may be too complicated for a simple use case. Whatever the reason, you may need to write your own option handling logic.

Here is an example that rewrites the function above without using getopts:

get_opts() {
  [ "$1" == "-h" ] && usage

  if [ "$1" == "-s" ] ; then
    stack_name="$2"
    return
  fi

  usage
}

Calling get_opts

The get_opts function operates on the command line arguments that are stored in the special Bash variable $@. It is therefore necessary to explicitly pass $@ to get_opts from the top scope.

get_opts "$@"

This line would normally be in the main function. More below.

`validate_opts` function

A safe, well written script usually also has some input validation - sanity checking to ensure that the inputs passed into it are safe or sane, and helpful feedback to the user if they are not. If the input validation requirement is very simple, the validation logic could be done inside the get_opts function. But often, it makes sense to do this in a separate function. Here is an example:

validate_opts() {
  if [ -z "$stack_name" ] ; then
    echo "You must pass a stack_name with -s"
    usage
  fi
}

section to implement logic

Most of your script will be actual implementation. After defining a usage, get_opts and validate_opts function, the middle section of an AWS CLI script is made up of the functions that actually do the work. These are expected to be either called by other functions or by the main function. See the next subsection.

`main` function

Every script, for reasons of consistency, testability and readability, should have a main function. The main function should present the high level logic of the program in a human-readable form. Of course, the main function will only be readable if you have given your functions good names, so make sure you do give them good names, and keep in mind the readability of the main function when naming things. For example:

main() {
  get_opts "$@"
  validate_opts
  describe_instances_in_stack
}

It is hopefully clear, without any further implementation detail, that this is the main function of a script that runs describe-instances on all the instances in a CloudFormation stack.

guard clause

Finally, there is one more section in a script that is often overlooked, but also important. Every script should have a guard clause. A guard clause - in general - is a piece of code that prevents code from actually running. In our case, a guard clause is used to prevent the main function from running if the script is sourced into the running shell, as opposed to executed. This is needed to make the script and its functions testable.

The guard clause always looks like this:

if [ "$0" == "${BASH_SOURCE[0]}" ] ; then
  main "$@"
fi

Notice here once again that we pass $@ to main so that main can then pass it to get_opts.

all together

Putting all of this together, here is a full script example. In this example, the script lists all EC2 instances that are inside a CloudFormation stack.

#/usr/bin/env bash

usage() {
  echo "Usage: $0 [-h] [-s STACK_NAME]"
  exit 1
}

get_opts() {
  local opt OPTARG OPTIND
  while getopts "hvs:" opt ; do
    case "$opt" in
      h) usage ;;
      s) stack_name="$OPTARG" ;;
      \?) echo "ERROR: Invalid option -$OPTARG"
          usage ;;
    esac
  done
  shift $((OPTIND-1))
}

validate_opts() {
  if [ -z "$stack_name" ] ; then
    echo "You must pass a stack_name with -s"
    usage
  fi
}

list_stack_resources() {
  local resource_type="AWS::EC2::Instance"
  local query=\
'StackResourceSummaries[?ResourceType==`'"$resource_type"'`].PhysicalResourceId'

  aws cloudformation list-stack-resources \
    --stack-name "$stack_name" \
    --query "$query"
}

describe_instances_in_stack() {
  local instance_id
  list_stack_resources | while read -r instance_id ; do
    aws ec2 describe-instances --instance-id \
      "$instance_id" --output "table"
  done
}

main() {
  export AWS_DEFAULT_OUTPUT="text"
  get_opts "$@"
  validate_opts
  describe_instances_in_stack
}

if [ "$0" == "${BASH_SOURCE[0]}" ] ; then
  main "$@"
fi

Default output

To avoid sending --output text to commands in an AWS CLI script, as shown in the example above, it makes sense to set the environment variable AWS_DEFAULT_OUTPUT. This can be done in the main function:

main() {
  export AWS_DEFAULT_OUTPUT="text"
  get_opts "$@"
  validate_opts
  describe_instances_in_stack
}

Globals and locals

Note that Bash functions cannot return values, so it is proper to use global variables to share data throughout the script. But variables that are internal to a function should be localised using local. Here is a function that correctly mixes globals and locals:

describe_key_pair() {
  local filter="Name=key-name,Values=$key_name"
  aws ec2 describe-key-pairs --filters "$filter"
}

Here, $key_name is a global variable - and also the input to this function - whereas $filter is a local variable. The global is expected to be defined elsewhere in the program before this function is called, whereas $filter exists only within the scope of this particular function.

Testing a function with globals

To test a function that depends on a global variable, do this:

▶ key_name=default ; describe_key_pair
{
  "KeyPairs": [
    {
      "KeyPairId": "key-0607ce5fb02240a92",
      "KeyFingerprint": "70:e2:fa:b1:97:e3:68:5f:6a:63:93:17:09:5a:43:29:60:94:53:ab",
      "KeyName": "default",
      "Tags": []
    }
  ]
}

Calling the AWS CLI

Simple calls to the AWS CLI that wrap a single AWS CLI subcommand should be named after the subcommand. The do-one-thing-well principle suggests to me that there should be one Bash function for every call to the AWS CLI. A very simple example might be:

describe_instances() {
  aws ec2 describe-instances
}

Filters, queries and jp

A treatment of AWS CLI filters and its JMESpath query language is beyond the scope of this article. However, it needs to be said that writing AWS CLI scripts requires you to fully understand both of these topics. The AWS documentation on this here is quite good, and there are also many good JMESpath tutorials out there.

Note that testing your JMESpath queries can be done using the jp command line utility. I recommend installing that and becoming familiar with it.

Reading a single value from AWS CLI command output

Often, you will need to set variables in your script based on responses from AWS CLI commands. Suppose you have an EC2 key pair name, and you need the key pair ID. The AWS CLI returns JSON that looks like this:

{
  "KeyPairs": [
    {
      "KeyPairId": "key-09e9f9a9f6632d54e",
      "KeyFingerprint": "af:ef:61:07:42:f8:33:0a:e4:d6:89:cb:2b:bb:3a:2e:21:fb:16:19",
      "KeyName": "alex",
      "Tags": []
    },
    {
      "KeyPairId": "key-0607ce5fb02240a92",
      "KeyFingerprint": "70:e2:fa:b1:97:e3:68:5f:6a:63:93:17:09:5a:43:29:60:94:53:ab",
      "KeyName": "default",
      "Tags": []
    }
  ]
}

If I want the key pair ID given the key name, here are two ways of doing that:

Pattern A - using read -r

The read -r command is always needed (see below) to read more than one value on a single line. But it also can be used to read just one value, as here:

get_key_pair_id() {
  read -r key_pair_id <<< "$(
    aws ec2 describe-key-pairs
      --filters "Name=key-name,Values=$key_name" \
      --query "KeyPairs[].KeyPairId"
  )"
}

Pattern B - use variable assignment

The second way to write this same function is:

get_key_pair_id() {
  key_pair_id="$(
    aws ec2 describe-key-pairs \
      --filters "Name=key-name,Values=$key_name" \
      --query "KeyPairs[].KeyPairId"
  )"
}

Which one should you use? This second pattern is probably more familiar to Bash users, however the first is more flexiable, because, as mentioned, it can also be used to read more than one value at a time. Perhaps it makes sense to use the first pattern if you will not need the second pattern at all in your script, but use the second pattern for consistency if you need both.

Reading multiple values from AWS CLI command

But what if you need to read both the KeyPairId and the fingerprint? Many a naive Bash programmer might do something like this:

get_key_details() {
  key_pair_id="$(aws ec2 describe-key-pairs --query "KeyPairs[].KeyPairId" ...)"
  key_fingerprint="$(aws ec2 describe-key-pairs --query "KeyPairs[].KeyFingerprint" ...)"
}

But that is both expensive and messy. It is not necessary to call the same command twice to read two values. Instead, we do this:

get_key_details() {
  read -r key_pair_id key_fingerprint <<< "$(
    aws ec2 describe-key-pairs \
      --filters "Name=key-name,Values=$key_name" \
      --query "KeyPairs[].[KeyPairId, KeyFingerprint]"
  )"
}

This combines 2 patterns that need to memorised:

Reading multiple variables in a single command in Bash, e.g. see Stack Overflow here.
Multi-select lists in JMESpath, e.g. docs here.

Iteration

Pattern A - using while loops

Suppose I have these additional key pairs with the word “runner” in them and I want to delete them all:

{
  "KeyPairs": [
    {
      "KeyPairId": "key-09e9f9a9f6632d54e",
      "KeyFingerprint": "af:ef:61:07:42:f8:33:0a:e4:d6:89:cb:2b:bb:3a:2e:21:fb:16:19",
      "KeyName": "alex",
      "Tags": []
    },
    {
      "KeyPairId": "key-0607ce5fb02240a92",
      "KeyFingerprint": "70:e2:fa:b1:97:e3:68:5f:6a:63:93:17:09:5a:43:29:60:94:53:ab",
      "KeyName": "default",
      "Tags": []
    },
    {
      "KeyPairId": "key-068c8e5fa546a4ddc",
      "KeyFingerprint": "2a:a9:63:3a:e5:eb:f6:81:75:f2:86:90:75:6c:07:8a",
      "KeyName": "runner-3szwaqc8-runner-1571372115-8810afca",
      "Tags": []
    },
    {
      "KeyPairId": "key-08db2361f992f0aef",
      "KeyFingerprint": "2f:c0:e7:6c:4c:68:2a:12:21:07:c2:3a:d7:c0:df:0f",
      "KeyName": "runner-devbsxdv-runner-1571318225-c715842e",
      "Tags": []
    }
  ]
}

I will need to iterate through the list of key pairs and delete all the ones that match “runner”. But first, we need to know how to iterate. Let us begin by creating a function that performs the query on key name. Here is that function:

describe_key_pairs() {
  local filter="*runner*"
  aws ec2 describe-key-pairs \
    --filters "Name=key-name,Values=$filter" \
    --query "KeyPairs[].[KeyPairId]"
}

This returns:

▶ describe_key_pairs
key-068c8e5fa546a4ddc
key-08db2361f992f0aef

Notice this time that the returned values are on separate lines. I have achieved this by coercing the key pair ID to a list in the query (see where I have [KeyPairId]). When converted to the text output format, these then appear one per line.

Next, I need a function that actually iterates. In my example, this will be the delete function. Here it is:

delete_key_pairs() {
  local key_pair_id
  describe_key_pairs | \
  while read -r key_pair_id ; do
    aws ec2 delete-key-pair \
      --key-pair-id "$key_pair_id"
  done
}

Using arrays and for loops

Pattern B - Using read -r -a

If I wanted to instead use a for loop to iterate over an array — maybe I need to keep these key pairs in the array for use later — I can refactor to do that this way:

describe_key_pairs() {
  local filter="*runner*"
  aws ec2 describe-key-pairs \
    --filters "Name=key-name,Values=$filter" \
    --query "KeyPairs[].KeyPairId"
}

delete_key_pairs() {
  local key_pair_ids key_pair_id
  read -r -a key_pair_ids <<< "$(describe_key_pairs)"
  for key_pair_id in "${key_pair_ids[@]}" ; do
    aws ec2 delete-key-pair \
      --key-pair-id "$key_pair_id"
  done
}

Pattern C - Using readarray aka mapfile - Bash 4

Or, I could load the array using the Bash 4 feature readarray. This command has an alternative name, mapfile. To refactor to use this pattern:

describe_key_pairs() {
  local filter="*runner*"
  aws ec2 describe-key-pairs \
    --filters "Name=key-name,Values=$filter" \
    --query "KeyPairs[].[KeyPairId]"
}

delete_key_pairs() {
  local key_pair_ids key_pair_id
  readarray -t key_pair_ids <<< "$(describe_key_pairs)"
  for key_pair_id in "${key_pair_ids[@]}" ; do
    aws ec2 delete-key-pair \
      --key-pair-id "$key_pair_id"
  done
}

Complete example

By way of providing a complete example, let’s put all of this together in a script that deletes key pairs that match a pattern supplied by the user.

#!/usr/bin/env bash

usage() {
  echo "Usage: $0 [-h] PATTERN"
  exit 1
}

get_opts() {
  [ "$1" == "-h" ] && usage
  pattern="$1"
}

describe_key_pairs() {
  local filter="*$pattern*"
  aws ec2 describe-key-pairs \
    --filters "Name=key-name,Values=$filter" \
    --query "KeyPairs[].[KeyPairId]"
}

delete_key_pairs() {
  local key_pair_id
  describe_key_pairs | \
  while read -r key_pair_id ; do
    aws ec2 delete-key-pair \
      --key-pair-id "$key_pair_id"
  done
}

main() {
  export AWS_DEFAULT_OUTPUT="text"
  get_opts "$@"
  delete_key_pairs
}

if [ "$0" == "${BASH_SOURCE[0]}" ] ; then
  main "$@"
fi

Concluding remarks

So, in this post, I have documented the Bash programming patterns that, in my opinion, are needed to write useful AWS CLI scripts. I have also, in the course of presenting the patterns, shown them in the context of two complete examples of AWS CLI scripts. Stay with me for the second part, where I’ll show how to unit test these scripts.

Revision history

March 15th, 2021 - Initial version.
May 9th, 2021 - Add section on readarray.

tags: bash - aws - shunit2

alexharv074.github.io

My blog

How to write an AWS CLI script, Part I: Patterns

Introduction

Patterns

Structure of a script

A `usage` function

`get_opts` function

Pattern A - using getopts

Pattern B - without getopts

Calling get_opts

`validate_opts` function

section to implement logic

`main` function

guard clause

all together

Default output

Globals and locals

Testing a function with globals

Calling the AWS CLI

Filters, queries and jp

Reading a single value from AWS CLI command output

Pattern A - using read -r

Pattern B - use variable assignment

Reading multiple values from AWS CLI command

Iteration

Pattern A - using while loops

Using arrays and for loops

Pattern B - Using read -r -a

Pattern C - Using readarray aka mapfile - Bash 4

Complete example

Concluding remarks

See also

Revision history

How to write an AWS CLI script, Part I: Patterns

Introduction

Patterns

Structure of a script

A usage function

get_opts function

Pattern A - using getopts

Pattern B - without getopts

Calling get_opts

validate_opts function

section to implement logic

main function

guard clause

all together

Default output

Globals and locals

Testing a function with globals

Calling the AWS CLI

Filters, queries and jp

Reading a single value from AWS CLI command output

Pattern A - using read -r

Pattern B - use variable assignment

Reading multiple values from AWS CLI command

Iteration

Pattern A - using while loops

Using arrays and for loops

Pattern B - Using read -r -a

Pattern C - Using readarray aka mapfile - Bash 4

Complete example

Concluding remarks

See also

Revision history

A `usage` function

`get_opts` function

`validate_opts` function

`main` function