Thinking > story

Testing Bash Scripts with Scriptkeeper

Shell scripts start small, but can quickly grow bigger and more complicated. They may implement critical functionality – for example, deployment – but they are, almost always, untested. When you look at a script's Git history, you find that there's no one to blame. The script started out very small, just a few lines of code and Bash seemed like a good choice at the time. It seemed fine that it wasn't tested, since it was so small. Over time, people needed the script to do more, so it grew and grew. At every step it seemed fine to stick with Bash. And no one ever added any tests – either because they didn't bother, since it's just a Bash script, or because they didn't dare, since writing good tests in Bash is hard. Most likely a bit of both. The end result is a maintainability nightmare, a piece of code that implements critical functionality, written in a language famous for being hard to use correctly, all of which is completely untested.

You can try to add traditional unit tests to the Bash script (for example with shunit2). That seems like a good idea, but it's a big refactoring. And, if you're used to modern languages with good support for tests, the testing frameworks won't feel ergonomic.

You might also reimplement the script in a different language. Again, in principle, that's a good idea, but it's an even larger undertaking.

Introducing scriptkeeper

Earlier this year, we at Originate started writing scriptkeeper. It lets you write and run tests for your Bash scripts. Unlike other tools, however, it works without the need to change your scripts at all. You can backfill tests without introducing regressions. Once there are tests, you can go ahead and refactor with confidence.

scriptkeeper is open-source and you can get it here: github.com/Originate/scriptkeeper.

Mocking Processes with Steps

Here's a short script deploy.sh that deploys an application to Heroku:

#!/usr/bin/env bash

set -euo pipefail

heroku container:push web --app my-fancy-app
heroku container:release web --app my-fancy-app

If you want to test this script with scriptkeeper, you need to create a YAML file next to the script that has the same name with an added .test.yaml extension. In our case, it's deploy.sh.test.yaml:

tests:
  - env:
      PATH: /snap/bin:/usr/bin:/bin
    steps:
      - heroku container:push web --app my-fancy-app
      - heroku container:release web --app my-fancy-app

The interesting part is the steps field. It says that the test expects the script to execute two commands, heroku container:push ... and heroku container:release ..., in that order.

(The env field allows scriptkeeper to find the heroku binary, which is in /snap/bin on my machine.)

Now execute the test by invoking scriptkeeper on the script:

$ scriptkeeper deploy.sh

This runs deploy.sh, but in a mode where it only verifies which programs your script would execute without actually running them. It compares the commands to the ones listed in deploy.sh.test.yaml. If the script behaves as expected, scriptkeeper outputs:

All tests passed.

What if our script doesn't behave as expected? Here's a script that calls the heroku commands in the wrong order:

#!/usr/bin/env bash

set -euo pipefail

heroku container:release web --app my-fancy-app
heroku container:push web --app my-fancy-app

When we invoke scriptkeeper with the same test file, it outputs the following error:

error:
  expected: heroku container:push web --app my-fancy-app
  received: heroku container:release web --app my-fancy-app

As expected, it tells us about a misbehaving script.

Standard Streams and Exit Codes

Here's an example of the same deploy.sh script with a lot more bells and whistles:

#!/usr/bin/env bash

set -euo pipefail

if ! cargo test ; then
  echo exiting, tests don\'t pass 1>&2
  exit 1
fi

APP=${1:-}
if [[ -z "$APP" ]] ; then
  echo please pass in the heroku app as an argument 1>&2
  exit 1
fi

if [[ -n "$(git status --porcelain)" ]] ; then
  echo exiting, git repo not clean 1>&2
  exit 1
fi

heroku container:push web --app "$APP"
heroku container:release web --app "$APP"

This script clocks in at sixteen lines of code. It now runs a test suite written in Rust (cargo test), our favorite new language. It then checks for a dirty working tree before pushing to Heroku.

Without tests, at this size and complexity, the script has become much less easy to maintain. With scriptkeeper, however, we can just add more tests to our test suite. Here's a test for making sure that the script aborts when cargo test fails:

tests:
  # exits when the tests don't pass
  - env:
      PATH: /snap/bin:~/.cargo/bin:/usr/bin:/bin
    steps:
      - command: cargo test
        exitcode: 1
    stderr: "exiting, tests don't pass\n"
    exitcode: 1

This specifies the expected command cargo test as a YAML object and its exitcode as 1. When run, scriptkeeper mocks the cargo test process and simulates an exitcode of 1. We then expect the script to

  • Not issue any further commands
  • Print a nice message to stderr
  • Itself return with an exitcode of 1

Arguments

Here's another test. We want to test that ./deploy.sh exits and prints a usage message when run without any arguments, like many Unix utilities do:

tests:
  # exits and prints a usage message when no arguments given
  - env:
      PATH: /snap/bin:~/.cargo/bin:/usr/bin:/bin
    arguments: ""
    steps:
      - cargo test
    stderr: "please pass in the heroku app as an argument\n"
    exitcode: 1

Notice arguments: "". That means scriptkeeper will not pass in any arguments to the tested script. ("" is the default value for the arguments field. It is set here to make the test more explicit.) Similar to before, we expect the script to abort with a nice error message.

Here's the last test for our deploy.sh script. It ensures that we don't accidentally deploy with an untracked change in our Git repo:

tests:
  # exits when the git repo is not clean
  - env:
      PATH: /snap/bin:~/.cargo/bin:/usr/bin:/bin
    arguments: test-heroku-app
    steps:
      - cargo test
      - command: git status --porcelain
        stdout: " M some-changed-file\n"
    stderr: "exiting, git repo not clean\n"
    exitcode: 1

The expected git status --porcelain command has the stdout field set to a string that indicates that Git found untracked changes. scriptkeeper simulates that git status --porcelain writes that specified string to stdout. Again, the test specifies that – under these circumstances – no other commands should be executed and expects the script to abort with a message.

With these tests, our deploy.sh script is ready for refactoring or new features. We can be confident that we won't break things.

How it Works

scriptkeeper uses ptrace, which is a kernel feature that allows a process to spawn a child process and monitor and intercept the child's syscalls. When scriptkeeper runs a test this is roughly what happens:

  • scriptkeeper spawns a Bash process executing the script under test
  • The Bash process will execute commands through syscalls
  • scriptkeeper will intercept these syscalls and compare them to the expected commands
  • scriptkeeper will create fake executables that have the behaviour that is specified for the commands in the tests
  • scriptkeeper will modify the syscalls to run these fake executables instead of executing the real commands

This approach allows to execute tests for your scripts without changing your scripts at all.

scriptkeeper is written in Rust, and we think that was a good choice for this project. Rust is a well-thought out programming language and has good tooling. In particular, Rust's Algebraic Datatypes are a pleasure to work with. For scriptkeeper Rust seemed particularly attractive, since it has very good support for using low-level interfaces like kernel APIs.

Use and Contribute!

scriptkeeper is still somewhat experimental. We used real scripts to drive the development process. If you're trying out scriptkeeper, there's a chance that you'll be missing a feature or two. In that case, please open a new issue or vote on the existing ones.

Currently, scriptkeeper only runs on Linux. On other platforms you can run it inside a Docker container.

Outlook

The main use-case that we've focused on for scriptkeeper is backfilling tests for existing Bash scripts. But the approach that it takes – mocking out syscalls with ptrace – also opens up some interesting possibilities beyond that:

  • You can add testcases before implementing the corresponding functionality. This means you can write Bash scripts in a test-driven development style.

  • scriptkeeper is, in principle, language-agnostic, so it could test scripts or programs in other languages. These other languages might use different syscalls than Bash, but scriptkeeper could start supporting these syscalls too.

  • Given that scriptkeeper could work for other languages, you could use it to reimplement scripts in a different language. Imagine the following workflow:

    • Backfill tests for an untested script by adding a test file
    • Delete the script
    • Create a new script in another language, using the existing test file to rewrite the old script in a TDD style

scriptkeeper started out as an experiment, and in the beginning we weren't sure that it was feasible to implement at all. And, while there are still features missing, we think a tool like scriptkeeper is not only possible to implement but also very valuable. We have used it successfully on a number of scripts internally, and I am planning to try it out more in the future. And you're invited to do the same. We are grateful for any feedback, so please open issues on the issue tracker or vote on existing issues with a 👍.

github.com/Originate/scriptkeeper

Recent Posts

Let's talk.

Give Us a Call
(800) 352-2292
Business Inquiries