Today we are going to quickly cover the Linux return code and its use in the world of network automation.
What is a Return Code?
A return code (also known as an exit code) is:
a numeric value that is returned by a process (to its parent process) when terminating to indicate its success. A return code of 0 indicates success, whereas anything other than 0 indicates failure.
A quick example of this is shown below:
# Successful termination.
$ ls -l
total 128
-rw-r--r-- 1 rick rick 11357 Sep 22 21:51 LICENSE
-rw-r--r-- 1 rick rick 1307 Sep 22 21:51 Makefile
$ echo $?
0
As we can see our return code is 0 which represents success. In the world of return codes, anything other than 0 is considered a failure. Which is shown below:
# Unsuccessful termination.
$ ls -l this_dir_name_is_not_present
ls: cannot access 'this_dir_name_is_not_present': No such file or directory
$ echo $?
2
Why?
So why do we need return codes? Well, return codes are super useful for several use cases, all of which centre around the ability to perform flow control. For example:
- Script scripts and one-liners - i.e perform an x if the command y is successful.
- Make - i.e terminate if the executing command is unsuccessful.
- CI/CD - i.e stop the workflow if one of the steps (typically a Linux command, think Pytest, Ansible etc.) is unsuccessful.
Below is an oversimplified example within a shell script, where a log message will be written (with the appropriate message) based on the success of running the Linux command tar
.
$ cat backup.sh
tar czf backup.tgz /opt/
if [ $? -eq 0 ]
then
logger -t linux_backup -p notice "Backup success."
else
logger -t linux_backup -p notice "Backup failure."
fi
An example which is more relevant/common to network automation is a CI workflow. I.e a workflow that is triggered upon raising a PR within GitHub that will start a number of automated tests against our change. Think network group/host variable updates.
Below is an example of a config for a CI workflow (via GitHub Actions). If a return code of non-zero is returned from either of the 2 executed commands (pip, pytest) then the workflow will fail.
name: CI
on:
pull_request:
jobs:
code-quality:
name: "code-quality"
...
test:
name: test
strategy:
matrix:
python-version: [3.8.6]
runs-on: ubuntu-latest
needs: code-quality
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: pip install -r requirements.txt <- workflow failure if rc is non-zero
- name: Run tests
run: pytest -vv --tb=short tests <- workflow failure if rc is non-zero
Knowing this behaviour is all well and good, and to be fair any pre-baked Linux CLI/command you use (Pytest, Ansible, Black etc.) will return the necessary return code upon success or failure. Where understanding this behaviour comes into play is when writing custom scripts, such as Python scripts using the Nornir framework. This now leads us to the final part of this post …
Return Codes via Python
So if we are writing our scripts (in let's say Python) how do we issue a return code, so we can run our script via a CI/CD workflow, for example?
The answer to this is fairly simple, we use the sys
module and supply the return code using exit
. Like so:
$ python3
>>> import sys
>>> sys.exit(1)
$ echo $?
1
If we circle back to Nornir again use the failed
attribute from our results to see if any task has failed and if so issue a return code of anything other than 0 to tell the “outside world”/parent process our script has failed.
…
deploy_result = nr.run(task=deploy_config)
if deploy_result.failed:
sys.exit(1)
Full Nornir config deployment script
#!/usr/bin/env python
import os
import sys
from pathlib import Path
from dotenv import load_dotenv
from nornir_napalm.plugins.tasks import napalm_configure
from nornir_utils.plugins.functions import print_result
from nornir import InitNornir
from nornir.core.task import Result, Task
# Load the environment variables from the .env file.
load_dotenv()
# Nornir config path and filename.
NORNIR_CONFIG_FILE = f"{Path(__file__).parent}/config.yaml"
# Variables for the Nornir tasks.
OUTPUT_PATH = f"{Path(__file__).parent}/output"
# Initialize Nornir against the supplied config file.
nr = InitNornir(config_file=NORNIR_CONFIG_FILE)
# Pull the device username/password from the environment variables and assign to the inventory defaults.
nr.inventory.defaults.username = os.getenv("LAB_USERNAME")
nr.inventory.defaults.password = os.getenv("LAB_PASSWORD")
# Filter the inventory for only the spine devices.
nr = nr.filter(role="spine")
def read_file(*, filename: str) -> str:
"""
Read data from a file
Args:
filename (str): str
Returns:
The data from the file
"""
with open(filename) as f:
return f.read()
def deploy_config(task: Task) -> Result:
"""
> The function `deploy_config` takes a `Task` object as an argument and returns a `Result` object
Args:
task (Task): Task - this is the task object that is passed to the function.
Returns:
A Result object with the host and result attributes.
"""
napalm_result = task.run(
task=napalm_configure,
filename=f"{OUTPUT_PATH}/{task.host}_vlan.txt",
replace=False,
)
return Result(
host=task.host,
result=f"{napalm_result.result}",
)
# Allows the script to be run as a standalone script, or imported as a
# module.
if __name__ == "__main__":
# Run the schema_validate function against the nr object.
deploy_result = nr.run(task=deploy_config)
# Print the result of the the config deployment.
print_result(deploy_result)
if deploy_result.failed:
sys.exit(1)