This was originally published on the Pluralsight tech blog.
So you’ve got an idea for something to build - awesome! - but maybe that code doesn’t really make sense to build into something servable, like a webapp. Instead, you want to make a shareable utility, so other users can run your tool on the fly. Building your code into a proper Python package is great, but you might not really need other users to integrate your code, at a granular level, into their own - rather, they might just need to be able to trigger tools in an ad-hoc fashion with relatively few options. At that point, you’re really talking about building an executable utility, rather than a package.
Unfortunately, Python doesn’t have a fantastic story for building standalone executables (though tools like PyInstaller
have come a long way) due to the way the Python environment handles dependencies.
There isn’t really a standardized way to bundle executable code, its dependencies, and (potentially) runtime information into a single distributable block in the way that, say, a fat Java .jar
file or a Go executable would.
However, there is a middle ground!
Python does have a good way to pack scripts along with associated code in its packages to create (relatively) enclosed command-line utilities.
While this does still have those environment dependencies (that is, it is installed to a particular Python environment and needs external dependencies installed there), such tools are pip
-installable and, from a user’s standpoint, can be treated as standalone.
This pattern is extremely frequent in common Python utilities, from core management tools like pip
and virtualenv
to code-quality tools like black
, flake8
, and mypy
.
Despite seeming daunting, building this type of command-line utility is actually pretty easy in Python! In this post, we’ll discuss how to build a CLI utility with some exciting new Python tools, including:
poetry
: this gets us a number of benefits, like integrated tooling across environment and build control, and fully-specified, deterministic dependency resolutiontyper
library for its clean design and clever use of Python type annotationsComplete dockerized code for this post, with additional configuration, build tools, and functionality, may be found here. Things we won’t be covering:
poetry
plays nicely with any standard Python package index (i.e., public PyPI, or a private artifact store)Let’s get going!
First, let’s consider an idea for what our tool will do. I’ve been playing a lot of Dungeons & Dragons recently (and we get fancy dice for writing blog posts), so a dice-rolling app sounds nice - let’s make something for that!
First, we’ll want the ability to specify a number of dice to roll, and their size (i.e., number of sides) - let’s have this return a list of the individual rolls (sorted in descending order in case we want to pick off larger rolls) and their total.
Next, we’ll want to be able to specify rolls in our common shorthand - e.g., writing “2D6” to specify rolling two six-sided dice - so we want a function that can parse a string like that into the numeric inputs for our first function.
To start with, we can write these functions in a file… let’s call it dice.py
:
import random
import re
from typing import Tuple, List
def roll(num_dice: int = 1, sides: int = 20) -> Tuple[List[int], int]:
rolls = sorted(
[random.choice(range(1, sides + 1)) for _ in range(num_dice)], reverse=True
)
return (rolls, sum(rolls))
def parse_dice_string(dice_string: str) -> Tuple[int, int]:
# extract digits from dice-roll strings like "2D6" with regex witchcraft
hit = re.search(r"(\d*)[dD](\d+)", dice_string)
if not hit:
raise ValueError("bad string")
count, sides = hit.groups()
count_int = int(count or 1) # regex hits on "" for 1st digit, munge to 1
sides_int = int(sides)
return (count_int, sides_int)
def roll_from_string(dice_string: str) -> Tuple[List[int], int, str]:
count, sides = parse_dice_string(dice_string)
rolls, total = roll(num_dice=count, sides=sides)
return (rolls, total, f"{count}D{sides}")
(We’ll skip writing docstrings for these for now, and rely on useful variable names and Python’s type hinting to guide people reading our code - the example code is more thoroughly documented).
Next, let’s try building this into a package - we’ll use the poetry
tool since it gives us quite a few niceties, like integrated tooling for environment & build control, and deterministic dependency resolution.
Simply running poetry new roll-the-dice
will create a project directory for us ready to go:
roll-the-dice/
|-- roll_the_dice/
| |-- __init__.py
| |-- dice.py
|-- tests/
| |-- __init__.py
| |-- test_dice.py
|-- pyproject.toml
|-- README.rst
(after adding our dice.py
and writing some appropriate unit tests, of course).
By nesting code as needed in directories (each with an __init__.py
file), we indicate that the directory structure should be interpreted as a module path - that is, we can import our functions from the roll_the_dice.dice
module.
Everything we need to manage our project’s environment and build is in pyproject.toml
:
[tool.poetry]
name = "roll-the-dice"
version = "0.0.1"
description = "a roll the dice CLI"
authors = [
"John Walk <email@example.com>"
]
[tool.poetry.dependencies]
python = "^3.7"
typer = "^0.0.8"
[tool.poetry.dev-dependencies]
pytest = "^5.2"
[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
(More configuration options for poetry
can be found here.)
With this set up, simply running poetry install
will create a virtual environment with our specified dependencies installed, ready to work with the package code.
Running poetry shell
will launch a shell in that virtual environment for interactive testing, and we can run shell commands in the environment with poetry run
(e.g., poetry run pytest
will run our unit tests at the project root, automatically discovering our test files in the project structure).
Ultimately, we can use poetry build -f wheel
and poetry publish
to assemble our package into a wheel
file and push it to PyPI (or any other package index we desire).
In essence, writing a simple command-line script is straightforward in Python - to get behavior equivalent to running individual commands in a Python shell, we just need a file (let’s call it cli.py
) structured like this:
def do_something():
# functions that do some things...
...
if __name__ == "__main__":
do_something()
for which running python cli.py
from the command line will execute any commands specified in the final if
block (i.e., do_something()
in this case).
This looks a little arcane, but generally we won’t need to worry about it - the short explanation is, Python automatically sets a __name__
attribute to the module level for any code (e.g., import foo
will set __name__
to "foo"
for the code therein), with "__main__"
as the reserved name for top-level execution from the command line.
Suffice to say, that block gets executed when called directly from the command line, and ignored in all other instances (like if a user imported cli.py
), so it gives us a convenient hook for our scripts.
Next, we’ll want to think about managing the command-line interface, since just calling static code is a bit silly.
In the standard library, Python provides a package called argparse
that can build command-line arguments for a function, but it requires a somewhat opaque (but highly flexible) structure to interface between command-line arguments and the arguments passed to Python functions internally.
The new typer
library (built on top of the also-excellent click
) makes this much simpler, by baking command-line arguments directly into Python function calls using Python’s new type annotation system.
Typer
is built by the same designer as FastAPI (another favorite of mine), and leverages many of the same design decisions to achieve this cleverness.
First, let’s look at creating a very simple script:
import typer
def hello_world():
"""our first CLI with typer!
"""
typer.echo("Opening blog post...")
typer.launch(
"https://pluralsight.com/tech-blog/python-cli-utilities-with-poetry-and-typer"
)
if __name__ == "__main__":
typer.run(hello_world)
For simple commands, simply calling typer.run
on a function is sufficient - calling it from the command line like python cli.py
will trigger the script, printing to your command line and launching this blog post in your browser.
The typer
call adds a number of niceties as well, like syntax highlighting & coloring in the terminal via typer.echo
.
It even starts building documentation and call options - running python cli.py --help
will show a help message with the contents of the function’s docstring and any arguments or options (including the auto-generated --help
flag).
We can add even more control by instantiating a typer
app instead - rather than directly calling run
, we could write the above as
import typer
app = typer.Typer()
@app.command("hello")
def hello_world()
... # contents from the function above
if __name__ == "__main__":
app()
creating an app and then binding commands to it.
If you’re familiar with Flask or FastAPI (by the same author as typer
) for building webapps, this pattern should look pretty familiar - but rather than creating a web app, you’re creating endpoints for a command line app.
This lets us create multiple subcommands for the script trivially simply by binding multiple functions to the app (just like endpoints on a webapp), and lets us add some additional features like specifying the name for the command (in this case, python cli.py hello
) to override the function name for cleaner calls.
Next, let’s write our first useful command (that is, one that’s using our roll_the_dice
tools) - to start, we’ll write a command that rolls the dice from an input string.
That means we’ll need to be able to intelligently handle command-line inputs to our script.
In a tool like argparse
, this would require creating a separate object just to handle arguments, calling them in a somewhat roundabout way.
With typer
, though, we just need to add arguments to the functions themselves with type annotations, and the CLI app takes care of the rest (again, this pattern will be familiar to anyone who’s done request parameter handling in FastAPI).
@app.command("roll-str")
def roll_string(dice_str: str):
"""Rolls the dice from a formatted string.
We supply a formatted string DICE_STR describing the roll, e.g. '2D6'
for two six-sided dice.
"""
try:
rolls_list, total, formatted_roll = roll_from_string(dice_str)
except ValueError:
typer.echo(f"invalid roll string: {dice_str}")
raise typer.Exit(code=1)
typer.echo(f"rolling {formatted_roll}!\n")
typer.echo(f"your roll: {total}\n")
The typer
app automatically takes this function argument & type annotation, and builds it into a positional argument for the script call - we can call this like
$ python cli.py roll-str 2D6
and typer
correctly handles the input argument (note that we access the command with the name specified in app.command
).
The app automatically handles missing or extra arguments in the call, and gives us an easy way to hook Python errors (like the ValueError
raised by badly-formatted strings) into CLI errors.
It even generates a useful help string for the command using the annotations plus the function’s docstring:
$ python cli.py roll-str --help
Usage: cli.py roll-str [OPTIONS] DICE_STR
Rolls the dice from a formatted string.
We supply a formatted string DICE_STR describing the roll, e.g. '2D6' for
two six-sided dice.
Options:
--help Show this message and exit.
We can be more fancy with our command-line options as well.
While typer
will interpret annotated keyword arguments as options or flags (in the same way positional arguments to the Python function are treated as required args for the CLI), it also provides helper functions for even greater control.
We can use the typer.Argument
and typer.Option
commands for positional and keyword inputs, letting us set things like help strings, overrides for flag names, and basic input validation.
We’ll use Option
flags for inputs to a command that rolls dice directly from numeric inputs (i.e., we explicitly pass the number and size of dice to roll), such that we could choose to omit options in favor of using defaults.
That is, we want to call the function like this to roll a pair of D20s:
$ python cli.py roll-num -n 2 -d 20 --rolls
(or we can skip either input to use the default value defined in the function).
The typer
app correctly interprets keyword arguments for these options:
@app.command("roll-num")
def roll_num(
num_dice: int = typer.Option(
1, "-n", "--num-dice", help="number of dice to roll", show_default=True, min=1
),
sides: int = typer.Option(
20, "-d", "--sides", help="number-sided dice to roll", show_default=True, min=1
),
rolls: bool = typer.Option(
False, help="set to display individual rolls", show_default=True
),
):
"""Rolls the dice from numeric inputs.
We supply the number and side-count of dice to roll with option arguments.
"""
rolls_list, total = roll(num_dice=num_dice, sides=sides)
typer.echo(f"rolling {num_dice}D{sides}!\n")
typer.echo(f"your roll: {total}\n")
if rolls:
typer.echo(f"made up of {rolls_list}\n")
The Option
flags let us specify flag names (including short- and long-form versions, like -d
for --sides
), set a help string for the flags, and even enforce minimum or maximum values for the inputs.
All these extra settings are automatically included in the helpstrings for the commands:
$ python cli.py roll-num --help
Usage: cli.py roll-num [OPTIONS]
Rolls the dice from numeric inputs.
We supply the number and side-count of dice to roll with option arguments.
Options:
-n, --num-dice INTEGER RANGE number of dice to roll [default: 1]
-d, --sides INTEGER RANGE number-sided dice to roll [default: 20]
--rolls / --no-rolls set to display individual rolls [default:
False]
--help Show this message and exit.
Typer even adds some cleverness for boolean flags, automatically generating flag
and no-flag
options rather than requiring the user to pass true/false values.
(A number of other clever parameter handling techniques can be found here for things like dates, filepaths, and enumerated options for commands.)
For multi-command scripts like this, typer
will also autogenerate top-level help:
$ python cli.py --help
Usage: cli.py [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
hello our first CLI with typer!
roll-num Rolls the dice from numeric inputs.
roll-str Rolls the dice from a formatted string.
(note that the command descriptions have a little cleverness - it automatically uses the first sentence of the docstring, so that should be a “one-liner” description of the command).
All told, this gives us a great way to handle command-line scripts. We can easily create multiple subcommands as needed, each with automatic documentation and any data validation we need, with minimal additional code over what we’d need for the bare Python function. Next, let’s look at how we can integrate this into our package into a standalone tool, rather than just a script.
The script above is perfectly functional for the command-line - that is, we can call it with python cli.py
and it will do the thing (provided we have our roll-the-dice
package installed).
However, this isn’t really ideal for building a truly reusable standalone utility.
For one, we don’t really have a good way to distribute the script - users can install roll-the-dice
(once we’ve pushed it to a package index), but they’d need to separately manage (and version control!) installs of the script.
Rather, we want a way to include the script with the package itself, such that the script is installed and version-controlled along with the package as a standalone command (much like how Python tools like flake8
or mypy
can be imported or called from the command line).
That is, in place of our awkward python cli.py
calls, we want to make our script into a command (let’s name it rtd
) that we can call like
$ rtd COMMAND [OPTIONS] ARGS
Historically, Python’s packaging utilities have supported a scripts setting, where we could include script files like the one we wrote above in a bin/
top-level directory (parallel to the package source and test directories).
The setuptools
build process would package these files in with the package source or wheel file, and copy the scripts to the Python environment’s bin/
directory at install to create a command accessible when the environment is active.
However, this runs into some awkwardness with integrating script code with the rest of the package (e.g., for testing, or handling multi-file source code for complex tools) and is difficult to get working on both Windows and POSIX systems.
Instead, we’ll use the more modern console_scripts
entrypoint to create our command.
This lets us include the command-line tooling directly into the function, and avoids any mucking around with namespace hacks (i.e., the __name__ == "__main__"
check) by instead directly referencing a function (not a script!) within the package itself.
At package install, Python will automatically create scripts in its environment’s bin/
directory that simply import the referenced functions and runs from there (as opposed to copying the entire source code referenced by the older scripts
keyword).
Let’s get started - first, we simply copy our cli.py
file into the package, where we’ll treat it like any other submodule:
roll-the-dice/
|-- roll_the_dice/
| |-- __init__.py
| |-- cli.py
| |-- dice.py
|-- tests/
|-- pyproject.toml
|-- README.rst
In our CLI file, we need to replace the __main__
invocation with an ordinary function: in place of
if __name__ == "__main__":
app()
we simply need
def main():
app()
Since this is just another submodule in our package, we can even write unit tests for it just like any other function - pytest
provides a built-in capsys
fixture to capture the standard out and error logs so we can easily test on the CLI commands outputs, like below:
def test_roll_num(capsys):
roller = roll_num(num_dice=1, sides=20)
stdout = capsys.readouterr().out
regex = re.compile(r"rolling (\d+D\d+)!\n\nyour roll: (\d+)")
roll_str, total = re.search(regex, stdout).groups()
assert roll_str == "1D20"
assert int(total) in range(1, 21)
(note, however, that in cases using typer.Argument
or typer.Option
we’ll likely need to explicitly provide values, since the Python interpreter will otherwise not correctly parse the typer
fields to their underlying values.)
Lastly, we need to define this function as the entrypoint in our pyproject.toml
file:
[tool.poetry.scripts]
rtd = "roll_the_dice.cli:main"
(we could actually just directly reference the app
function here, making main()
unnecessary - but this pattern is more explicit, allows for any additional setup calls needed around the app instantiation, and minimizes objects imported into the autogenerated script.)
We can access this command during development with poetry run rtd
, as poetry
will run any script defined in the TOML file within its included virtual environment.
At package build, poetry
will automatically convert this into a standard package entrypoint, and on install that entrypoint will create a CLI command rtd
in our Python environment that we can call perfectly normally.
With that done, we’re ready to roll our dice, e.g., rtd roll-str 2D6
to get to our first rolling function - we have a fully functional CLI tool at our disposal!
Though Python doesn’t have a fantastic story for building completely standalone executables, it’s nevertheless a great language for scripting - and modern tools make it easy to package our scripts into pip
-installable command line utilities, letting us validate, test, and version-control scripts.
In this walkthrough, we:
poetry
typer
libraryUsing these next-gen tools gives us a lot of benefits, like integrated tooling & dependency resolution with poetry
or clean, type-annotated CLI layout with typer
, but if we wanted we could just as easily use other tools for this layout.
For example, the console_scripts
entrypoint is agnostic of the actual behavior of the CLI - it just needs access to an importable function that takes no arguments (from within Python - instead arguments are handled by our CLI parser of choice).
If we were using argparse
, for example, we’d just need to include handling for the ArgumentParser
object in our main()
function.
Similarly, we could build our package (including scripts) with setuptools
rather than poetry
- this loses us the integrated environment, package, and build control, but will in some cases be necessary (for example, building a package with bindings to compiled non-Python code).
We can still gain a lot of the benefit of more modern build configuration by leaning on a setup.cfg
file rather than piling configuration in the executable setup.py
file.
To include our script as an entrypoint for a package with setuptools
, we just need to add to our setup.cfg
file:
[options.entry_points]
console_scripts =
rtd = roll_the_dice.cli:main
which will build the package’s console_scripts
entrypoint identical to the tool.poetry.scripts
option with poetry
.
Whatever tooling we choose, building pip
-installable CLI tools with Python is a great way to distribute ad-hoc scripts in a tested, version-controlled way with little additional overhead.