John Walk

Projects

Fusion Primer

Photography

Publications

Pluralsight Tech Blog

Medium

Github

LinkedIn

Python CLI Utilities with Poetry and Typer

John Walk

February 2020

This was originally published on the Pluralsight tech blog.

So you’ve got an idea for something to build - awesome! - but maybe that code doesn’t really make sense to build into something servable, like a webapp. Instead, you want to make a shareable utility, so other users can run your tool on the fly. Building your code into a proper Python package is great, but you might not really need other users to integrate your code, at a granular level, into their own - rather, they might just need to be able to trigger tools in an ad-hoc fashion with relatively few options. At that point, you’re really talking about building an executable utility, rather than a package.

Unfortunately, Python doesn’t have a fantastic story for building standalone executables (though tools like PyInstaller have come a long way) due to the way the Python environment handles dependencies. There isn’t really a standardized way to bundle executable code, its dependencies, and (potentially) runtime information into a single distributable block in the way that, say, a fat Java .jar file or a Go executable would.

However, there is a middle ground! Python does have a good way to pack scripts along with associated code in its packages to create (relatively) enclosed command-line utilities. While this does still have those environment dependencies (that is, it is installed to a particular Python environment and needs external dependencies installed there), such tools are pip-installable and, from a user’s standpoint, can be treated as standalone. This pattern is extremely frequent in common Python utilities, from core management tools like pip and virtualenv to code-quality tools like black, flake8, and mypy.

Despite seeming daunting, building this type of command-line utility is actually pretty easy in Python! In this post, we’ll discuss how to build a CLI utility with some exciting new Python tools, including:

  • building a package with poetry: this gets us a number of benefits, like integrated tooling across environment and build control, and fully-specified, deterministic dependency resolution
  • writing a command-line script: we’ll use the new typer library for its clean design and clever use of Python type annotations
  • integrating the CLI: baking the script into the package’s entrypoints to make an installable utility

Complete dockerized code for this post, with additional configuration, build tools, and functionality, may be found here. Things we won’t be covering:

  • unit testing design (though the linked example code includes package tests)
  • CI or deployment, since this is highly organization-specific - however, poetry plays nicely with any standard Python package index (i.e., public PyPI, or a private artifact store)

Let’s get going!

building a package

First, let’s consider an idea for what our tool will do. I’ve been playing a lot of Dungeons & Dragons recently (and we get fancy dice for writing blog posts), so a dice-rolling app sounds nice - let’s make something for that!

First, we’ll want the ability to specify a number of dice to roll, and their size (i.e., number of sides) - let’s have this return a list of the individual rolls (sorted in descending order in case we want to pick off larger rolls) and their total. Next, we’ll want to be able to specify rolls in our common shorthand - e.g., writing “2D6” to specify rolling two six-sided dice - so we want a function that can parse a string like that into the numeric inputs for our first function. To start with, we can write these functions in a file… let’s call it dice.py:

import random
import re
from typing import Tuple, List


def roll(num_dice: int = 1, sides: int = 20) -> Tuple[List[int], int]:
    rolls = sorted(
        [random.choice(range(1, sides + 1)) for _ in range(num_dice)], reverse=True
    )
    return (rolls, sum(rolls))


def parse_dice_string(dice_string: str) -> Tuple[int, int]:
    # extract digits from dice-roll strings like "2D6" with regex witchcraft
    hit = re.search(r"(\d*)[dD](\d+)", dice_string)
    if not hit:
        raise ValueError("bad string")

    count, sides = hit.groups()
    count_int = int(count or 1)  # regex hits on "" for 1st digit, munge to 1
    sides_int = int(sides)
    return (count_int, sides_int)


def roll_from_string(dice_string: str) -> Tuple[List[int], int, str]:
    count, sides = parse_dice_string(dice_string)
    rolls, total = roll(num_dice=count, sides=sides)
    return (rolls, total, f"{count}D{sides}")

(We’ll skip writing docstrings for these for now, and rely on useful variable names and Python’s type hinting to guide people reading our code - the example code is more thoroughly documented).

Next, let’s try building this into a package - we’ll use the poetry tool since it gives us quite a few niceties, like integrated tooling for environment & build control, and deterministic dependency resolution. Simply running poetry new roll-the-dice will create a project directory for us ready to go:

roll-the-dice/
|-- roll_the_dice/
|   |-- __init__.py
|   |-- dice.py
|-- tests/
|   |-- __init__.py
|   |-- test_dice.py
|-- pyproject.toml
|-- README.rst

(after adding our dice.py and writing some appropriate unit tests, of course). By nesting code as needed in directories (each with an __init__.py file), we indicate that the directory structure should be interpreted as a module path - that is, we can import our functions from the roll_the_dice.dice module. Everything we need to manage our project’s environment and build is in pyproject.toml:

[tool.poetry]
name = "roll-the-dice"
version = "0.0.1"
description = "a roll the dice CLI"
authors = [
    "John Walk <email@example.com>"
]

[tool.poetry.dependencies]
python = "^3.7"
typer = "^0.0.8"

[tool.poetry.dev-dependencies]
pytest = "^5.2"

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

(More configuration options for poetry can be found here.) With this set up, simply running poetry install will create a virtual environment with our specified dependencies installed, ready to work with the package code. Running poetry shell will launch a shell in that virtual environment for interactive testing, and we can run shell commands in the environment with poetry run (e.g., poetry run pytest will run our unit tests at the project root, automatically discovering our test files in the project structure). Ultimately, we can use poetry build -f wheel and poetry publish to assemble our package into a wheel file and push it to PyPI (or any other package index we desire).

writing a command-line script

In essence, writing a simple command-line script is straightforward in Python - to get behavior equivalent to running individual commands in a Python shell, we just need a file (let’s call it cli.py) structured like this:

def do_something():
    # functions that do some things...
    ...


if __name__ == "__main__":
    do_something()

for which running python cli.py from the command line will execute any commands specified in the final if block (i.e., do_something() in this case). This looks a little arcane, but generally we won’t need to worry about it - the short explanation is, Python automatically sets a __name__ attribute to the module level for any code (e.g., import foo will set __name__ to "foo" for the code therein), with "__main__" as the reserved name for top-level execution from the command line. Suffice to say, that block gets executed when called directly from the command line, and ignored in all other instances (like if a user imported cli.py), so it gives us a convenient hook for our scripts.

Next, we’ll want to think about managing the command-line interface, since just calling static code is a bit silly. In the standard library, Python provides a package called argparse that can build command-line arguments for a function, but it requires a somewhat opaque (but highly flexible) structure to interface between command-line arguments and the arguments passed to Python functions internally. The new typer library (built on top of the also-excellent click) makes this much simpler, by baking command-line arguments directly into Python function calls using Python’s new type annotation system. Typer is built by the same designer as FastAPI (another favorite of mine), and leverages many of the same design decisions to achieve this cleverness.

First, let’s look at creating a very simple script:

import typer


def hello_world():
    """our first CLI with typer!
    """
    typer.echo("Opening blog post...")
    typer.launch(
        "https://pluralsight.com/tech-blog/python-cli-utilities-with-poetry-and-typer"
    )


if __name__ == "__main__":
    typer.run(hello_world)

For simple commands, simply calling typer.run on a function is sufficient - calling it from the command line like python cli.py will trigger the script, printing to your command line and launching this blog post in your browser. The typer call adds a number of niceties as well, like syntax highlighting & coloring in the terminal via typer.echo. It even starts building documentation and call options - running python cli.py --help will show a help message with the contents of the function’s docstring and any arguments or options (including the auto-generated --help flag).

We can add even more control by instantiating a typer app instead - rather than directly calling run, we could write the above as

import typer

app = typer.Typer()


@app.command("hello")
def hello_world()
    ...  # contents from the function above


if __name__ == "__main__":
    app()

creating an app and then binding commands to it. If you’re familiar with Flask or FastAPI (by the same author as typer) for building webapps, this pattern should look pretty familiar - but rather than creating a web app, you’re creating endpoints for a command line app. This lets us create multiple subcommands for the script trivially simply by binding multiple functions to the app (just like endpoints on a webapp), and lets us add some additional features like specifying the name for the command (in this case, python cli.py hello) to override the function name for cleaner calls.

Next, let’s write our first useful command (that is, one that’s using our roll_the_dice tools) - to start, we’ll write a command that rolls the dice from an input string. That means we’ll need to be able to intelligently handle command-line inputs to our script. In a tool like argparse, this would require creating a separate object just to handle arguments, calling them in a somewhat roundabout way. With typer, though, we just need to add arguments to the functions themselves with type annotations, and the CLI app takes care of the rest (again, this pattern will be familiar to anyone who’s done request parameter handling in FastAPI).

@app.command("roll-str")
def roll_string(dice_str: str):
    """Rolls the dice from a formatted string.

    We supply a formatted string DICE_STR describing the roll, e.g. '2D6'
    for two six-sided dice.
    """
    try:
        rolls_list, total, formatted_roll = roll_from_string(dice_str)
    except ValueError:
        typer.echo(f"invalid roll string: {dice_str}")
        raise typer.Exit(code=1)

    typer.echo(f"rolling {formatted_roll}!\n")
    typer.echo(f"your roll: {total}\n")

The typer app automatically takes this function argument & type annotation, and builds it into a positional argument for the script call - we can call this like

$ python cli.py roll-str 2D6

and typer correctly handles the input argument (note that we access the command with the name specified in app.command). The app automatically handles missing or extra arguments in the call, and gives us an easy way to hook Python errors (like the ValueError raised by badly-formatted strings) into CLI errors. It even generates a useful help string for the command using the annotations plus the function’s docstring:

$ python cli.py roll-str --help
Usage: cli.py roll-str [OPTIONS] DICE_STR

  Rolls the dice from a formatted string.

  We supply a formatted string DICE_STR describing the roll, e.g. '2D6' for
  two six-sided dice.

Options:
  --help                Show this message and exit.

We can be more fancy with our command-line options as well. While typer will interpret annotated keyword arguments as options or flags (in the same way positional arguments to the Python function are treated as required args for the CLI), it also provides helper functions for even greater control. We can use the typer.Argument and typer.Option commands for positional and keyword inputs, letting us set things like help strings, overrides for flag names, and basic input validation.

We’ll use Option flags for inputs to a command that rolls dice directly from numeric inputs (i.e., we explicitly pass the number and size of dice to roll), such that we could choose to omit options in favor of using defaults. That is, we want to call the function like this to roll a pair of D20s:

$ python cli.py roll-num -n 2 -d 20 --rolls

(or we can skip either input to use the default value defined in the function). The typer app correctly interprets keyword arguments for these options:

@app.command("roll-num")
def roll_num(
    num_dice: int = typer.Option(
        1, "-n", "--num-dice", help="number of dice to roll", show_default=True, min=1
    ),
    sides: int = typer.Option(
        20, "-d", "--sides", help="number-sided dice to roll", show_default=True, min=1
    ),
    rolls: bool = typer.Option(
        False, help="set to display individual rolls", show_default=True
    ),
):
    """Rolls the dice from numeric inputs.

    We supply the number and side-count of dice to roll with option arguments.
    """
    rolls_list, total = roll(num_dice=num_dice, sides=sides)

    typer.echo(f"rolling {num_dice}D{sides}!\n")
    typer.echo(f"your roll: {total}\n")
    if rolls:
        typer.echo(f"made up of {rolls_list}\n")

The Option flags let us specify flag names (including short- and long-form versions, like -d for --sides), set a help string for the flags, and even enforce minimum or maximum values for the inputs. All these extra settings are automatically included in the helpstrings for the commands:

$ python cli.py roll-num --help
Usage: cli.py roll-num [OPTIONS]

  Rolls the dice from numeric inputs.

  We supply the number and side-count of dice to roll with option arguments.

Options:
  -n, --num-dice INTEGER RANGE  number of dice to roll  [default: 1]
  -d, --sides INTEGER RANGE     number-sided dice to roll  [default: 20]
  --rolls / --no-rolls          set to display individual rolls  [default:
                                False]
  --help                        Show this message and exit.

Typer even adds some cleverness for boolean flags, automatically generating flag and no-flag options rather than requiring the user to pass true/false values. (A number of other clever parameter handling techniques can be found here for things like dates, filepaths, and enumerated options for commands.) For multi-command scripts like this, typer will also autogenerate top-level help:

$ python cli.py --help
Usage: cli.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  hello     our first CLI with typer!
  roll-num  Rolls the dice from numeric inputs.
  roll-str  Rolls the dice from a formatted string.

(note that the command descriptions have a little cleverness - it automatically uses the first sentence of the docstring, so that should be a “one-liner” description of the command).

All told, this gives us a great way to handle command-line scripts. We can easily create multiple subcommands as needed, each with automatic documentation and any data validation we need, with minimal additional code over what we’d need for the bare Python function. Next, let’s look at how we can integrate this into our package into a standalone tool, rather than just a script.

integrating the CLI

The script above is perfectly functional for the command-line - that is, we can call it with python cli.py and it will do the thing (provided we have our roll-the-dice package installed). However, this isn’t really ideal for building a truly reusable standalone utility. For one, we don’t really have a good way to distribute the script - users can install roll-the-dice (once we’ve pushed it to a package index), but they’d need to separately manage (and version control!) installs of the script. Rather, we want a way to include the script with the package itself, such that the script is installed and version-controlled along with the package as a standalone command (much like how Python tools like flake8 or mypy can be imported or called from the command line). That is, in place of our awkward python cli.py calls, we want to make our script into a command (let’s name it rtd) that we can call like

$ rtd COMMAND [OPTIONS] ARGS

Historically, Python’s packaging utilities have supported a scripts setting, where we could include script files like the one we wrote above in a bin/ top-level directory (parallel to the package source and test directories). The setuptools build process would package these files in with the package source or wheel file, and copy the scripts to the Python environment’s bin/ directory at install to create a command accessible when the environment is active. However, this runs into some awkwardness with integrating script code with the rest of the package (e.g., for testing, or handling multi-file source code for complex tools) and is difficult to get working on both Windows and POSIX systems.

Instead, we’ll use the more modern console_scripts entrypoint to create our command. This lets us include the command-line tooling directly into the function, and avoids any mucking around with namespace hacks (i.e., the __name__ == "__main__" check) by instead directly referencing a function (not a script!) within the package itself. At package install, Python will automatically create scripts in its environment’s bin/ directory that simply import the referenced functions and runs from there (as opposed to copying the entire source code referenced by the older scripts keyword).

Let’s get started - first, we simply copy our cli.py file into the package, where we’ll treat it like any other submodule:

roll-the-dice/
|-- roll_the_dice/
|   |-- __init__.py
|   |-- cli.py
|   |-- dice.py
|-- tests/
|-- pyproject.toml
|-- README.rst

In our CLI file, we need to replace the __main__ invocation with an ordinary function: in place of

if __name__ == "__main__":
    app()

we simply need

def main():
    app()

Since this is just another submodule in our package, we can even write unit tests for it just like any other function - pytest provides a built-in capsys fixture to capture the standard out and error logs so we can easily test on the CLI commands outputs, like below:

def test_roll_num(capsys):
    roller = roll_num(num_dice=1, sides=20)
    stdout = capsys.readouterr().out

    regex = re.compile(r"rolling (\d+D\d+)!\n\nyour roll: (\d+)")
    roll_str, total = re.search(regex, stdout).groups()
    assert roll_str == "1D20"
    assert int(total) in range(1, 21)

(note, however, that in cases using typer.Argument or typer.Option we’ll likely need to explicitly provide values, since the Python interpreter will otherwise not correctly parse the typer fields to their underlying values.) Lastly, we need to define this function as the entrypoint in our pyproject.toml file:

[tool.poetry.scripts]
rtd = "roll_the_dice.cli:main"

(we could actually just directly reference the app function here, making main() unnecessary - but this pattern is more explicit, allows for any additional setup calls needed around the app instantiation, and minimizes objects imported into the autogenerated script.) We can access this command during development with poetry run rtd, as poetry will run any script defined in the TOML file within its included virtual environment.

At package build, poetry will automatically convert this into a standard package entrypoint, and on install that entrypoint will create a CLI command rtd in our Python environment that we can call perfectly normally. With that done, we’re ready to roll our dice, e.g., rtd roll-str 2D6 to get to our first rolling function - we have a fully functional CLI tool at our disposal!

wrapping up

Though Python doesn’t have a fantastic story for building completely standalone executables, it’s nevertheless a great language for scripting - and modern tools make it easy to package our scripts into pip-installable command line utilities, letting us validate, test, and version-control scripts. In this walkthrough, we:

  • built a quick dice-rolling tool into a Python package with poetry
  • laid out a command-line script using that package with the new typer library
  • integrated that script into a command-line tool using the package’s entrypoints

Using these next-gen tools gives us a lot of benefits, like integrated tooling & dependency resolution with poetry or clean, type-annotated CLI layout with typer, but if we wanted we could just as easily use other tools for this layout. For example, the console_scripts entrypoint is agnostic of the actual behavior of the CLI - it just needs access to an importable function that takes no arguments (from within Python - instead arguments are handled by our CLI parser of choice). If we were using argparse, for example, we’d just need to include handling for the ArgumentParser object in our main() function.

Similarly, we could build our package (including scripts) with setuptools rather than poetry - this loses us the integrated environment, package, and build control, but will in some cases be necessary (for example, building a package with bindings to compiled non-Python code). We can still gain a lot of the benefit of more modern build configuration by leaning on a setup.cfg file rather than piling configuration in the executable setup.py file. To include our script as an entrypoint for a package with setuptools, we just need to add to our setup.cfg file:

[options.entry_points]
console_scripts =
    rtd = roll_the_dice.cli:main

which will build the package’s console_scripts entrypoint identical to the tool.poetry.scripts option with poetry. Whatever tooling we choose, building pip-installable CLI tools with Python is a great way to distribute ad-hoc scripts in a tested, version-controlled way with little additional overhead.