Functional Options in Python

Functional Options in Python

Rob Pike has made a habit out of thrusting through the boundaries and barriers of human-created software. He sees through the deceitful nature of artificial foundations and recognizes that it was built not by gods, but by people. And by daring to challenge the giants our modern world was built upon, he has grown into one himself. Even though this design pattern is one of his least impressive works (Golang, Unix), it's the process through which this pattern came to exist that piqued my interest.

So what are functional options? Well, here's a sneak peek that looks very awkward without more context! By the end of this post, you should understand what functional options help us achieve and how we can implement them in Python.

// Python normal args
my_func(arg1, arg2, named_arg1=val1, named_arg2=val2)


// Go functional options
MyFunc(arg1, arg2, WithNamedArg1(val1), WithNamedArg2(val2))

Usage of functional options in Golang

My first reaction to seeing functional options for the first time was why even bother with such complexity when there are simpler ways of setting options? Using several arguments, option structs, variadic constructors, etc.

Looking at each of these common methods of building options one by one, their downsides will become apparent. I think most of us wouldn't have thought twice about building our API with any of them even with these downsides, which makes it all the more impressive that someone spotted the need for a better way and came up with one.

As an example for the following sections, we'll take a function called 'get_text()' which is supposed to extract text from a file-like object using OCR/document parsing. You can imagine the file is a PDF or an email or any other document with text.

Many Arguments

def get_text(
    self,
    file: BufferedReader,
    with_ocr: bool = False,
    timeout: int = None,
    with_bounding_boxes: bool = False,
    as_plain_text: bool = True,
    skip_ocr: bool = True,
    inline_ocr: bool = False,
    ocr_with_text: bool = False,
):
    ...

This is the most straightforward way of doing things: many arguments. With type hints, you can offer additional information, but it's still unclear what any of these arguments do in practice.

I'll also have to write a lengthy docstring that's hard to read in order to document these arguments.

This obviously grows out of control very quickly the more arguments we want to add, and is hard to change without breaking the API for users.

Option Structs

A natural next step is option or config structs. In python, this can be achieved with a dataclass. With this approach, the dataclass can grow over time as new options are added without breaking the API.

We can now pass GetTextOptions instances around, sharing options across function calls if needed.

This method can lead to better documentation in other languages like Golang where we can document each attribute separately, but in Python we're still bound to a long docstring in the class definition (attribute level docstrings are still not well supported in Python unfortunately, only for generated docs).

It is also still hard to validate/clean each one of those attributes (no hard typing in Python so even stronger need to validate, but sometimes we also want more complex validation ex. float > 5.3 or something).

@dataclass
class GetTextOptions:
    """
    Options for the `get_text` method.
    Args:
        file: The file to extract text from.
        timeout: The timeout for the request.
        with_ocr: Whether to include OCR results in the response.
        with_bounding_boxes: Whether to include bounding boxes in the response.
        as_plain_text: Whether to return the text as plain text or as a list of lines.
        skip_ocr: Whether to skip OCR.
        inline_ocr: Whether to perform OCR on every image in the document (much slower than whole page ocr which is the default).
        ocr_with_text: Whether to perform OCR with text.
    """

    file: BufferedReader
    timeout: int
    with_ocr: bool = False
    with_bounding_boxes: bool = False
    as_plain_text: bool = True
    skip_ocr: bool = True
    inline_ocr: bool = False
    ocr_with_text: bool = False
   


def get_text(options: GetTextOptions):
    ...

Functional Options

Now that we've seen a few alternatives, let's look at how functional options would look like before diving deeper.

from config import GetTextOptions as opt

get_text(file, opt.WithOCR(), opt.InlineOCR())

From a usage perspective, we now have access to autocomplete with detailed documentation since each option is now a function with it's own docstrings.

VSCode screenshot showing autocomplete
VSCode autocomplete with functional options

Now it's a bit awkward to implement, but using it is pretty straightforward. Let's go over the first part of the implementation: How are options applied?

def get_text(
    self,
    file: BufferedReader,
    *opts: List[Callable[..., GetTextOptions]],
) -> bytes:
    """Get text from file-like object."""
    api_headers = {}

    defaults = [opt.OCROnly()]
    defaults.extend(opts)

    # Apply options
    for opt in defaults:
        api_headers = opt(api_headers)  # This populates the dictionary
       
    # Do stuff with api
    ...

Each option is a callable which is called as they are looped over. Where does the 'functional' part of 'functional options' come from? These callable options defined as follows:

class GetTextOptionsBuilder:
    @staticmethod
    def OnlyOCR() -> Callable[..., GetTextOptions]:
        """Returns only the OCR text, no other metadata."""

        def _only_ocr(opts: GetTextOptions) -> GetTextOptions:
            opts["ocr_only"] = "true"
            return opts

        return _only_ocr

    @staticmethod
    def WithTimeout(timeout: int) -> Callable[..., GetTextOptions]:
    	"""Process document with a timeout in seconds"""
        # Validate timeout
        try:
            t = int(timeout)
        except ValueError:
            raise ValueError("Timeout must be a positive integer")

        if t < 0:
            raise ValueError("Timeout must be a positive integer")

        def _with_timeout(opts: GetTextOptions) -> GetTextOptions:
            opts["timeout"] = str(t)
            return opts

        return _with_timeout
    
    ...

Notice how several improvements are achieved with this pattern:

  • Documentation is now clearer and provided per attribute
  • Validation of each option value is now possible and simple
  • Extension is easy - just add more methods for each new option

It's not every day that we see new interesting design patterns in software, and seeing one for configuring something as mundane as options is all the more impressive. I really see this being used in different languages and not just in Go, although I've never seen a Python implementation before mine.

I hope this shows the benefits of functional options clearly and inspires more Python implementations! I've implemented a working API client using this pattern and using it and extending it have been as pleasant as expected.

I'll end with a reference to Rob Pike's original blogpost detailing the design of functional options (or self-referential functions for option setting).

Self-referential functions and the design of options
I’ve been trying on and off to find a nice way to deal with setting options in a Go package I am writing. Options on a type, that is. The p…
Rami Awar

Rami Awar

Delft, Netherlands