Parsers#

Sybil parsers are what extracts examples from source files and turns them into parsed examples with evaluators that can check if they are as expected. A number of parsers are included, and it’s simple enough to write your own. The included parsers are as follows:

doctest#

This parser extracts classic doctest examples and evaluates them in the document’s namespace. The parser can optionally be instantiated with doctest option flags.

An additional option flag, sybil.parsers.doctest.FIX_BYTE_UNICODE_REPR, is provided. When used, this flag causes byte and unicode literals in doctest expected output to be rewritten such that they are compatible with the version of Python with which the tests are executed. If your example output includes either b'...' or u'...' and your code is expected to run under both Python 2 and Python 3, then you will likely need this option.

The parser is used by instantiating sybil.parsers.doctest.DocTestParser with the required options and passing it as an element in the list passed as the parsers parameter to Sybil.

Warning

FIX_BYTE_UNICODE_REPR is quite simplistic. It will catch examples but you may hit problems where, for example, ['b', ''] in expected output will be rewritten as ['', ''] on Python 2 and ['u', ''] as ['', '']. on Python 3. To work around this, either only run Sybil on Python 3 and do not use this option, or pick different example output.

codeblock#

The parsers in sybil.parsers.codeblock extract examples from Sphinx code-block directives and evaluate them in the document’s namespace.

Including all the boilerplate necessary for examples to successfully evaluate and be checked can hinder writing documentation. To help with this, these parsers also evaluate “invisible” code blocks such as this one:


.. invisible-code-block: python

  remember_me = b'see how namespaces work?'

These take advantage of Sphinx comment syntax so that the code block will not be rendered in your documentation but can be used to set up the document’s namespace or make assertions about what the evaluation of other examples has put in that namespace.

Python#

Python code blocks can be checked by instantiating sybil.parsers.codeblock.PythonCodeBlockParser and passing it as an element in the list passed as the parsers parameter to Sybil.

For example, this Python code block would be evaluated successfully and will define the prefix_and_print() function in the document’s namespace:


.. code-block:: python

  import sys

  def prefix_and_print(message):
      print('prefix:', message.decode('ascii'))

PythonCodeBlockParser takes an optional future_imports parameter that can be used to prefix all example python code found by this parser with one or or more from __future__ import ... statements. For example, to prefix all code block examples with from __future__ import print_function, such that they can use Python 3 style print() calls even when testing the documentation under Python 2, you would instantiate the parser as follows:

from sybil.parsers.codeblock import PythonCodeBlockParser

PythonCodeBlockParser(future_imports=['print_function'])

Other Languages#

Note

If your code-block examples define content rather executable code, you may find the capture parser is more useful.

sybil.parsers.codeblock.CodeBlockParser can be used to check examples in any language you require, either by instantiating with a specified language and evaluator, or by subclassing to create your own parser.

As an example, let’s look at evaluating bash commands in a subprocess and checking the output is as expected:

.. code-block:: bash

   $ echo hi there
   hi there

We can do this using CodeBlockParser as follows:

from subprocess import check_output
from textwrap import dedent

from sybil import Sybil
from sybil.parsers.codeblock import CodeBlockParser

def evaluate_bash(example):
    command, expected = dedent(example.parsed).strip().split('\n')
    actual = check_output(command[2:].split()).strip().decode('ascii')
    assert actual == expected, repr(actual) + ' != ' + repr(expected)

parser = CodeBlockParser(language='bash', evaluator=evaluate_bash)
sybil = Sybil(parsers=[parser], pattern='*.rst')

Alternatively, we can create our own parser class and use it as follows:

from subprocess import check_output
from textwrap import dedent

from sybil import Sybil
from sybil.parsers.codeblock import CodeBlockParser

class BashCodeBlockParser(CodeBlockParser):

    language = 'bash'

    def evaluate(self, example):
        command, expected = dedent(example.parsed).strip().split('\n')
        actual = check_output(command[2:].split()).strip().decode('ascii')
        assert actual == expected, repr(actual) + ' != ' + repr(expected)

sybil = Sybil([BashCodeBlockParser()], pattern='*.rst')

capture#

This parser takes advantage of Sphinx comment syntax to introduce a special comment that takes the preceding ReST block and inserts its raw content into the document’s namespace using the name specified.

It is used by including sybil.parsers.capture.parse_captures() as an element in the list passed as the parsers parameter to Sybil.

For example:

A simple example::

  root.txt
  subdir/
  subdir/file.txt
  subdir/logs/

.. -> expected_listing

The above documentation source, when parsed by this parser and then evaluated, would mean that expected_listing could be used in other examples in the document:

>>> expected_listing.split()
['root.txt', 'subdir/', 'subdir/file.txt', 'subdir/logs/']

It can also be used with code-block examples that define content rather executable code, for example:

.. code-block:: json

    {
        "a key": "value",
        "b key": 42
    }

.. -> json_source

The JSON source can now be used as follows:

>>> import json
>>> json.loads(json_source)
{'a key': 'value', 'b key': 42}

Note

It’s important that the capture directive, .. -> json_source in this case, has identical indentation to the code block above it for this to work.

skip#

This parser takes advantage of Sphinx comment syntax to introduce special comments that allow other examples in the document to be skipped. This can be useful if they include pseudo code or examples that can only be evaluated on a particular version of Python.

For example:

.. skip: next

This would be wrong:

>>> 1 == 2
True

If you need to skip a collection of examples, this can be done as follows:

This is pseudo-code:

.. skip: start

>>> foo = ...
>>> foo(..)

.. skip: end

You can also add conditions to either next or start as shown below:

.. invisible-code-block: python

  import sys

This will only work on Python 3:

.. skip: next if(sys.version_info < (3, 0), reason="python 3 only")

>>> repr(b'foo')
"b'foo'"

As you can see, any names used in the expression passed to if must be present in the document’s namespace. invisible code blocks, setup methods or fixtures are good ways to provide these.

The parser is used by including sybil.parsers.skip.skip() as an element in the list passed as the parsers parameter to Sybil.

Developing your own parsers#

Sybil parsers are callables that take a sybil.Document and yield a sequence of regions. A Region contains the character position of the start and end of the example in the document’s text, along with a parsed version of the example and a callable evaluator. That evaluator will be called with an Example constructed from the Document and the Region and should either raise an exception or return a textual description in the event of the example not being as expected. Evaluators may also modify the document’s namespace or evaluator.

As an example, let’s look at a parser suitable for evaluating bash commands in a subprocess and checking the output is as expected:

.. code-block:: bash

   $ echo hi there
   hi there

Note

This specific case can more easily be dealt with using the code-block support for other languages.

Writing parsers quite often involves using regular expressions to extract the text for examples from the document. There’s no hard requirement for this, but if you find you need to, then find_region_sources() may be of help. Parsers are free to access any documented attribute of the Document although will most likely only need to work with text. The namespace attribute should not be modified.

For the above example, the parser could be implemented as follows, with the parsed version consisting of a tuple of the command to run and the expected output:

import re, textwrap
from sybil import Region

BASHBLOCK_START = re.compile(r'^\.\.\s*code-block::\s*bash')
BASHBLOCK_END = re.compile(r'(\n\Z|\n(?=\S))')

def parse_bash_blocks(document):
    for start_match, end_match, source in document.find_region_sources(
        BASHBLOCK_START, BASHBLOCK_END
    ):
        command, output = textwrap.dedent(source).strip().split('\n')
        assert command.startswith('$ ')
        parsed = command[2:].split(), output
        yield Region(start_match.start(), end_match.end(),
                     parsed, evaluate_bash_block)

Evaluators are generally much simpler than parsers and are called with an Example. Instances of this class are used to wrap up all the attributes you’re likely to need when writing an evaluator and all documented attributes are fine to use. In particular, parsed is the parsed value provided by the parser when instantiating the Region and namespace is a reference to the document’s namespace. Evaluators are free to modify the namespace if they need to.

For the above example, the evaluator could be implemented as follows:

from subprocess import check_output

def evaluate_bash_block(example):
    command, expected = example.parsed
    actual = check_output(command).strip().decode('ascii')
    assert actual == expected, repr(actual) + ' != ' + repr(expected)

The parser can now be used when instantiating a Sybil, which can then be used to integrate with your test runner:

from sybil import Sybil

sybil = Sybil(parsers=[parse_bash_blocks], pattern='*.rst')