Parsers#
Sybil parsers are what extracts examples from source files and turns them into parsed examples with evaluators that can check if they are as expected. A number of parsers are included, and it’s simple enough to write your own. The included parsers are as follows:
doctest#
This parser extracts classic doctest examples
and evaluates them in the document’s namespace
.
The parser can optionally be instantiated with
doctest option flags.
An additional option flag, sybil.parsers.doctest.FIX_BYTE_UNICODE_REPR
, is provided.
When used, this flag causes byte and unicode literals in doctest expected
output to be rewritten such that they are compatible with the version of
Python with which the tests are executed. If your example output includes either
b'...'
or u'...'
and your code is expected to run under both Python 2
and Python 3, then you will likely need this option.
The parser is used by instantiating sybil.parsers.doctest.DocTestParser
with the required options and passing it as an element in the list passed as the
parsers
parameter to Sybil
.
Warning
FIX_BYTE_UNICODE_REPR
is quite simplistic. It will catch
examples but you may hit problems where, for example, ['b', '']
in expected
output will be rewritten as ['', '']
on Python 2 and ['u', '']
as ['', '']
.
on Python 3. To work around this, either only run Sybil on Python 3 and do not
use this option, or pick different example output.
codeblock#
The parsers in sybil.parsers.codeblock
extract examples from Sphinx code-block
directives and evaluate them in the document’s namespace
.
Including all the boilerplate necessary for examples to successfully evaluate and be checked can hinder writing documentation. To help with this, these parsers also evaluate “invisible” code blocks such as this one:
.. invisible-code-block: python
remember_me = b'see how namespaces work?'
These take advantage of Sphinx comment syntax so that the code block will not be rendered in your documentation but can be used to set up the document’s namespace or make assertions about what the evaluation of other examples has put in that namespace.
Python#
Python code blocks can be checked by instantiating
sybil.parsers.codeblock.PythonCodeBlockParser
and passing it as an element in
the list passed as the parsers
parameter to Sybil
.
For example, this Python code block would be evaluated successfully and will define the
prefix_and_print()
function in the document’s namespace:
.. code-block:: python
import sys
def prefix_and_print(message):
print('prefix:', message.decode('ascii'))
PythonCodeBlockParser
takes an optional
future_imports
parameter that can be used to prefix all example python
code found by this parser with one or or more from __future__ import ...
statements. For example, to prefix all code block examples with
from __future__ import print_function
, such that they can use Python 3 style
print()
calls even when testing the documentation under Python 2, you would
instantiate the parser as follows:
from sybil.parsers.codeblock import PythonCodeBlockParser
PythonCodeBlockParser(future_imports=['print_function'])
Other Languages#
Note
If your code-block
examples define content rather executable code, you may
find the capture parser is more useful.
sybil.parsers.codeblock.CodeBlockParser
can be used to check examples in any language
you require, either by instantiating with a specified language and evaluator, or by subclassing
to create your own parser.
As an example, let’s look at evaluating bash commands in a subprocess and checking the output is as expected:
.. code-block:: bash
$ echo hi there
hi there
We can do this using CodeBlockParser
as follows:
from subprocess import check_output
from textwrap import dedent
from sybil import Sybil
from sybil.parsers.codeblock import CodeBlockParser
def evaluate_bash(example):
command, expected = dedent(example.parsed).strip().split('\n')
actual = check_output(command[2:].split()).strip().decode('ascii')
assert actual == expected, repr(actual) + ' != ' + repr(expected)
parser = CodeBlockParser(language='bash', evaluator=evaluate_bash)
sybil = Sybil(parsers=[parser], pattern='*.rst')
Alternatively, we can create our own parser class and use it as follows:
from subprocess import check_output
from textwrap import dedent
from sybil import Sybil
from sybil.parsers.codeblock import CodeBlockParser
class BashCodeBlockParser(CodeBlockParser):
language = 'bash'
def evaluate(self, example):
command, expected = dedent(example.parsed).strip().split('\n')
actual = check_output(command[2:].split()).strip().decode('ascii')
assert actual == expected, repr(actual) + ' != ' + repr(expected)
sybil = Sybil([BashCodeBlockParser()], pattern='*.rst')
capture#
This parser takes advantage of Sphinx comment syntax to introduce
a special comment that takes the preceding ReST block and inserts its
raw content into the document’s namespace
using the name specified.
It is used by including sybil.parsers.capture.parse_captures()
as an element in the list passed as the
parsers
parameter to Sybil
.
For example:
A simple example::
root.txt
subdir/
subdir/file.txt
subdir/logs/
.. -> expected_listing
The above documentation source, when parsed by this parser and then evaluated,
would mean that expected_listing
could be used in other examples in the
document:
>>> expected_listing.split()
['root.txt', 'subdir/', 'subdir/file.txt', 'subdir/logs/']
It can also be used with code-block
examples that define content rather
executable code, for example:
.. code-block:: json
{
"a key": "value",
"b key": 42
}
.. -> json_source
The JSON source can now be used as follows:
>>> import json
>>> json.loads(json_source)
{'a key': 'value', 'b key': 42}
Note
It’s important that the capture directive, .. -> json_source
in this case, has identical
indentation to the code block above it for this to work.
skip#
This parser takes advantage of Sphinx comment syntax to introduce special comments that allow other examples in the document to be skipped. This can be useful if they include pseudo code or examples that can only be evaluated on a particular version of Python.
For example:
.. skip: next
This would be wrong:
>>> 1 == 2
True
If you need to skip a collection of examples, this can be done as follows:
This is pseudo-code:
.. skip: start
>>> foo = ...
>>> foo(..)
.. skip: end
You can also add conditions to either next
or start
as shown below:
.. invisible-code-block: python
import sys
This will only work on Python 3:
.. skip: next if(sys.version_info < (3, 0), reason="python 3 only")
>>> repr(b'foo')
"b'foo'"
As you can see, any names used in the expression passed to if
must be
present in the document’s namespace
.
invisible code blocks, setup
methods or fixtures are good ways to provide these.
The parser is used by including sybil.parsers.skip.skip()
as an element in the list passed as the
parsers
parameter to Sybil
.
Developing your own parsers#
Sybil parsers are callables that take a
sybil.Document
and yield a sequence of
regions
. A Region
contains
the character position of the start and end of the example in the document’s
text
, along with a parsed version of the
example and a callable evaluator. That evaluator will be called with an
Example
constructed from the
Document
and the Region
and should either raise an exception or return a textual description in the
event of the example not being as expected. Evaluators may also
modify the document’s namespace
or evaluator
.
As an example, let’s look at a parser suitable for evaluating bash commands in a subprocess and checking the output is as expected:
.. code-block:: bash
$ echo hi there
hi there
Note
This specific case can more easily be dealt with using the code-block
support for other languages.
Writing parsers quite often involves using regular expressions to extract
the text for examples from the document. There’s no hard requirement
for this, but if you find you need to, then
find_region_sources()
may be of help.
Parsers are free to access any documented attribute of the
Document
although will most likely
only need to work with text
.
The namespace
attribute should not be
modified.
For the above example, the parser could be implemented as follows, with the parsed version consisting of a tuple of the command to run and the expected output:
import re, textwrap
from sybil import Region
BASHBLOCK_START = re.compile(r'^\.\.\s*code-block::\s*bash')
BASHBLOCK_END = re.compile(r'(\n\Z|\n(?=\S))')
def parse_bash_blocks(document):
for start_match, end_match, source in document.find_region_sources(
BASHBLOCK_START, BASHBLOCK_END
):
command, output = textwrap.dedent(source).strip().split('\n')
assert command.startswith('$ ')
parsed = command[2:].split(), output
yield Region(start_match.start(), end_match.end(),
parsed, evaluate_bash_block)
Evaluators are generally much simpler than parsers and are called with an
Example
. Instances of this class are used to wrap up
all the attributes you’re likely to need when writing an evaluator and all
documented attributes are fine to use. In particular,
parsed
is the parsed value provided by the parser
when instantiating the Region
and
namespace
is a reference to the document’s
namespace. Evaluators are free to modify the
namespace
if they need to.
For the above example, the evaluator could be implemented as follows:
from subprocess import check_output
def evaluate_bash_block(example):
command, expected = example.parsed
actual = check_output(command).strip().decode('ascii')
assert actual == expected, repr(actual) + ' != ' + repr(expected)
The parser can now be used when instantiating a Sybil
, which can
then be used to integrate with your test runner:
from sybil import Sybil
sybil = Sybil(parsers=[parse_bash_blocks], pattern='*.rst')