astdocs Extract and format Markdown documentation from Python code.

Module astdocs

Extract and format Markdown documentation from Python code.

According to my standards.

In a few more words, parse the underlying Abstract Syntax Tree (AST) description. (See the documentation of the standard library module with same name.) It expects a relatively clean input (demonstrated in this very script) which forces me to keep my code somewhat correctly documented and without fancy syntax.

My only requirement was to use the Python standard library exclusively (even the templating) as it is quite [overly] complete these days, and keep it as lean as possible. Support for corner cases is scarse...

The simplest way to check the output of this script is to run it on itself:

$ python astdocs.py astdocs.py  # pipe it to your favourite markdown linter

or even:

$ python astdocs.py .  # recursively look for *.py files in the current directory

The behaviour of this little stunt can be modified via environment variables:

  • ASTDOCS_BOUND_OBJECTS taking the 1, on, true or yes values (anything else will be ignored/counted as negative) to add %%%START ... and %%%END ... markers to indicate the beginning/end of an object (useful for further styling when rendering in HTML for example). Not to be mixed up with the %%%BEGIN markers (see below).
  • ASTDOCS_FOLD_ARGS_AFTER to fold long object (function/method) definitions (many parameters). Defaults to 88 characters, black recommended default.
  • ASTDOCS_SHOW_PRIVATE taking the 1, on, true or yes values (anything else will be ignored) to show Python private objects (which names start with an underscore).
  • ASTDOCS_SPLIT_BY taking the m, mc, mfc or an empty value (default, all rendered content in one output): split each module, function and/or class (by adding %%%BEGIN ... markers). Classes will always keep their methods. In case mfc is provided, the module will only keep its docstring, and each function/class/method will be marked.
  • ASTDOCS_WITH_LINENOS taking the 1, on, true or yes values (anything else will be ignored) to show the line numbers of the object in the code source (to be processed later on by your favourite Markdown renderer). Look for the %%%SOURCE ... markers.
$ ASTDOCS_WITH_LINENOS=on python astdocs.py astdocs.py

or to split marked sections into separate files:

$ ASTDOCS_SPLIT_BY=mc python astdocs.py module.py | csplit -qz - '/^%%%BEGIN/' '{*}'
$ sed '1d' xx00 > module.md
$ rm xx00
$ for f in xx??; do
>   path=$(grep -m1 '^%%%BEGIN' $f | sed -r 's|%%%.* (.*)|\1|g;s|\.|/|g')
>   mkdir -p $(dirname $path)
>   sed '1d' $f > "$path.md"  # double quotes are needed
>   rm $f
> done

(See also the Python example in the docstring of the astdocs.render_recursively() function.)

Each of these environment variables translates into a configuration option stored in the config dictionnary of the present module. The key name is lowercased and stripped from the ASTDOCS_ prefix.

When handling rendering programmatically one can use helper [private] functions (if necessary). See code and/or tests for details.

All encountered objects are stored as they are parsed. The content of the corresponding attribute can be used by external scripts to generate a dependency graph, or simply a Table of Contents:

import astdocs

def toc(objects: dict[str, dict[str, dict[str, str]]]) -> str:
    md = ""

    for m in objects:  # each module
        anchor = m.replace(".", "")  # github
        md += f"\n- [`{m}`](#module-{anchor})"
        for t in ["functions", "classes"]:  # relevant object types
            for o in objects[m][t]:
                anchor = (m + o).replace(".", "")  # github
                md += f"\n    - [`{m}.{o}`](#{anchor})"

    return md

md = astdocs.render_recursively(".")
toc = toc(astdocs.objects)

print(f"{toc}\n\n{md}")

Attributes

  • TPL [string.Template]: Template to render the overall page (only governs order of objects in the output).
  • TPL_CLASSDEF [string.Template]: Template to render class objects.
  • TPL_FUNCTIONDEF [string.Template]: Template to render def objects (async or not).
  • TPL_MODULE [string.Template]: Template to render the module summary.
  • objects [dict[str, typing.Any]]: Nested dictionary of all relevant objects encountered while parsing the source code.

Functions

Functions

astdocs.format_docstring

format_docstring(
    node: ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module,
) -> str:

Format the object docstring.

Expect some stiff NumPy-ish formatting (see this or that). Do try to type all your input parameters/returned objects. And use a linter on the output?

Parameters

  • node [ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module]: Source node to extract/parse docstring from.

Returns

  • [str]: The formatted docstring.

Example

Below the raw docstring example of what this very function is expecting as an input (very inceptional):


Parameters
----------
node : ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module
    Source node to extract/parse docstring from.

Returns
-------
: str
    The formatted docstring.

The code blocks are extracted and replaced by placeholders before performing the substitutions (then rolled back in). The regular expressions are then applied:

  • Leading hashtags (#) are removed from any lines starting with them as we do not want to conflict with the Markdown output.
  • Any series of words followed by a line with 3 or more hyphens is assumed to be a section marker (such as Parameters, Returns, Example, etc.).
  • Lines with parameter : type (: type optional) followed by a description, itself preceded by four spaces are formatted as input parameters.
  • Lines with : type (providing a type is here mandatory) followed by a description, itself preceded by four spaces are formatted as returned values.

Keep in mind that returning the full path to a returned object is always preferable. And indeed some of it could be inferred from the function call itself, or the return statement. BUT this whole thing is to force myself to structure my docstrings correctly.

Notes

If the regular expression solution used here (which works for my needs) does not fulfill your standards, it is pretty easy to clobber it:

import ast
import astdocs

def my_docstring_parser(docstring: str) -> str:
    # process docstring
    return string

def format_docstring(node: ast.*) -> str:  # simple wrapper function
    return my_docstring_parser(ast.get_docstring(node))

astdocs.format_docstring = format_docstring

print(astdocs.render(...))

Known problem

Overall naive, stiff and very opinionated (again, for my use).

source
def format_docstring(
    node: ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module,
) -> str:
    r"""Format the object docstring.

    Expect some stiff `NumPy`-ish formatting (see
    [this](https://numpydoc.readthedocs.io/en/latest/example.html#example) or
    [that](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html)).
    Do try to **type** all your input parameters/returned objects. And use a linter on
    the output?

    Parameters
    ----------
    node : ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module
        Source node to extract/parse docstring from.

    Returns
    -------
    : str
        The formatted docstring.

    Example
    -------
    Below the raw docstring example of what this very function is expecting as an input
    (very inceptional):

    ```text

    Parameters
    ----------
    node : ast.AsyncFunctionDef | ast.ClassDef | ast.FunctionDef | ast.Module
        Source node to extract/parse docstring from.

    Returns
    -------
    : str
        The formatted docstring.
    ```

    The code blocks are extracted and replaced by placeholders before performing the
    substitutions (then rolled back in). The regular expressions are then applied:

    * Leading hashtags (`#`) are removed from any lines starting with them as we do not
      want to conflict with the `Markdown` output.
    * Any series of words followed by a line with 3 or more hyphens is assumed to be a
      section marker (such as `Parameters`, `Returns`, `Example`, *etc.*).
    * Lines with `parameter : type` (`: type` optional) followed by a description,
      itself preceded by four spaces are formatted as input parameters.
    * Lines with `: type` (providing a type is here *mandatory*) followed by a
      description, itself preceded by four spaces are formatted as returned values.

    Keep in mind that returning **the full path** to a returned object is always
    preferable. And indeed **some of it could be inferred** from the function call
    itself, or the `return` statement. BUT this whole thing is to force *myself* to
    structure *my* docstrings correctly.

    Notes
    -----
    If the regular expression solution used here (which works for *my* needs) does not
    fulfill your standards, it is pretty easy to clobber it:

    ```python
    import ast
    import astdocs

    def my_docstring_parser(docstring: str) -> str:
        # process docstring
        return string

    def format_docstring(node: ast.*) -> str:  # simple wrapper function
        return my_docstring_parser(ast.get_docstring(node))

    astdocs.format_docstring = format_docstring

    print(astdocs.render(...))
    ```

    Known problem
    -------------
    Overall naive, stiff and *very* opinionated (again, for *my* use).

    """
    s = ast.get_docstring(node) or ""

    # extract code blocks, replace them by a placeholder
    blocks = []
    patterns = [f"([`]{{{i}}}.*?[`]{{{i}}})" for i in range(7, 2, -1)]
    i = 0
    for p in patterns:
        for m in re.finditer(p, s, flags=re.DOTALL):
            blocks.append(m.group(1))
            s = s.replace(m.group(1), f"%%%BLOCK{i}", 1)
            i += 1

    # remove trailing spaces
    s = re.sub(r" {1,}\n", r"\n", s)

    # rework any word preceded by one or more hashtag
    s = re.sub(r"\n#+\s*(.*)", r"\n**\1**", s)

    # rework any word followed by a line with 3 or more dashes
    s = re.sub(r"\n([A-Za-z ]+)\n-{3,}", r"\n**\1**\n", s)

    # rework list of arguments/descriptions (no types)
    s = re.sub(r"\n([A-Za-z0-9_]+)\n {2,}(.*)", r"\n* `\1`: \2", s)

    # rework list of arguments/types/descriptions
    s = re.sub(
        r"\n([A-Za-z0-9_]+) : ([A-Za-z0-9_\[\],\.| ]+)\n {2,}(.*)",
        r"\n* `\1` [`\2`]: \3",
        s,
    )

    # rework list of types/descriptions (return values)
    s = re.sub(r"\n: ([A-Za-z0-9_\[\],\.| ]+)\n {2,}(.*)", r"\n* [`\1`]: \2", s)

    # put the code blocks back in
    for i, b in enumerate(blocks):
        s = s.replace(f"%%%BLOCK{i}", b)

    return s.strip()

astdocs.parse_annotation

parse_annotation(a: typing.Any) -> str:

Format an annotation (object type or decorator).

Dive as deep as necessary within the children nodes until reaching the name of the module/attribute objects are annotated after; save the import path on the way. Recursively repeat for complicated object.

See the code itself for some line-by-line documentation.

Parameters

  • a [typing.Any]: The starting node to extract annotation information from.

Returns

  • [str]: The formatted annotation.

Known problems

  • The implementation only supports nodes I encountered in my projects.
  • Does not support lambda constructs.
source
def parse_annotation(a: typing.Any) -> str:  # noqa: C901 (ignoring complexity warning)
    """Format an annotation (object type or decorator).

    Dive as deep as necessary within the children nodes until reaching the name of the
    module/attribute objects are annotated after; save the import path on the way.
    Recursively repeat for complicated object.

    See the code itself for some line-by-line documentation.

    Parameters
    ----------
    a : typing.Any
        The starting node to extract annotation information from.

    Returns
    -------
    : str
        The formatted annotation.

    Known problems
    --------------
    * The implementation only supports nodes I encountered in my projects.
    * Does not support `lambda` constructs.

    """
    s = ""

    # dig deeper: module.object
    if isinstance(a, ast.Attribute):
        s = f"{parse_annotation(a.value)}.{a.attr}"

    # dig deeper: | operator
    elif isinstance(a, ast.BinOp):
        s = parse_annotation(a.left)
        s += " | "
        s += parse_annotation(a.right)

    # dig deeper: @decorator(including=parameter)
    elif isinstance(a, ast.Call):
        s = parse_annotation(a.func)
        s += "("
        s += ", ".join([f"{a_.arg}={parse_annotation(a_.value)}" for a_ in a.keywords])
        s += ")"

    # we dug deep enough and unravelled a value
    elif isinstance(a, ast.Constant):
        s = f'"{a.value}"' if isinstance(a.value, str) else str(a.value)

    # dig deeper: content within a dictionnary
    elif isinstance(a, ast.Dict):
        s = "{"
        s += ", ".join(
            [
                f"{parse_annotation(k)}: {parse_annotation(v)}"
                for k, v in zip(a.keys, a.values, strict=True)
            ],
        )
        s += "}"

    # dig deeper: content within a list
    elif isinstance(a, ast.List):
        s = "["
        s += ", ".join([parse_annotation(a_) for a_ in a.elts])
        s += "]"

    # we dug deep enough and unravelled a canonical object
    elif isinstance(a, ast.Name):
        s = a.id

    # dig deeper: complex object, tuple[dict[int, float], bool, str] for instance
    elif isinstance(a, ast.Subscript):
        v = parse_annotation(a.slice)
        s = parse_annotation(a.value)
        s += "["
        s += v[1:-1] if v.startswith("(") and v.endswith(")") else v
        s += "]"

    # dig deeper: content within a set
    elif isinstance(a, ast.Set):
        s = "{"
        s += ", ".join([parse_annotation(a_) for a_ in a.elts])
        s += "}"

    # dig deeper: content within a tuple
    elif isinstance(a, ast.Tuple):
        s = "("
        s += ", ".join([parse_annotation(a_) for a_ in a.elts])
        s += ")"

    return s

astdocs.parse_class

parse_class(
    node: ast.ClassDef,
    module: str,
    ancestry: str,
    classes: dict[str, dict[str, str]],
) -> dict[str, dict[str, str]]:

Parse a class statement.

Parameters

  • node [ast.ClassDef]: The node to extract information from.
  • module [str]: Name of the current module.
  • ancestry [str]: Complete path to the object, used to identify ownership of children objects (functions and methods for instance).
  • classes [dict[str, dict[str, str]]]: Dictionnaries of all encountered class definitions.

Returns

  • [dict[str, dict[str, str]]]: Dictionnaries of all encountered class definitions.
source
def parse_class(
    node: ast.ClassDef,
    module: str,
    ancestry: str,
    classes: dict[str, dict[str, str]],
) -> dict[str, dict[str, str]]:
    """Parse a `class` statement.

    Parameters
    ----------
    node : ast.ClassDef
        The node to extract information from.
    module : str
        Name of the current module.
    ancestry : str
        Complete path to the object, used to identify ownership of children objects
        (functions and methods for instance).
    classes : dict[str, dict[str, str]]
        Dictionnaries of all encountered class definitions.

    Returns
    -------
    : dict[str, dict[str, str]]
        Dictionnaries of all encountered class definitions.

    """
    ap = f"{ancestry}.{node.name}"  # absolute path to the object
    lp = ap.replace(module, "", 1).lstrip(".")  # local path to the object
    objects[module]["classes"][lp] = ap  # save the object path

    # parse decorator objects
    dc = [f"`@{parse_annotation(d)}`" for d in node.decorator_list]

    # save the object details
    classes[ap] = {
        "ancestry": ancestry,
        "classname": node.name,
        "classdocs": format_docstring(node),
        "decoration": "**Decoration** via " + ", ".join(dc) + "." if dc else "",
        "endlineno": str(node.end_lineno),
        "hashtags": "#" if "c" in config["split_by"] else "###",
        "lineno": str(node.lineno),
    }

    return classes

astdocs.parse_function

parse_function(
    node: ast.AsyncFunctionDef | ast.FunctionDef,
    module: str,
    ancestry: str,
    functions: dict[str, dict[str, str]],
) -> dict[str, dict]:

Parse a def statement.

Parameters

  • node [ast.AsyncFunctionDef | ast.FunctionDef]: The node to extract information from.
  • module [str]: Name of the current module.
  • ancestry [str]: Complete path to the object, used to identify ownership of children objects (functions and methods for instance).
  • functions [dict[str, dict[str, str]]]: Dictionnaries of all encountered function definitions.

Returns

  • [dict[str, dict]]: Dictionnaries of all encountered function definitions.

Notes

If *args and some kwargs arguments are present, args.vararg will not be None and the node.args.kwonlyargs / node.args.kw_defaults attributes need to be parsed. Otherwise all should be available in the args / defaults attributes.

source
def parse_function(
    node: ast.AsyncFunctionDef | ast.FunctionDef,
    module: str,
    ancestry: str,
    functions: dict[str, dict[str, str]],
) -> dict[str, dict]:
    """Parse a `def` statement.

    Parameters
    ----------
    node : ast.AsyncFunctionDef | ast.FunctionDef
        The node to extract information from.
    module : str
        Name of the current module.
    ancestry : str
        Complete path to the object, used to identify ownership of children objects
        (functions and methods for instance).
    functions : dict[str, dict[str, str]]
        Dictionnaries of all encountered function definitions.

    Returns
    -------
    : dict[str, dict]
        Dictionnaries of all encountered function definitions.

    Notes
    -----
    If `*args` and some `kwargs` arguments are present, `args.vararg` will not be `None`
    and the `node.args.kwonlyargs` / `node.args.kw_defaults` attributes need to be
    parsed. Otherwise all should be available in the `args` / `defaults` attributes.

    """
    ap = f"{ancestry}.{node.name}"  # absolute path to the object
    lp = ap.replace(module, "", 1).lstrip(".")  # local path to the object
    objects[module]["functions"][lp] = ap  # save the object path

    params = []  # formatted function/method parameters

    # parse decorator objects
    dc = [f"`@{parse_annotation(d)}`" for d in node.decorator_list]

    # parse/format arguments and annotations; with default values if present
    def _parse_format_argument(ann: typing.Any, val: typing.Any = None) -> str:
        """Parse and format an annotation.

        Parameters
        ----------
        ann : typing.Any
            Any type of annotation node.
        val : typing.Any
            Default value for this parameter.

        Returns
        -------
        : str
            Formatted annotation with potential default value.

        """
        s = ann.arg

        if ann.annotation is not None:
            s += ": "
            s += parse_annotation(ann.annotation)

        if val is not None:
            s += f" = {parse_annotation(val)}"

        return s

    # args; to accolate default values we need to reverse the argument list
    for ann, val in list(
        itertools.zip_longest(node.args.args[::-1], node.args.defaults[::-1]),
    )[::-1]:
        params.append(_parse_format_argument(ann, val))

    # *args
    if node.args.vararg is not None:
        params.append(f"*{node.args.vararg.arg}")

    # kwargs, only populated if args.vararg is; same accolation comment as above
    for ann, val in list(
        itertools.zip_longest(node.args.kwonlyargs[::-1], node.args.kw_defaults[::-1]),
    )[::-1]:
        params.append(_parse_format_argument(ann, val))

    # **kwargs
    if node.args.kwarg is not None:
        params.append(f"**{node.args.kwarg.arg}")

    # output
    output = f" -> {parse_annotation(node.returns)}" if node.returns is not None else ""

    # add line breaks if the function call is long (pre-render this latter first, no way
    # around it)
    if len(f'{node.name}({", ".join(params)}){output}') > config["fold_args_after"]:
        params = [f"\n    {p}" for p in params]
        suffix = ",\n"
    else:
        suffix = ""

    # save the object details
    functions[ap] = {
        "ancestry": ancestry,
        "params": ("," if len(suffix) else ", ").join(params) + suffix,
        "decoration": ("**Decoration** via " + ", ".join(dc) + ".") if dc else "",
        "endlineno": str(node.end_lineno),
        "funcdocs": format_docstring(node),
        "funcname": node.name,
        "hashtags": "#" if "f" in config["split_by"] else "###",
        "lineno": str(node.lineno),
        "output": output,
    }

    return functions

astdocs.parse_import

parse_import(
    node: ast.Import | ast.ImportFrom,
    module: str,
    ancestry: str,
    imports: dict[str, str],
) -> dict[str, str]:

Parse import ... [as ...] and from ... import ... [as ...] statements.

The content built by this function is currently not rendered. This latter is kept in case all the objects (and aliases) accessible within a module is required for a post-processing or some later [smart and exciting] implementations.

Parameters

  • node [ast.Import | ast.ImportFrom]: The node to extract information from.
  • module [str]: Name of the current module.
  • ancestry [str]: Complete path to the object, used to identify ownership of children objects (functions and methods for instance).
  • imports [dict[str, str] | None]: Dictionnaries of parsed imports. Defaults to an empty dictionnary {}.

Returns

  • [dict[str, str]]: Dictionnaries of all encountered imports. Untouched for now, always empty dictionnary {}.
source
def parse_import(
    node: ast.Import | ast.ImportFrom,
    module: str,
    ancestry: str,
    imports: dict[str, str],
) -> dict[str, str]:
    """Parse `import ... [as ...]` and `from ... import ... [as ...]` statements.

    The content built by this function is currently *not* rendered. This latter is kept
    in case all the objects (and aliases) accessible within a module is required for a
    post-processing or some later [smart and exciting] implementations.

    Parameters
    ----------
    node : ast.Import | ast.ImportFrom
        The node to extract information from.
    module : str
        Name of the current module.
    ancestry : str
        Complete path to the object, used to identify ownership of children objects
        (functions and methods for instance).
    imports : dict[str, str] | None
        Dictionnaries of parsed imports. Defaults to an empty dictionnary `{}`.

    Returns
    -------
    : dict[str, str]
        Dictionnaries of all encountered imports. Untouched for now, always empty
        dictionnary `{}`.

    """
    if isinstance(node, ast.Import):
        for n in node.names:
            abspath = f"{ancestry}.{n.name}"
            locpath = n.asname or n.name

            # save the object
            objects[module]["imports"][locpath] = abspath

    if isinstance(node, ast.ImportFrom):
        m = f"{node.module}." if node.module is not None else ""
        v = node.level + 1 if node.level > 0 else 0
        for n in node.names:
            abspath = f'{ancestry}.{"." * v}{m}{n.name}'
            locpath = n.asname or n.name

            # save the object; with support for heresy like "from .. import *" (who does
            # that seriously)
            objects[module]["imports"][locpath] = abspath

    return imports

astdocs.parse

parse(
    node: typing.Any,
    module: str,
    ancestry: str = "",
    classes: dict[str, dict[str, str]] | None = None,
    functions: dict[str, dict[str, str]] | None = None,
    imports: dict[str, str] | None = None,
) -> tuple[dict[str, dict[str, str]], dict[str, dict[str, str]], dict[str, str]]:

Recursively traverse the nodes of the abstract syntax tree.

The present function calls the formatting function corresponding to the node name (if supported) to parse/format it.

Parameters

  • node [typing.Any]: Any type of node to extract information from.
  • module [str]: Name of the current module.
  • ancestry [str]: Complete path to the object, used to identify ownership of children objects (functions and methods for instance).
  • classes [dict[str, dict[str, str]] | None]: Dictionnaries of parsed class definitions. Defaults to None.
  • functions [dict[str, dict[str, str]] | None]: Dictionnaries of parsed function definitions. Defaults to None.
  • imports [dict[str, str] | None]: Dictionnaries of parsed imports. Defaults to a None.

Returns

  • [dict[str, dict[str, str]]]: Dictionnaries of all encountered class definitions.
  • [dict[str, dict[str, str]]]: Dictionnaries of all encountered function definitions.
  • [dict[str, str]]: Dictionnaries of all encountered imports.
source
def parse(
    node: typing.Any,
    module: str,
    ancestry: str = "",
    classes: dict[str, dict[str, str]] | None = None,
    functions: dict[str, dict[str, str]] | None = None,
    imports: dict[str, str] | None = None,
) -> tuple[dict[str, dict[str, str]], dict[str, dict[str, str]], dict[str, str]]:
    """Recursively traverse the nodes of the abstract syntax tree.

    The present function calls the formatting function corresponding to the node name
    (if supported) to parse/format it.

    Parameters
    ----------
    node : typing.Any
        Any type of node to extract information from.
    module : str
        Name of the current module.
    ancestry : str
        Complete path to the object, used to identify ownership of children objects
        (functions and methods for instance).
    classes : dict[str, dict[str, str]] | None
        Dictionnaries of parsed class definitions. Defaults to `None`.
    functions : dict[str, dict[str, str]] | None
        Dictionnaries of parsed function definitions. Defaults to `None`.
    imports : dict[str, str] | None
        Dictionnaries of parsed imports. Defaults to a `None`.

    Returns
    -------
    : dict[str, dict[str, str]]
        Dictionnaries of all encountered class definitions.
    : dict[str, dict[str, str]]
        Dictionnaries of all encountered function definitions.
    : dict[str, str]
        Dictionnaries of all encountered imports.

    """
    classes = {} if classes is None else classes
    functions = {} if functions is None else functions
    imports = {} if imports is None else imports

    for n in node.body:
        # call the parser for each supported node type
        if n.__class__.__name__ == "ClassDef":
            classes = parse_class(n, module, ancestry, classes)

        elif n.__class__.__name__ in ("AsyncFunctionDef", "FunctionDef"):
            functions = parse_function(n, module, ancestry, functions)

        elif n.__class__.__name__ in ("Import", "ImportFrom"):
            imports = parse_import(n, module, ancestry, imports)

        # not interested
        else:
            pass

        # recursively traverse the ast
        try:
            parse(
                n,
                module,
                f"{ancestry}.{n.name}",
                classes,
                functions,
                imports,
            )
        except AttributeError:
            continue

    return classes, functions, imports

astdocs.render_class

render_class(
    filepath: str,
    name: str,
    classes: dict[str, dict[str, str]],
    functions: dict[str, dict[str, str]],
    config: dict[str, typing.Any] = config,
) -> str:

Render a class object, according to the defined TPL_CLASSDEF template.

Parameters

  • filepath [str]: Path to the module (file) defining the object.
  • name [str]: The name (full path including all ancestors) of the object to render.
  • classes [dict[str, dict[str, str]]]: Dictionnaries of all encountered class definitions.
  • functions [dict[str, dict[str, str]]]: Dictionnaries of all encountered function definitions.
  • config [dict[str, typing.Any]]: Configuration options used to render attributes.

Returns

  • [str]: Markdown-formatted description of the class object.
source
def render_class(
    filepath: str,
    name: str,
    classes: dict[str, dict[str, str]],
    functions: dict[str, dict[str, str]],
    config: dict[str, typing.Any] = config,
) -> str:
    """Render a `class` object, according to the defined `TPL_CLASSDEF` template.

    Parameters
    ----------
    filepath : str
        Path to the module (file) defining the object.
    name : str
        The name (full path including all ancestors) of the object to render.
    classes : dict[str, dict[str, str]]
        Dictionnaries of all encountered class definitions.
    functions : dict[str, dict[str, str]]
        Dictionnaries of all encountered function definitions.
    config : dict[str, typing.Any]
        Configuration options used to render attributes.

    Returns
    -------
    : str
        `Markdown`-formatted description of the class object.

    """
    ht = classes[name]["hashtags"]

    # select related methods
    fs = [f for f in functions if f.startswith(f"{name}.")]

    # fetch the content of __init__
    n = f"{name}.__init__"
    if n in fs:
        fs.remove(n)
        details = functions.pop(n)
        params = re.sub(r"self[\s,]*", "", details["params"], count=1)
        docstring = details["funcdocs"]
        beglineno = details["lineno"]
        endlineno = details["endlineno"]
        if config["with_linenos"]:
            docstring += f"\n\n%%%SOURCE {filepath}:{beglineno}:{endlineno}"
    else:
        params = ""
        docstring = ""

    # methods rendered
    fsr = []
    for f in fs:
        n = f.split(".")[-1]
        if not n.startswith("_") or config["show_private"]:
            functions[f].update(
                {
                    "hashtags": f"{ht}##",
                    "params": re.sub(
                        r"self[\s,]*", "", functions[f]["params"], count=1
                    ),
                },
            )
            fsr.append(render_function(filepath, f, functions))

    # methods bullet list
    fsl = []
    for _i, f in enumerate(fs):
        n = f.split(".")[-1]
        if not n.startswith("_") or config["show_private"]:
            link = f.replace(".", "").lower()  # github syntax
            desc = functions[f]["funcdocs"].split("\n")[0]
            desc = f": {desc}" if len(desc) else ""
            fsl.append(f"* [`{n}()`](#{link}){desc}")

    # update the description of the object
    classes[name].update(
        {
            "params": params,
            "constdocs": docstring,
            "functions": (ht + "# Methods\n\n" + "\n\n".join(fsr)) if fsr else "",
            "funcnames": ("**Methods**\n\n" + "\n".join(fsl)) if fsl else "",
            "path": filepath,
        },
    )

    return TPL_CLASSDEF.substitute(classes[name]).strip()

astdocs.render_function

render_function(filepath: str, name: str, functions: dict[str, dict[str, str]]) -> str:

Render a def object (function or method).

Follow the defined TPL_FUNCTIONDEF template.

Parameters

  • filepath [str]: Path to the module (file) defining the object.
  • name [str]: The name (full path including all ancestors) of the object to render.
  • functions [dict[str, dict[str, str]]]: Dictionnaries of all encountered function definitions.

Returns

  • [str]: Markdown-formatted description of the function/method object.
source
def render_function(
    filepath: str,
    name: str,
    functions: dict[str, dict[str, str]],
) -> str:
    """Render a `def` object (function or method).

    Follow the defined `TPL_FUNCTIONDEF` template.

    Parameters
    ----------
    filepath : str
        Path to the module (file) defining the object.
    name : str
        The name (full path including all ancestors) of the object to render.
    functions : dict[str, dict[str, str]]
        Dictionnaries of all encountered function definitions.

    Returns
    -------
    : str
        `Markdown`-formatted description of the function/method object.

    """
    # update the description of the object
    functions[name].update({"path": filepath})

    return TPL_FUNCTIONDEF.substitute(functions[name]).strip()

astdocs.render_module

render_module(
    name: str,
    docstring: str,
    classes: dict[str, dict[str, str]],
    functions: dict[str, dict[str, str]],
    config: dict[str, typing.Any] = config,
) -> str:

Render a module summary as a Markdown file.

Follow the defined TPL_MODULE template.

Parameters

  • name [str]: Name of the module being parsed.
  • docstring [str]: The docstring of the module itself, if present (defaults to an empty string).
  • classes [dict[str, dict[str, str]]]: Dictionnaries of all encountered class definitions.
  • functions [dict[str, dict[str, str]]]: Dictionnaries of all encountered function definitions.
  • config [dict[str, typing.Any]]: Configuration options used to render attributes.

Returns

  • [str]: Markdown-formatted description of the whole module.
source
def render_module(
    name: str,
    docstring: str,
    classes: dict[str, dict[str, str]],
    functions: dict[str, dict[str, str]],
    config: dict[str, typing.Any] = config,
) -> str:
    """Render a module summary as a `Markdown` file.

    Follow the defined `TPL_MODULE` template.

    Parameters
    ----------
    name : str
        Name of the module being parsed.
    docstring : str
        The docstring of the module itself, if present (defaults to an empty string).
    classes : dict[str, dict[str, str]]
        Dictionnaries of all encountered class definitions.
    functions : dict[str, dict[str, str]]
        Dictionnaries of all encountered function definitions.
    config : dict[str, typing.Any]
        Configuration options used to render attributes.

    Returns
    -------
    : str
        `Markdown`-formatted description of the whole module.

    """
    # self-standing functions bullet list
    fs = []
    for f in functions:
        if f.count(".") == name.count(".") + 1:
            n = f.split(".")[-1]
            if not n.startswith("_") or config["show_private"]:
                link = f.replace(".", "").lower()  # github syntax
                desc = functions[f]["funcdocs"].split("\n")[0]
                desc = f": {desc}" if len(desc) else ""
                fs.append(f"* [`{n}()`](#{link}){desc}")

    # classes bullet list
    cs = []
    for c in classes:
        if c.count(".") == name.count(".") + 1:
            n = c.split(".")[-1]
            if not n.startswith("_") or config["show_private"]:
                link = c.replace(".", "").lower()  # github syntax
                desc = classes[c]["classdocs"].split("\n")[0]
                desc = f": {desc}" if len(desc) else ""
                cs.append(f"* [`{n}`](#{link}){desc}")

    sub = {
        "classnames": "**Classes**\n\n" + "\n".join(cs) if cs else "",
        "docstring": docstring,
        "funcnames": "**Functions**\n\n" + "\n".join(fs) if fs else "",
        "module": name,
    }

    # clean up the unwanted
    if "c" in config["split_by"]:
        sub["classnames"] = ""
    if "f" in config["split_by"]:
        sub["funcnames"] = ""

    return TPL_MODULE.substitute(sub).strip()

astdocs.render

render(
    filepath: str = "",
    remove_from_path: str = "",
    code: str = "",
    module: str = "",
    config: dict[str, typing.Any] = config,
) -> str:

Run the whole pipeline (useful wrapper function when this gets used as a module).

Parameters

  • filepath [str]: The path to the module to process. Defaults to empty string.
  • remove_from_path [str]: Part of the path to be removed. If one is rendering the content of a file buried deep down in a complicated folder tree but does not want this to appear in the ancestry of the module. Defaults to empty string.
  • code [str]: Code to process; useful when used as a module. If both filepath and code are provided the latter will be ignored. Defaults to empty string.
  • module [str]: Name of the current module. Defaults to empty string.
  • config [dict[str, typing.Any]]: Configuration options used to render attributes.

Returns

  • [str]: Markdown-formatted content.
source
def render(
    filepath: str = "",
    remove_from_path: str = "",
    code: str = "",
    module: str = "",
    config: dict[str, typing.Any] = config,
) -> str:
    """Run the whole pipeline (useful wrapper function when this gets used as a module).

    Parameters
    ----------
    filepath : str
        The path to the module to process. Defaults to empty string.
    remove_from_path : str
        Part of the path to be removed. If one is rendering the content of a file buried
        deep down in a complicated folder tree *but* does not want this to appear in the
        ancestry of the module. Defaults to empty string.
    code : str
        Code to process; useful when used as a module. If both `filepath` and `code` are
        provided the latter will be ignored. Defaults to empty string.
    module : str
        Name of the current module. Defaults to empty string.
    config : dict[str, typing.Any]
        Configuration options used to render attributes.

    Returns
    -------
    : str
        `Markdown`-formatted content.

    """
    _update_templates(config)

    if len(filepath):
        # clean up module name
        if remove_from_path:
            filepath = filepath.replace(remove_from_path, "")

        module = re.sub(r"\.py$", "", filepath.replace("/", ".")).lstrip(".")
        module = module.replace(".__init__", "")
        module = module if len(module) else str(pathlib.Path.cwd()).rsplit("/", 1)[-1]

        # traverse and parse the ast
        with pathlib.Path(filepath).open() as fp:
            n = ast.parse(fp.read())

    elif len(code) and len(module):
        filepath = f"{module}.py"
        n = ast.parse(code)

    else:
        return "Nothing to do."  # user provided nOthINg

    # all objects encountered over a whole run are kept track of
    global objects  # noqa: PLW0602
    objects[module] = {"classes": {}, "functions": {}, "imports": {}}

    # parse it all
    classes, functions, imports = parse(n, module, module, {}, {}, {})

    # render the functions at the root of the module
    fs = []
    for f in functions:
        if f.count(".") == module.count(".") + 1:
            name = f.split(".")[-1]
            if not name.startswith("_") or config["show_private"]:
                fs.append(render_function(filepath, f, functions))

    # render the classes at the root of the module
    cs = []
    for c in classes:
        if c.count(".") == module.count(".") + 1:
            name = c.split(".")[-1]
            if not name.startswith("_") or config["show_private"]:
                cs.append(render_class(filepath, c, classes, functions, config))

    # render each section according to provided options
    sub = {
        "classes": "\n\n".join(
            [
                "## Classes" if "c" not in config["split_by"] and cs else "",
                "\n\n".join(cs) if cs else "",
            ],
        ),
        "functions": "\n\n".join(
            [
                "## Functions" if "f" not in config["split_by"] and fs else "",
                "\n\n".join(fs) if fs else "",
            ],
        ),
        "module": render_module(
            module,
            format_docstring(n),
            classes,
            functions,
            config,
        ),
    }

    s = TPL.substitute(sub).strip()

    # cleanup (extra line breaks)
    s = re.sub(r"\n{3,}", "\n\n", s)
    return re.sub(r"\n{2,}%%%(^SOURCE[A-Z]*)", r"\n%%%\1", s)

astdocs.render_recursively

render_recursively(
    path: str,
    remove_from_path: str = "",
    config: dict[str, typing.Any] = config,
) -> str:

Run pipeline on each Python module found in a folder and its subfolders.

Parameters

  • path [str]: The path to the folder to process.
  • remove_from_path [str]: Part of the path to be removed.
  • config [dict[str, typing.Any]]: Configuration options used to render attributes.

Returns

  • [str]: Markdown-formatted content for all Python modules within the path.

Example

import astdocs
import re

outdir = "docs"

for line in astdocs.render_recursively(...).split("\n"):
    if line.startswith("%%%BEGIN"):
        try:
            output.close()
        except NameError:
            pass
        path = re.sub(
            r"\.py$",
            ".md",
            "/".join([outdir.rstrip("/")] + line.split()[2].split(".")),
        )
        os.makedirs(path.split("/")[:-1], exist_ok=True)
        output = open(path, "w")
    else:
        output.write(f"{line}\n")
try:
    output.close()
except NameError:
    pass
source
def render_recursively(
    path: str,
    remove_from_path: str = "",
    config: dict[str, typing.Any] = config,
) -> str:
    r"""Run pipeline on each `Python` module found in a folder and its subfolders.

    Parameters
    ----------
    path : str
        The path to the folder to process.
    remove_from_path : str
        Part of the path to be removed.
    config : dict[str, typing.Any]
        Configuration options used to render attributes.

    Returns
    -------
    : str
        `Markdown`-formatted content for all `Python` modules within the path.

    Example
    -------

    ```python
    import astdocs
    import re

    outdir = "docs"

    for line in astdocs.render_recursively(...).split("\n"):
        if line.startswith("%%%BEGIN"):
            try:
                output.close()
            except NameError:
                pass
            path = re.sub(
                r"\.py$",
                ".md",
                "/".join([outdir.rstrip("/")] + line.split()[2].split(".")),
            )
            os.makedirs(path.split("/")[:-1], exist_ok=True)
            output = open(path, "w")
        else:
            output.write(f"{line}\n")
    try:
        output.close()
    except NameError:
        pass
    ```

    """
    ms = []

    # render each module
    for filepath in sorted(pathlib.Path(path).glob("**/*.py")):
        name = str(filepath).split("/")[-1]
        if not name.startswith("_") or config["show_private"] or name == "__init__.py":
            ms.append(
                render(
                    filepath=str(filepath),
                    remove_from_path=remove_from_path,
                    config=config,
                ),
            )

    s = "\n\n".join(ms)

    # cleanup (extra line breaks)
    return re.sub(r"\n{2,}%%%(^SOURCE[A-Z]*)", r"\n%%%\1", s)

astdocs.postrender

postrender(func: typing.Callable) -> typing.Callable:

Apply a post-rendering function on the output of the decorated function.

This can be used to streamline the linting of the output, or immediately convert to HTML for instance.

Parameters

  • func [typing.Callable]: The function to apply; should take a str as lone input, the Markdown to process.

Returns

  • [str]: Markdown-formatted content.

Example

Some general usage:

import astdocs

def extend_that(md: str) -> str:
    # process markdown
    return string

def apply_this(md: str) -> str:
    # process markdown
    return string

@astdocs.postrender(extend_that)
@astdocs.postrender(apply_this)
def render(filepath: str) -> str:  # simple wrapper function
    return astdocs.render(filepath)

print(render(...))

or more concrete snippets, for instance lint the output immediately:

import astdocs
import mdformat

def lint(md: str) -> str:
    return mdformat.text(md)

@astdocs.postrender(lint)
def render(filepath: str) -> str:
    return astdocs.render(filepath)

print(render(...))

and replace the %%%SOURCE ... markers by <details> HTML tags including the code of each object:

import astdocs
import re

def extract_snippet(md: str) -> str:
    for m in re.finditer("^%%%SOURCE (.*):([0-9]+):([0-9]+)\n", md):
        ms = m.group(0)  # matched string
        fp, cs, ce = m.groups()  # path to module, first and last line of snippet
        with open(fp) as f:
            snippet = "\n".join(f.readlines()[cs:ce + 1])
        md = md.replace(
            ms, f"<details><summary>Source</summary>\n\n{snippet}\n\n</details>"
        )
    return md

@astdocs.postrender(extract_snippet)
def render(filepath: str) -> str:
    config = astdocs.config.copy()
    config.update({"with_linenos": True})
    return astdocs.render(filepath, config=config)

print(render(...))
source
def postrender(func: typing.Callable) -> typing.Callable:
    r"""Apply a post-rendering function on the output of the decorated function.

    This can be used to streamline the linting of the output, or immediately convert to
    `HTML` for instance.

    Parameters
    ----------
    func : typing.Callable
        The function to apply; should take a `str` as lone input, the `Markdown` to
        process.

    Returns
    -------
    : str
        `Markdown`-formatted content.

    Example
    -------
    Some general usage:

    ```python
    import astdocs

    def extend_that(md: str) -> str:
        # process markdown
        return string

    def apply_this(md: str) -> str:
        # process markdown
        return string

    @astdocs.postrender(extend_that)
    @astdocs.postrender(apply_this)
    def render(filepath: str) -> str:  # simple wrapper function
        return astdocs.render(filepath)

    print(render(...))
    ```

    or more concrete snippets, for instance lint the output immediately:

    ```python
    import astdocs
    import mdformat

    def lint(md: str) -> str:
        return mdformat.text(md)

    @astdocs.postrender(lint)
    def render(filepath: str) -> str:
        return astdocs.render(filepath)

    print(render(...))
    ```

    and replace the `%%%SOURCE ...` markers by `<details>` HTML tags including the code
    of each object:

    ```python
    import astdocs
    import re

    def extract_snippet(md: str) -> str:
        for m in re.finditer("^%%%SOURCE (.*):([0-9]+):([0-9]+)\n", md):
            ms = m.group(0)  # matched string
            fp, cs, ce = m.groups()  # path to module, first and last line of snippet
            with open(fp) as f:
                snippet = "\n".join(f.readlines()[cs:ce + 1])
            md = md.replace(
                ms, f"<details><summary>Source</summary>\n\n{snippet}\n\n</details>"
            )
        return md

    @astdocs.postrender(extract_snippet)
    def render(filepath: str) -> str:
        config = astdocs.config.copy()
        config.update({"with_linenos": True})
        return astdocs.render(filepath, config=config)

    print(render(...))
    ```

    """

    def decorator(f: typing.Callable) -> typing.Callable:
        def wrapper(*args: list, **kwargs: dict) -> typing.Callable:
            return func(f(*args, **kwargs))

        return wrapper

    return decorator

astdocs.cli

cli() -> None:

Process CLI calls.

source
def cli() -> None:
    """Process CLI calls."""
    config = _update_configuration()

    if len(sys.argv) != 2:
        sys.exit("Wrong number of arguments! Accepting *one* only.")

    try:
        md = render(filepath=sys.argv[1], config=config)
    except IsADirectoryError:
        md = render_recursively(sys.argv[1], config=config)

    sys.stdout.write(f"{md}\n")