Skip to content

Conversion

rst_in_md.rst_to_soup(rst)

Convert restructured text to html in a manner that is compatible with markdown.

Parameters:

Name Type Description Default
rst str

Raw restructured text to convert to html.

required

Returns:

Name Type Description
str BeautifulSoup

Html converted from restructured text.

Source code in rst_in_md/conversion.py
82
83
84
85
86
87
88
89
90
91
92
def rst_to_soup(rst: str) -> BeautifulSoup:
    """Convert restructured text to html in a manner that is compatible with markdown.

    Args:
        rst (str): Raw restructured text to convert to html.

    Returns:
        str: Html converted from restructured text.
    """
    soup = _rst_to_soup(rst)
    return _strip_attributes(soup)

rst_in_md.conversion._rst_to_soup(rst)

Convert reStructuredText to a BeautifulSoup object.

This will convert the reStructuredText to HTML using docutils. The HTML is then converted to a BeautifulSoup object and returned.

Errors and warnings are captured gracefully and raised at the end.

This function is heavily inspired by this rst2html implementation.

Parameters:

Name Type Description Default
rst str

The reStructuredText to convert.

required

Raises:

Type Description
ValueError

If there are any errors or warnings during the conversion.

Returns:

Name Type Description
BeautifulSoup BeautifulSoup

The converted reStructuredText.

Source code in rst_in_md/conversion.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def _rst_to_soup(rst: str) -> BeautifulSoup:
    """Convert reStructuredText to a BeautifulSoup object.

    This will convert the reStructuredText to HTML using docutils. The HTML is then
    converted to a BeautifulSoup object and returned.

    Errors and warnings are captured gracefully and raised at the end.

    This function is heavily inspired by this [rst2html](https://github.com/andrewpetrochenkov/rst2html.py/blob/b66942f16e93d7260748ecc90867c55a4bb3236d/rst2html/__init__.py)
    implementation.

    Args:
        rst (str): The reStructuredText to convert.

    Raises:
        ValueError: If there are any errors or warnings during the conversion.

    Returns:
        BeautifulSoup: The converted reStructuredText.
    """
    kwargs = {
        "writer_name": "html",
        "settings_overrides": {
            "_disable_config": True,
            "report_level": 2,
        },
    }

    with io.StringIO() as target, redirect_stderr(target):
        parts = docutils.core.publish_parts(rst, **kwargs)
        warning = target.getvalue().strip()

    if warning:
        msg = f"Failed to convert restructured text:\n\n{warning}"
        raise ValueError(msg)

    return BeautifulSoup(parts.get("body"), features="html.parser")

rst_in_md.conversion._strip_attributes(soup)

Remove specific attributes from the soup.

This will remove all attributes from the top level tags, and will also remove some attributes from the descendants. This took heavy inspiration from this StackOverflow answer.

Parameters:

Name Type Description Default
soup BeautifulSoup

Input soup to remove attributes from.

required

Returns:

Name Type Description
BeautifulSoup BeautifulSoup

Same soup with attributes removed.

Source code in rst_in_md/conversion.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
def _strip_attributes(soup: BeautifulSoup) -> BeautifulSoup:
    """Remove specific attributes from the soup.

    This will remove all attributes from the top level tags, and will also remove
    some attributes from the descendants. This took heavy inspiration from this
    [StackOverflow answer](https://stackoverflow.com/a/9045719).

    Args:
        soup (BeautifulSoup): Input soup to remove attributes from.

    Returns:
        BeautifulSoup: Same soup with attributes removed.
    """
    # Remove attributes from the top level tags
    for tag in soup.contents:
        if isinstance(tag, Tag):
            tag.attrs = {}

    # Remove specific attributes from the descendants
    for tag in soup.descendants:
        if isinstance(tag, Tag):
            tag.attrs = {
                key: value
                for key, value in tag.attrs.items()
                if key not in ATTRIBUTES_TO_STRIP
            }
    return soup