Rust-based parsers and toolings used by django-components. Exposed as a Python package with maturin.
pip install djc-coreTransform HTML in a single pass. This is a simple implementation.
This implementation was found to be 40-50x faster than our Python implementation, taking ~90ms to parse 5 MB of HTML.
Usage
from djc_core import set_html_attributes
html = '<div><p>Hello</p></div>'
result, _ = set_html_attributes(
html,
# Add attributes to the root elements
root_attributes=['data-root-id'],
# Add attributes to all elements
all_attributes=['data-v-123'],
)To save ourselves from re-parsing the HTML, set_html_attributes returns not just the transformed HTML, but also a dictionary as the second item.
This dictionary contains a record of which HTML attributes were written to which elemenents.
To populate this dictionary, you need set watch_on_attribute to an attribute name.
Then, during the HTML transformation, we check each element for this attribute. And if the element HAS this attribute, we:
- Get the value of said attribute
- Record the attributes that were added to the element, using the value of the watched attribute as the key.
from djc_core import set_html_attributes
html = """
<div data-watch-id="123">
<p data-watch-id="456">
Hello
</p>
</div>
"""
result, captured = set_html_attributes(
html,
# Add attributes to the root elements
root_attributes=['data-root-id'],
# Add attributes to all elements
all_attributes=['data-djc-tag'],
# Watch for this attribute on elements
watch_on_attribute='data-watch-id',
)
print(captured)
# {
# '123': ['data-root-id', 'data-djc-tag'],
# '456': ['data-djc-tag'],
# }This project uses a multi-crate Rust workspace structure to maintain clean separation of concerns:
djc-html-transformer: Pure Rust library for HTML transformationdjc-template-parser: Pure Rust library for Django template parsingdjc-core: Python bindings that combines all other libraries
To make sense of the code, the Python API and Rust logic are defined separately:
- Each crate (AKA Rust package) has
lib.rs(which is like Python's__init__.py). These files do not define the main logic, but only the public API of the crate. So the API that's to be used by other crates. - The
djc-corecrate imports other crates - And it is only this
djc-corewhere we define the Python API using PyO3.
-
Setup python env
python -m venv .venv
-
Install dependencies
pip install -r requirements-dev.txt
The dev requirements also include
maturinwhich is used packaging a Rust project as Python package. -
Install Rust
-
Run Rust tests
cargo test -
Build the Python package
maturin develop
To build the production-optimized package, use
maturin develop --release. -
Run Python tests
pytest
NOTE: When running Python tests, you need to run
maturin developfirst.
Deployment is done automatically via GitHub Actions.
To publish a new version of the package, you need to:
- Bump the version in
pyproject.tomlandCargo.toml - Open a PR and merge it to
main. - Create a new tag on the
mainbranch with the new version number (e.g.1.0.0), or create a new release in the GitHub UI.