Contributing to crypto-condor#
Be it issues, pull requests, or suggestions, contributions are welcome!
Requirements for development#
The external dependencies are (not including Python packages):
To configure the Python dependencies and the repo:
Run
make install
to install the development dependencies.Use a virtual environment with
poetry shell
.Run
make init
to configure the repo.When coding and committing, run
make all
to run the test suite.
make install
installs all the dependencies, including the development and
documentation dependencies, inside a virtual environment[1]. It uses a lock
file (poetry.lock
) to ensure that package versions are consistent between
developers’ environments.
It also installs the project as an editable package inside that virtual
environment, meaning that changes to the source code are immediately reflected
in the package. The CLI is installed as crypto-condor-cli
.
poetry shell
activates the virtual environment inside a sub-shell that can be
exited with ctrl+D
or by typing exit
. This is the recommended way, as
otherwise all commands requiring the venv’s python would have to be preceded by
poetry run
.
make all
is the go-to command for testing changes. It runs the linter, tests,
and coverage. It also builds the docs, checking for errors and running the
doctest examples, ensuring they stay up-to-date.
You can run commands normally inside this sub-shell:
# Display the CLI's help.
crypto-condor-cli --help
# Run the unit tests.
make test
Design#
The source code is inside the crypto_condor
directory. It has three main
modules: cli
, primitives
, and vectors
.
The cli
is divided in commands, some of which have their own module. The main
app can be found in main.py
and contains some generic commands (i.e. that do
not depend on a given primitive) such as method
and get-wrapper
.
The primitives
are separated by modules, each with their own functions to test
implementations, protocols to describe the expected function signatures, and
enums to define the parameters (mode of operation, elliptic curve, etc.).
The vectors
contain subdirectories where the source files for test vectors are
stored. These are parsed with the primitive’s _import.py
script and then
serialised with protobuf
, making it easy to load them at runtime.
Finally, there is a fourth directory, resources
. It contains the version of
the method guides that is used by the method
commands, as well as the wrapper
templates and examples for each primitive.
Documentation#
The documentation is generated with Sphinx. Most documents are written in
Markdown, thanks to MyST
parser. The exception to this
are the documents that make use of autodoc
directives like autofunction
, as
the sphinx.ext.autodoc
extension doesn’t support Markdown files.[2]
Protobuf#
We use protobuf to store test vectors. Protobuf uses
.proto
files that describe the message (in our case the vectors). These are
then compiled with protoc
to Python classes. For type-checking and adding
docstrings to these classes, we use
mypy-protobuf, which creates
.pyi
files when compiling with protoc
.
You can use the Makefile target compile-proto
to compile the protobufs. It
finds the corresponding files, and only updates those that require it. It also
shows the protoc
version, which should preferably be included in the commit
message.
Testing#
Testing is done with pytest
and pytest-cov
for code coverage. The structure
of tests
reflects that of crypto_condor
: tests under primitives/
test the
functions and implementations directly, as a library user would use them, and
tests under cli/
test the CLI commands. This includes running the wrapper
examples bundled with crypto-condor, which is especially useful as these
examples cover a lot of code, from the CLI to the primitives and test vectors.
Adding new primitives#
Here are some guidelines on how to add a new primitive. To get started, the
handy utils/add_primitive.py
script creates templates of most of the necessary
files:
python utils/add_primitive.py <primitive name>
From here on out, we’ll use ML-KEM as an example: since it’s a recently updated module, it is a good reference for new ones.
Test vectors#
Use the script to create:
A protobuf descriptor:
Add a parameter to
Vectors
that characterises a set of tests.Add the necessary fields to
Test
so any source of vectors is supported.
A parsing script.
First, there are the test vectors. It creates a directory named _mlkem
to
store the source files, protobuf descriptors, parsing script, and the serialized
vectors. We mainly use test vectors from NIST
CAVP
and Project Wycheproof, though any
reference (RFC, official implementation, etc.) source is welcome.
All test vectors should be serialized. To serialize test vectors we use Protocol Buffers or protobufs for short. You will need two files: a protobuf descriptor and a parsing script.
The protobuf descriptor is a .proto
file that defines the messages and their
attributes, similar to Python dataclasses. add_primitive.py
creates it with
two types of message needed by crypto-condor
: Vectors
and Test
. These
are already filled with a common set of values requiring minimal changes.
Vectors
should define a characterising parameter (mode of operation, elliptic
curve, etc.), while Test
should define fields that allow any source of vectors
to be added easily. This is pretty vague, but the idea is to stop having
separate types of vectors for each source, which may require a bit of additional
logic in the parsing to conform to a “standard format”.
The descriptor is compiled using protoc
to a Python module that provide the
messages as classes, which can be imported and used by the primitive module.
The parsing script will use these classes, creating a new instance for each
group of vectors, and parsing the text file to extract the values of each
vector. It also includes a generate_json
function that is used to generate the
JSON file that declares the list of protobufs which are available for each
parameter.
Primitive#
Second, it creates the primitive module, primitives/MLKEM.py
in this case,
where the code to test implementations will lie.
As a rule of thumb, this module includes:
An enum which defines a parameter for the primitive (e.g. mode of operation, elliptic curve). This makes it easy to document (with the
autoenum
directive) and makes it clear which options are implemented. Also Typer uses enums to provide auto-completion.A function that loads test vectors, usually based on that enum and the JSON file mentioned above.
One or more Protocol classes that describe the function signature that the implementation must have in order to be tested.
A test function for each operation that is supported, which runs with test vectors or user input files.
A function that runs a Python wrapper (or more if other languages are supported).
Additionally, internal implementations or wrappers of third-party
implementations are considered private. The convention in Python is that the
function name should start with an underscore. To improve its privacy, we do not
include this function in the module’s __dir__()
(see below). Python does not
have a way of enforcing this “privacy”, users can still access these functions
if they know they exist, but the idea is to convey the message that these are
not meant to be used anywhere else, no guarantees are made.
A side-note on imports#
Currently the primitive modules are structured to be imported and used “directly”. For example:
from crypto_condor.primitives import AES
AES.test(...)
We use __dir__
to declare the public API, as it limits what is returned when
using an IDE’s or interpreter’s auto-completion. This allows to remove names
such as logging
as well as avoid exposing functions meant to be only used
internally, like our wrapper of the primitives.
__dir__
returns a list of strings. Objects like type aliases have to be
referenced by name directly (e.g. "CiphertextAndTag"
), while most other
objects can be referenced by their __name__
attribute (e.g.
verify.__name__
). The advantage of the latter is that renaming the
function/class/etc. using an IDE will change this reference automatically.
CLI commands#
Once this work on the primitive is done, add the integration to the CLI. This
should mostly consist in adding a function for the primitive under the
corresponding command, which parses the inputs with typer.Argument
and
typer.Option
, and passes them to the corresponding function e.g.
MLKEM.test_encaps(...)
.
When the corresponding functions are implemented, add a new entry to the
SUPPORTED_MODES
dictionary in constants.py
and the necessary tests.
A few aspects to consider:
When adding wrappers, the tool checks that the
get-wrapper
command is supported for the given primitive, and then looks for a directory underresources/wrappers
. This directory must be named as the primitive, in lower-case. Inside it the wrappers are organized by language, each with their own subdirectory named in lower-case. Examples are in subdirectories named<language>-example
. Each example has its own sub-subdirectory inside it. These sub-subdirectories are numbered by an increasing counter that starts at 1.Guides are first written for the documentation then copied with the
utils/copy_guides.py
script. The name matches the one for the documentation, namely the primitive name in upper-case.
Adding a new harness#
crypto-condor can test functions exposed by a shared library, similar to a
fuzzing hook. To do so, the functions must follow the conventions described by
the harness API. Internally, this means adding a test_lib
function to the
corresponding primitive. This function has a particular signature:
def test_lib(ffi: cffi.FFI, lib, functions: list[str]) -> ResultsDict:
...
Where:
ffi
is thecffi.FFI
instance.lib
is the library dlopen’d withffi
.functions
is a list of function names to test, which should correspond to the primitive called.
Each primitive is in charge of calling ffi.cdef()
to define the signature of
the exposed function, and to wrap it and test it. The
crypto_condor.harness.test_harness
function is in charge of determining the
available functions, importing the corresponding primitives, and passing the
list of function that each primitive should test.
The documentation for this mode can be found in docs/source/harness-api
.
Documenting a new primitive#
The documentation can be found under docs/source
. There, it is divided in
several directories which correspond to different pages in the HTML render. As
indicated above, most documents can be written in Markdown, but those that make
use of autodoc must be written in rST as autodoc doesn’t support Markdown.
Building the documentation#
The packages required to build the documentation can be installed with poetry install --with=docs
. Then you can either use make docs
which builds the docs
to docs/build/html
or use make livedocs
with uses sphinx-autobuild
to
build the docs, watch for changes, and reload open tabs after rebuilding
changes. Both options ensure that the dependencies are installed before
building.
For publishing, the docs are automatically built by the CI. It uses the
pages-ci
target which calls the all-versions
target of docs/Makefile
is
used. This target uses a hard-coded list of Git refs (tags or branches), checks
out each ref and builds its corresponding documentation under
docs/build/public/[ref]
. Then the pages-ci
targets moves the resulting docs
to the correct directory used by GitLab Pages.
Versioning#
As indicated in the README, this project currently adheres to
CalVer. This version is shown in various parts of the
project (--version
option, the documentation, the git tags, etc.). For each
release, the version must be updated in both the git tag and
pyproject.toml
, otherwise the CI pipeline will fail the publish
step.
To avoid pushing a tagged version with an out-of-date pyproject.toml
or vice
versa, you can add a pre-push hook that runs the
utils/check_tag_and_version.py
script. Create .git/hooks/pre-push
with the
following content:
current=$(git branch --show-current)
if test "$current" = "main"
then
.venv/bin/python utils/check_tag_and_version.py
fi
This checks that the hook only runs on the main
branch, as others should not
be tagged. It also assumes that we are using a virtual environment to run and
test the tool, and said venv is inside the .venv
directory.
Note: when using poetry, it might be necessary to run poetry install
to
refresh the package version, otherwise the hook will fail.
Contributing to CONTRIBUTING#
Modifications to CONTRIBUTING must be done to the version found in
docs/source/development/CONTRIBUTING
, as the one found in the root of the repo
is a copy of that version (see the root Makefile’s copy-contributing
target).