Need to create it manually as there isn't automated support for this currently https://snakemake.readthedocs.io/en/stable/snakefiles/testing.html Could be fairly simple, e.g. compare median AUROC for final models and feature importances on the micro dataset.