Skip to content

Feature/workflow #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Feature/workflow #18

wants to merge 13 commits into from

Conversation

jakevc
Copy link

@jakevc jakevc commented Mar 13, 2025

Summary by CodeRabbit

  • New Features

    • Enhanced automated workflows now include a robust Docker-based build process with integrated quality checks.
    • Launched an expanded research pipeline for bioinformatics, now delivering comprehensive data processing and quality plot visualizations.
    • Introduced a new script for generating quality score histograms.
  • Chores

    • Refined repository management by updating file tracking rules to maintain a cleaner project environment.
    • Updated environment variable declaration for clarity in the Dockerfile.
    • Updated workflow configuration to improve build and testing processes.

Copy link

coderabbitai bot commented Mar 13, 2025

📝 Walkthrough

Walkthrough

The changes update the CI/CD workflow and introduce a new bioinformatics pipeline. The GitHub Actions workflow now includes a job for building a Docker image and linting both the Dockerfile and Snakefile, with the test job updated to depend on the Docker build. A new .gitignore file prevents tracking of specific directories and files. Additionally, a new Snakefile defines several pipeline rules for mapping, processing, and plotting data, and a Python script is added to create a quality histogram plot.

Changes

File(s) Change Summary
.github/workflows/main.yml Added a docker-build job to checkout code, lint the Dockerfile with hadolint, and build the Docker image tagged as snakemake-tutorial. Introduced a lint job (dependent on docker-build) for linting the Snakefile and performing a dry run using the built image. Updated the test job to depend on docker-build, changed checkout action to actions/checkout@v4, and corrected branches_ignore to branches-ignore.
.gitignore New file specifying directories and files (calls, plots, sorted_reads, mapped_reads, and .snakemake) to be ignored by Git.
workflow/Snakefile New workflow file defining rules: all (final target), bwa_map (mapping), samtools_sort (sorting), samtools_index (indexing), bcftools_call (variant calling), and plot_quals (plotting).
workflow/scripts/plot-quals.py New Python script using matplotlib (with the "Agg" backend) and pysam to read variant quality scores, generate a histogram, and save it as an SVG file.

Sequence Diagram(s)

sequenceDiagram
    participant Repo as Repository
    participant Docker as Docker-Build Job
    participant Lint as Lint Job
    participant Test as Test Job

    Repo->>Docker: Checkout code, lint Dockerfile with hadolint, build Docker image
    Docker->>Lint: Notify completion for Docker build
    Repo->>Lint: Checkout code, lint Snakefile, run dry-run with Docker image
    Docker->>Test: Signal dependency met for test job
    Repo->>Test: Checkout code (v4) and run tests using Docker image
Loading
sequenceDiagram
    participant A as Rule All
    participant B as Rule bwa_map
    participant C as Rule samtools_sort
    participant D as Rule samtools_index
    participant E as Rule bcftools_call
    participant F as Rule plot_quals
    participant S as plot-quals.py

    A->>B: Trigger mapping step (generate BAM files)
    B->>C: Pass BAM files for sorting
    C->>D: Provide sorted BAM for indexing
    D->>E: Supply inputs for variant calling
    E->>F: Deliver VCF file for plotting
    F->>S: Invoke Python script to generate quality plot (SVG)
Loading

Possibly related PRs


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9da520d and 9db203d.

📒 Files selected for processing (1)
  • Dockerfile (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (10)
workflow/scripts/plot-quals.py (3)

6-7: Add error handling for the variant file input

The script assumes the variant file exists and has quality scores, but doesn't include error handling.

-quals = [record.qual for record in VariantFile(snakemake.input[0])]
-plt.hist(quals)
+try:
+    vcf = VariantFile(snakemake.input[0])
+    quals = [record.qual for record in vcf]
+    if not quals:
+        print("Warning: No quality scores found in the VCF file")
+    plt.hist(quals)
+except Exception as e:
+    print(f"Error processing variant file: {e}")
+    raise
🧰 Tools
🪛 Ruff (0.8.2)

6-6: Undefined name snakemake

(F821)


9-9: Add figure formatting before saving

The plot could benefit from labels and title for better interpretation.

-plt.savefig(snakemake.output[0])
+plt.xlabel('Quality Score')
+plt.ylabel('Frequency')
+plt.title('Distribution of Variant Quality Scores')
+plt.tight_layout()
+plt.savefig(snakemake.output[0])
🧰 Tools
🪛 Ruff (0.8.2)

9-9: Undefined name snakemake

(F821)


1-9: Add a main guard for better script reusability

The script directly uses the snakemake variable which will cause errors if run outside of Snakemake.

import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
from pysam import VariantFile

-quals = [record.qual for record in VariantFile(snakemake.input[0])]
-plt.hist(quals)
-
-plt.savefig(snakemake.output[0])
+def plot_qualities(input_file, output_file):
+    try:
+        vcf = VariantFile(input_file)
+        quals = [record.qual for record in vcf]
+        if not quals:
+            print("Warning: No quality scores found in the VCF file")
+        plt.hist(quals)
+        plt.xlabel('Quality Score')
+        plt.ylabel('Frequency')
+        plt.title('Distribution of Variant Quality Scores')
+        plt.tight_layout()
+        plt.savefig(output_file)
+    except Exception as e:
+        print(f"Error processing variant file: {e}")
+        raise
+
+if __name__ == "__main__":
+    # This allows the script to be run both as a standalone script and through Snakemake
+    if 'snakemake' in globals():
+        plot_qualities(snakemake.input[0], snakemake.output[0])
+    else:
+        import sys
+        if len(sys.argv) != 3:
+            print("Usage: python plot-quals.py input.vcf output.svg")
+            sys.exit(1)
+        plot_qualities(sys.argv[1], sys.argv[2])
🧰 Tools
🪛 Ruff (0.8.2)

6-6: Undefined name snakemake

(F821)


9-9: Undefined name snakemake

(F821)

workflow/Snakefile (4)

9-17: Consider adding threads parameter to bwa_map rule

BWA mem can benefit from parallelization, especially for larger datasets.

rule bwa_map:
    input:
        "data/genome.fa",
        "data/samples/{sample}.fastq",
    output:
        "mapped_reads/{sample}.bam",
+   threads: 4
    shell:
-       "bwa mem {input} | samtools view -Sb - > {output}"
+       "bwa mem -t {threads} {input} | samtools view -Sb - > {output}"

19-27: Consider adding threads parameter to samtools_sort rule

Samtools sort can benefit from parallelization for improved performance.

rule samtools_sort:
    input:
        "mapped_reads/{sample}.bam",
    output:
        "sorted_reads/{sample}.bam",
+   threads: 4
    shell:
        "samtools sort -T sorted_reads/{wildcards.sample} "
-       "-O bam {input} > {output}"
+       "-O bam -@ {threads} {input} > {output}"

38-48: Add parallel processing to bcftools commands

Both mpileup and call can benefit from using multiple threads for larger datasets.

rule bcftools_call:
    input:
        fa="data/genome.fa",
        bam=expand("sorted_reads/{sample}.bam", sample=SAMPLES),
        bai=expand("sorted_reads/{sample}.bam.bai", sample=SAMPLES),
    output:
        "calls/all.vcf",
+   threads: 4
    shell:
-       "bcftools mpileup -f {input.fa} {input.bam} | "
-       "bcftools call -mv - > {output}"
+       "bcftools mpileup --threads {threads} -f {input.fa} {input.bam} | "
+       "bcftools call --threads {threads} -mv - > {output}"

1-57: Consider adding resource directives and log files

Adding resource directives and log files can help with workflow management and debugging.

For each rule, consider adding:

  1. Log file output
  2. Resources directives (memory, time)

Example for the bwa_map rule:

rule bwa_map:
    input:
        "data/genome.fa",
        "data/samples/{sample}.fastq",
    output:
        "mapped_reads/{sample}.bam",
    threads: 4
+   resources:
+       mem_mb=8000,
+       runtime=60
+   log:
+       "logs/bwa_map/{sample}.log"
    shell:
-       "bwa mem -t {threads} {input} | samtools view -Sb - > {output}"
+       "bwa mem -t {threads} {input} | samtools view -Sb - > {output} 2> {log}"

Similar changes could be made to the other rules as well.

Would you like me to provide examples for the other rules as well?

.github/workflows/main.yml (3)

8-8: Avoid using empty branches-ignore configuration

The empty branches-ignore list will have no effect. Consider removing it or specifying branches to ignore.

  pull_request:
-    branches-ignore: []
🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


21-21: Remove trailing whitespace

There's unnecessary trailing whitespace on this line.

-        dockerfile: Dockerfile
-      
+        dockerfile: Dockerfile
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 21-21: trailing spaces

(trailing-spaces)


46-54: Upgrade checkout action and add missing newline

The checkout action is using an older version, and the file is missing a newline at the end.

  test:
    runs-on: ubuntu-latest
    needs: docker-build
    steps:
      - name: Checkout Code
-        uses: actions/checkout@v2
+        uses: actions/checkout@v4

      - name: Run Tests
-        run: docker run snakemake-tutorial snakemake --cores 1 
+        run: docker run snakemake-tutorial snakemake --cores 1
🧰 Tools
🪛 actionlint (1.7.4)

51-51: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 YAMLlint (1.35.1)

[error] 54-54: no new line character at the end of file

(new-line-at-end-of-file)


[error] 54-54: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2676d6 and 9a3ba8f.

📒 Files selected for processing (4)
  • .github/workflows/main.yml (1 hunks)
  • .gitignore (1 hunks)
  • workflow/Snakefile (1 hunks)
  • workflow/scripts/plot-quals.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • workflow/scripts/plot-quals.py
🪛 Ruff (0.8.2)
workflow/scripts/plot-quals.py

6-6: Undefined name snakemake

(F821)


9-9: Undefined name snakemake

(F821)

🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)


51-51: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


[error] 21-21: trailing spaces

(trailing-spaces)


[error] 54-54: no new line character at the end of file

(new-line-at-end-of-file)


[error] 54-54: trailing spaces

(trailing-spaces)

🔇 Additional comments (8)
.gitignore (1)

1-5: Appropriate configuration for Snakemake workflow outputs

The .gitignore entries correctly exclude directories that will contain generated files from the Snakemake workflow (calls, plots, sorted_reads, mapped_reads) as well as the Snakemake working directory (.snakemake).

workflow/scripts/plot-quals.py (1)

1-5: Configuration setup looks good

The script properly sets the non-interactive Agg backend for matplotlib and imports the necessary libraries.

workflow/Snakefile (4)

1-2: LGTM: Sample definition looks good

The SAMPLES list is correctly defined at the top of the file.


4-7: All rule correctly defines the workflow output

The rule all properly specifies the final output file that the workflow will generate.


29-36: LGTM: Samtools index rule looks good

The samtools_index rule is properly implemented.


50-57: LGTM: Plot quals rule is well implemented

The plot_quals rule correctly uses the script to generate a quality score plot.

.github/workflows/main.yml (2)

11-24: Docker build job looks good

The docker-build job correctly sets up checkout, linting, and image building steps.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


[error] 21-21: trailing spaces

(trailing-spaces)


25-45: Lint job is well configured

The lint job is set up correctly with proper dependencies and steps.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
.github/workflows/main.yml (4)

7-9: Empty 'branches-ignore' Field.
The "branches-ignore" field is set to an empty list, which actionlint flags as an issue. If you do not intend to ignore any branches, consider removing this field altogether, or otherwise specify the branches you want to ignore.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


14-14: Incorrect Indentation for Step Items.
YAMLLint expects step items to be indented with 2 additional spaces relative to their parent "steps:" key (i.e. 6 spaces total), but line 14 currently shows 4 spaces. Please adjust the indentation for all step items accordingly.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


22-22: Remove Trailing Spaces.
There are trailing spaces on this line (line 22) that violate YAML lint rules. Please remove the extra whitespace to ensure consistent formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 22-22: trailing spaces

(trailing-spaces)


55-55: Ensure Proper File Ending.
Line 55 contains trailing spaces and is missing a newline at the end of the file. Please remove the trailing spaces and add a newline to adhere to YAML best practices.

-      - name: Run Tests
-        run: docker run snakemake-tutorial snakemake --cores 1 
+      - name: Run Tests
+        run: docker run snakemake-tutorial snakemake --cores 1
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 55-55: no new line character at the end of file

(new-line-at-end-of-file)


[error] 55-55: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a3ba8f and 93cf633.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)


52-52: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


[error] 22-22: trailing spaces

(trailing-spaces)


[error] 55-55: no new line character at the end of file

(new-line-at-end-of-file)


[error] 55-55: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build

@jakevc jakevc self-assigned this Mar 13, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
.github/workflows/main.yml (4)

8-8: Revisit the 'branches-ignore' Setting
The branches-ignore field on line 8 is currently set to an empty list. If there is no branch to ignore, consider removing this field to avoid confusion; otherwise, populate it with the appropriate branch names.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


14-14: Correct YAML List Item Indentation
Line 14 (the beginning of the "Checkout Code" step) is indented with 4 spaces, but YAML best practices would typically expect 6 spaces for list entries under the steps key. Please update the indentation as shown below:

-    - name: Checkout Code
+      - name: Checkout Code
🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


22-22: Remove Trailing Spaces
There are trailing spaces detected on line 22. Removing these unnecessary spaces will help keep the YAML file clean and prevent potential linting issues.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 22-22: trailing spaces

(trailing-spaces)


55-55: Fix File Termination Formatting
Line 55 ends with trailing spaces and is missing the newline character at the end of the file. Please update the file so that it ends with a clean newline and without extra spaces.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 55-55: no new line character at the end of file

(new-line-at-end-of-file)


[error] 55-55: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93cf633 and d1f0481.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


[error] 22-22: trailing spaces

(trailing-spaces)


[error] 55-55: no new line character at the end of file

(new-line-at-end-of-file)


[error] 55-55: trailing spaces

(trailing-spaces)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
.github/workflows/main.yml (4)

7-8: Review the branches-ignore Setting.
The pull_request trigger currently specifies branches-ignore: [], which is effectively redundant. Consider removing this key or populating it with the intended branch patterns to ignore.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


25-25: Remove Trailing Spaces on Line 25.
Trailing whitespace detected on this line. Removing such extraneous spaces will improve readability and prevent potential YAML lint warnings.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 25-25: trailing spaces

(trailing-spaces)


36-36: Remove Trailing Spaces on Line 36.
There are unnecessary trailing spaces on this line. Please remove them to adhere to clean YAML formatting practices.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 36-36: trailing spaces

(trailing-spaces)


91-91: Remove Trailing Spaces on Line 91.
Trailing spaces are present at the end of the Run Tests command. Removing them will ensure consistency in formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 91-91: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1f0481 and 9e97cb0.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[warning] 14-14: wrong indentation: expected 6 but found 4

(indentation)


[error] 25-25: trailing spaces

(trailing-spaces)


[error] 36-36: trailing spaces

(trailing-spaces)


[error] 91-91: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
.github/workflows/main.yml (5)

1-9: Workflow Naming and Trigger Configuration
The workflow name change to "Build, Lint, and Test" improves clarity. However, note that the branches-ignore field under pull_request (line 8) is set to an empty array. This can trigger warnings (as flagged by actionlint) and may lead to unintended behavior. Consider either removing this key if it's not needed or specifying the branches that should be ignored.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


23-28: Trailing Spaces in Dockerfile Lint Step
Trailing spaces are detected at the end of line 28. Please remove these trailing spaces to ensure the YAML passes linting without issues.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


37-42: Upload Artifact Step Review
The "Upload Artifact" step correctly uploads the Docker image tarball using the specified environment variable. However, be sure to remove any trailing spaces (for example, as flagged on line 39) to maintain clean YAML formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 39-39: trailing spaces

(trailing-spaces)


76-97: Test Job Configuration and Artifact Usage
The test job now depends on the docker-build job and includes the following positive updates:

  • It checks out the code using the updated actions/checkout@v4 and sets up Buildx.
  • The steps for downloading the Docker image artifact, loading the image, and executing the tests using snakemake are correctly implemented.
    Note: There are trailing spaces at the end of line 96. Removing these will help pass YAML linting checks.
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 96-96: trailing spaces

(trailing-spaces)


96-96: Remove Trailing Spaces in Test Run Command
Trailing spaces are present at the end of line 96. Please remove these extra spaces to meet YAML formatting standards.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 96-96: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e97cb0 and ee7d2cf.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


[error] 96-96: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build
🔇 Additional comments (3)
.github/workflows/main.yml (3)

10-12: Environment Variable Configuration
The environment variable DOCKER_TAG_NAME is clearly defined as "snakemake-tutorial", which is consistently used across jobs. No changes needed here.


14-42: Docker-Build Job Configuration
The docker-build job is well structured. The following points are noted:

  • The checkout action is updated to actions/checkout@v4 as intended.
  • The steps for setting up Buildx, linting the Dockerfile (using hadolint/hadolint-action@v3.1.0), building the Docker image (with proper tagging and artifact output), and uploading the artifact are clearly defined.
    Overall, the configuration is consistent and adheres to best practices.
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


43-75: Lint Job Configuration
The lint job is correctly set up and depends on the successful completion of the docker-build job. Key observations:

  • It uses actions/checkout@v4 with full repository history (fetch-depth: 0), which is appropriate for linting purposes.
  • The Buildx setup and the linter for the Snakefile (using github/super-linter@v7) are well-configured with the necessary environment variables.
  • The steps for downloading the Docker image artifact, loading the image, and performing a dry run of the Snakefile are properly implemented.
    Overall, this job is well crafted.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
.github/workflows/main.yml (5)

8-9: Empty 'branches-ignore' configuration.
The pull_request trigger specifies branches-ignore: [], which is effectively a no-op. If the intent is to trigger on all branches, consider either removing this key or adding a comment to clarify the intent.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


28-28: Remove trailing whitespace.
Trailing spaces detected on line 28 can be removed to improve YAML cleanliness.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


39-39: Remove trailing whitespace.
Trailing spaces are found on line 39. Removing these will help maintain consistent formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 39-39: trailing spaces

(trailing-spaces)


64-69: Use environment variable for artifact name consistency.
While the environment variable DOCKER_TAG_NAME is defined with the value snakemake-tutorial, the artifact download step hardcodes the name. For uniformity, consider replacing the hardcoded value with the env variable. For example:

-          name: snakemake-tutorial
+          name: ${{ env.DOCKER_TAG_NAME }}

98-98: Remove extra blank lines.
Static analysis indicates there is an extra blank line at the end. Removing unnecessary blank lines will help conform to style guidelines.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 98-98: too many blank lines

(1 > 0) (empty-lines)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4daa61e and 76eafb7.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


[warning] 98-98: too many blank lines

(1 > 0) (empty-lines)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build

@jakevc jakevc requested a review from johanneskoester March 13, 2025 16:37
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
.github/workflows/main.yml (1)

97-101: ⚠️ Potential issue

Remove Erroneous '-tutorial' Flag in 'Run Tests' Step.
The extra -tutorial argument in the Docker run command is likely unintended and could cause command errors. Please update the command as follows:

-            conda run -n ${{env.DOCKER_TAG_NAME}} -tutorial snakemake --cores 1
+            conda run -n ${{env.DOCKER_TAG_NAME}} snakemake --cores 1
🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 101-101: too many blank lines

(1 > 0) (empty-lines)

🧹 Nitpick comments (5)
.github/workflows/main.yml (5)

8-9: Ensure Proper 'branches-ignore' Configuration.
The branches-ignore field is currently set to an empty array, which may not be intended. Consider specifying branches to ignore (if any) or remove the key if not used to avoid potential confusion.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


27-28: Remove Trailing Whitespace.
Trailing spaces detected on line 28 should be removed to comply with YAML formatting standards.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


37-42: Remove Trailing Whitespace in 'Upload Artifact' Step.
Trailing spaces on line 39 should be eliminated to satisfy YAML linting requirements.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 39-39: trailing spaces

(trailing-spaces)


73-77: Clean Up 'DryRun Snakefile' Step Formatting.
Trailing whitespace is present on line 74. Please remove the extra spaces to improve maintainability and adhere to style standards.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 74-74: trailing spaces

(trailing-spaces)


101-102: Eliminate Extra Blank Line at End of File.
A warning indicates there is one extra blank line at the end of the file. Please remove it to conform with YAML formatting standards.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 101-101: too many blank lines

(1 > 0) (empty-lines)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76eafb7 and 1150d82.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


[error] 74-74: trailing spaces

(trailing-spaces)


[warning] 101-101: too many blank lines

(1 > 0) (empty-lines)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build
🔇 Additional comments (1)
.github/workflows/main.yml (1)

43-61: Positive Addition: 'lint' Job.
The new lint job that checks out code, sets up Buildx, and lints the Snakefile is a valuable addition to improve code quality.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
.github/workflows/main.yml (4)

7-9: Clarify the use of an empty branches-ignore list.
The branches-ignore field under pull_request is currently set to an empty array. If no branches need to be ignored, consider removing this key to avoid confusion. Otherwise, update it with the intended branch names to ignore.

🧰 Tools
🪛 actionlint (1.7.4)

8-8: "branches-ignore" section should not be empty

(syntax-check)


28-28: Remove Trailing Whitespace.
A static analysis tool flagged trailing spaces on this line. Removing them will help maintain clean YAML formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


73-77: Clean Up Trailing Whitespace in Multi-line Command.
The DryRun Snakefile step has a multi-line command where static analysis noted trailing spaces (e.g., line 74). Removing these trailing spaces will improve YAML cleanliness.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 74-74: trailing spaces

(trailing-spaces)


97-101: Run Tests Command Verification & Cleanup.
The Run Tests step appropriately runs the tests using Docker with the built image. Additionally, note that a static analysis tool flagged an extra blank line (line 101). Please remove any unnecessary newlines at the end of the command block.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 101-101: too many blank lines

(1 > 0) (empty-lines)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1150d82 and 9da520d.

📒 Files selected for processing (1)
  • .github/workflows/main.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/main.yml

8-8: "branches-ignore" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)
.github/workflows/main.yml

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


[error] 74-74: trailing spaces

(trailing-spaces)


[warning] 101-101: too many blank lines

(1 > 0) (empty-lines)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: docker-build
🔇 Additional comments (9)
.github/workflows/main.yml (9)

1-2: Workflow Name Update Confirmation.
The workflow has been renamed to "Build, Lint, and Test," which clearly reflects its new responsibilities.


10-11: Environment Variable Consistency.
The environment variable DOCKER_TAG_NAME is clearly defined and used consistently across jobs. No issues found here.


14-42: Review of the docker-build Job.
The docker-build job is well-structured:

  • It checks out the code using actions/checkout@v4.
  • It sets up Buildx and uses hadolint to lint the Dockerfile.
  • The Docker image is built with the tag from env.DOCKER_TAG_NAME and exported as an artifact.
    Please verify that the output artifact (/tmp/${{ env.DOCKER_TAG_NAME }}.tar) is correctly consumed in downstream jobs.
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 28-28: trailing spaces

(trailing-spaces)


[error] 39-39: trailing spaces

(trailing-spaces)


43-51: lint Job: Checkout Step Validated.
The checkout step in the lint job uses actions/checkout@v4 with fetch-depth: 0, which is appropriate for full history if needed.


55-63: Linting Configuration for Snakefile.
The Lint Snakefile step uses github/super-linter@v7 with relevant environment variables. The configuration appears correct to ensure Snakefile standards.


64-69: Artifact Download Consistency.
The artifact downloaded in the lint job uses the same name (snakemake-tutorial) as set by the environment variable. This consistency is good; please ensure the artifact path is correctly utilized in subsequent steps.


70-72: Docker Image Loading Step.
The Load Docker Image step in the lint job correctly loads the artifact. No issues detected here.


78-93: Test Job Download and Setup Verification.
The test job sets up by:

  • Checking out code with actions/checkout@v4.
  • Setting up Buildx.
  • Downloading the Docker image artifact using the environment variable to ensure consistency.
    The step order is logical and adheres to dependency requirements.

94-96: Docker Image Loading in Test Job.
The Load Docker Image step in the test job mirrors the process in the lint job. It is correctly implemented.

@cmeesters
Copy link
Member

@jakevc this only requires review? I noticed it on the Hackathon project site. @johanneskoester?

Copy link
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and a good idea in principle! One thing worries me though: we now have two sources of the workflow. One in the tutorial docs and one here. I think it would be cool to rather auto test the code snippets in the sphinx docs? One could use this one: https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html

@jakevc
Copy link
Author

jakevc commented Jun 11, 2025

Yeah I see what you mean. I guess the main goal of this PR is to make the whole tutorial maximally portable. This can be useful for individual users, as well as for teaching purposes. I have definitely rebuilt the tutorial from the documentation multiple time for this reason, could be nice to have an official location that is runnable end to end.

What do you think about going the opposite direction: dynamically including snippets from this repo into the sphinx documentation?

@jakevc
Copy link
Author

jakevc commented Jun 11, 2025

The benefit of this approach is that it offloads the testing of the workflow to a more realistic/portable environment, the downside is I don't think there is a suitable sphinx extension that does this.... yet

@johanneskoester
Copy link
Contributor

Yes, I agree, that is better!

@johanneskoester
Copy link
Contributor

I think literalinclude should do the job: https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-literalinclude

When we align snakefiles here with the snippets that show up in the tutorial, one should be able to fully test that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

3 participants