Kuberntes running Nextflow test #2149
Replies: 4 comments 1 reply
-
Currently I am actively looking for a solution. I have considered creating a batch of PVC, each PVC only serves one sample, this can disperse the bandwidth and IOPS pressure of PVC. I'm still looking for ideas to realize this idea, if you have a better idea, I will be very grateful to communicate with you |
Beta Was this translation helpful? Give feedback.
-
Hi Yang Xiao, based on your testing, does this suggest nextflow running on kubernetes can only use one PVC and persistent volume for the whole pipeline run? My organization is exploring whether to use AWS Batch or Kubernetes as a backend for nextflow, we have a working POC with AWS Batch but are currently struggling to get nextflow running on AWS EKS cluster. If you have some time, would you mind writing up a small example of how you set up your cluster and how you were able to deploy and run your pipelines? |
Beta Was this translation helpful? Give feedback.
-
Hi Yang Xiao, thank you for running this test and sharing your results. Nextflow uses a single I have proposed a solution (#2527) that I think may be the only way to address this issue, short of a complete restructuring of Nextflow itself. If we can direct certain sub-workflows -- for example, a sequence of tasks that operate on the same piece of data -- to run on the same node, then that will reduce a great deal of traffic as the intermediate inputs/outputs will not be staged back and forth. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, the excellent Nextflow team, thank you very much for the framework you developed. I am an operation and maintenance engineer working in China. I am currently testing Nextflow in Kubernetes. This time I want to share my testing situation in using Huawei Cloud Kubernetes. In order to elaborate some of my views and some of my optimization suggestions for Nextflow.
Reference: In the document https://www.nextflow.io/docs/latest/kubernetes.html, I first created a PVC (accessModes:ReadWriteMany), and then I tested it in three stages. The results are as follows:
The first stage:
Scenario: HUAWEI cloud cce, sfs (capacity type), concurrent execution of 23 samples
Running cycle: sample number 1770 running for 17 hours and 41 minutes
second stage:
Scenario: HUAWEI cloud cce, sfs (capacity type), single sample operation, to test the situation that no other instances compete for broadband and IOPS resources
Operating cycle: sample number 1770 runs for 13 hours and 30 minutes
The third phase:
Scenario: HUAWEI cloud cce, sfs (standard enhanced), single sample operation, to test the situation that no other instances compete for broadband and IOPS resources
Running cycle: sample number 1770 running for 12 hours and 44 minutes
It can be seen from the above that the requirements for pvc are extremely high when multiple samples are running at the same time. Therefore, I suggest that Nextflow should make better use of the powerful resource scheduling and powerful controller model of Kubernetes, and the hard disk of Kubernetes Node itself should be used, and then consider adding After running the data copy to the shared SFS or other better solutions, this will greatly improve the efficiency of Nextflow running in kubernetes, if the Nextflow team is willing, I can continue to track and continue to test the implementation of this phase.
Beta Was this translation helpful? Give feedback.
All reactions