Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion - Incorporate checkpointing #49

Open
Steven-Kemp opened this issue Feb 22, 2022 · 4 comments
Open

Suggestion - Incorporate checkpointing #49

Steven-Kemp opened this issue Feb 22, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@Steven-Kemp
Copy link

Hi there,

I've been using TORMES for a little while now and have tried to scale this up for use on a HPC.

Sadly due to constraints, I am only allowed to use 12hrs of wall-time, this just about gets me through the Quality Filtering process for around 980 fastqs.

Would it be possible to incorporate a way of checkpoints, e.g. if TORMES detects that QC ha been completed, and all of the files are in the correct place, it can resume at a later stage when I resubmit the job?

Many thanks!
Steve

@nmquijada
Copy link
Owner

Hi Steve,

That's definitely a great suggestion. We are preparing a major release of TORMES, and we are planning to include such checkpoints at the end of each step. Hope we can release it soon...

In the meantime, having a 12hrs wall-time can be a bit tight, especially if you are planning to work with that high number of samples. I guess you cannot leave TORMES running in the background or something similar, right?

Have you considered dividing your dataset for different tormes runs? (i.e. 98 samples/run instead of 980) This would give the tool enough time to finish the entire analysis for some subsets within your schedule (the time will depend on the strength of your system and if you are requiring some steps such as assembly, pangenome comparison, etc., but you will get all this info after your first complete run and you might be able to adjust the number of samples to include per run in order to optimize your time). Let me know if you need some code to automate this.

If you would like to run all your samples in one run so you can have all in a single report, I can help you coding to merge those runs afterwords for a single report (we are also implementing this action in the major release).

Best,
Narciso

@nmquijada nmquijada added the enhancement New feature or request label Feb 23, 2022
@Steven-Kemp
Copy link
Author

Steven-Kemp commented Feb 23, 2022 via email

@nmquijada
Copy link
Owner

Hi Steve,

I am really glad to read that. It definitely encourage us to keep moving forward!

All the best,
Narciso

@d-kk
Copy link

d-kk commented Mar 26, 2024

What about potentially using the Nextflow framework for Tormes workflow execution? Nextflow would automatically enable check-pointing and would also add a host of other great features...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants