Open
Description
I ran autoexperiment, but my jobs crashed due to an error, so I stopped the autoexperiment process. I tried to restart them using autoexperiment on the next day, but I got these messages for each one:
Resume <exp> from job id: <job id>
Current job id for <exp>: <job id>
slurm_load_jobs error: Invalid job id specified
Command 'squeue -j <job id>' returned non-zero exit status 1.
Retrying again in 10 mins for <exp>...
It seems that the job ids are no longer tracked by slurm in squeue. Should I always manually delete all the logs/ files?
Is this behaviour intended?
Thank you!
Metadata
Metadata
Assignees
Labels
No labels