I am working with paired end sequences. So I ran:
qiime tools import for EMPPairedEndSequences to generate my qza file.
I was trying to demux but my command timed out (alloting 10 hours on the university cluster utilizing 10 cores for the code to run). This does not seem right. So now I want to subsample to test whether or not the code is working at all...
I need help to figure out what is going wrong with this code and also would like help to figure out how to demux a subsample from the raw-sequences.qza.
Hello @AdventureDavid, allocating multiple cores to qiime demux emp-paired is unfortunately not really going to speed things up because it is a single threaded action.
How large is your data? Since it sounds like you timed out and didn't otherwise see any errors, it is possible that this is just going to take a long time.
I agree, the dataset might need more time (approximately 220 samples) but I'm also cautious to run it again if I can confirm the code is running correctly. Based on the comment, the code looks fine though?
Is there anyway I can tell the code to demux only specific barcodes for subsampling to get a faster output to test if the codes running correctly?
I believe you would have to subsample your raw data before importing it into QIIME 2 which would be difficult to do, but you could do it.
I talked to some other members of the QIIME 2 team, and the general consensus is that it is not unusual for the action to take this long on a substantial amount of data. Probably give it 24 hours or so, and if it times out after that there may be a problem, but currently there doesn't seem to be any indication that anything has actually gone wrong.