Dear QIIME community,
I'm quite new to the forum, so please apologise if I missed some this topic in my search of existing topics.
I'm running analysis of 18S illumina data. I did barcode extraction (extract_barcodes.py), demultiplexing (split_libraries_fastq.py) and separation of sequences fastq files according to the sample they come from (split_sequence_file_on_sample_ids.py) in QIIME1. Then I loaded the (unjoined!) data to QIIME2 using the fastq manifest with the intention to use DADA2 for denoising.
The problem comes with the use of cutadapt plugin where I'm getting this message:
Plugin error from cutadapt:
/var/folders/vh/n2zk66mn4l5glnh89srkr10w0000gq/T/q2-CasavaOneEightSingleLanePerSampleDirFmt-casxa8ke/18SLC_2_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:
Missing sequence for record beginning on line 5
I already tried to import fastq as well as fastq.gz compressed files to QIIME2 and I also tried to save the .csv manifest file in the Windows .csv format since I experienced previously problems with the .csv format of Mac OSx in different package (QGIS). I always ended up with the same error.
Hi @jakub.zarsky! In the future, it would be really helpful if you provide the full command you are running - this helps contextualize the error message you are reporting! With that in mind, I suspect this is a bug related to how cutadapt saves data vs how QIIME 2 expects to see it. Would you be able to provide your data, and the exact command you ran to me? A link to download the files from something like Dropbox or Google Drive would be perfect (you can send in a direct message to me if privacy is a concern). This will help us reproduce the issue, and come up with a gameplan for fixing it. Thanks!
Hi @jakub.zarsky, I think we identified the issue, please check out this thread for more detail. In the short term, you can use cutadapt directly to filter, just remember to use the -m 1 flag. We will update this thread when q2-cutadapt gets an update (open issue). Thanks!