Hi everyone, I'm new to QIIME2- and bioinformatics in general- so I've been following this protocol and these instructions for demultiplexing. We're trying to look at paired-end sequences from Illumina. After importing the data and demultiplexing, the output of qiime tools validate
showed that the resulting demultiplexed-sequences file has empty sequences. I'm not sure where or how sequences got deleted, or how big of an underlying problem it is.
Here's the error message:
Result demultiplexed-seqs.qza does not appear to be valid at level=max: /tmp/qiime2-archive-eqvu81tf/b28f664d-8b9e-4767-aa3f-45a6579f691c/data/PB_375_ATATCG_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file: Missing sequence for record beginning on line 49
Here's the exact commands I ran:
Starting with the raw files from the sequencing center (the only thing I did to them was rename to forward and reverse). And the metadata file is attached. 4246A_Metadata.txt (386 Bytes)
source activate qiime2-2018.11
qiime tools import --type
MultiplexedPairedEndBarcodeInSequence --input-path /bioinf/home/acastill/Bioinf_16S_Oct2019/import_4246A --output-path /bioinf/home/acastill/Bioinf_16S_Oct2019/Edits_4246A_Feb2020/multiplexed-seqs.qza
And multiplexed-seqs.qza is a valid file. Then demultiplexing:
qiime cutadapt demux-paired --i-seqs /bioinf/home/acastill/Bioinf_16S_Oct2019/Edits_4246A_Feb2020/multiplexed-seqs.qza --m-forward-barcodes-file /bioinf/home/acastill/Bioinf_16S_Oct2019/Edits_4246A_Feb2020/metadata_4246A.tsv --m-forward-barcodes-column Barcode --p-error-rate 0 --o-per-sample-sequences /bioinf/home/acastill/Bioinf_16S_Oct2019/Edits_4246A_Feb2020/demultiplexed-seqs.qza --o-untrimmed-sequences /bioinf/home/acastill/Bioinf_16S_Oct2019/Edits_4246A_Feb2020/untrimmed.qza
Later in the script I've used standalone cutadapt to get rid of the empty sequences and they generally only make up a small percentage of all sequences (from 0% to 1-2% sequences removed), but I'm nervous that ignoring the cause of the problem will have consequences in data interpretation.
Any help would be greatly appreciated!