Plugin error from dada2, 16S paired-end fastq files can't generate feature table through dada2

Hi! Thank you for pointing out the issue.
I find out this may be caused by download and unzip processes.
The following are the processes I used before:

  1. DOWNLOAD NCBI SAMPLE FILES THROUGH prefetch
    Generally, the download process is like this:
prefetch SRA_ID  --location NCBI
  1. After downloading the .sralite file(a kind of zip file), I use fastq-dump to unzip this file:
fastq-dump --split-3 SRA.sralite

3.Finally, two paired-end or one single file will be generated. The file's content is like below:

@SRR19603331.1 1 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAACTGGCCTTTATTTGACGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTCACGTGAGAGCAGGCGG
+SRR19603331.1 1 length=251
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
@SRR19603331.2 2 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAACTGGCCTTTATTTGACGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTAAAGTGAGCGCAGGCGG
+SRR19603331.2 2 length=251
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
@SRR19603331.3 3 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAATAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAAATGGCCTTTATTTGAAGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATGTATTGGGCGTAAAGCGAGCGCAGGCGG
+SRR19603331.3 3 length=251
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

We can draw conclusion that this file is not right because there are only '?' in the files.
However, when I use wget to download SRA file, it will be very different.

  1. I enter the one of the run's website, which is displayed below.

    Copying the AWS url and use wget to download the files
wget -b -c https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR19603331/SRR19603331
  1. parallel-fastq-dump was used to generate fastq files from the file which was downloaded before.
 parallel-fastq-dump -t 20  -O ./ --split-3  -s  SRR19603331
  1. Check the fastq files, the result is showed below:
@SRR19603331.1 1 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAACTGGCCTTTATTTGACGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTCACGTGAGAGCAGGCGG
+SRR19603331.1 1 length=251
FFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FF,FFFFFFFFFF:FFFFFFFFFF:FFFFFFF:FF:FFFFFFFFFFFFF:FFFFFFF,F::FF:F:FFF,FF:FFFFFFF:F:,FFFFFFF::F:FF:F:FFFFF:FFF,FFFFFFFFFFFFFFFFF:FFF::FFF::FFFFFF:FFFF:FFF:FFF::FFFFF:,F,FFFFFFF::F,F:,FFFF,FF:F,FFFFF,FF,FF:F:
@SRR19603331.2 2 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAACTGGCCTTTATTTGACGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTAAAGTGAGCGCAGGCGG
+SRR19603331.2 2 length=251
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFFFFFFFF:F:FFF:FFFFF:FFFF
@SRR19603331.3 3 length=251
ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGACGCAAGTCTGATGGAGCAACGCCGCGTGAGTGAATAAGGTTTTCGGATCGTAAAGCTCTGTTGTTGGTGAAGAAGGATAGAGGTAGTAAATGGCCTTTATTTGAAGGTAATCAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATGTATTGGGCGTAAAGCGAGCGCAGGCGG
+SRR19603331.3 3 length=251
FFFFF,FF::F:::FFFFFFFFFF:FF:FFFF,:FFFFFFFFFFFF,,FFFFFFFFFF,FFFFF,FF,:FF:FFF,,::FFFFFF,,FFFFFF:FFF,:,FF,FF,FFFFFFF:F,FFFFFFFFFF:F:F,,F:F::F,F,F:,F,FF:,FF:FFF,FFF,F,FFF:,,F,FFFF,F,FF:FFF:FF:FFFF:FF,F,,F,F,FF:::F,FFFF,FFFF:,,F,F,F:F:,F::F:,,:,,F:,FFF,FF,

This file is right! In a nutshell, it seems like the problem is caused by prefetch

Thank you for your patient and kindness! I will try this way to generate a feature table and do species annotation.

2 Likes