PEAR to qiime2 import

Hi,
I’ve been using PEAR to merge together paired-end fastq files, generating 49 samples overall (all are now .fastq).
I want to import those to qiime2 for the rest of the analysis (16S), but I’m not sure whether to import them as single-end artifact, for instance using the good old:
qiime tools import
–type EMPSingleEndSequences
–input-path emp-single-end-sequences
–output-path emp-single-end-sequences.qza

Would appreciate your help!

Hi @barakdror,

It sounds like data may already be demultiplexed (if you have 49 files overall?). The smoothest (if not always the easiest) way is probably a manifest format which will follow the single end format. I think though that you can use --type "SampleData[JoinedSequencesWithQuality]" in the command. So, for some manifest file saved at manifest.tsv, you could use the command:

qiime tools import \
 --input-path manifest.tsv \
 --output-path joined_seqs.qza \
 --input-format SingleEndFastqManifestPhred33V2 \
 --type  "SampleData[JoinedSequencesWithQuality]"

But, modify as needed.

As a quick reminder, pre-joined data isn’t appropriate for DADA2, so if you want to denoise, I suggest deblur.

Best,
Justine

1 Like

Thank you @jwdebelius!
Regarding DADA2- I understood that the problem is only when the F and R are joined when there is no overlap, thus enforcing the addition of N’s. In my case there is a 14 bases overlap between the reads, so I understand this shouldn’t be a problem for DADA2.

Hi @barakdror,

No, the DADA2 algorithm relies on having raw, unfiltered and unjoined reads. So, its not about overlap length, it’s about the way the algorithm learns error.

Best,
Justine

1 Like

Hmmm, thanks, maybe I misunderstood our seqeucning facility people.
So if I do want to use DADA2 and the overlap is higher than 12 bp (which is the threshold of qiime2 from what I saw in previous posts), than I guess importing the F and R reads using qiime and than do the pairing is porbably wiser.

Thank you for your time!
Barak

1 Like

HI @barakdror,

For DADA2, yes, you want to import the reads as paired end (again, check the manifest format!) and then process in DADA2 from there. This works as long as there has been no quality filtering past like, trimming adapters and stuff.

Best,
Justine

4 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.