import sequence to qiime2

gaudy93 · September 9, 2019, 12:18pm

Good morning:
I have a question about how to import my sequences. We have single fastq files are already demultiplexed and adapters were remove. How is the correct protocol to import the sequences and remove the primers when we are using five primers during the pcr?
Thank you:
Sergio

colinbrislawn · September 9, 2019, 2:02pm

Good morning Sergio,

You can import your sequences using the Fastq Manifest Format. After importing, you can then run cutadapt to do more trimming.

That's a lot of primers! How do you plan to process all of them!

Colin

gaudy93 · September 16, 2019, 4:14pm

I don´t have idea. Do you have some idea?

colinbrislawn · September 17, 2019, 1:03pm

Working with multiple marker genes is hard, which is why I usually use different primers to target different kingdoms of organisms (bacteria vs fungi), but try not to mix and match primers when working with a single group of microbes.

The most direct approach is to use closed-ref database search to match reads against known microbes. This 'old school' method is imperfect... but fast and easy. I would start there.
https://docs.qiime2.org/2019.7/plugins/available/vsearch/cluster-features-closed-reference/

Let's see if there is a good 'new school' method for dealing with multiple regions. @Mehrbod_Estaki @jwdebelius

Colin

jwdebelius · September 17, 2019, 1:21pm

There's a technique call Smurf for scaffolding reads using kmer based alignment where they do an iterative demultiplexing.

The pro is that it is (in theory) a really cool method and is probably better than closed reference picking. (Insert obligatory skeptical comment about species level resolution, known databases, and 16s rRNA sequencing in general). In theory, you should be able to run it on any database with any set of primers you're interested in. In practice, it only runs in Matlab which is proprietary and has proven difficult to run on anything other then their example data.

So, I think @colinbrislawn's recommendation is probably a good one. Or, I might look at a single hypervariable region.

Best,
Justine

Nicholas_Bokulich · September 17, 2019, 2:25pm

Since you still have the primers attached, it would be straightforward to use q2-cutadapt to trim primers and split your data by primer set all in one go, as @colinbrislawn suggested. Use qiime cutadapt trim-paired (or trim-single if your reads are not paired). See the --help documentation for more details.

system · October 18, 2019, 8:39pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.