Creating a Taxa Barplot for Two Separate Sets of Rep-Seqs?


(Ellen) #1

Having trouble finding an answer out there about this but I imagine there is a way to do this that is probably relatively easy and that I’m just not thinking of!

I have two separate sets of sequences performed on two separate MiSeq runs. I have created taxa barplots for both when I originally got the seequences in (so did all the demux, dada2, etc. etc.). However, now we want to be able to compare the taxonomic composition across both these sets. Would I have to re-run everything or is there anyway to just combine them up?

I had previously been told that a seaparate dada2 run needs to be performed for each different sequencing run. Is that true? And if so how would I get one rep-seqs file from two different runs?


(Justine) #2

Hi @Ellenphant,

You can use qiime feature-table merge-table and combine the two. Check out the Fecal Microbiome tutorial, which does some meta analysis.

My three caveats

  1. The way chimera slaying is handled in DADA2 could mean that you get different chimeras for each run. It’s fine, it’s just worth noting.
  2. You will have to re-calculate beta diversity, since this is dependent on the interaction between the set of samples. I would check qiime diversity beta-diversity to see if there’s a way you can calculate it over only the intersection between the merged samples, and then merge the tables, but Im not sure.
  3. If you used MD5 theres a small but non-zero possibility that two sequences might have gotten the same identifier. It may be worth checking your ref-seq file if you’re getting super weird results.

Best,
Justine

PS, I love your user name :elephant:.


(Nicholas Bokulich) #3

HI @Ellenphant,
Just to add my 2 cents to @jwdebelius exquisite advice:

Make sure that the runs you are merging really do overlap entirely. By this I mean that both runs used the exact same primer set and that:

  1. Single-end reads are trimmed to the same length in both runs
  2. Paired-end reads overlap enough to successfully join at approximately the same rate in both runs (check your dada2 stats to make sure). You do not need to trim at the same length, since the reads are joining to yield full amplicons, anyway, but I like to trim at the same length anyway if I can out of paranoia that different trimming lengths could cryptically impact joining between runs.

(Ellen) #4

Hi justine

Thanks so much this worked great!

Just for future reference when I have more separate runs to combine (really sad for the workflow but is safer for the budget!), does the qiime feature-table merge-table command work to combine more than two tables? Or would I just have to do this command repeatedly until I only am left with one table?


(Nicholas Bokulich) #5

Yes! You can merge multiple tables and multiple sequences in a single command — just use the tables or data input multiple times like so:

  qiime feature-table merge \
    --i-tables run1/table.qza \
    --i-tables run2/table.qza \
    --i-tables run3/table.qza \
    --p-overlap-method sum \
    --o-merged-table giant-merged-table.qza

  qiime feature-table merge-seqs \
    --i-data run1/rep-seqs.qza \
    --i-data run2/rep-seqs.qza \
    --i-data run3/rep-seqs.qza \
    --o-merged-data giant-merged-rep-seqs.qza