Same barcodes to draw the bar-plot

Lu_Yang · November 13, 2017, 3:13pm

Hi, I have three batches of run, I run them in qiime2 seperately.
The I got table-1.qza, table-2.qza, table-3.qza, rep-seqs-1.qza, rep-seqs-2.qza, rep-seqs-3.qza. then I use the code below to merge them.
But after I did these steps, how to draw a bra-plot of them? Because for each run, barcodes are the same. Then final barplot draw needs the sample-metadata. Is there a way out?
Thanks.

qiime feature-table merge \
  --i-table1 table-1.qza \
  --i-table2 table-2.qza \
  --i-table2 table-3.qza \
  --o-merged-table table.qza
qiime feature-table merge-seq-data \
  --i-data1 rep-seqs-1.qza \
  --i-data2 rep-seqs-2.qza \
  --i-data1 rep-seqs-3.qza \
  --o-merged-data rep-seqs.qza

qiime feature-table tabulate-seqs --i-data rep-seqs.qza --o-visualization rep-seqs.qzv
qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --o-classification taxonomy.qza
qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv

Nicholas_Bokulich · November 13, 2017, 6:25pm

Hi @Lu_Yang,
Thanks for posting. Is there a particular error that you are getting when trying to generate barplots with the taxa barplot command? Perhaps I misunderstand what your question is, but barcodes and barplots are entirely unrelated. After demultiplexing samples in each individual run, barcodes do not impact any part of the downstream analysis. As long as you provide unique sample IDs for each of your samples, you can generate barplots and other analyses that separately display these samples. You can combine the metadata files from your three sequencing runs and as long as the sample IDs are unique you should not have any issue running taxa barplot. Please let me know if I have misunderstood or if you still get an error.

Good luck!

Lu_Yang · November 13, 2017, 7:43pm

Hi,
Thanks so much for your reply. Luckily I got the result without error.
I ask this question since I do not clearly know the meta data's function here. My original thought was this step will use the barcodes to differentiate the samples. I was worried about this because some of my barcodes are duplicated.
However, as a result, I got the result successfully of the barplot!
And I have another question, Is this barplot the final result? I am new to qiime2, I do not know which data file is the OTU picking result? Or better to say, DADA2 do not pick OTU, it produced the feature table. But I can only see the tax in the barplot. I did not see the OTU table or the so call dada2 variants table. For me I have transferred the feature table into a biom file, and import it into R. I got the table as below. The first row is my sample name, I do not know what does the first column means. May I know that? Thanks

Lu_Yang · November 13, 2017, 7:49pm

@Nicholas_Bokulich
Sorry to @ you in the former reply. Now to the forum.
Thanks.

Nicholas_Bokulich · November 13, 2017, 8:15pm

Hi @Lu_Yang,

The sample metadata is used for two things: first, to demultiplex samples based on their barcodes, and second to differentiate sample types in downstream analyses (e.g., statistical tests where you want to distinguish two different sample types). If you have multiple sequencing runs that re-use the same barcodes, you will want to have multiple metadata files: first, you will want one metadata file for each run so that you can demultiplex the samples matching those barcodes in each run; second, you will want to combine those metadata files into a metadata file containing ALL samples (with unique sample IDs) for analysis after sequencing runs are merged.

It is a final result — but you can use the feature table as input to so many more actions to perform alpha and beta diversity analyses, differential abundance testing with ANCOM, etc. Check out the tutorials for many examples.

The feature table is the OTU table: abundances of each feature (sequence variant) in each sample. It sounds like you may be looking for a visualizer like heatmap to view these feature abundances per sample in a figure.

The first column contains the list of feature names (i.e., the name of each sequence variant). So this table you generated is the abundance of each feature in each sample. To see the sequence associated with each feature, use tabulate-seqs.

I hope that helps answer your questions!

Lu_Yang · November 13, 2017, 8:39pm

Hi, @Nicholas_Bokulich,
Thanks so much! It really helps a lot!
So it means for QIIME2 with DADA2, the lowest tax is species, no more several OTUs belong to one species. If I am more interested in strains, the lower level, want to pick the OTUs, what shall I do in QIIME2? Because I used USEARCH, I can deal with UNOISE3 to produce ZOTU, similar as DADA2. It will also give me OTUs with UPARSE. Is there a parallel way in QIIME2 like this?

Nicholas_Bokulich · November 13, 2017, 9:01pm

Several sequence variants could still classify to the same taxonomy, but this issue is largely reduced compared to OTU picking methods (as many of the "novel" OTUs are actually noisy sequences).

The sequence variants are as resolved as you can get. It is possible (but unlikely with short sequence reads) that these may distinguish unique strains, e.g., if you wanted to classify these against a strain-level sequence reference database. It all depends on the quality of your taxonomy classifier and reference sequence database and length of sequence variants.

You want to perform OTU picking on your sequence variants after dada2? You can use the vsearch plugin to cluster denoised sequence variants into OTUs.

Lu_Yang · November 13, 2017, 9:05pm

Hi, @Nicholas_Bokulich
Thanks for your detailed interpretation. Now I understand it now. Thanks. so much for your help!!
Many thanks!

system · December 16, 2017, 10:53pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.