Using data from qiime taxa barplot .CSV

Mehrbod_Estaki · July 27, 2018, 4:03am

Hmm, my apologies if you had to take the long way for this. For some reason in my head I had it that the .csv file that would be downloaded from taxa-barplots.qzv would be the relative abundances used to generate the plots and not raw counts. In any case, the .csv file would attach your metadata category columns at the end of the table too which is not what you want for a raw OTU/feature table. But perhaps you didn't have any extra columns in your metadata in the first place. Either way, sounds like both ways work!

Biom actually comes with your qiime2 so you can do this conversion in qiime2!

I'm not sure about the future of ANCOM to be honest, but there is an ANCOM2 which is meant to make its way to q2 at some point, see this thread for a link and more detail. ANCOM and gneiss use a similar approach to dealing with compositionality but gneiss' use of balance trees is a bit different than ANCOM. I'm no expert on this matter but I think there's room for both analyses since they do differ a bit fundamentally.

You can actually do all sorts of filtering right in qiime2 so you don't need to keep switching between qiim1 and 2. See this filtering tutorial for details on how.

As for your error with gneiss, I think your approach is right to filter low abundant features as this removes lot of of noise that can mess up ANCOM/gneiss. If the overwhelming majority of your features are only present less than 10 times it might be worth double checking to make sure these are true features and not a product of improper denoising. For example if you forgot to remove your primers/barcodes before denoising. You could try blasting some of these features from your rep-seqs table to see if they are indeed showing up as true taxa.
Another point to mention, in your little snippet of data I see only 3 samples. Is there only 3 samples in your data? If so then I imagine this might be to blame as well. You can't really do proper stats with n=3.

DESeq2 normalization wouldn't help with the error you are receiving nor the approach you are using to filter. Deseq2 normalizization deals with uneven sampling depth (instead of lets say rarefying) but you would still be wanting to filter low abundance features. ANCOM/gneiss use relative abundance data so that kind normalization is not necessary.

Look into those points I mentioned, and if you are still having problems with gneiss, could you please start a new thread and provide the exact commands you are using alongside with your data (if you can share them) and we will dive into that more in detail there.
I'm also pinging @mortonjt the creator of gneiss in case there's anything else I missed here.

p.s Thanks for looking up and linking other discussions on the forum. Very helpful!