Dada2 back to qiime 2

Biancabrown · September 26, 2017, 5:30pm

Hi All,

Due to slow running time, I ran dada2 independently using R. However, I have an issue: How do I create a phylogenetic tree in Qiime2?

Procedure:

I saved the output of the dada2 taxa file transformed it and created a taxa table:

taxat <- t(taxa)#transpose the table
taxat <- cbind('#OTUID' = rownames(taxat ), taxat)#Add '#OTUID' to the header (required by biom)
write.table(taxa, "dada2_taxa.txt", sep='\t', row.names=FALSE, quote=FALSE)

I converted the dada2_taxa.txt into a biom table and imported it into qiime2.

Currently stuck at this point. I want to do beta diversity analyses, but I have no tree.

Please help.

thermokarst · September 28, 2017, 2:18pm

Hi @Biancabrown, sorry it took as longer than usual to reply!

We have been actively researching the dada2 slowness when installed in a conda environment, but unfortunately we don't have an ETA on when these changes will get rolled into QIIME 2 . In the meantime, it does seem like the approach you proposed here could work just fine!

Looking at your code snippet, now that you have your feature table, you can use that to produce your representative sequences. We have some python code in q2-dada2 that does this exact same thing - it reads in the BIOM table, does some ID cleanup in the table, then extracts the representative sequences. I am not an R-aficionado, so I don't think I can be of much help here when it comes to writing some comparable R code, but hopefully this can get you moving in the right direction. Please keep us posted if you get stuck! Thanks!

Biancabrown · October 4, 2017, 6:39pm

Hi,

Thanks for the quick response. Apologies I'm ignorant in python. The code that you send me should I just save it as a python script then run:

for example if I save it as q2-dada2.py
Can i run q2-dada2.py -i table.biom -o rep.table

ebolyen · October 5, 2017, 7:50pm

Hey all!

I did a bit of research on this, and it turns out there's an easier way!

I'm going to use the DADA2 1.4 Tutorial as a reference. In order to accomplish things like beta diversity analysis we're going to need the sequence table (we call it a feature-table), and the sequences. In the tutorial, those can both be made from seqtab.nochim which I will reference in a few places.

I'm also going to imagine we have a directory dada2-analysis/ which I'll use in place of an actual filepath.

In your R session, you'll want to run the following (replacing the particular filepaths):

write.table(t(seqtab.nochim), "dada2-analysis/seqtab-nochim.txt", sep="\t", row.names=TRUE, col.names=NA, quote=FALSE)

uniquesToFasta(seqtab.nochim, fout='dada2-analysis/rep-seqs.fna', ids=colnames(seqtab.nochim))

We can import that fasta file easily with:

qiime tools import \
  --input-path dada2-analysis/rep-seqs.fna \
  --type 'FeatureData[Sequence]' \
  --output-path rep-seqs.qza

For the feature-table, there are two steps we have to do first:

Add a special header for BIOM:

echo -n "#OTU Table" | cat - dada2-analysis/seqtab-nochim.txt > dada2-analysis/biom-table.txt

Convert to BIOM v2.1:

biom convert -i dada2-analysis/biom-table.txt -o dada2-analysis/table.biom --table-type="OTU table" --to-hdf5

Now we can import that as well:

qiime tools import \
  --input-path dada2-analysis/table.biom \
  --type 'FeatureTable[Frequency]' \
  --source-format BIOMV210Format \
  --output-path table.qza

This should leave you with a rep-seqs.qza and table.qza that you can use (following along with the moving pictures tutorial).

ebolyen · October 5, 2017, 8:06pm

Also for reference:
The above steps will have feature IDs which are the same as your sequences.
This is the same situation as if you had run dada2 denoise-* with the --p-no-hashed-feature-ids flag.

Biancabrown · October 9, 2017, 2:00pm

This was really help. However, I have a follow up question. I was able to obtain both the biom and req-seq files in the qiime2 format. However, I keep getting an error when I try to make the tree.

Sequence of events:

qiime alignment mafft --i-sequences rep-seqs.qza --o-alignment aligned-rep-seqs.qza

iime alignment mask --i-alignment aligned-rep-seqs.qza --o-masked-alignment masked-aligned-rep-seqs.qza

qiime phylogeny fasttree --i-alignment masked-aligned-rep-seqs.qza --o-tree unrooted-tree.qza

When I get to the final step I get the following error message:

Command '['FastTree', '-quote', '-nt', '/tmp/qiime2-archive-4dezq9v8/0
0d76262-d31c-4a55-b6f8-471182bbd1e4/data/aligned-dna-
sequences.fasta']' returned non-zero exit status 1

jairideout · October 9, 2017, 10:59pm

Hi @Biancabrown! Which release of QIIME 2 do you have installed? You can find that info by running qiime info. My guess is that you're running into a bug with MAFFT that we fixed in the 2017.8 release:

Biancabrown · October 10, 2017, 12:45pm

Hello,

I download the latest version to my laptop and ran it again. Still getting a similar message. Is there a way to stream line the process dada2(r version) to Qiime2. I think the problem is that something is going wrong with my conversions.

ebolyen · October 10, 2017, 7:18pm

I'm afraid q2-dada2 is really the only well supported way of doing this at the moment.

That's possible, would you be able to send me your imported artifacts in a direct message on the forum? I'd like to double check the MAFFT situation to make sure there isn't something we don't know about going on.

Thanks!

system · November 11, 2017, 1:18am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

ebolyen · December 22, 2017, 5:52pm

QIIME 2 2017.12 has been released and uses DADA2 1.6 which has explicit SSE vectorization for much better performance!