Merging seqs.fna from multiple projects

wasade · June 13, 2018, 5:38pm

That sounds right, except that you may want to include assigning taxonomy as well . The outputs from Qiita do not automatically have taxonomy associated, but running the merged reference-hit.seqs.fa data through qiime feature-classifier classify-sklearn should do the trick.

Best,
Daniel

emescioglu · June 13, 2018, 5:40pm

Great! last question - Does Qiita automatically merge and provide the reference-hits.seqs.fa when I merge deblurred files or do I do this myself ?Thanks!

emescioglu · June 14, 2018, 6:58pm

reference-hits.seqs.fa is not provided. However, this file just contains the different sequences in my biom table right? So I could just generate a new file that has all of my sequences beginning with "> " and use this?

wasade · June 15, 2018, 12:09am

Ya, that would work! One way to do this:


    $ biom table-ids -i  --observations | awk '{ print ">" $1 "\n" $1}' > output_sequences.fa

Best,
Daniel

emescioglu · June 15, 2018, 6:58pm

Everything worked! You guys rock @ebolyen @wasade

To summarize for others interested in SourceTracker, I

downloaded Deblur output files called reference-hit.biom from each of the studies I was interested in to my computer, instead of merging them all via Qiita and downloading. I think the file from Qiita with all the data merged was too big to for my computer to process the file
converted the biom files to txt format and took a subset of samples I was interested in (there were thousand samples in some of the studies) and then converted these back to biom format. This step was because I didn't know how to edit biom files
$ biom convert -i HumanSkinSamples.txt -o HumanSkinSamples.biom --table-type="OTU table" --to-hdf5
merged all the biom tables together using qiime1 merge_otu_tables.py
$ merge_otu_tables -i HumanSkinSamples.biom,HumanFecalSamples.biom -o merged.biom
imported merged biom table as a qiime artifact
$ qiime tools import --input-path merged.biom --type 'FeatureTable[Frequency]' --source-format BIOMV210Format --output-path merged-feature-table.qza
created a reference-hit.biom.fa file using excel tricks (saw your code after I already did this) and imported as a qiime artifact
$ qiime tools import --input-path rep-seqs-merged.fa --output-path rep-seqs-merged.qza --type FeatureData[Sequence]
assigned taxonomy
qiime feature-classifier classify-sklearn
--i-classifier 16S_classifier.qza
--i-reads rep-seqs-merged.qza
--o-classification taxonomy.qza
made taxa barplots and downloaded the output as CSV file through Qiime View at desired level (Species, Genus, etc.)
$ qiime taxa barplot
--i-table merged-feature-table.qza
--i-taxonomy taxonomy.qza
--m-metadata-file MetadataFile.tsv
--o-visualization taxa-bar-plots.qzv
Formatted table to use as input to SourceTracker

Thank you @wasade, @ebolyen, and rest of Qiime team!!

wasade · June 15, 2018, 7:05pm

Great!!! And thank you so much for sharing the commands you used!

Best,
Daniel

system · July 17, 2018, 1:11am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.