Merging seqs.fna from multiple projects

Hi @emescioglu,

That sounds right, except that you may want to include assigning taxonomy as well :wink: . The outputs from Qiita do not automatically have taxonomy associated, but running the merged reference-hit.seqs.fa data through qiime feature-classifier classify-sklearn should do the trick.

Best,
Daniel

1 Like

Great! last question - Does Qiita automatically merge and provide the reference-hits.seqs.fa when I merge deblurred files or do I do this myself ?Thanks!

reference-hits.seqs.fa is not provided. However, this file just contains the different sequences in my biom table right? So I could just generate a new file that has all of my sequences beginning with "> " and use this? :slight_smile:

Ya, that would work! One way to do this:

$ biom table-ids -i --observations | awk '{ print ">" $1 "\n" $1}' > output_sequences.fa

Best,
Daniel

2 Likes

Everything worked! You guys rock @ebolyen @wasade

To summarize for others interested in SourceTracker, I

  1. downloaded Deblur output files called reference-hit.biom from each of the studies I was interested in to my computer, instead of merging them all via Qiita and downloading. I think the file from Qiita with all the data merged was too big to for my computer to process the file

  2. converted the biom files to txt format and took a subset of samples I was interested in (there were thousand samples in some of the studies) and then converted these back to biom format. This step was because I didn’t know how to edit biom files
    $ biom convert -i HumanSkinSamples.txt -o HumanSkinSamples.biom --table-type=“OTU table” --to-hdf5

  3. merged all the biom tables together using qiime1 merge_otu_tables.py
    $ merge_otu_tables -i HumanSkinSamples.biom,HumanFecalSamples.biom -o merged.biom

  4. imported merged biom table as a qiime artifact
    $ qiime tools import --input-path merged.biom --type ‘FeatureTable[Frequency]’ --source-format BIOMV210Format --output-path merged-feature-table.qza

  5. created a reference-hit.biom.fa file using excel tricks (saw your code after I already did this) and imported as a qiime artifact
    $ qiime tools import --input-path rep-seqs-merged.fa --output-path rep-seqs-merged.qza --type FeatureData[Sequence]

  6. assigned taxonomy
    qiime feature-classifier classify-sklearn
    –i-classifier 16S_classifier.qza
    –i-reads rep-seqs-merged.qza
    –o-classification taxonomy.qza

  7. made taxa barplots and downloaded the output as CSV file through Qiime View at desired level (Species, Genus, etc.)
    $ qiime taxa barplot
    –i-table merged-feature-table.qza
    –i-taxonomy taxonomy.qza
    –m-metadata-file MetadataFile.tsv
    –o-visualization taxa-bar-plots.qzv

  8. Formatted table to use as input to SourceTracker :smiley:

Thank you @wasade, @ebolyen, and rest of Qiime team!!

5 Likes

Great!!! And thank you so much for sharing the commands you used!

Best,
Daniel

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.