The goal of my study is to determine which vertebrate or mussel species are present in various eDNA samples. For four genes we extracted DNA, sequenced them (separately), and are analyzing the results in Qiime2. For 48 samples, each sequenced at one gene, I ran the data through demultiplexing -> dada2 denoising (after changing parameters adequately to not lose all the data!) -> downloading reference data from NCBI (both bacteria, fungi, and vertebrates or mussels for each gene) -> filtered the reference data to length and using my primers -> fit a Bayesian classifier to the reference sequences and FINALLY ran the classifier on my real data.
In case it wasn't clear, the output of the demux is a rep-seqs.qza
that contains sequences of one gene from 48 different environmental localities.
Here are the results, attached here. How can I figure out which sample these feature ID's came from? I assume each Feature ID is one (or a few) read(s), but I really need to know which sample those reads came from so I can determine where that species might have been detected.
If it helps, here is the code I ran:
qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path manifest-file.txt --input-format SingleEndFastqManifestPhred33V2 --output-path demux-co1-r1.qza
qiime dada2 denoise-single \ --i-demultiplexed-seqs demux-co1-r2.qza \ --p-trim-left 20 \ --p-trunc-len 115 \ --o-representative-sequences rep-seqs-r2-115.qza \ --o-table table-r2-115.qza \ --o-denoising-stats denoise-stats-r2-115.qza
qiime rescript evaluate-fit-classifier --i-sequences coi-mussels-filtered-seqs.qza --i-taxonomy coi-mussels-taxonomy-unfiltered.qza --o-classifier coi-mussels-classifier.qza --o-evaluation coi-mussels-classifier-evaluation.qzv --o-observed-taxonomy coi-mussels-classifier-predicted-taxonomy.qza --verbose
qiime feature-classifier classify-sklearn \ --i-classifier ../coi-mussels-classifier.qza \ --i-reads rep-seqs-r1-115.qza \ --o-classification coi-r1-mussels-classified-taxonomy.qza
qiime metadata tabulate \ --m-input-file coi-r1-mussels-classified-taxonomy.qza \ --o-visualization coi-r1-mussels-classified-taxonomy.qzv
Is there a way to search individual samples (e.g. fastq files) to see if they contain a feature ID? Or do I have to run each of those above commands on one sample (i.e. one fastq file) to get taxonomic identification for that sample?
Thanks for your help!