Hello everyone,
I have a problem with my ITS data. When I analyzed the data I got a lot of unassigned sequences (more than 50%) and I don't know where is the problem. This is my current code but I have changed it a lot looking for an answer for this issue (I haven't found anything).
qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path se-33-manifest-mine-H
--output-path single-end-demux-H.qza
--input-format SingleEndFastqManifestPhred33
qiime quality-filter q-score
--i-demux single-end-demux-H.qza
--p-min-quality 19
--o-filtered-sequences demux-filtered-trim.qza
--o-filter-stats demux-filter-stats-trim.qza
I think that here there is the big problem, because when I use cat for forward and reverse only I can put 1 primer...
--> Adapters Forward: GTGAATCATCGAATCTTTGAA Reverse: TCCTCCGCTTATTGATATGC
qiime cutadapt trim-single
--i-demultiplexed-sequences demux-filtered-trim.qza
--p-front GTGAATCATCGAATCTTTGAA \
--p-minimum-length 250
--p-error-rate 0
--o-trimmed-sequences noadapters-all.qza
qiime dada2 denoise-single
--i-demultiplexed-seqs noadapters-all.qza
--p-trunc-len 0
--p-max-ee 2
--p-n-threads 0
--o-table table-demux-trim.qza
--o-representative-sequences rep-seq-trim.qza
--o-denoising-stats denoising-demux-ytom.qza
--p-chimera-method pooled
--verbose
I know that if I use closed-reference I won't have the problem with Unassigned sequences but I need to understand why it happens.
qiime vsearch cluster-features-open-reference
--i-sequences rep-seq.qza
--i-table table-demux.qza
--i-reference-sequences heimer-seq.qza
--p-perc-identity 0.97
--o-clustered-table table-open.qza
--o-clustered-sequences rep-open.qza
--o-new-reference-sequences new-open.qza
--verbose
qiime feature-classifier classify-consensus-blast
--i-query rep-open.qza
--i-reference-reads heimer-seq.qza
--i-reference-taxonomy heimer-taxonomy.qza
--p-perc-identity 0.97
--o-classification taxonomy-full-open-trim.qza
--verbose
qiime taxa barplot
--i-table table-open.qza
--i-taxonomy taxonomy-full-open.qza
--m-metadata-file metadata.tab
--o-visualization taxa-bar-plots.qzv
Perhaps the name of a file does not correspond to the previous one... I've tried so many things that it's crazy but in reality I correct it before executing it.
I have tried pairing/simple forward/ cat forward and reverse (I did it with qiime 1 and had no problem). More stuff... Don't delete adapters/delete adapters, use quality filter/do not use it,,,,, and many more things. Maybe this is the problem that I use so many plugins....
I also have a doubt... Let's suppose that I have indeed done it right (I doubt it) and I have many Unassigned sequences. To get the biodiversity there is no problem, I use everything. I think, sometimes too much, that if the taxonomics does not reflect the biodiversity, the interpretation of the data is wrong, isn't it?
Sorry for this long-long post but I have read a lot about this topic (in this forum and others places) but I am not making progress....