hello everyone,
I am trying to do metagenomic analysis of fish using qiime2-2022.2.
the analysis was already done by the sequencing and I wanted to use the raw data to analyze it using qiime2:
I have two types of sample (egg and alevin) and for each sample we have paired-end sequences fastq files with barcode included in the header.
so I imported the data using manifest file as follow:
If the company processed the data differently, say using different settings for dada2 denoise-paired, we would expect the resulting counts to change.
Did they also use Qiime2 to process these files? If so, we can look at all the settings they used and compare these to your settings to see what changed. If they didn't use Qiime2, did they provide details about their analysis workflow?
thanks a lot for your answer.
the company didn't provide the detailed setting for their analysis, they provided only this one
so I can't know which setting they used.
for trunc-len-f and trunc-len-r, do u think are the correct ones?
may be due to denoising process, they could have chosen another region to make the cuts, you chose 290/291 why?
Yry to make the same with anoter values p.eg. 280/240
And I have a question, do you have your raw data like this?
Well, that's a start! Those are available as Qiime2 plugins, like you have discovered, so it should be possible to replicate their analysis.
Could you reach out to the company asking for more details? May I ask which company processed your data?
Looks like most of your reads do not pass the first quality filter, unlike their analysis. This is because --p-max-ee-f and --p-max-ee-r are set at 2, and your trimming at 290 leaves a lot of low quality reads, which don't pass that Expected Error filter.
Try trimming shorter
--p-trunc-len-f 250
--p-trunc-len-r 200
the best option is probably to trim as short as possible (to remove errors near the end) while still long enough so they can overlap and join.