Hi everyone,
I got, from my sequencing facilities, two independent Hiseq runs using in both runs the same sample-ids and one metadata file. After dada2 filtering per each run I got table1 from the run1 and table2 from run 2.
I did regression analysis between these two runs on 24 samples picked randomly, but after filtering chloroplast and Candidatus as they were abundant (when I checked the taxonomy with vsearch classifier).
Below the command used:
qiime quality-control evaluate-composition
–i-expected-features table-relative-freq-24samples_nocondi-nochloro-run1.qza
–i-observed-features table-relative-freq-24samples_nocondi-nochloro-run2.qza
–p-depth 1
–o-visualization qc-lane1-and2-comparison.qzv
When I checked the slope and p values generated from this above command line, it showed that 50% dissimilarities exist between runs.
Please let me know if it is safe to merge them at this stage or do I need to do any further normalisation?
like percentile to reduce any batch effect? But I am sure there is no contaminants between the two runs as all sequences were assigned prior to this evaluation and I feel that these differences are rather biological.
I would like to highlight that the first objective from these two technical replicates was to cover the diversity as much as possible.
I got one Echerichia positive control per each run.
your help is highly appreciated.
Thanks a lot.
That command is really intended for comparing samples to their expected composition, but can definitely be used to compare technical replicates. It would not be useful, however, if you are comparing unrelated samples. In that case, the differences would be due to biological variation and you cannot discern batch effects. Since you are picking samples randomly, they would not really have any relation to one another (it sounds like you randomly subsampled each table but maybe you picked the same samples from each table?).
If all samples are technical replicates, you should use PCoA and PERMANOVA to see if you can discern run differences.
And if these are all technical replicates I would not worry too much about batch effect anyway. Batch effect is much more concerning when you are merging runs to compare different samples — but if all samples are present in both runs then the same noise will be introduced to all samples so it is not as if you are going to distort a particular group.