I am using Qiime2 to calculate Unifrac distance matrix from 16S data.
I have about 16 samples from 2 types of conditions. like Time x body site. Every condition just have two levels, so 16 samples can be stacked into 4 groups(2x2),every group have 4 samples.like before-oral, after-oral, before-fecal, after-fecal. Then I calculated the weighted Unifrac distance as Matrix 1
By some reason, I changed the groups, like before-oral, after-oral→oral, before-fecal, after-fecal→fecal. So OTU table, no change; pre-sequence, no change; only meta file for samples changed. Then I calculated weighted Unifrac distance as Matrix 2.
Finally, I suddenly found that the distance between same pari of two samples was different between Matrix 1 and Matrix 2. I do not know why. I am thinking it is impossible. Like what I know is that unifrac distance between two samples is dependent on the phylogenetic tree and the shared OTU. However, phylogenetic tree and shared OTU is not changed. Did I make a mistake? or the shared OTU is dependent on groups not two samples?
Please, provide more data. I don't think it's the failure of Q2, but if it is there's always chance that it might be a .
Yes, distance should not change in that case.
Hi, Thank you for your reply.
sorry for the less information. But I find where is the problem, it is not a bug. I am using a script to generate cmds for qiime2 at once. Because I need change the OTU table and meta file every time to test some hypothesis, so this script will start from convert data to biom and qza, then generate phylogenetic tree. So I remember the agorism of fasttree that if I were using more than 1 thread ( or 1 thread is same?) , the place of near OTU would be exchanged. Usually this would not make any changing on the conclusion of phylogenetic tree. But Unifrac distance calculation is very sensitive to the location of OTU, any small changing will affect the value, especially on the case with large number of OTUs and a lot of them is similar.
When I used same tree, only changed meta file, I got same distance matrix. I think this problem have already been solved, and no need to show the data and command. If you know what, I will try to upload a small set of data.
Perfect, this technique is called "a rubber duck" - you need to explain your code/calculation to an imaginary "rubber duck", it helps to verbalize ideas and find problems/pitfalls.
I'm glad it's solved!