Understanding pcoa exported file

Hi!

I have exported pcoa.qza file, hoping I will find the value for axis in my ordination.

The output has a line with Eigvals for top 15 axis, proportion explained for top 15 axis, and a "matrix" with sample as the first coulmn (labeld Site) and additional 15 coulmns, all filled with values. my guess would be that the second coulmn (first after Site coulmn) stands for PC1 values for each sample, coulmn 3 stands for PC2, etc.

couple of quests:

Thanks!

1 Like

Hi @kam ,
The issue is that QIIME 2 stores the pcoa results on disk as scikit-bio OrdinationResults format. So the file contains various information, not only the ordination values.

It is easier to view and interact with these data in python:

>>> import qiime2 as q2, pandas as pd
>>> from skbio import OrdinationResults
>>> pcs = q2.Artifact.load('jaccard_pcoa_results.qza')
>>> pcs = pcs.view(OrdinationResults)
>>> pcs.samples.iloc[:2,:3]
                                           0         1         2
1928.SRS015121.SRX020555.SRR045717  0.153000 -0.149544  0.078728
1928.SRS064354.SRX020689.SRR048775 -0.202465  0.024489 -0.120242

Yes, see the last line of the python example above. I am selecting the first 2 rows and first 3 columns of the dataframe. You could instead write pcs.samples.iloc[:,:3] to select all rows and only the first 3 axes.

Good luck!

1 Like

Thanks @Nicholas_Bokulich!

Yes, see the last line of the python example above. I am selecting the first 2 rows and first 3 columns of the dataframe. You could instead write pcs.samples.iloc[:,:3] to select all rows and only the first 3 axes.

Yes, this is displaying the selected rows based on pandas indexing. I was referring to the data itself - is it possible to get more than values for 15 lines as I received from my data?

The issue is that QIIME 2 stores the pcoa results on disk as scikit-bio OrdinationResults format. So the file contains various information, not only the ordination values.

Unfortunately I am having an issue with scikit-bio package at the moment, so I will post a photo from the top left of an excel file( :neutral_face:) of the exported Jaccard pcoa results from the Moving Pictures tutorial to make sure I got your point:

If I am understanding correctly, the green cell is the PC1 value of L1S105 sample, and the blue cell is the PC3 value of L1S208 sample?

Yes, you can have however many rows as you have samples.

Yes

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.