Different sequencing lengths and regions for picrust analysis principle

Hello! Community,

I want to predict the functional profile of my 16s samples from different studies with different sequencing regions, is there comparability between sequences of varying sequencing lengths and regions using picrust analysis?

Hello @songying!

I'm not exactly an expert on this specific topic, but here's my opinion:

To me, it doesn't seem that much worse than comparing taxonomy between different amplicon regions. You aren't going to really be talking about the same thing in every situation, but it's also the only possible way to compare different sequencing regions at all. You can also collapse the taxonomy up to some approximately common denominator of resolution between your regions, which still isn't perfect, but is better than nothing.

So I would just be extra mindful of the limits of taxonomic resolution, as this is now compounded by the inherent limits of functional profile predictions from 16s which is already require a pretty large grain of salt. Check out some of the key limitations of picrust here.

If picrust supports a similar collapse of gene family information, you might look into that as well, although I'm not sure if that's as common practice for these situations as it is for handling taxonomy in meta-analyses. I also wouldn't have any idea how far you would need to collapse and if that could even be uniform between gene-families (in fact I would be surprised if it were).

So perhaps this can be useful for hypothesis generation and designing follow up studies which use actual shotgun metagenomics, but not really for saying anything definitive (be that within a study or across studies).