I have a very large merged dataset and it is to large to run typical phylogenetic inference on with align-to-tree-mafft-iqtree. The dataset is not 16S based so I can use the fragment-insertion alternative. Thus, is there a way to subsample my data after consulting alpha rarefaction curves and summary visualisation of the feature table/seqs?
Ideally I would then like to use the subsampled dataset for all analysis going forward.
Maybe I do not mean subsampling then. When looking at sampling depth determined from rarefication in the feature table summary visualisation it shows x% of features would be lost. Would that not result in a smaller tree but still conveying similar information as per the theory rarefication ?
Ah! Thank you for clarifying. In that case, yes, those features would be lost at that subsampling depth and would be dropped from the tree. (You may have to drop them using an additional command, but it's possible.)
In my example, ASV3 in Sample3 had a count of zero after subsampling. If it had a count of zero in all samples, the feature could be dropped from the table and the tree. This sounds like what you want.