Hi everyone,
I have looked the qiime2 tutorials regarding filtering the data but I couldn’t find solution which I am looking for. I have roughly 16000 unique representative sequences. I would like to randomly select only 20 representative sequences from rep-seqs.qza. Is there any suggestion how could I achieve it?
Reason behind my question:
I ran qiime phylogeny align-to-tree-mafft-fasttree in rep-seqs.qza in order to generated the rooted-tree.qza . After running the above command aligned-rep-seqs.qza and masked-aligned-rep-seqs.qza files were also created.
I was curious to see how different the representative-sequence, aligned-representative-sequence and masked-aligned-representative look for a particular feature id. I found the following result:
Starting length are different for Representative sequences, Aligned-Representative sequences have same length (5420) for all feature-id and Masked-Aligned-Representative sequences have same length (5222) for all feature-id.
Now I would like to run qiime phylogeny align-to-tree-mafft-fasttree in the rep-seqs.qza file (which will have less unique feature id ~around 20) and see how difference will length be in Representative sequences , Aligned-Representative sequences and Masked-Aligned-Representative sequences after running qiime phylogeny align-to-tree-mafft-fasttree.