I have two reactors labelled as A and B, and each was conducted in triplicate labelled α, β and γ, and sampled at three time points labelled as 1, 2 and 3. So I have 18 paired-end sequence data, A_α_1, A_β_1, A_γ_1 and so on, and the average amounts of reads in one sequence data are 50,000.
I want to compare the microbial composition between A_1 and B_1 by edgeR. So I imported the sequences A_α_1, A_β_1, A_γ_1 into A_1.qza and imported the sequences B_α_1, B_β_1, B_γ_1 into B_1.qza. And then the two .qza files were denoised by DADA2, so I got two
FeatureData[Sequence] and two
FeatureTable[Frequency] and the used these file to make comparison.
Nevertheless, the parameter
--p-n-reads-learn in DADA2 shows higher and more reliable error model. So would it be better to import all 18 sequences data into one .qza file?
If so, how to subsample the
FeatureTable[Frequency] of A_1 and B_1 to only make the comparison between A_1 and B_1?
qiime feature-table subsample would randomly pick samples.
Thanks in advance.