Is the number of features retained after merging and filtering usual in my case?

Hello all,

I have 2 NGS runs with 50 and 43 samples respectively.
I did the following:

  1. Trimmed off primers from my reads using cutadapt separately for two different runs
  2. denoised data with dada2 separately for two runs (Both the runs yielded slightly more than 2000 features)
  3. merged the 2 feature tables (obtained from dada2) (yielded a total feature of ~3700
  4. filtered sequences and features (removed features that occured in less than 3 samples)
  5. Now, I was left with 1165 features (after filtering) and I am heading towards taxonomic classification. Before I taxonomically classify these sequences I want to ask is getting 1165 features from 93 samples usual enough? or these are too huge or too less?

My samples are stool samples from cancer patients and I intend to study their microbiome composition

1 Like


With identical parameters? It is recommended to keep the same settings if different runs are going to be merged.

Is it a number of unique features? Looks reasonable to me.
Are total frequencies by samples sufficient?

You can get even lower number of unique ASVs/features if you will filter features based on total frequency (for example, remove all features with total frequency less than 10).

1 Like