Sampling depth to include all sample

Hi all, all my sample counts are all very low, range from 161-982. I would like to include all the samples and see what is going on. In this case, may I know what is the value I should put for sampling depth? Should I put “0” or “1” in sampling depth? How about the value of alpha rarefaction? Should I use the same value or I can choose a different value?

I’m a bit confused here. Is both the value of sampling depth and alpha rarefaction depth always have to be the same?

Thank you for your time.

Hi! If you want all samples for alpha and beta diversity, choose 161.

Here you can choose any value between your min and max counts. You can use the mean, or average, for example.

1 Like

Hi @timanix, could I just put 1 or 0 in this case? I assume it will be the same?

How about if I put the value beyond the min and max range? In this case 1000?

Hi @alexp,

While I agree with @timanix’s recommendation to use a sequencing depth of 161 to retain all samples, I woulld recommend either taking a step back and seeing where you’re losing sequnces. Do you not have a lot of sequences to import? Do they drop off during quality filtering? Denosiing? Paird end joining? Clustering? Did you filter in some way?

I’m a more shallow samples evangelist and I wouldn’t trust observations at 160 sequences/sample for anything more than a rhetorical point.

Best,
Justine

1 Like

Hi @jwdebelius

As I still couldn’t get the answer yet, would sampling depth “0” or “1” works if I want to retain all samples? Or I can go as low as “10”?

Another question is that do I need to put the same value for alpha rarefaction as well? I mean the same value as I use in sampling depth.

Many thanks!

Hi @alexp,

It depends on which command you’re using. Alpha rarefaction and rarefy will both fail with a value of 0, because you can’t generate a feature table with 0 counts. A value of 1 will give you a useless table because you will have 1 count per sample and therefore can only have 1 feature and your shannon diversity should be 1.

As I said before,

However, I will re-iterate that at 160ish sequences/sample, your data is junk. You should not be worrying about rarefaction until you’ve figure out why you don’t have a reasonable sequencing depth.

Best,
Justine

2 Likes

Hi @jwdebelius,

Sorry I am a newbie in bioinformatic and Qiime2. And now I’m getting super confused.

I’m sequencing the microbiome from a very little microbe sample. The reads are 161, 263, 232, 487, 732, 521, 811, and 982. I tried using a sampling depth of 150 (not sure is this okay?) and 150 for alpha rarefaction max depth. The Faith_pd image , Shannon

, and Observed_OTUS image graph are shown as follows. They don’t reach plateau at all. I think I’m not including all the counts? Any advice for me? Long reply with details and explanation (if possible) would greatly appreciated.

Many thanks!

Hi @alexp,

The fact that you’re new is exactly why I bring up your sequencing depth. Your sample size and sequencing depth are insufficient for analysis. You should not be worrying about rarefaction until you figure out why your sequencing depths are so low.

The way to do that is to go back to earlier in the process. How many sequences did you import? What command did you use to generate your feature table? We can work through issues with sequencing depth together, but I need your help to figure out what went wrong where.

You’re not going to plateau at 150 sequences/sample in a richness metric, like it’s just not going to happen. So, figure out why you only have 150 sequences/sample and then worry about your rarefaction curves.

Best,
Justine

Hi @jwdebelius

Thank you for your prompt reply.

I actually don’t expect my sample to have any microbe as I am working with a sterile organ, yet the sequencing result showing there are some counts. So I’m looking at what the microbes are, whether come from environmental contamination or what.

Here are the command that I used:
qiime tools import --type ‘SampleData[PairedEndSequencesWithQuality]’ --input-path manifest.tsv --input-format PairedEndFastqManifestPhred33V2 --output-path imported_seqs.qza

qiime demux summarize --i-data imported_seqs.qza --o-visualization imported_seqs.qzv

qiime dada2 denoise-paired --i-demultiplexed-seqs imported_seqs.qza --p-trim-left-f 0 --p-trim-left-r 19 --p-trunc-len-f 225 --p-trunc-len-r 183 --o-table table.qza --o-representative-sequences rep-seqs.qza --o-denoising-stats denoising-stats.qza

qiime metadata tabulate
–m-input-file denoising-stats.qza
–o-visualization denoising-stats.qzv

qiime phylogeny align-to-tree-mafft-fasttree --i-sequences rep-seqs.qza --o-alignment aligned-rep-seqs.qza --o-masked-alignment masked-aligned-rep-seqs.qza --o-tree unrooted-tree.qza --o-rooted-tree rooted-tree.qza

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table table.qza --p-sampling-depth 150 --m-metadata-file metadata.tsv --output-dir core-metrics-results

Many Thanks!

Hi @alexp,

So i would double check your dada2 denosing stats to see what your read counts look like going in and coming out. I would also focus more on your taxonomic assignments than on rarefaction or diversity analysis. It doesn’t matter how your samples relate to each other: you care about what’s there and whether or not it’s reasonable.

You may also want to look at some of the literature around kit contamination. The Salter paper is a classic; you may also like the Kathroseq paper.

Best,
Justine

1 Like