Hi, hopefully this is the appropriate section. I've recently used Qiime2 to analyse a series of data collected elsewhere to prepare for my own research, and I'm just curious whether I could get some advice on the study system.
We have soil microbiome data (16S and ITS) for each sample at day 0 (start of study) and day 14, as well as fresh weight measurements taken at day 16. There are significant growth differences, but our chosen treatment only peaks at around 1500mg fresh weight, which despite being double or more any other treatments is still very low.
I am pretty curious about the limitations of such a short dataset, as while there are microbiome changes I can calculate as significant... I have a feeling the scale of changes won't be captured in 14 days.
theorem states that the sample rate must be at least twice the bandwidth (a.k.a. maximum frequency) of the signal to avoid aliasing.
...in the sense that if samples are taken at a slower rate than twice the band limit, then there are some signals that will not be correctly reconstructed.
To guarantee detection of a daily change, you need to sample 2x per day.
To guarantee detection of an hourly change, you need to sample 2x per hour.
If you sample twice a week, you can detect any change that happens slower than a week.
I would describe this as categorical sampling, which happens to be over time.
You have sampled before, during, and after, which is fine!
For example, we state: "Analysis tools developed for spatial studies can sometimes be applied to temporal studies, or vice versa. For example, in [58], variograms were used to identify the temporal scales at which E. coli concentrations increased within a watershed."
A little dated for sure... but I thought it was a cool way to experimentally determine how often one should sample.
For sake of example... basically, sampling would be something akin to, sampling X times a day for x days, then sample every Y days y times, then sample every Z week(s) or so z times, etc...
Thank you both for the responses! This should help me a lot in planning my future experiments.
I have one quick question the X samples at X days. My dataset has only taken samples at two timepoints (0 and 14), but uses multiple replicates. If I had two replicates with identical treatments sampled once per day, would this be equivalent to twice per day?
My comment about sampling was more to make the point that you do not need to sample every day or so for several weeks or months, but the further you go out in time the less you need to periodically sample... this will allow you to cover across different scales of time.
For example, some geographic studies sample on a log scale...
samples at 1 m apart
samples at 10 m apart
samples at 100 m apart
samples at 1000 m apart.
Then they can generate pairwise comparisons between these scales, trying to have a similar amount of datapoints and pair-wise comparisons (if needed) at each spatial scale. You can do this similarly for temporal data. From this they can determine what spatial scale is relevant to the study. Yeah, I know... beyond the scope of the question.
Thanks, so it's just a nice way of figuring out where samples start to differ!
On another note, I realised that I hadn't removed low-count samples after I filtered out non fungal/bacterial taxa (Chloroplasts, Mitochondria, Brassica). Now the treatments actually look reasonably distinct between the two timepoints.