Sample-32 and sample-35 are the two samples I want to compare together in one taxonomic bar plot. Sample-35 has read lengths of about 220 bp, while the rest are of 250-300 bp.
Is there a way for me to trim and analyse them separately and only combine the results in one taxonomic bar plot?
Many thanks once again for your kind and generous help
It looks like your mixed run is failing to merge. (DADA2 doesn't particularly like mixed runs either, just FYI). You might need to consider forward sequences only to be able to combine them. You may also find deblur easier, since this provides a fixed sequence length.
You could create feature tables separately, for e.g two tables that have an OTU labels assigned for each sequence. Then you can merge the two feature tables, because it only cares about the OTU labels and not the sequences itself. Maybe this will work? But if you want to use the entire sequence, this method won't work I think.
First we’ll merge the two FeatureTable[Frequency] artifacts, and then we’ll merge the two FeatureData[Sequence] artifacts. This is possible because the feature ids generated in each run of denoise-single are directly comparable (in this case, the feature id is the md5 hash of the sequence defining the feature).
Wow, creating md5 hash from the actual sequences is such a brilliant idea!
I will try merging the feature tables.
@jwdebelius thank you for sharing your insights - could you kindly explain more about the "length signal" that you mentioned please? Is it a bad thing / will that affect the downstream statistical analysis?
Technical effects (sequence length, primers, etc) can sometimes outweight the biological signal you're looking for, or can confound it. One easy way to solve this is to trim all your sequences to the same length. At best, it increases the noise in your data. At worst, it can actually confound biological signals.
So, I strongly encourage you to think about trimming to the same length.