That's really low. Unless if you have very low sequence counts to start with, chances are you are losing too many sequences due to sub-optimal parameter settings in your denoising protocol.
- How many starting sequences do you have? When you run dada2, use the
--verbose
flag so that filtering summaries are reported (then you will know where the sequences are being dropped) - What marker gene/primers are you using? What is the total length of the amplicon? Since you are using paired reads, you want to make especially sure that the trimmed reads still overlap with at least 20 nt, or you will lose reads that fail to join.
- What does the quality of the demultiplexed sequences look like? Please share your demux quality QZV if you want us to take a look.
Chances are sequences are being dropped because:
- you are either trimming too much and sequences fail to overlap and successfully join.
- you are trimming too little and noisy, error-filled sequences (probably on the reverse reads) are being dropped because they are too noisy.
- both. If reads are noisy, you want to trim appropriately — but if so much is trimmed that you cannot join these sequences, you are stuck. This sad state of affairs has happened to me many times — in which case the best course of action is to discard the reverse reads (or whichever is noisier) and just analyze the less-noisy reads in your analysis as single-end data.
Definitely not — your most highly sequenced sample still contains < 1000 sequences, which is just way too low. But the good news is that this is probably a problem with your workflow/parameters, and more sequences should be recoverable if you check out the steps I've outlined above.
Good luck!