That variation is much more than usual, indeed. There is another related discussion somewhere on the forum (I can't find it right now) where it was mentioned that this variation is usually related to non-target DNA... e.g., chloroplast + mitochondrial 16S is shorter than bacterial, and sometimes some eukaryote 18S can be hit by the primers. So unless if this is something that interests you, you should just filter out anything too long/short. I'd say filter anything < 240 nt or > 255 nt
You are correct @MirjamBloem it depends on what this is, but if it is non-target DNA (e.g., eukaryote) it will probably not classify so just removing it now will save you time.
Not sure! Could be that truncating the last 20 nt is causing the unusually long seqs to filter out at that stage (since they no longer join) but I can't explain the shorter seqs.