We don't recommend this as it is not as accurate, but it will probably work fine as long as individual samples are reasonably deep (>10k reads per sample).
I don't really understand what this means. It might be best to restate this a bit more precisely, perhaps by indicating the exsct commands that get you from "sequence data" to "feature table", and how you are tallying up the number of "features ... in the sequence data" vs. the number of "features... in the feature table".
One potential explanation, is that if you have run 97% OTU clustering on the output of DADA2, it will of course result in fewer features, because sequence variants that are 97% similar will be lumped into one OTU.