This is most likely because closed reference OTU picking at 97% will result in losing quite a few sequences (anything < 97% similar to at least one reference sequence!). That may or may not be desirable depending on your experimental designs. If you used de novo OTU picking in qiime 1 you would probably end up with as many or more OTUs than dada2...
yes, use feature-table summarize
I feel your pain! I expect dada2 and OTU picking are not going to handle this type of contamination differently (i.e., it will impact both similarly).
We have an ongoing discussion about contamination control, if you have any thoughts or maybe just want to check out some of the ideas folks have been posting: Discussion: methods for removing contaminants and cross-talk