I have been analysing my 16S rRNA paired-end sequences from four different datasets of cucumber microbiome, and I am now writing my report. I want to give a summary of the total raw paired reads, and the sequences after the denoising step and after filtering of chloroplasts and mitochondria. Is there a way I can obtain these numbers?
Yes, just use the command qiime feature-table summarize with your final filtered dataset.
The "total frequency" is the total number of counts/reads of your dataset. The "number of features" is the number of different ASVs that were found in your dataset. In other words, your dataset has 691,800 reads spanning 645 different sequences.
Maybe there is a clever way to do it, but I would just manually (Excel/R) append to this table that you are showing the data from the "Interactive sample detail" tab of the qiime feature-table summarize output mentioned above.
(Matthew Ryan Dillon)
Congratulations on finishing your analysis, @lilycrook! @vheidrich's answer covers your questions pretty well - I just have a couple more breadcrumbs for you. Because most people only use the DADA2 denoising stats for diagnostic work (rather than publications), there aren't links at this time to export most of the tables in that visualization.
To get the denoising stats data without a lot of copy-pasting, you can use qiime tools export. This will, by default, export a .tsv. E.g.
There are a couple ways you could tackle getting the per-sample frequencies from your table. If you need a programmatic solution, there are directions in other forum posts on how to export your feature table and make it into a .tsv. You'll still have to do some work with the table to sum the frequencies across features. Python, R, whatever will do this for you in a reproducible way.
If you don't need a programmatic approach, you can just copy-paste the Sample and Feature Count columns from the interactive sample detail page into a spreadsheet - that could be your already-exported dada2-stats, or a new .tsv. If you like excel, you can use an if formula to match sample ids. If you go this route, be careful that your sample-ids are always stored as plain text. If you paste them into a number-formatted column in excel, leading zeros will be dropped and your sample-ids may not match each other or the rest of your data.
Export your DADA2 denoising stats (as @ChrisKeefe showed you, above)
Export your feature table "frequency per sample detail" CSV on the unfiltered table (as I showed above)
Filter chloroplasts and mitochondria (see link above)
Export your filtered feature table "frequency per sample detail" CSV on the unfiltered table (as I showed above)
a. Merge all of these tables manually (using a spreadsheet tool, for example)
b. If you want to generate a visualization of these results, format as TSVs and run metadata tabulate, specifying all of the TSV files (the merging will automatically handle matching the IDs)