Hi All,
I was trying to repeat a published work by using qiime2.https://www.ncbi.nlm.nih.gov/pubmed/30171008
I have been trying for weeks and it has really been poor efficiency. I would really appreciate it a lot if someone could kindly guide me a little.
This is a sentence saying "Raw forward and reverse reads were aligned using fastq-join (32) and combined into a single fastq file using the split_libraries tool, which truncates reads with three consecutive base calls that exhibit a Phred score below 19. In total, 8,839,451 sequences (61.57% of total) were assembled and deemed passable following quality filtering.
The command I have used are:
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path original-data-set \
--input-format CasavaOneEightSingleLanePerSampleDirFmt \
--output-path demux.qza
qiime demux summarize
–i-data demux.qza
–o-visualization demux.qzv
$ qiime dada2 denoise-paired
–i-demultiplexed-seqs demux.qza
–p-trunc-len-f 0
–p-trunc-len-r 0
–p-trunc-q 19
–o-table table.qza
–o-representative-sequences rep-seqs.qza
–o-denoising-stats denoising-stats.qza
My question would be as below:
-
how to do “truncates reads with 3 consecutive base calls”?
-
how can I see the right per cent of sequences were assembled after the dada2 quality filtering, as the example shown as 61.57%?
-
I also found there is this command embedded in DADA2, --p-max-ee, what is the function of this command? I have read the explanation in the user support book but still cannot understand. Could someone give a example as illustration?
-
Also, I have tried the dada2 denoising step for several times, and the results are not all the same, sometimes it returned successfully completed, the other times it will takes much longer time and still failed. Is there a certain reason and a better solution to this?
Thank you for your help!
Many thanks again!
Shuqi