Hi
I already conducted my 18s amplicon sequencing
Here is the primer I use for PCR
FR-1 = ANCCATTCAATCGGTANT
Fun18s = CCATGCATGTCTAAGTWTAA
I use MinION MK1B for the sequencing, the library was prepared using the Oxford Nanopore Rapid Barcoding Kit (SQKRBK114.24).
MinKNOW were used for basecalling, adapter trimming and barcode trimming.
The sequencing runs were stopped after all the samples reached at least 1.5k reads.
I use this command to combine my fastq file in linux
cat *.fastq.gz > merged.fastq.gz
Then the fastq file is feed into fastQC and MultiQC in Galaxy (https://usegalaxy.org/)
The insight from the fastQC and MultiQC is use for later quality control steps in fastp and porechop also in Galaxy (https://usegalaxy.org/)
Porechop
- Output format for the reads: fastq
- Barcode binning setting
- Percent identity for binning = 75
- Minimum difference in identity = 5.0
- Require two matches for binning = No
- Discard unassigned reads = No
Adapter search setting
- Minimum identity = 90
- Number of alligned reads = 10000
- Scoring scheme = 3,-6,-5,-2
End adapter setting
- Number of bases =150
- Minimum trim length = 4
- Extra bases trimmed = 2
- Minimum percent identity = 75
Middle adapter setting
- Skip splitting = No
- Discard reads with middle adapter = No
- Minimum percent identity = 85
- Extra trimming good side = 10
- Extra trimming bad side = 100
- Minimum length reads post-split = 1000
fastp
Single-end
Adapter trimming options
Filter options
- Qualified quality phred = 9
Length filtering option - Length required = 200
- Maximum length = 1000
Read Modification Options - PolyG tail trimming = Disable polyG tail trimming
Here is the result from the fastQC and MultiQC after the quality control
Next the fastq file in feed into Qiime2 Amplicon Distribution Environment
qiime tools import \
--type 'SampleData[SequencesWithQuality]' \
--input-path manifest_18s_dr.tsv \
--output-path single-end-demux.qza \
--input-format SingleEndFastqManifestPhred33V2
qiime demux summarize \
--i-data single-end-demux.qza \
--o-visualization demux.qzv
qiime vsearch dereplicate-sequences \
--i-sequences single-end-demux.qza \
--o-dereplicated-table table.qza \
--o-dereplicated-sequences rep-seqs.qza
qiime feature-table summarize \
--i-table table.qza \
--o-visualization table.qzv \
--m-sample-metadata-file metadata_18s.tsv
qiime feature-table tabulate-seqs \
--i-data rep-seqs.qza \
--o-visualization rep-seqs.qzv
Then i use the q2-feature-classifier
I downloaded and import the pre-formatted SILVA reference sequence and taxonomy files available at the Qiime2 Data resources
Silva 138 SSURef NR99 full-length sequences (MD5: de8886bb2c059b1e8752255d271f3010)
Silva 138 SSURef NR99 full-length taxonomy (MD5: f12d5b78bf4b1519721fe52803581c3d)
qiime feature-classifier extract-reads \
--i-sequences silva-138-99-seqs.qza \
--p-f-primer ANCCATTCAATCGGTANT \
--p-r-primer CCATGCATGTCTAAGTWTAA \
--o-reads silva-138-99-ref-seqs-region.qza
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads silva-138-99-ref-seqs-region.qza \
--i-reference-taxonomy silva-138-99-tax.qza \
--o-classifier silva-138-99-classifier.qza
qiime feature-classifier classify-sklearn \
--i-classifier silva-138-99-classifier.qza \
--i-reads rep-seqs.qza \
--o-classification taxonomy.qza
qiime metadata tabulate \
--m-input-file taxonomy.qza \
--o-visualization taxonomy.qzv
Next I proceed with closed reference clustering at 90%
qiime vsearch cluster-features-closed-reference \
--i-table table.qza \
--i-sequences rep-seqs.qza \
--i-reference-sequences silva-138-99-seqs.qza \
--p-perc-identity 0.90 \
--o-clustered-table table-cr-90.qza \
--o-clustered-sequences rep-seqs-cr-90.qza \
--o-unmatched-sequences unmatched-cr-90.qza
qiime feature-table summarize \
--i-table table-cr-90.qza \
--o-visualization table-cr-90.qzv \
--m-sample-metadata-file metadata_18s.tsv
qiime feature-table tabulate-seqs \
--i-data rep-seqs-cr-90.qza \
--o-visualization rep-seqs-cr-90.qzv
Lastly I construct a taxa barplot
qiime taxa barplot \
--i-table table-cr-90.qza \
--i-taxonomy taxonomy.qza \
--m-metadata-file metadata_18s.tsv \
--o-visualization taxa-bar-plots-cr-90.qzv
Here is the taxa bar plot
Here are my questions
- How can I know the phylum of d__Eukaryota;__?
- What is the most correct way to eliminate the phyla that are not Fungi from the taxa bar plot (also the d__Bacteria;__)?
- Why is Fungi also classified as phylum? Can I further assign the d__Eukaryota;p__Fungi?