Importing metagenomics files after sortmerna

You_y_Choi · October 17, 2023, 4:24am

Hello, experts.
I am trying to import SortMeRNA-aligned FASTQ files, which were generated after running Trimmomatic and Bowtie2 for host removal.

Codes were
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-format PairedEndFastqManifestPhred33V2
--input-path manifest.txt
--output-path demux.qza

Form of manifest file.
"sample-id forward-absolute-filepath reverse-absolute-filepath
CON /Users/aligned_rRNA/aligned_11.31p2_fwd.fq /Users/aligned_rRNA/aligned_11.31p2_rev.fq"

And I attached demux.qzv file.

All the quality processes were done in the metagenomics pipeline; I don't need to use DADA2, right?

qiime feature-classifier classify-sklearn
--i-classifier silva-138-99-nb-classifier.qza
--i-reads demux.qza
--o-classification taxonomy.qza

(1/1) Invalid value for '--i-reads': Expected an artifact of at least type
** FeatureData[Sequence]. An artifact of type**
** SampleData[PairedEndSequencesWithQuality] was provided.**

How should I import my files to proceed with taxonomy analysis?
demux.qzv (310.1 KB)
Or, if I made a mistake, please provide guidance.

Many thanks

jphagen · October 17, 2023, 8:52pm

Hi @You_y_Choi,

You are correct you do not need DADA2. I believe it would be more useful in your case to utilize the Qiime2 Shotgun Distribution. This workflow should help you work around DADA2 and use Kraken to classify your metagenomic reads. If I am misinterpreting your question feel free to give me more detail so I can better understand your goal. I hope this is helpful and you are able to get your data classified.

-Hannah

You_y_Choi · October 17, 2023, 9:17pm

Thank you, Hanna.

I am trying to perform 16S and 18S analysis using the Silva 138 database. However, the imported files did not work for taxonomy analysis. This issue seems very similar to the one discussed in this forum post: 'https://forum.qiime2.org/t/working-with-extracted-16s-fastq-files/26098.'

How should I proceed with the import?
I also uploaded the fq files, metadata, and manifest files.
manifest.txt (152 Bytes)
metadata.txt (26 Bytes)
aligned_11.31p2_fwd.fastq (4.7 MB)
aligned_11.31p2_rev.fastq (4.6 MB)

jphagen · October 17, 2023, 10:43pm

Hi @You_y_Choi,
This import is correct:

But your downstream analysis is failing because you dont have the right file type.

To clarify, are you working with metatranscriptomic data or metagenomic data? I am trying to figure out your goal for this analysis so I can help you with relevant downstream analysis.

If it is metagenomic data you can extract 16S and 18S from your sequences but they will underperform compared to running a metagenomic analysis or having 16S data and running amplicon analysis from there. If you want to proceed in this direction. The commands would be as follows:

qiime feature-classifier extract-reads

But I would again recommend our shotgun distribution for metagenomic data as it is more applicable to your data.

If you have metatranscriptomic data we currently do not have implementation for this type of sequencing.

-Hannah

You_y_Choi · October 18, 2023, 2:27am

Hello, @jphagen
Yes, these are metagenomic reads.
I followed your suggestion and ran the following commands:

qiime tools import \

--type 'SampleData[PairedEndSequencesWithQuality]'
--input-format PairedEndFastqManifestPhred33V2
--input-path manifest.txt
--output-path demux.qza

The import command worked fine, and it imported manifest.txt as PairedEndFastqManifestPhred33V2 to demux.qza.

However, when I ran the qiime feature-classifier extract-reads command, I encountered some issues:

qiime feature-classifier extract-reads \

--i-sequences demux.qza
--o-reads feature_data_sequence

I received the following errors:

(1/3) Invalid value for '--i-sequences': Expected an artifact of at least type FeatureData[Sequence]. An artifact of type SampleData[PairedEndSequencesWithQuality] was provided.
(2/3) Missing option '--p-f-primer'.
(3/3) Missing option '--p-r-primer'.

It seems that there are some problems with the command you provided. I will also explore the "Qiime2 Shotgun" approach, but I'd like to resolve this issue since many research papers also use this method.

Thanks for your help.

jphagen · October 18, 2023, 9:41pm

Hi @You_y_Choi,

I see.
To fix this first error:

These steps should bring you from your demultiplexed reads to FeatureData[Sequence] and FeatureTable[Frequency]. Since you have paired end data, you can try this workflow:

qiime vsearch merge-pairs ...
qiime vsearch dereplicate-sequences
qiime feature-classifier extract-reads 
qiime feature-classifier classify-sklearn

The last two errors are asking you to provide the forward and reverse primers as parameters

What variable region are you targeting? That will help us figure out which primers to use.

-Hannah

You_y_Choi · October 18, 2023, 10:17pm

Hello, @jphagen !

qiime vsearch merge-pairs
--i-demultiplexed-seqs demux.qza
--o-merged-sequences demux-joined.qza
Saved SampleData[JoinedSequencesWithQuality] to: demux-joined.qza

qiime vsearch dereplicate-sequences
--i-sequences demux-joined.qza
--o-dereplicated-table table.qza
--o-dereplicated-sequences req-seqs.qza
Saved FeatureTable[Frequency] to: table.qza
Saved FeatureData[Sequence] to: req-seqs.qza

I followed your suggestion. Do metagenome-derived sequences also require primers?
If so, I am planning to use protozoa-specific primers targeting the 18S rRNA genes.

qiime feature-classifier extract-reads
--i-sequences req-seqs.qza
--p-f-primer GACTAGGGATTGGAGTGG
--p-r-primer AATTGCAAAGATCTATCCC
--o-reads 18S_reads.qza
Plugin error from feature-classifier:

No matches found

Debug info has been saved to /var/folders/bd/kdj7l_yx1zvf2hfp96yc8s4w0000gn/T/qiime2-q2cli-err-wm9501j0.log

Thank you for your help

jphagen · October 18, 2023, 10:28pm

Hi @You_y_Choi,

No but if you want to extract a specific 16S or 18S region then you will need the primers for region.

It looks like those sequence primers were not found in your sequences. If you weren't able to extract I would give Silva a try and see if you can make any progress there.

I would expect to see a number of unassigned sequences with this workflow. Just a heads up!

You_y_Choi · October 20, 2023, 2:00am

Thank for your suggestion! @jphagen

I just used silva138 database and it worked
As you suggested I also try "Qiime2 Shotgun" approach.

system · November 20, 2023, 8:01am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.