Can I construct an OTU table using Qiime 2

Hello,
My name is Brigitta. I want to know whether it is possible to construct an OTU table using Qiime2. If, it is possible is there some tutorial that i can follow to get my work done as i am new to qiime 2. Something like the scripts page in Qiime1 will be very helpful. Also, i came across something called artifacts in qiime 2. How do i convert my fastq files to artifacts? Thank you in advance.
Best Regards,
Brigitta

Yes, we have all of these!

Tutorials Tutorials — QIIME 2 2022.2.0 documentation
Qiime1 used scripts, and Qiime2 uses plugins Available plugins — QIIME 2 2022.2.0 documentation

Each tutorial has a section on importing. All the ways to import data are listed here: Importing data — QIIME 2 2022.2.0 documentation

Here's a full overview of how Qiime2 works, including how and why we use Artifacts for everything in the new platform. Overview of QIIME 2 Plugin Workflows — QIIME 2 2022.2.0 documentation

And of course, let us know if you have any questions! :qiime2:

3 Likes

Thank you so much Colin!

Hi @Brigitta1,
a thread discussing how to import IonTorrent data which may be helpful to you is the following:

Luca

1 Like

Thank you so much Luca!

Hello Colin,
I read through the document Importing data — QIIME 2 2022.2.0 documentation several times. I want to get my data imported in the Casava 1.8 single-end demultiplexed fastq format.

Should my command be
qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path casava-18-single-end-demultiplexed
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-single-end.qza
Should the "--input path" be followed by the path to my .fastq.gz file?

Hi @Brigitta1,

can you clarify what type of data you are trying to import? Is the Ion Torrent data you mentioned in the other thread, a dataset form the tutorial, or some other Illumina sequences?

The importing step is one of the most difficult, so please do not worry if will take a little bit of time.
The importin command may differs depending on the type of data, hence the previous question.

For example, the command you asked is very specific for an Illumina dataset, in fact it requires that the file names are kept as the Illumina sequencer produce them.

Another option is the command is in the link I sent, which allows IonTorrent data to be imported. However, it requires bit more work from you. The command is:

qiime tools import \
  --type 'SampleData[SequencesWithQuality]' \
  --input-path se-33-manifest.csv \
  --output-path single-end-demux.qza \
  --input-format SingleEndFastqManifestPhred33

The 'type' option specifies what type of things you are going to import (in the case above sequences with quality, fastq files, as is your case)

The 'output' is the name of the final qiime2 artefact which will contain the sequences
The other two options are related, the 'input path' is the location for a file (so called manifest file) which tell qiime2 where to find the sequence files, the 'input format' is the description of what kind of information are contained in the manifest file'. What is in this two option is linked and described in the importing tutorial.

An example of manifest file is described by @colinbrislawn in the linked thread .

 sample-id,absolute-filepath,direction
# Lines starting with '#' are ignored and can be used to create
# "comments" or even "comment out" entries
sample-8,$PWD/some/filepath/S8-.fastq,forward
sample-9,$PWD/some/filepath/S9-.fastq,forward
sample-10,$PWD/some/filepath/S10-.fastq,forward
...
more samples
...
sample-46,$PWD/some/filepath/S46-.fastq,forward

The 'SingleEndFastqManifestPhred33' option tells qiime2 that the manifest file contains 3 columns, which have to be the columns as in the example above. But also that only one sequence file is in use for each sample, for which the direction is specified in the file.

Now, if you look at the importing tutorial, you may see that there are many 'input-format' available for the qiime2 import command, these should cover most (if not all) the possibility present to date!

For the artefacts and qiime2 objects, I suggest to work a bit with the dataset provided with the tutorials, even if is not your final analysis would be good for you to familiarize with the commands and outputs. The dataset are small so working with them should not take long processing time!

Hope it helps
Luca

Thank you for your reply Luca.
The data I am trying to import are Ion Torrent data which i have already converted from BAM format to FASTQ.

Hello Brigitta,

What commands have you tried running? Did they give errors or output files? You can post any errors that you got and we can take a look.

If you got output files, try going to the next step in the process!

1 Like

Hello,
I am trying to get my sequence files denoised.
The “Moving Pictures” tutorial — QIIME 2 2022.2.0 documentation tutorial says the following
In the demux.qzv quality plots, we see that the quality of the initial bases seems to be high, so we won’t trim any bases from the beginning of the sequences. The quality seems to drop off around position 120, so we’ll truncate our sequences at 120 bases.
Can I please know what is the easiest way for me to analyze my interactive quality plot?

Thank you in advance,
Brigitta

Hi @Brigitta1,

Just to confirm, were you able to import the sequences into a qiime2 artefact?
OK, fantastic job!

Did you imported the sequences already demultiplexed for each sample, or sequences for all samples were in the same fastq file? If so you need to demultiplex them as next step.
After that, you need to use cutadapt plug in to remove the sequencing adapter (if any) and the PCR primers sequences.
At this point, we can look at the sequence quality plot to figure out the trimming paramenters.
For that, would be really useful if you can post the command you used as well as share the artefact (of course feel free to send the file privately as direct message if you don't want to be available to anybody in the forum)

Hope it helps
Luca

Hello @llenzi

Yes, I was able to import my files. Btw, you were right. Qiime2 is really user-friendly!
I used the following code to denoise my sample;
qiime dada2 denoise-single **
--i-demultiplexed-seqs single-end-demux.qza **
--p-trim-left 0 **
--p-trunc-len 170 **
--o-representative-sequences rep-seqs-dada2.qza **
--o-table table-dada2.qza **
--o-denoising-stats stats-dada2.qza
My code is getting processed atm. I used the value 170 because the quality (score showed under median) started decreasing (below 28). I am a little worried because my sequence base length went up to 450, so a huge portion of my sequence was trimmed

1 Like

Hi @Brigitta1,
nice work!

Couple of points, the dada2 denoise-single is not suitable for IonTorrent data, you should use denoise-pyro instead.

On the other hand, I agree, your trimming parameters looks short. I am not familiar with the quality profile for IonTorrent, the quality threshold in the tutorial are aimed for Illumina sequences so you may need to use other values in your case. Because you are using single sequences, this value will influence the taxonomic assignment later, so I guess you can relax and use longer lengths.

Hope it helps
Luca

1 Like

Hello @llenzi

I am sorry, i didn't quite get you. Are you saying that my command should have looked something like this;
qiime dada2 denoise-pyro **
--i-demultiplexed-seqs single-end-demux.qza **
--p-trim-left 0 **
--p-trunc-len 170 **
--o-representative-sequences rep-seqs-dada2.qza **
--o-table table-dada2.qza **
--o-denoising-stats stats-dada2.qza

Also, I read this in an ion torrent manual:
. The Phred-like Q score measures accuracy on logarithmic scale that: Q10 = 90%, Q20= 99%, Q30 = 99.9%, Q40 = 99.99%, and Q50 = 99.999%
So do you think that me trimming the reads having a quality score of less than 30 is a good choice?

Thank you in advance
Brigitta

1 Like

Hi @Brigitta1,
yes your command should be something like that!
For the trimming quality, I think we should se the quality profiles for your sequences, as shown in the qzv artefact.

Luca

1 Like

Hello,
I successfully got through the taxonomy analysis step. I had five samples, to begin with. I want to know how I can know the bacterial composition in each sample separately.
Best Regards,
Brigitta

Hi @Brigitta1 ,
well done!
Can I ask you to start a different topic for this question?
It would be to keep forum tidy because this part will go off topic!
Cheers
Luca

Hi @llenzi
Can I know what happens when I run the following line?
--p-trunc-len 170 ****
Does it cut off the last 170 bases or does it cut off the bases starting for the 170th position?
Thank you in advance,
Brigitta

Hi @Brigitta1
it does the second you said, all the produced representative sequences will be 170 bp long!

Cheers
Luca

Thank you so much @llenzi