What type of sequencing data is used in this tutorial?

[Edited by moderator to show Google translation into English]:
Thanks to developers and translators, the quality of the tutorial translation is great! When starting the tutorial, I hope that the developer or translator can also explain the structure of the read in the sequencing data involved in the tutorial, whether the data is single-ended or double-ended, the length of the read and the position of the barcode in the read, and the library is in the coverage area of the 16S genome , There is no explanation on the structure of the sequencing data in the tutorial. Even if the learner can run through the process, the choice of filtering methods and parameters is not clear, and it is not clear what genomic region the data covers.

Original text:

1 Like

Welcome to the forum, @sensan!

Thank you for your kind words! All credit goes to @Yong-Xin_Liu for translating so many of our tutorials :grin:

Note that this translation is of a rather old document (from 2017) so the most up-to-date information and tutorials will be found at docs.qiime2.org, though these documents will be in English. For example, here is the latest version of this tutorial, and it explains many of these details earlier on:


The data used here are single-end (that does not necessarily mean that the original study does not have paired-end reads, only single-end reads are used here for simplicity but other tutorials describe how to analyze paired-end reads).

150 nt, this is mentioned lower down in the tutorial:

This is explained in the latest version of this tutorial but perhaps not the older version that was translated. This tutorial is using data generated using the EMP protocols, you can find more details here: https://earthmicrobiome.org/protocols-and-standards/

With the EMP protocol, the primer is not found in the sequence read. This is using the V4 primers 515f and 806r.

These sorts of decisions are described in more detail in the current tutorials. The genomic region etc is actually of little consequence; this general protocol is quite applicable to different marker-gene sequence targets.

Good luck!