Getting started with paired end, no barcodes

I have been using Qiime 1 off and on for a long time. I’m trying to make the switch to Qiime 2 now. I have completed the Q2 moving picture tutorial.

I generally work with paired-end Illumina reads, without barcodes, i.e. one pair of R1 and R2 reads per sample, so I’m trying to adapt the atacama-soils tutorial for my needs.

I don’t see how to create an artifact with my non-barcoded, paired-end reads, that includes my metadata, and is ready for the dada2 step.

My manifest looks like this:

sample-id,absolute-filepath,direction
7,5101-Cow108-w5d7-Feces-MS515F-926R_R1.fastq,forward
7,5101-Cow108-w5d7-Feces-MS515F-926R_R2.fastq,reverse
8,5101-Cow108-w5d7-Fluid-MS515F-926R_R1.fastq,forward
8,5101-Cow108-w5d7-Fluid-MS515F-926R_R2.fastq,reverse

I think I’m supposed to import like this (which does create the artifact.)

qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path manifest.csv \
  --output-path sequences.qza \
  --source-format PairedEndFastqManifestPhred33

And I have modified a Q1 mapping file to get this metadata file

#SampleID	LinkerPrimerSequence	ReverseLinkerPrimerSequence	Type	Description
7	GTGCCAGCMGCCGCGGTAA	CCGTCAATTCMTTTRAGTTT	Feces	7
8	GTGCCAGCMGCCGCGGTAA	CCGTCAATTCMTTTRAGTTT	Rumen Fluid	8
4 Likes

Hi,
I had the same issue, but in the tutorials -import- they mention about Casava 1.8 paired-end demultiplexed fastq (which I believe is the type of data I have also).
I used the following to create the artifact.

qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path emp-paired-end-sequences \
  --source-format CasavaOneEightSingleLanePerSampleDirFmt \
  --output-path demux-paired-end.qza

Hope it helps!

-Alba

Thanks @mamillerpa for posting! Your detailed post is really great, thanks for that!

The file you created called sequences.qza in your import step is ready for DADA2! You should be able to run denoise-single or denoise-paired on this artifact! Since your samples are already demultiplexed, you can skip that section of the analysis, although before jumping into DADA2, I would run qiime demux summarize, that way you can get a feel for your sequence quality!

As far as metadata goes, the general pattern in QIIME 2 is for individual methods to accept Metadata as an input (sometimes optionally). So for a step like denoise-single, there is no need for metadata here, so it isn't requested as an argument to the command. When you get to something like feature-table summarize, that command will optionally accept metadata, and will provide some nice interactive visualizations utilizing that metadata.

If you haven't had a chance to work through the tutorials, I strongly suggest you start here, before diving in to your own analysis.

Keep us posted on your progress, we would love to hear how it is going! :t_rex:


@acmayta - thanks for helping out @mamillerpa! I don't think that @mamillerpa will be able to use the CasavaOneEightSingleLanePerSampleDirFmt here, because the filenames indicated in the post above don't follow the Casava 1.8 naming convention.

Thanks, and keep on QIIMEing!! :tada:

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.