Importing undetermined reads after demultiplexed

Hi

After demultiplexing I have reads (files) per sample and some undetermined reads. e.g.,

lane1-s160-index-GTCCGAAACACT-160_S160_L001_R1_001.fastq.gz
lane1-s160-index-GTCCGAAACACT-160_S160_L001_R2_001.fastq.gz

lane1-s161-index-TAAACCGCGTGT-161_S161_L001_R1_001.fastq.gz
lane1-s161-index-TAAACCGCGTGT-161_S161_L001_R2_001.fastq.gz

Undetermined_S0_L001_I1_001.fastq.gz
Undetermined_S0_L001_R1_001.fastq.gz
Undetermined_S0_L001_R2_001.fastq.gz

  1. Since DADA2 uses the information of a single (full?) MiSeq run, I am not sure if I need to import only the reads assigned to samples or do I even import the undetermined reads?

  2. When importing “Casava 1.8 paired-end demultiplexed fastq”, does it import ALL the fastq.gz files in the --input-path dir? If yes, do I need to keep only those files that I plan to import (as per you answer to Q1) in that folder?

qiime tools import
–type ‘SampleData[PairedEndSequencesWithQuality]’
–input-path casava-18-paired-end-demultiplexed
–input-format CasavaOneEightSingleLanePerSampleDirFmt
–output-path demux-paired-end.qza

Thanks.

-Rich

1 Like

DADA2 will still work with a subset of samples, so you could skip importing these. If you did want to import the unassigned, you could lump them into an “unassigned” sample, which you could then filter out once you have a feature table in hand.

Yes, this will import all fastq.gz files that match the CASAVA 1.8 naming convention. If there are samples in that dir you don’t want imported, remove them first.

Hope that helps! :t_rex:

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.