Plugin error Feature-table summarize

shaista_karim · February 16, 2022, 4:31pm

Hi-
I am using Feature-table summarize to generate .qzv file from my .qza (artifact). I am aware of previously discussed errors about this plugin - I double-checked everything according. Do I need to have an extra column in my metadata say the same names as my demultiplex files fastq.gz ( I am confused- I have two fastq.qz for each pair end since I have paired-end data).

I am attaching the code, error file, and metadata file. Any feedback will be highly helpful.

qiime feature-table summarize
--i-table table.qza
--o-visualization table.qzv
--m-sample-metadata-file sample-metadata.tsv

Thank you-

sample-metadata.qzv (1.2 MB)

I further added another column in metadata (Check-meta.qzv) which same names as my demux sample (qemultiplex-sequence-summ.qza) but the error still persists

demultiplexed-sequences-summ.qzv (336.4 KB)

Check-meta.qzv (1.2 MB)

colinbrislawn · February 17, 2022, 3:01am

Hello @shaista_karim,

Looks like there's a mismatch between your table.qza and sample-metadata.tsv

The following IDs are not present in the metadata: 'lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101', ...

When I view the sample-metadata.qzv file, I looked in the SampleID column and did not see
lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101,
but I did fine a sample with the ID OR_Pre_101

Is that the same sample?

I'm not 100% sure what's going on there, but here's my best guess. Those sample IDs in your table.qza are really long, and it looks like they contain sequencing info, like the lane, index ID, and two barcodes.

During import or demultiplexing, the short SampleIDs are associated with the full file names for the paired end reads. As an example

sample-id     forward-absolute-filepath       reverse-absolute-filepath
sample-1      /filepath/lane1-s001-R1-sample-1.fastq.gz  /filepath/lane1-s001-R2-sample-1.fastq.gz

I'm guessing that instead of using short SampleIDs like sample-1, the longer SampleIDs based on the full file names were used instead. But this means that this long SampleIDs do not match the short SampleIDs inside the metadata file.

See if you can import / demultiplex this data again using the short SampleIDs from your sample-metadata.tsv file.

Let me know if that helps, or if you discover any other clues.

shaista_karim · February 17, 2022, 4:44am

Hi @colinbrislawn

Thank you so much for taking some time to help me figure it out.
I tried making the long ID (lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101) as sample ID's in my metadata file- which matched with my table.qza ID's and therefore solved the problem. Yayy- Thank you.

You mention that I can associate my short ID while importing my sequence data (I have paired-end demux data) and I used this to import data
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-format CasavaOneEightSingleLanePerSampleDirFmt --input-path reads --output-path demultiplexed-sequences.qza

Do you mean adding above mentioned line of code with qiime tools import? Please correct me if I am wrong.

colinbrislawn · February 17, 2022, 1:38pm

OK, we are on the right path!

I'm glad you mentioned that you have data in the CasavaOneEightSingleLanePerSampleDirFmt. As mentioned here, that format gets SampleIDs from files, while other format let you list your sample IDs in a separate file.

So get short sampleIDs using this import format, you will have to rename your files so the short sampleID is at the front.
sampleID1_15_L001_R1_001.fastq.gz and sampleID1_15_L001_R2_001.fastq.gz

Another options is to use the fastq-manifest-format that lets you pass a manifest.tsv file that lists file paths and what ever sampleIDs that you want.

Let me know what you try next!

shaista_karim · February 17, 2022, 4:43pm

Hi @colinbrislawn

I made manifest.tsv with absolute path listed and imported that way (attached)-- Now, I've my short names listed. . Thank you it worked.

Screenshot 2022-02-17 084025

system · March 20, 2022, 10:44pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.