I am using Feature-table summarize to generate .qzv file from my .qza (artifact). I am aware of previously discussed errors about this plugin - I double-checked everything according. Do I need to have an extra column in my metadata say the same names as my demultiplex files fastq.gz ( I am confused- I have two fastq.qz for each pair end since I have paired-end data).
I am attaching the code, error file, and metadata file. Any feedback will be highly helpful.
Looks like there's a mismatch between your table.qza and sample-metadata.tsv
The following IDs are not present in the metadata: 'lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101', ...
When I view the sample-metadata.qzv file, I looked in the SampleID column and did not see lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101,
but I did fine a sample with the ID OR_Pre_101
Is that the same sample?
I'm not 100% sure what's going on there, but here's my best guess. Those sample IDs in your table.qza are really long, and it looks like they contain sequencing info, like the lane, index ID, and two barcodes.
During import or demultiplexing, the short SampleIDs are associated with the full file names for the paired end reads. As an example
I'm guessing that instead of using short SampleIDs like sample-1, the longer SampleIDs based on the full file names were used instead. But this means that this long SampleIDs do not match the short SampleIDs inside the metadata file.
See if you can import / demultiplex this data again using the short SampleIDs from your sample-metadata.tsv file.
Let me know if that helps, or if you discover any other clues.
Thank you so much for taking some time to help me figure it out.
I tried making the long ID (lane1-s001-indexN702-A-S502-A-CGTACTAG-CTCTCTAT-Pre-101) as sample ID's in my metadata file- which matched with my table.qza ID's and therefore solved the problem. Yayy- Thank you.
You mention that I can associate my short ID while importing my sequence data (I have paired-end demux data) and I used this to import data
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-format CasavaOneEightSingleLanePerSampleDirFmt --input-path reads --output-path demultiplexed-sequences.qza
Do you mean adding above mentioned line of code with qiime tools import? Please correct me if I am wrong.
I'm glad you mentioned that you have data in the CasavaOneEightSingleLanePerSampleDirFmt. As mentioned here, that format gets SampleIDs from files, while other format let you list your sample IDs in a separate file.
So get short sampleIDs using this import format, you will have to rename your files so the short sampleID is at the front. sampleID1_15_L001_R1_001.fastq.gz and sampleID1_15_L001_R2_001.fastq.gz
Another options is to use the fastq-manifest-format that lets you pass a manifest.tsv file that lists file paths and what ever sampleIDs that you want.