Okay, so that represents one FASTQ record. The first line is the fastq header/Id. The second is the sequence itself. The third is a delimiter, and the fourth are the quality score for each nt in the sequence.
So, let's count how many characters (nts) are in the sequence:
So, there is the problem: one more (or less, depending on perspective) value between lines 2 and 4. My money is on a weird preprocessing step - perhaps you can track down what was done to these reads before dropping them into QIIME 2? FWIW, you are going to most likely have issues with other tools too, since most (well behaved) tools will expect to have exactly only quality score per nucleotide.
Well, I know exactly what the steps were. I imported the .ab1 files from the sequencing facility into Geneious for trimming. I trimmed off the strands that came before & after the primers, then exported them as .fastq files to my Google Drive. From there I have been working between a Windows platform & a Mac platform. I had to dereplicate each sequence into its own .fastq file (as this tutorial states), then create a manifest file to each sequence file in that folder & (still trying to) import into q2.
Perhaps it is user error (I'm still pretty new to this entire world), but if I'm understanding everything correctly, this should work just fine.
As best I can...it's been a while since I did that.
I deleted the nt's before & after the beginning of each primer sequence (f & r) to clean them up. I did this with Geneious v7.1.8. From there, it was pretty straightforward. Just exporting to Google Drive.
Edit for clarity: I didn't trim off the sequence, leaving just the primers. I deleted the bits before the 5' start & after the 3' ends of the primers.
I'm super late to this thread, but I have a small suggestion.
Start from the very beginning and do it all in Qiime 2. Like, switching between .ab1 and fastq, mac and windows, sounds really hard. Especially if this is your first time using bioinformatics tools.
If you do this entirely in qiime 2, you will have a unified platform for primer trimming and processing, and I think we will get more consistent results.
I know this means repeating your work, but this might be the best way to get a result you can trust.
What are the very first, most raw, unedited files that you have?
Colin
I've thought about that being an issue & I agree that it would probably be best if I could just import everything directly into q2 & go from there. I DM'd @thermokarst last week with news that I have the same sequences at a Next Gen facility that are due back in a couple of weeks. It should be a lot easier to work with those data instead of my Sanger data. That said, once I get those files back I'll be importing them directly & going from there.
Thanks for the suggestion & the insight! I'll definitely be back soon with a fresh new set of problems.