Demux summarize error: Plugin error from demux- 'NoneType' object has no attribute 'strip'

Hello.

I finally imported my demux’ed paired end fastq files via manifest method. I did have to run a perl script to modify the fastq seq headers since they were modified by an in-house script of our sequencing facility. So I moved on to demux summarize but got a plugin error :Plugin error from demux: ‘NoneType’ object has no attribute 'strip’
The following is from the error log:

    Traceback (most recent call last):
      File "/home/qiime2/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/q2cli/commands.py", line 222, in __call__
        results = action(**arguments)
      File "<decorator-gen-207>", line 2, in summarize
      File "/home/qiime2/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/sdk/action.py", line 201, in callable_wrapper
        output_types, provenance)
      File "/home/qiime2/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/sdk/action.py", line 392, in _callable_executor_
        ret_val = callable(output_dir=temp_dir, **view_args)
      File "/home/qiime2/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/q2_demux/_summarize/_visualizer.py", line 114, in summarize
        for seq in _read_fastq_seqs(file):
      File "/home/qiime2/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/q2_demux/_demux.py", line 36, in _read_fastq_seqs
        qual.strip())
    AttributeError: 'NoneType' object has no attribute 'strip'

Let me know if you need additional information to help me troubleshoot this error.
Many thanks,

Trina

Hi @taknotts! Bummer! It looks like that perl script maybe didn’t quite work as expected — that error you posted above seems to indicate that the imported data no longer has quality scores. I feel like the manifest format shouldn’t have allowed you to import quality-less sequences, so I am opening an issue to track that.

Can you provide a short snippet of one of your files, but after running the perl script on it? Something similar to what you provided in another topic would be perfect! Thanks!

Hi @thermokarst

Here is the output of “head” for one of the converted fastq files:

@M01533:390:000000000-B4NPH:1:1101:15472:1372 1:N:0:0
CCTGTTTGCTCCCCACGCTTTCGTGCCTCAGTGTCAGTTACAGTCCAGTAAGCCGCCTTCGCCACTGGTGTTCCTCCTAATATCTACGCATTTCACCGCTACACTAGGAATTCCGCTTACCTCTCCTGCACTCCAGCTGGACAGTTTCCAATGCAGTACCGGGGTTGAGCCTCGGGCTCTCACATCTTACTTGCCCTACCACCTACGCACCCTTTACACCCAGTACATACGGACAACGCTTGCCCCCACC
+
AAAAAFFFFF1CFFCE1EAFGFABBFFGFHHFBGGHHFHHHH1GBF1GAGGHFFEGGGGHGGG/A1BEBFGGD2GFHH1GFGGFDFBFEGG?FHHHF?>>>>BGG0BFFHHEGHHF/EEHHG<1CGHHGFGGFGHFHB100F1FHHH22>?GCG1<F<GBC/--><-<CBGC<.<-AC.:GHHHH0000:0G0G/09/;[email protected];@.9FBFB0;FBE-9///9/9---;----9A9;-B/;--A-
@M01533:390:000000000-B4NPH:1:1101:15242:1543 1:N:0:0
CCTGTTCGATACCCACGCTTTCGTGCCTGAGCGTCAGTTGTGCACCGGTATGCTGCCTTCGCAATCGGGGTTCTGCGTGATATCTACGCATTTCACCGCTACACCACGCATTCCGCATACCTCTCGCGCACTCTAGATCCCCAGCTTCAACGGCTGGATGGGGTTGAGCCCCACGATTTGACCGCTGACTTAAAAATCCGCCGTAGCACCCTTTAAACCCAATCACACCTGATTACGCTCCCATCCCCAC
+
1>>AAFFAAFAD111EECEFFAAAABE1BAFGEGG?CH2G1B1D10EEEE/BGGGHGHHHGEE//GF?//[email protected]<<<@H0<GHGG/CEDFHCBC?DCDGBG/A/<>A<GDHH1<11/<C..<0=00<[email protected]?:-;:0/C..:[email protected]/C;C9--...09;;->@=--9:/9/;/;/9-A---999//;A-9B/////-9--/;B/;999///;/9----9-;B9----
@M01533:390:000000000-B4NPH:1:1101:15225:1550 1:N:0:0
CCTGTTCGATACCCACGCTTTCGTGCCTGAGCGTCAGTTGTGCACCGGTATGCTGCCTTCGCAATCGGGGTTCTGCGTGATATCTACGCATTTCACCGCTACACCACGCATTCCGCATACCTCTCGCACACTCTAGATCCCCAGTTTCAACGGCTGGATGGGGTTGAGCCCCACGATTTGCCCGCTGACTTAAAAATCCGCCTGCGCACCCTTTAAACCCAATAAATCCGGATAAACCTCCCCTCCTCCC

Thanks,

Trina

Thanks for the sample! So I think my initial hunch was wrong, it definitely looks like you have quality scores hanging out in there. The error you are seeing ('NoneType' object has no attribute 'strip') does imply that you might have an extra line somewhere in the file, because QIIME 2 isn’t seeing a cleanly divisible-by-4 number of lines in the file. Perhaps there is an extra line at the very end being tacked on by this script?

Can you run the following and provide the output:

  1. tail $FILENAME (where $FILENAME is one of your converted fastq files)
  2. wc -l $FILENAME (where $FILENAME is one of your converted fastq files)

Thanks! :t_rex:

  1. Here is the output of tail:
+
1>AAA3BBBB1BFABEGGGFF10BBAFGFC10AAAC0FHBD22BFGFCFDBABDFFCGFGFEGE?GFCE>/>[email protected]>E>E1F0F1BCGHGF2BAEE/BF<////B1BB<FCB>G1C?//<CFHHDG1111<<.01<>D0.<@F0D0/<GGF.CC-.;CGHH0:.:/;C;00:00..9A-9-A....-9;9-:B////;A-A-///9/99--;-9/--9AE-//9/B9;
@M01533:390:000000000-B4NPH:1:1114:6693:4688 1:N:0:0
CCTGTTTGATACCCACGCCTTCGTGCTTCAGCGTCAGTTGGATGCCGGTATGCTGCCTTCGCAATCGGAGTTCTGCGTGATATCTATGCATTTCACCGCTACACCACGCATTCCGCATACTTCTCATCCACTCAAGAAAACCAGTTTCAACTGCCGTATGTGGTTTACGCCACAGAATTTCACGGTGGCCTTGCGTCCACACCACGCCTCCCTTTAAACCAATACAACCCGGTACACACCCGCCTCCCCC
+
AAAAADFBF555B44A22FG?B4FB225AAFFEBA2EGF5EBG323AEAAEE5FGHEH5FAEEE?GE1?>[email protected]@BE?E13FGGGFGFB4GHHBBBDE/>>>DGCE//EEGFF4CBCD??GFGEGHF1<@<DCCHFF0FFGAEHHFDG11F1<10->.0<111<.01<.<A.<<.//<:::00...:...0///-.-9.0/;.....;..9C/09;0//B.9/:////.9;--;.9/;...---.9.;..
@M01533:390:000000000-B4NPH:1:1114:6710:4695 1:N:0:0
CCTGTTTGATACCCACGCCTTCGTGCTTCAGCGTCAGTTGGATGCCGGTATGCTGCCTTCGCAATCTGAGTTCTGCGTGATATCTATGCATTTCACCGCTACACCACGCATTCCGCATACTTCTCATCCACTCAAGAAAACCAGTTTCACCGACGGATAGCGGTCGGCCTCCCCCGTTATTCCACTTCACTTGACTCTCGCCCTTGCCGCCCCTTAACCTCATTAATCCACAGTACACCCCCCACCCCCC
+
[email protected]?/ABFD1GGF1B/EEE/[email protected][email protected]/>>//10BGG/EEEEH2B//E/[email protected]@/CGF1<<111//<-<..>10<-<----..;..:.-.99.0;;0900090000000909..;.-/-//--;----///9/;;////9/9//;;//;/9/--;@-9;----
  1. Here is the output of wc -l

132316 /media/sf_qiime/MMPC/Berryman/new_data/TKnotts0517_G691_1.fastq

Indeed divisible by 4.

Do I need to go through each file?

Thanks,

Trina

1 Like

Hi @taknotts! I think we should check the line count in each of your files. @ebolyen & I just whipped up this one liner that should assist:

$ cd /media/sf_qiime/MMPC/Berryman/
$ for f in *.fastq; do r=$(( $(wc -l < $f | tr -d '[:space:]') % 4 )); echo $r $f; done

So change directories into the place that has your converted fastq files, then copy-and-paste (and run) the second line in the above code block. Can you copy-and-paste those results here? They will look like this (but with your filenames):

0 s1.fastq
0 s2.fastq
1 s3.fastq
0 s4.fastq
0 s5.fastq
0 s6.fastq

Any file that has a 0 in front of it is fine (according to this check, at least - the number of lines in the file is divisible by 4). If the value is something other than 0, then there is a problem in that file. In my example above, the file s3.fastq has too many or too few lines (not divisible by 4).

Good luck!!!

Thanks @ebolyen for the code. SO much faster than me trying to write the code or do them one by one.
@thermokarst We have found our culprit. One of the files is not a 0.

> 0 TKnotts0517_G779_2.fastq
> 0 TKnotts0517_G780_1.fastq
> 0 TKnotts0517_G780_2.fastq
> 0 TKnotts0517_G790_1.fastq
> 0 TKnotts0517_G790_2.fastq
> 0 TKnotts0517_G843_1.fastq
> 0 TKnotts0517_G843_2.fastq
> 0 TKnotts0517_G844_1.fastq
> 0 TKnotts0517_G844_2.fastq
> 0 TKnotts0517_G845_1.fastq
> 0 TKnotts0517_G845_2.fastq
> 0 TKnotts0517_G846_1.fastq
> 0 TKnotts0517_G846_2.fastq
> 0 TKnotts0517_G851_1.fastq
> 0 TKnotts0517_G851_2.fastq
> **3 TKnotts0517_G855_1.fastq**
> 0 TKnotts0517_G855_2.fastq
> 0 TKnotts0517_G856_1.fastq
> 0 TKnotts0517_G856_2.fastq
> 0 TKnotts0517_G859_1.fastq
> 0 TKnotts0517_G859_2.fastq
> 0 TKnotts0517_G860_1.fastq
> 0 TKnotts0517_G860_2.fastq
> 0 TKnotts0517_G883_1.fastq
> 0 TKnotts0517_G883_2.fastq
wc -l /media/sf_qiime/MMPC/Berryman/new_data/TKnotts0517_G855_1.fastq

> 128251 /media/sf_qiime/MMPC/Berryman/new_data/TKnotts0517_G855_1.fastq
tail /media/sf_qiime/MMPC/Berryman/new_data/TKnotts0517_G855_1.fastq

> CCTGTTTGCTCCCCACGCTTTCGAGCCTCAACGTCAGTTACGGTCCAGTAAGCCGCCTTCGCCACCGGTGTTCCTCCTAATATCTACGCATTTCACCGCTACACTAGGAATTCCGCTTACCCCTCCCGCACTCTAGCCCGCCAGTTTCCAAAGCAGTTCCGCAGTTAAGCTGCGGCATTTCACTTCAGACTTGCCGTACCGTCTACGCTCCCTTTACACCCAGTCAATCCGGATAACGCTTGCTCCCTAC
> +
> BBABBFFFFBDF2A4AEGGGGFEDBBEGF3GFF2FFGGGHHGCGEEHHHHFHGFGGGGGFFEGGGHGGF?E1BDHHHHHGF4EFGHGGCGDEHHHHHGGGEE3F3FGHHHHHHHHEGFGHFFGGFEFGGCGFGGHHHHHGGGGFHDGHHHH0GFCHHHHH?D-?D00:0:0:;:[email protected];90BFFFGG00CBB990/ACADFFFFFEB.;ED.A:FFBFFFFFF.F///B/FA-.;DFFFFDEEF/;BFF/9
> @M01533:390:000000000-B4NPH:1:1114:10632:4694 1:N:0:0
> CCTGTTCGATCCCCACGCTTTCGTGCCTCAGCGTCAGTATGGAGCCGGTATGCTGCCTTCGCAATCGGAGTTCTTCGCGATATCTAAGCATTTCACCGCTACACCACGCATTCCGCTTACTTCTCGCTCACTCAAGGCACCCAGTTTCAACGGCTCTCCGCGGTTGAGCCCCGCCATTTTCCCGCTTCGCTAACCGTCCGCCTTCGCCCCCCTTTAACCCAATTAATCCCGTTAACGCCCGCCTCCTCCC
> +
> 3AAAAFFAABFBGGFEEGGGGFAFBA2A33FGEEE2EFFD5A5ABFEEEF1GH5GGFHHBFE111BFD>EE3BGGH11//EEEDGH4FFFHHBBHHFGEEEEGG3?/EFCEGF2BE/<[email protected]@GFGH///<CFHH0?G1/>F..F.GHBGHHF.<@AA/0<..:A.:-;/09CD----.=0;00;.--..-.9.A/.-;.;--;../...A;@-9./;/99A.9.///;;9/..9../99---;-.9;BFF?
> @M01533:390:000000000-B4NPH:1:1114:10632:4694 1:N:0:0

There appears to be 2 extra lines at bottom of file. Easy way to edit .fastq?

Thanks,

Trina

Actually there was 3 lines (see what I posted earlier)- I need to check to see if file was truncated at some point during processing (see if number of sequences seem on par with the rest). I deleted 2 empty lines and the header line for an incomplete sequence, checked it with the code you provided and then reran the qiime import and demux summarize. And I have success- no error. Moving forward- I will first do some QC on the script-edited sequence files using the handy code provided here before trying to import the sequences. Thanks @ebolyen and @thermokarst for your help. Hopefully this simple file check will be handy to others as well.
Looking forward to continue working through the tutorials with my own data. :grinning:
Thanks,

Trina

1 Like

Thanks for the update @taknotts! Sounds like script QC would be a really good plan, you will want to make sure that the script didn’t do anything spooky (:ghost:) to your data. Happy QIIMEing! :sunny:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

There is now a command qiime tools validate in the newly released QIIME 2 2017.10. This should help avoid or at least diagnose these issues in the future!