Dear all,

I want to utilize `q2-breakaway`

plugin (qiime2-2020.8).

Here is my state of knowledge:

- I need to retain singletons for breakaway to work its magic
- Results of breakaway depend on the number of singletons
- I can use output of DADA2 with pool=TRUE
- I can use output of Deblur

I have some ideas on how I would like to perform this analysis, but I have doubts:

a) First, I consider using `q2-dada2`

with pseudo-pooling (`--p-pooling-method 'pseudo'`

). It retains singletons but, I would assume, not as many as full-blown pool=TRUE option. **How big of an issue is this for breakaway?**

b) Second, I consider Deblur. When using Deblur with default options, the singletons are removed, so I assume I need to Deblur data in a special way. I figured I could either retain all possible singletons using **â€“p-min-reads 1 and --p-min-size 1** argumentsâ€¦

```
qiime deblur denoise-16S \
--i-demultiplexed-seqs qiime_analysis/qa_joined_noadapter_for_deblur_SampleData[SequencesWithQuality].qza \
--p-min-reads 1 \
--p-min-size 1 \
--p-trim-length 401 \
--p-left-trim-len 0 \
--p-sample-stats True \
--p-jobs-to-start 4 \
--p-no-hashed-feature-ids \
--o-table qiime_analysis/deblur_singletons_FeatureTable[Frequency].qza \
--o-representative-sequences qiime_analysis/deblur_singletons_FeatureData[Sequence].qza \
--o-stats qiime_analysis/deblur_singletons_SampleData[DeblurStats].qza
```

â€¦ OR only â€śbiologicalâ€ť singletons, so the ones which are unique in given sample, but not for the entire dataset (**â€”p-min-size 1**):

```
qiime deblur denoise-16S \
--i-demultiplexed-seqs qiime_analysis/qa_joined_noadapter_for_deblur_SampleData[SequencesWithQuality].qza \
--p-min-reads 10 \
--p-min-size 1 \
--p-trim-length 401 \
--p-left-trim-len 0 \
--p-sample-stats True \
--p-jobs-to-start 4 \
--p-no-hashed-feature-ids \
--o-table qiime_analysis/deblur_singletons_FeatureTable[Frequency].qza \
--o-representative-sequences qiime_analysis/deblur_singletons_FeatureData[Sequence].qza \
--o-stats qiime_analysis/deblur_singletons_SampleData[DeblurStats].qza
```

**I cannot find, whether breakaway wants me to include all singletons or just the biological ones in the input?**

c) Finally: option a) and b) return very different number of singletons. As far as I understand the method, this may influence results. **I am unsure how to proceed with the analysis given this problem.** I believe I could use `breakaway_nof1`

to check if the number of singletons is to big, but even if it is - what then? And what if the number of singletons is to small?

Help me, smart people, Youâ€™re my only hope! Perhaps @Pauline_Trinh could be my Obi-Wan Kenobiâ€¦?