MOCK sample and PCR blanks

thermokarst · August 31, 2017, 12:20am

I'm not sure I follow your question, can you provide some additional detail about what it is that you are trying to do? Thanks!

Cool! I think I can help with this, although it is a bit of a roundabout solution, because we haven't yet implemented filtering of FeatureData[Sequence] . What we can do today is filter your feature table to remove the features found in the "blanks":

Step 0) I am going to use the Moving Pictures tutorial dataset for this example, and for the sake of illustration, will consider Samples L6S93 and L6S68 as "blanks". The general strategy is to filter our feature table to just those samples, then we will grab the features that are found in those samples, and finally filter the original table using those feature IDs. Let's go!!

Step 1) Filter our feature table to grab just the "blanks"

$ qiime feature-table filter-samples \
  --i-table table.qza \
  --m-metadata-file sample-metadata.tsv \
  --p-where '"#SampleID" IN ("L6S68", "L6S93")' \
  --o-filtered-table table-blanks.qza
# Note, the single vs double quotes matter above!

$ qiime feature-table summarize \
  --i-table table-blanks.qza \
  --o-visualization table-blanks.qzv

Step 2) Prep our list of feature IDs to filter

$ qiime tools view table-blanks.qzv
# Note, you could also take a look at this file at view.qiime2.org, your call!

You should see something that looks like this (note though, this is the Moving Pictures dataset):

A quick scan tells us that there are two samples in this filtered feature table (which makes sense, we asked it to filter our table to only two samples), and in those two samples there are 60 unique features. Let's scroll all the way down and grab the "Frequency per feature detail" CSV file.

So the first few lines of this file look like this:

$ head feature-frequency-detail.csv
fe30ff0f71a38a39cf1717ec2be3a2fc,3109.0
1d2e5f3444ca750c85302ceee2473331,1756.0
997056ba80681bbbdd5d09aa591eadc0,1413.0
d29fe3c70564fc0f69f2c03e0d1e5561,965.0
3c9c437f27aca05f8db167cd080ff1ec,945.0
f35ce9c514e1398308f5f84ed50b260f,566.0
6edca9464612efff71d8f97299f01663,428.0
699071a4a7ce68918e4cd93a61275ecd,314.0
fba3d9b3872ff9021d66434323fb9565,288.0
9f718b8f66b79c3de80546ca9a54539c,288.0

The first column is the feature ID, the second observed frequency (we don't need this column for this procedure, just wanted to note it). So, what we need to do to make this file useful is to convert it to TSV, and add a header row, that way we can use this file as QIIME 2 metadata (specifically as feature metadata!). You can do this in Excel, or use some command line magic, your call. I will use a command line one-liner to set this up:

$  echo 'Feature ID\tFrequency' | cat - feature-frequency-detail.csv | tr "," "\\t" > features-to-filter.tsv

The new features-to-filter.tsv file should look like this:

$ head features-to-filter.tsv
Feature ID      Frequency
fe30ff0f71a38a39cf1717ec2be3a2fc        3109.0
1d2e5f3444ca750c85302ceee2473331        1756.0
997056ba80681bbbdd5d09aa591eadc0        1413.0
d29fe3c70564fc0f69f2c03e0d1e5561        965.0
3c9c437f27aca05f8db167cd080ff1ec        945.0
f35ce9c514e1398308f5f84ed50b260f        566.0
6edca9464612efff71d8f97299f01663        428.0
699071a4a7ce68918e4cd93a61275ecd        314.0
fba3d9b3872ff9021d66434323fb9565        288.0

Step 3) Filter out the features found in the "blanks"

# First, remove the blanks from the feature table
$ qiime feature-table filter-samples \
  --i-table table.qza \
  --m-metadata-file sample-metadata.tsv \
  --p-where '"#SampleID" IN ("L6S68", "L6S93")' \
  --p-exclude-ids \
  --o-filtered-table table-sans-blanks.qza

# Then, remove the features that are identified in `features-to-filter.tsv`
$ qiime feature-table filter-features \
  --i-table table-sans-blanks.qza \
  --m-metadata-file features-to-filter.tsv \
  --p-exclude-ids \
  --o-filtered-table filtered-table.qza

# This isn't necessary, just want to view the stats about our `filtered-table.qza`
$ qiime feature-table summarize \
  --i-table filtered-table.qza \
  --o-visualization filtered-table.qzv

$ qiime tools view filtered-table.qzv
# Note, you could also take a look at this file at view.qiime2.org, your call!

The stats look like this:

And the original feature table:

So it looks like we filtered out the samples that had the blanks (32 vs 34 samples), and we have removed the 60 features that were found in those blanks *phew*!

Let us know if you have any questions or get stuck!