gm...

.
NV

.- gm
a. k. 1. a . , . . V .
.. . . Q

5.4::

i; _ . .
. . . . V u a.

a #9.. ...
0.
it . 1
3w. . ‘ Ilnﬂ...

a 3. 11.9193 8
ON ‘7 .1 . ’3‘.“me
. “mm... warn. RM .ﬁvomﬂs...
all? I“ I ...... 5‘3»
aﬁ...%......... . V J V
539...".«0; .. . V .

“mm. c .

:3 . b

1.3.2}: .w\ .
9 \m.i¥.

. «‘1‘;

mvﬁidm. :.

.- g .

3-), "DI-1‘

.. h {an

 

. .7. .. (i! .
x} .32.... .n.
(:33: it”... .8]!
r t II-
. iiilyﬂr » K
a“. h! «aim. fiu;u.
tn «suh‘vﬂwuﬁm: ._.
Lanmxmm. . {4 . in
x .
2 a. 113.35%. .heﬁaptxmwdi
I. . (3.1.2.: It... 3% :3 5|: It:
o.‘.15“‘-D§Lil"-lﬂ“4‘3 (gr
8 zonintitanalrwsl M. 30.0...
Aim. 1. . Inﬁrva.“

$.12. V ”scatti .
. 3.4. 33.5.1311!!! iv...
.5.» .1: an.“ H1551...”
$9.... ”1.4!:

 

 

 

13.51.? 5.
1.5!.xsl...z.l.(2£\x
.5331... .

. . :8 i ,.

 

. x 9
(133(-
is:

 

,..V._..m...\ V :4. . V . V . .
a... .;wc.:... V. , . . V . -.. v.1. _ «gr 5...».
1. Q J... .i k .% rﬁﬁm
Train...” ﬁxm

 

 

.\o LIBRARY
Michigan State
University

 

 

 

This is to certify that the
dissertation entitled

MICROBIAL COMMUNITY ANALYSIS ASSESSED BY
PYROSEQUENCING OF rRNA GENE: COMMUNITY
COMPARISONS, ORGANISM IDENTIFICATION, AND ITS
ENHANCEMENT

presented by
WOO JUN SUL

has been accepted towards fulﬁllment
of the requirements for the

Doctor of degree in Crop and Soil Sciences —
Philosoph Environmental Toxicology

Maw/Z

Majoyprofessor's Sign ture
ch. IQ, .2003
/

Date

 

 

MSU IS an Afﬁrmative Action/Equal Opportunity Employer

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5/08 K'lProlechSPres/CIRC/DateDueindd

 

MICROBIAL COMMUNITY ANALYSIS ASSESSED BY PYROSEQUENCING OF
rRNA GENE: COMMUNITY COMPARISONS, ORGANISM IDENTIFICATION,
AND ITS ENHANCEMENT
By

Woo Jun Sul

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY

Crop and Soil Sciences — Environmental Toxicology

2009

ABSTRACT
MICROBIAL COMMUNITY ANALYSIS ASSESSED BY PYROSEQUENCING OF
rRNA GENE: COMMUNITY COMPARISONS, ORGANISM IDENTIFICATION,
AND ITS ENHANCEMENT
By
Woo Jun Sul

There are more than 10)0 bacteria on Earth, with their members embedding 3.8

billion years evolutionary history and having evolved to take advantage of virtually every
energy-yielding niche hospitable to life. This makes the microbial world extremely
diverse, ubiquitous and essential to Earth’s habitability. Hence, determining which
microbes make up these communities is an initial goal for understanding microbial
communities. Recently, pyrosequencing of ribosomal RNA genes has become a popular
tool for in-depth analyses of microbial communities. I used pyrosequencing of rRNA’s
hypervariable V4-region to characterize a wide variety of microbial communities.

Soil microbial communities in the tropics are potentially more dynamic than
temperate ones due to longer and more favorable temperature, moisture and energy
resources from primary productivity. I studied the effect on soil Bacteria of different soil-
crop management systems in Eastern Ghana, one of which lost 50% of its stored soil
organic carbon (SOC) within 4 years. Canonical correspondence analysis and stepwise
multiple regression of the 290,000 V4-rRNA sequences showed that SOC was the most
important factor that explained differences in microbial community structure among
managements. The data indicate that the use of a pigeon-pea crop (a legume) during the
winter season (normally fallow) promotes a higher microbial diversity and sequesters

more soil organic carbon, which is important for soil structure, nutrient retention and

recycling, and general soil health. I also evaluated analysis methods for 211 rRNA-
determined bacterial assemblages, comprising 1.3 million rRNA sequences from seven
habitat types. A taxonomy-supervised method, using taxonomy-bins, was advantageous
in its ability to compare non-overlapping sequences, and requiring minimal computation
capacity compared to the non-taxonomy-supervised (clustering-detennined) method. The-
taxonomy supervised method produced results that were signiﬁcantly correlated to the
clustering method, which is the current standard, and as taxonomy improves should
provide even better resolution. Because of the much greater depth and replication
provided by pyrosequencing, more robust determination of microbial species distribution,
diversity, organism identiﬁcation, community comparisons and dynamics is possible.

As a result of microbes’ long history, they harbor considerable genetic diversity

and some of their genes likely have more desirable properties that those known. I used
stable isotope probing (SIP) with [13C]-biphenyl as substrate to retrieve novel biphenyl
dioxygenase subunits bphAE which showed PCB oxidative activity in the 31.8 kb cosmid

clone made from the [13C]-DNA. The discrepancy of G+C content near the bphAE genes

implies their recent acquisition, possibly by horizontal transfer, and suggests dispersed

dioxygenase gene organization in nature. I also used V4-16S rRNA gene pyrosequencing

of the [13C]-biphenyl-derived DNA from three PCB-contaminated environmental

matrices: rhizosphere, industrial soil, and river sediment to more speciﬁcally identify the
PCB- and biphenyl-utilizing populations of the three sites. I found little commonality in
the abundant members of three sites but new candidate groups that may be involved in

PCB degradation.

TABLE OF‘CONTENTS

 

LIST OF TABLES ............................. - V - . - - vi
LIST OF FIGURES ..................................................................................................... vii
CHAPTER ONE
WHAT DO WE LEARN FROM MICROBIAL COMMUNITY PROFILING BY RRNA
GENE PYROSEQUENCING: AN INTRODUCTION .................................................... I
Blossom of 16S rRNA gene sequences ......................................................................... 1
Deep sequencing .......................................................................................................... 1
General Procedure for 16S rRNA gene pyrosequencing ................................................ 3
Considerations in procedure for 168 rRN A pyrosequencing ......................................... 5
Quantiﬁcation of community structure by 16S rRN A gene pyrosequencing ................. 9
Phylogeny by 16S rRNA gene pyrosequencing .......................................................... 11
Diversity and species distributions in bacterial communities ....................................... 14
Community proﬁling and comparisons ...................................................................... 17
Measuring bacterial community dynamics .................................................................. 19
Bacteria] groups that correlate to habitat characteristics .............................................. 20
Method validation ...................................................................................................... 21
Amplicon pyrosequencing of protein encoding genes ................................................ 21
Conclusion and future directions ................................................................................ 22
References ................................................................................................................. 24
CHAPTER TWO
CONﬂVIUNITY RESPONSES TO AGRICULTURAL PRACTICES IN TROPICAL
AFRICA ANALYZED BY PYROSEQUENCING ....................................................... 28
Abstract ...................................................................................................................... 39
Introduction ................................................................................................................ 30
Results ........................................................................................................................ 32
Characterization of microbial and phylogenetic structures ....................................... 32
Microbial members in Ghana soils .......................................................................... 34
Structural differences in microbial communities among agricultural plots ................ 34
Characterization of new clades of sequences unafﬁliated to known sequences ........ 39
Discussion .................................................................................................................. 41
Materials and methods ............................................................................................... 45
Experimental design and sampling .......................................................................... 45
SSU rRNA gene amplicon pyrosequencing .............................................................. 46
Pyrosequencing data ................................................................................................ 47
Statistical analyses and implementation ................................................................... 47
Acknowledgements ... ................................................................................................. 48
References .................................................................................................................. 48
Supporting information 1 text ..................................................................................... 48
Supporting information 2 text ..................................................................................... 49
Supporting information material and method .............................................................. 50

iii

Initial processing and ﬁltering ................................................................................. 50

Sequence alignment ................................................................................................ 50

Neo plot .................................................................................................................. 52

References .................................................................................................................. 55
CHAPTER THREE

DNA-STABLE ISOTOPE PROBING INTEGRATED WITH METAGENOMICS:
RETRIEVAL OF BIPHENYL DIOXYGENASE GENES FROM PCB -

CONTAMINATED RIVER SEDIMENT ..................................................................... 57
Abstract ..................................................................................................................... 58
Introduction ............................................................................................................... 59
Materials and methods ............................................................................................... 60

Sample description and SIP microcosms .................................................................. 60
DNA extraction and [13C]-DNA separation ............................................................. 61
16S rRNA and aromatic ring hydroxylating dioxygenase (ARHD) gene clone
libraries ................................................................................................................... 62
Cosmid library construction and screening library with ARHDs primers .................. 63
Sequencing cosmid clone and genomic analysis ....................................................... 63
PCB transformation by expression in E. coli ............................................................ 64
Nucleotide sequence accession numbers .................................................................. 66
Results ........................................................................................................................ 66
Disappearance of biphenyl during the incubation ..................................................... 66
DNA extraction and isopycnic centrifugation ........................................................... 66
Analysis of 16S rRNA and ARHDS genes in clone libraries ..................................... 67
Screening for and analysis of biphenyl dioxygenases ............................................... 70
Functional analysis of biphenyl dioxygenases .......................................................... 70
Discussion ....................................... - ........................................................................... 72
Acknowledgements .................................................................................................... 77
References .................................................................................................................. 78

CHAPTER FOUR

UNIQUE PCB- AND BIPHENYL-UTILIZNG POPULATIONS IN THREE

DIFFERENT ENVIRONMENTAL MATRICES .......................................................... 82
Abstract ..................................................................................................................... 83
Introduction ................................................................................................................ 83
Materials and methods ............................................................................................... 86

Site description. ....................................................................................................... 86
V4-l6S rRNA gene pyrosequencing ....................................................................... 86
Estimates of bacterial richness ................................................................................. 87
Results ........................................................................................................................ 87
Bacterial communities in PCB-contaminated sites and their biphenyl-utilizing
populations .............................................................................................................. 87
PCB- and Biphenyl- Population Shifts During Incubation ........................................ 92
Shared OTUS Of Three Biphenyl-Utilizing Populations After 14 Days Incubation... 94
Different Incubation Methods Altered Biphenyl-Utilizing Populations .................... 94
Discussion ................................................................................................................ 103

iv

Acknowledgements .................................................................................................. 109
References ................................................................................................................ 1 09

CHAPTER FIVE
MICROBIAL COMMUNITY (ASSEMBLAGES) COMPARISONS BY BACTERIAL
TAXONOMY-SUPERVISED METHOD BYPASSING SEQUENCE ALIGNMENT

AND CLUSTERING .................................................................................................. 113
Abstract ................................................................................................................... 114
Introduction .............................................................................................................. 1 15
Materials and methods ............................................................................................. 116
Results ...................................................................................................................... 118
Discussion ................................................................................................................ 120
References ................................................................................................................ 124

APPENDIX
Appendix A .............................................................................................................. 129
Appendix B] Habitat-Lite two level scheme and its terms deﬁnition ....................... 140
Appendix B2 Priori groups described by Habitat-Lite ............................................... 141
Appendix B3 List of samples and their priori groups ................................................ 142
Appendix B4 Confusion table of priori groups and bacterial assemblage’ clusters by
average distance clustering ....................................................................................... 151
Appendix B5 Bacterial Assemblages Clusteimg ....................................................... 152
Appendix B6 Indicator Species Of Selected Priori Groups ....................................... 154
Appendix B7 Functional Diversity Measures ............................................................ 156

LIST OF TABLES

Table 1.1. Comparisons of relative abundances in a high nitrate wastewater treatment
system in Uruguay ......................................................................................................... 12

Table 2.1. Summary of soil characteristics of agricultural plots and pyrosequencing
results ............................................................................................................................ 33

Table 3.1 Phylogenetic classiﬁcation of 168 rRNA genes in clone libraries at zero (D0)
and 14 (D14H) days. ...................................................................................................... 69

Table 3.2 Phylogenetic classiﬁcation of 16S rRNA genes in clone libraries at zero (D0)

and 14 (D14H) days. ...................................................................................................... 73
Table 4.1. Bacterial richness estimations at 90% OTUS. ................................................ 95
Table 4.2. Bacterial richness estimations at 97% OTUS. ................................................ 96
Table 4.3. Bacterial richness estimations at 99% OTUS. ................................................ 97
Table 5.1. Similarity index measures and morphology of points in principle coordinate
analysis (PCoA) .......................................................................................................... 121
Table B1 . 1. Deﬁnition of terms in Habitat-Lite ............................................................ 140
Table B2. 1. Priori groups described using Habitat-Lite ................................................ 141
Table B3.1. List of samples ......................................................................................... 142
Table B4. 1. Confusion table of priori groups and bacterial assemblage’ clusters by
average distance clustering .......................................................................................... 151
Table B6. 1. Indicator species ....................................................................................... 154

Table B7.1. A cumulated relative abundance of species included in the calculations 157

vi

LIST OF FIGURES

Images in this dissertation are presented in color.

Figure 1.1. The numbers of 16S rRNA gene sequences per study .................................... 2
Figure 1.2. Comparison of rarefaction curves ................................................................. 4
Figure 1.3. Suggested procedure for 16S rRNA gene pyrosequencing .............................. 6

Figure 1.4. Schematic diagram of 16S rRNA gene pyrosequencing with barcode (tag,
key) primers ................................................................................................................... 8

Figure 1.5. Comparison of relative abundance by quantitative PCR and 16S
pyrosequencing ............................................................................................................. 10

Figure 1.6. Relative abundance (%) at the Phylum level by V4-16S pyrosequencing and

by Sanger sequencing of the clone library of a PCB-contaminated rhizosphere soil ....... 13
Figure 1.7. Examples of the phylogenetic analysis ......................................................... 15
Figure 1.8. The dominant Phyla in 6 different soils analyzed by V4-16S rRNA

pyrosequencing ............................................................................................................ 1 8
Figure 2.1. Microbial community structure and composition ......................................... 35
Figure 2.2. Microbial community comparison ............................................................... 36

Figure 2.3. Neighbor-joining phylogenetic tree displaying 287 clusters with a signiﬁcant
stepwise multiple regressions to any of the environmental parameters ........................... 38

Figure 2.4. Frequency distributions of the uncorrected distances of sequences afﬁliated to
selected taxonomic groups ............................................................................................. 40

Figure 2.4.B NEO plots for a representative microbial community under bare fallow
treatment (BF3) ............................................................................................................. 42

Figure 28]. Coverage of 16S rRNA sequences in RDP by V4 primers .......................... 51

Figure 3.1. Separation of [12C]- and [13C]-DNA by small-scaled secondary isopycnic
centrifugation and quantiﬁed by Q-PCR of 16S rRNA genes on triplicate samples ........ 68

Figure 3.2. Schematic diagram of gene order in clone L1 lElO. ..................................... 72

Figure 3.3. Amino acid sequence alignment of large subunit Of LB400, L11E10, and
KF707 biphenyl dioxygenases ...................................................................................... 76

vii

Figure 4.1A. Bacterial phylum composition in three PCB-contaminated sites initially (0d)

and after 4 and 14 days of incubation with biphenyl .................................................... 102
Figure 4.1B. River Raisin sediment ............................................................................. 103
Figure 410 Sandy soil .............................................................................................. 104
Figure 4.2. Principal Coordinate Analysis (PCOA) plot ............................................... 105
Figure 4.3A Shared OTUS in C2 4d and C2 14d .......................................................... 106
Figure 4.3C Shared OTUS in Rr 3d and Rr 14d ........................................................... 107
Figure 4.4. Shared OTUS among three PCB- and biphenyl-utilizing pOpulations after 14

days incubation with 13C-biphenyl (Pi). ...................................................................... 108
Figure 4.5. Relative abundances of Rr 14ds OTUS ....................................................... 109

Figure 4.6. Schematic summary of biphenyl-utilizing bacteria and cross-feeders in three
PCB-contaminated sites . ............................................................................................. 110

Figure 5.1. Sequence classiﬁcation percentages at different conﬁdence thresholds
determined by RDP-II Classiﬁer for different taxonomic levels ................................... 125

Figure 5.2. Rank comparison of distances calculated using non taxonomy-supervised

OTUS and taxonomy-bins. ........................................................................................... 126
Figure 5.3A. PCOA plot comparison based on abundance based distance ..................... 128
Figure 5.3B. PCOA plot comparison based on occurrence based distance ..................... 129
Figure B5.1. Bacterial assemblage clustering ............................................................... 152
Figure B7.]. COG categories with CWM values by priori groups ................................ 159
Figure B7.2 COG categories with CWM values by priori groups ................................. 160
Figure B7.3 Constant CMW value in all groups ........................................................... 161

viii

CHAPTER I

WHAT WE LEARN FROM MICROBIAL COMMUNITY PROFILING BY
rRNA GENE PYROSEQUENCING: AN INTRODUCTION

BLOSSOMING OF 16S rRNA GENE SEQUENCES

Since the complete 16S ribosomal RNA gene in Escherichia coli was sequenced
in 1978 (Brosius et al., 1978), the demands and usages of 16S rRNA gene sequencing
have increased in the ﬁelds of microbiology and microbial ecology. Ribosomal RNA
genes sequences have been used not only as marker genes tO shape bacterial phylogeny
but also as surrogates to reveal microbial community composition. Over past the 30
years, the numbers of rRNA gene sequences per study have greatly increased (Figure 1)
because of lower sequencing costs and, recently, because of massively parallel capacity,
such as by 454’s pyrosequencing technology. This technology has been successfully used
as a rapid and efﬁcient tool for in-depth analysis of microbial communities including
comparisons of microbial communities and the pre-diagnosis of microbial communities

prior to metagenomic analysis (Tringe and Hugenholtz, 2008).

DEEP SEQUENCING
Cost-effective microbial community ﬁngerprinting methods such as denaturing
gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE),
single-strand conformation polymorphism (SSCP), terminal restriction fragment length
polymorphism (T-RFLP), ampliﬁed rDNA restriction analysis (ARDRA), ampliﬁed
ribosomal intergenic spacer analysis (ARISA), were widely applied in the previous

decade due to their reasonable costs and convenience of performing and interpreting data

8::

BEES: E :EoEEom: 8 ..:8.. £05350— mciosom Eoc 863m 2022: wowmoco com Coca—ea can 838% mom 5 3:880

3:55 Co 80533 Co 838:: 05 02865 8150 some :85 .353. .8.— 825308 28” <75: we— we Eon—E:— o—E. ~.— 9...»:—
me> C03mO=DDD

crow meow ooom mmmr ommr name owmw mhmr one

 

 

 

 

 

 

. . . - 8+5
oxOxOOxooooooooooo 00
00000 0 0
00000 0
mmmmmwmooooooooooo .
ommao Fo+m_.
mwooo
9.00000 0
mooo o %
owe ..m
No+wv9
u
o
9
m
mo+wrw
q
a
J
¢o+m€

 

 

 

 

mo+m.—.

(Anderson and Cairney, 2004). However, these methods detected only the most dominant
community members, and thus they have limited resolution for describing microbial
community structure. Conventional 16S rRNA clone libraries determined by Sanger
sequencing are more informative about those members sampled but the cost is too great
to reach the numbers of sequences to provide sufﬁcient community resolution, although a
few studies attained more than ten thousand sequences with incredible sequencing
investments (Ley et al., 2008). Hence, the large numbers of sequences produced by 16S
rRNA gene pyrosequencing is major breakthrough in overcoming the obstructions in cost
and low resolution of the previous methods. For a similar cost, pyrosequencing can
produce hundreds of times more sequences than the 16S rRNA clone conventional
libraries, thus providing better community descriptions (Figure 2). These huge numbers
of sequences along with phylogenetic information, provide more precise signatures of

microbial communities.

GENERAL PROCEDURE FOR 16S rRNA GENE PYROSEQUENCING

The procedures of 16S rRNA gene pyrosequencing consist of four parts: 1) pre-
sequencing steps include sampling design, DNA extraction, 16S rRNA primer design or
selection, barcodes selection, and PCR to produce the amplicons for pyrosequencing, and
mixing of the amplicons for the sequencing plate or section, 2) pyrosequencing itself, 3)
initial processing of sequences data, including trimming of barcodes, ﬁltering out bad
sequences, alignment of sequences, sequences clustering to generate OTUS, assignment
of sequences’ taxonomy, and 4) data analysis of processed sequences, including

calculation of microbial community richness, evenness, and diversity as well as

:0me 0000: 05 5 000.898 mm gnaw 832 05 E 00 =08m .2903 Boamoﬁﬁ voaeﬁaﬁcooﬁom 0 80¢ 3395:
000—0 300309600 90 wﬁoaoswom cowgm 0:0 wﬁocoscomoca 80¢ 080030 2.: 53> 0.02:0 00300003.. H0 00330800 .NA 0.5!.“—

OOONr 0959 080 800 08% OOON 0

18m
.809

'80—.

 

 

 

 

. SON
05000300on 0 4

Von . 003

.01
30.5: 0:90 r 0000

08
u 8mm

6w

9600:0009?"—

OOr u 80v

community comparison index to facilitate the interpretation of sequence data, and the

necessary analyses, e. g. statistical, to derive scientiﬁc conclusions (Figure 3).

CONSIDERATIONS IN PROCEDURES FOR 16S rRNA PYROSEQUENCING

When selecting or designing 16S rRNA primer sets, several aspects need to be
considered: 1) an appropriate PCR product length for currently available pyrosequencing
reads (~ 240 bps for GS FLX and ~400 bps for GS Titanium), 2) adequacy of 16S
primers for coverage of Bacterial or Archael groups, 3) high resolution and accuracy of
selected regions for organism identiﬁcation, and 4) low frequency of insertions and
deletions to simplify sequence alignment and to retain more comparable sequences. The
choice of 168 primers strongly inﬂuences the coverage of 16S rRNA genes in microbial
communities and, therefore, can lead to a biased representation of microbial
communities. l6S primers preferentially selects or rules out speciﬁc taxa (reviewed in
Hamady and Knight, 2009) and over- / underestimates their microbial richness (Youssef
et al., 2009). Different primer selection may potentially lead the different research
conclusions (Hamp et al., 2009). Due to the inability to cover the entire 16S rRNA gene
sequence by current high-throughput sequencing methods, selection of a “good” region is
important for good taxonomy assignment (Liu et al., 2007; Wang et al., 2007).
Furthermore, the low frequency of insertions and deletions in the sequencing region is
also important to simplify sequence alignment and to retain more comparable sequences
(Cole et al., 2009). Currently the more popular regions used in 16S rRNA
pyrosequencing include the regions surrounding V2, V4, and V6 (Sogin et al., 2006;

Huse et al., 2007; Roesch et al., 2007; Andersson et al., 2008; Chapter 2, 4, and 5).

 

 

Sampling and DNA Extraction
e

 

Pre-Sequencing

 

 

 

 

l
Primer Selection or Design I
I
l

|
l
L PCR with barcode primers
l

 

 

Steps f
\7
Sequencing
Sequencing lnitial Processing
Data (Barcode sorting and
Analysrs Seouences . ali control

 

 

          

 

 

Taxonomy Supervised Non-Supervised

RDP Classiﬁer Alignment

Mercer

FastA
RDP Seq Match Selection
D l' t Linka e
RDP Library Compar Clusteiing

Post-Sequencing Steps

Complete

 

5 J;

Finding Measuring microbial Microbial diversity
Indicator species assemblages measurement

comparison

 

 

 

 

 

Figure 1.3. Suggested procedure for 168 rRNA gene pyrosequencing. RDP
Pyrosequencing Pipeline provides a trained aligner on a small hand-curated set of high-
quality, full-length rRNA gene sequences. These aligned sequences can be clustered by
Complete Linkage Clustering, a method of calculating distance between clusters in
hierarchical cluster analysis. For identifying clusters’ bacterial taxonomy, Dereplicate
Request allows users to select a representative sequence from each cluster. The sequence
with the minimum sum of the square of distances between sequences within a cluster is
assigned as the representative sequence for that cluster. Representative sequences can
easily be retrieved from original sample’s sequences using FASTA Sequence Selection.
Alignment Merger helps sequence retrieval from multiple alignment ﬁles.

Positioning barcode (key or tag) nucleotides, such as those calculated by
Parameswaran et al. (2007) and Hamady et al. (2009), by positioning them between
adaptor sequences to pyrosequencing beads and 16S rRNA gene primers allows one to
mix multiple samples in one pyrosequencing run (Figure 4). Also, RDP’s Pyrosequencing
Pipeline lists 72 barcodes of 8-base length (V4-adaptor A primer speciﬁc) that have a
minimum difference of 2 bases from all other barcodes (Cole et al., 2009), avoid
problematic order of nucleotide addition in pyrosequencing ﬂow, and do not include
homopolymers that increase possibility of sequencing error (Quinlan et al., 2008). The
inﬂuence of the barcode sequence on biasing the sequences ampliﬁed by extending the
match to the target sequence has not yet been established.

To minimize the potential errors during PCR ampliﬁcation, all primers have to be
synthesized and puriﬁed at least once by HPLC to remove incorrectly synthesized
oligonucleotides. For each sample, more than three replicate PCR reactions with DNA
polymerases with proofreading capability are run in parallel and bands of the expected
size are extracted from a gel after electrophoretic separation in order to remove primer
dimmers and primer residues. When analyzing multiple samples in the same run,
barcoded primers are used for ampliﬁcation and the amplicons are carefully quantiﬁed
and mixed together in equimolar amounts before applying to the sequencing plate.

Processing raw sequence data includes ﬁltering out low-quality reads, although
error rates for pyrosequencing was only 0.4% with the GS 20 instrument (Huse et al.
2007). The suggested procedure is to discard reads with any errors in the 16S primers and
barcodes or below the average quality score of 25 (Huse et al. 2007). In addition,

checking the error in reverse primers (3'end, opposite primer from sequencing start) is a

00:00E w8000=0000§m min—M wEm: 000083 3 08.80 80 wooaoscom .8: v? 000 8 08000880000
00 :00 8.083 28:02 808:: $00— .33 0000.3: .53 “8000:0823 one» <73.— m: .8 883:0 05082—5 .vé 0.53%

.

 

i . _ 01 l \J.6

0 . . . .000000<0<0< / A...

n m 22.8 H 6.....,-.yi/.).¢ A - g A . o.
. ﬂ ...e000ep<0<0¢ . _.,. . (Gr... r .330. O
Filmii., . . 00000824000
1:... ,, 2000000048000». .. I . - J- - )6
.ﬂ . _ ... fly/X. , / /\I)°. A... .J- A i ..
z m Scream , . . 000009040004.0 . i / A/fl/ , .
H , .. «.2, ﬁf/o/ 0., /))°
. .1: n .. 90009940404
a- - 3 ._ . . 990009049004, ,9
. A . . . . . din/i I/.\.I..
” 1 h., . . 000009042090 ﬁK/....i.../).;°

>3. 3 28 x: 0.62.. <20 29.50

080m 85:08 8 Ev mo.— __ow =0m

further option (allowing maximum reverse primer edit distance) to ﬁlter out low-quality
reads because of the greater tendency for errors to occur when sequence reads reach the
35end.

Filtered sequences can be assigned to bacterial taxonomy by several applications:
RDP Classiﬁer, a naive Bayesian rRNA classiﬁer (Wang et al., 2007), searching for
nearest neighbor SeqMatch (Wang et al., 2007), SILVA (Pruesse et al., 2007), Greengene
(DeSantis et al., 2006). In order to cluster the sequences to generate OTUs (operational
taxonomic units), sequence alignment can be performed with the Infernal aligner, a
SCFG-based, secondary-structure aware aligner, (Nawrocki & Eddy, 2009) adapted in
RDP Pyrosequencing Pipeline, and NAST; Nearest Alignment Space Termination

(DeSantis et al., 2006b).

QUANTIFICATION OF COMMUNITY STRUCTURE BY 16S rRNA GENE
PYROSEQUENCING

The quantiﬁcation of bacterial species abundance by rRNA gene pyrosequencing
has been compared with other abundance measures such as FISH, quantitative PCR and
16S rRNA gene clone libraries. For example, relative abundances of Exiquobacterium
and Psychrobacter measured by V4-rRNA gene pyrosequencing were correlated to
relative abundances from Q-PCR of the same organism’s 16S rRNA gene, after
correction for copy number (Figure 5). The relative abundances of certain bacterial
groups using V4-rRNA pyrosequencing in wastewater treatment systems was found to be
correlated to results from other methods although there were some exceptions:

Chloroﬂexi and Nitrospira abundances were overestimated by FISH (or underestimated

by rRNA pyrosequencing), while the reverse was true for Betaproteobacteria by clone
libraries (Table l). The uncertainty, due to potential primer bias, in PCR-based
measurements makes that the reliability of quantiﬁcation by 16S rRNA pyrosequencing
uncertain (Figure 6). The measurements of Acetobacterium abundances using V6-rRNA
pyrosequencing did not correspond with FISH-measurements using Acetobacterium-
speciﬁc probe. The rRNA pyrosequencing result might be over-represented due to a
combination of DNA extraction and PCR bias (Gaidos et al., 2009).

Detection of bacterial species by rRNA pyrosequencing can also be compared to
culture-based methods. When these methods were compared the “culture-
negative/pyrosequencing-positive discordant pairs” (found only in pyrosequencing data
set) were found, but “culture-p0sitive/pyrosequencing-negative discordant pairs” (only by
culturing) were rarely found (Price et al., 2009). The genus Rhodococcus was dominant
by isolation, but ruled not detected in a clone library from a Czech PCB-contaminated
soil (Leigh et al., 2006; Leigh et al., 2007).

However, results of rRNA pyrosequencing showed that Rhodococcus was present

in low abundance but preferentially cultured in this case (Chapter 4).

PHYLOGENY BY 16S rRNA GENE PYROSEQUENCING
Phylogenetic analyses using pyrosequencing data has proven useful (Andersson et
al., 2008; Chapter 2). However, studying the bacterial phylogeny with pyrosequencing
sequences is strictly limited by the degree of polymorphism of bacterial groups within the
sequenced 16S rRNA gene region. Short read lengths makes the phylogenetic analysis

less robust due to decreased resolution, certainly the case at the species level for FLX

11

._0>0_
000000.800 $3 00 00006005 max .3 008800000 0.003 000000000 83000000030..me 00.0 008800 0000.38.00 0 0800 05 00 008000.000
003 83000000003 <79: m03~> .Aaoom <0 00 00030005 MUm 030808000 000 0000 0003 808:0 <ZMH 380300 000 08:0 «02%
mg 2.300% .mm Eztﬁegeagxm .m0_0000_0000...3 m3 .000 mUm 0.535000: .3 000000000 0.50—0.- ue 000080800 .mA 0.8me

xobv 00:00:00 0200.05 09.. >0 000000000 0000 <Zm: wow 00 oz
0 v N

o
. . l. o

 

........ ,1- -.m.O

.00

........ -- . -. €00
omnmdnk
0000.0 1+. 06.3-0. ..0- is .....................

    

{.04

(SQL Iaiouseueo onﬁlxa) HOd-O

 

 

 

me
E:_..Ouunnoac_xm

10

E05: 000—0 000000800 000 .390 080000000003
000w <2~t 3 0080008 $030.5 8 8000.? 8000000 000300003 000.08 :3: 0 8 000000850 020200 00 0002000800 A 030,—.

030200 0.00 ”<2
0000000000 0000 ”Oz

 

 

Q2 Q2 D2 Q2 000.000.00.02
DZ 02 Q2 Q2 080000.02
02 00.0 _ 0 0 m 0.0 0000.202
DZ 0. m m o. 2 06208000022
_ 0:0 mm 3. 0020830
m <2 <2 ed 000030000003
0 <2 <2 wen 0000908000003
0 <2 <2 0.0 0.2000000000308509
m <2 <2 wd 020000000000Q=0mam
2 <2 <2 m. _ 0.2000000000E00N0Q
_ <2 <2 Wm 000080500
_ _ <2 <2 m2 0M0£0~0=0M0¢0AE
om om .w _ 0N _ .m _ 0.2000000000LQ000M
0 mm .2 m .w _ No— 0.2000000000LQS‘8V
dﬁn: 000—0 0:0000mEmE Ema wm_000000mo._.£

12

$005000 00:00Emm0—0 0:0 .0w000>00 8th
.2000000003 $20800 00 309800 00000000.“? 00 000 00 >08 0000000m6 2:. .=0m 00230003.. 630580.000A—Dm 0 u0 E05:
0020 0.: u0 3:000:03. how—Em 3 0:0 maﬁa—5000.39 @073» 3 _0>0_ 833.— 05 .0 AR; 0000—0550 950.0% .94 0.53m

msn 0 A D H V N0 uv
cu m. 9 a w m o w
B J 9 O U
“.8 o I. q 0 0
mm m n. 0.. m. w m.
a 0 w w. a a o
P m o a u u. m
0 W S B B U
m... D. B

E w

9

S

 

 

 

 

 

 

 

:14!

vacuum—.00 00:3.an ~30... even an 05252003}. n.
samuccou 00530.0 Hun—Om *0» an 059.03.09.30: 3
>020: 0:20 I

 

 

 

13

sequencing (Armougom et al., 2009). If two bacteria have the same 16S rRNA gene
sequences within the sequenced region, it is impossible to differentiate their phylogeny.
In addition, there is the discrepancy in phylogenetic relationship between full-length 16S
rRNA gene and short pyrosequencing reads (Figure 7). The phylogenetic trees with short
pyrosequencing read or full-length 16S rRNA gene sequences sometimes conﬂict with
each other by altering the position of major phyla in the tree. Moreover, the actual
phylogeny could possibly be overwhelmed by inherited pyrosequencing error rates. The

phylogeny with pyrosequencing reads should be carefully done.

DIVERSITY AND SPECIES DISTRIBUTIONS IN BACTERIAL
COMMUNITIES

168 rRNA gene pyrosequencing reveals a “rare biosphere” in that thousands of
low-abundance populations are now detected (Sogin et al., 2006). This large sampling by
this exhaustive sequencing can make more valid richness estimates by assuming a
species/taxa-abundance distribution (TAD). Previously, diversity has been estimated by
ﬁtting data of 16S rRNA gene clone libraries (Hong et al., 2006) or T-RF LP (Doroghazi
and Buckley, 2008) to taxa-abundance curves and extrapolating from this to estimate
richness (Curtis et al., 2006). Having large numbers of sequences circumvents the
limitations of previous sampling methods makes it possible to apply rigorous statistical
methods to ﬁt TADS to rRNA pyrosequencing data, resulting in better prediction of
microbial diversity (Quince er al., 2008). Although the true bacterial taxa-abundance
distribution is unclear as the ultimate statistical model is to ﬁt TADS, estimation of

richness can be used for pre-metagenomic analysis to decide the depth of sequencing

14

Group I

  
 

Q Bunkholden'a sordidicola
Burkho ria‘sartisoli

  
    
  

Bummﬂa magmas Burkholden’a glathei
Burkholden'a caryophyﬂi
Burkholderia phytoﬁrmans
Burkholdelia phenazinium o
Burkholden'a blyophila Burkholden'a multivorans
Burkholdeda fimgorum
Burkholden‘a graminls
Burkholdon'a caledonica
0
GM" " Burkholderia mimosarum

 

o
o 01 ' ° °
' Burkholderia ferrariae
Figure 1.7. Examples of the phylogenetic analysis: Distribution of Burkholderia species

in California grassland

efforts required to cover the microbial genetic component and is the basis for the
systematic exploration of microbial diversity on the planet.

Non-parametric estimations are used to measure bacterial diversity for practical
reasons although the estimation tends to be underestimated when sampling sizes are
small. Pyrosequencing overcomes the limit of the small sample size, and has been used to
measure and compare diversity. For instance, non-parametric estimate of diversity of
commensal human oral microﬂora was at least one order of magitude higher (>19,000
species) using pyrosequencing than previous estimates based on (Keij ser et al., 2008). It
is worthy to note that short fragment sequences of pyrosequencing, gives various species
richness estimates depending on which variable regions the sequence fragments span. By
comparing to richness estimates from complete 16S rRNA gene fragments, richness
values were overestimated by the V1+V2, and V6 regions, underestimated by V3, V7,
and V7+V8 regions, and nearly comparable by V4, V5+V6, and V6+V7 regions (Youssef
et al., 2009).

In bacterial communities with less taxa at the phyla level but high numbers at the
species and strain level, taxonomic richness would likely be underestimated because short
variable regions of the 16S rRNA gene would have insufﬁcient resolution. An example is
the architecture of highly speciated, but phyla impoverished human gut microbiota. This
is not the case for the soil environment, which has more uniform distributions of its
phylogenetic architecture.

COMMUNITY PROFILING AND COMPARISONS
The main purpose of rRNA gene pyrosequencing is the proﬁling of various

bacterial communities, for instance, the deep marine biosphere (Sogin et al., 2006; Huber

16

et al., 2007), soils (Roesch et al., 2007), oral microﬂora (Keiser et al., 2008), oligarchic
microbial assemblages in anoxic bottom waters of a volcanic lake (Gaidos et al., 2009),
bacterial and archaeal communities in tidal ﬂat sediments (Kim et al., 2008), active PCB-
degrading populations (Chapter 4), airborne microbial community (Bowers et al., 2009),
rhizosphere soils (Figure 8).

Barcoding of 16S rRNA gene pyrosequencing also provide for analyzing a larger
number of replicates that previously possible by the clone library approach. Hence,
comparisons of microbial communities can be reliably achieved along with changes due
to ecological raison d'étre. We used this strategy to compare different soil management
systems, one of which rapidly altered stored soil carbon, in agricultural plots in Africa.
Soil organic carbon (SOC) was the most important factor that explained differences in
microbial community structure among treatments. Most notably, members of the
Acidobacteria subdivisions GP4, GP6, and Alphaproteobacteria were more abundant in
soils with relatively high SOC whereas Acidobacteria subdivisions GP7 and GP],
Actinobacteria, and Gemmatimonadetes were more prevalent in soil with lower SOC
(Chapter 2).

Bacterial communities in stools from bio-breeding diabetes—prone, and bio-
breeding diabetes-resistant rats were compared and different species were found to be
dominant. However, the relatedness of these species to diabetes could not be determined
(Roesch et al., 2009a). V6—rRNA pyrosequencing was applied to human microbiomes in
throat, stomach and fecal samples in study focused on effects of the presence of
Helicobactor pylori in stomach. Hierarchical clustering based on Unifrac distance

showed that H. pylori positive stomach samples have a different signature in its bacterial

l7

w5000000m003 <2MH m3 40> 03 083050 .008 000000.00 0 5 00302 E05500 00H .w.— 0.5”;

0,

00. 0.. 0. 0.
.000 -00 00. 000.00% .0000. ”0.00% 00% 0000.

 

0.30 use: .0530 a

20:0 .5006. $05 002000.000: m
2.00 0005 .0022 n
230 80:: £2820 90-
290 805 00905030-
03300 0800 .050 00003. -

  

 

 

 

(%) ooucpunqe mama

18

community compared to negative H. pylori samples (Andersson et al., 2008). The impact
of diabetes and antibiotics on chronic wound microbiota characterized by V3-16S rRNA
pyrosequencing showed that wound microbiota from antibiotic treated patients was
signiﬁcantly different from untreated patients. Also, antibiotic use among diabetics
decreased Streptococcaceae abundance, which was more abundant among diabetics as
compared to non-diabetics. The authors conclude that some bacteria might be involved in
the non-healing state of some chronic wounds (Price et al., 2009). Hamsters' fecal
bacterial populations determined by pyrosequencing of 168 rRNA tags were analyzed to
understand the inﬂuence of grain sorghum lipid extract (GSL) through feeding the
hamsters GSL. Pyrosequencing results revealed that families Coriobacteriaceae and
Erysipelotrichaceae were negatively correlated to GSL intake, and Allobaculum was
positively correlated with GSL while phylum level composition had no differences.
Hence, alterations of taxa occurred a deeper levels (small groups) were linked to diet
(Martinez et al., 2009). These ﬁndings suggest that rRNA gene pyrosequencing can used
to detect and quantify community differences and to analyze disease-associated microbial
gut ecology.
MEASURING BACTERIAL COMMUNITY DYNAMICS

Bacterial community dynamics also can be measured by 168 rRNA gene
pyrosequencing. Population dynamics in fermented foods, e.g. pearl millet slurries,
revealed that F irmicutes and lactic acid bacteria were detected throughout 24 h of
fermentation whereas other bacteria were only detected at beginning of fermentation
(Humblot and Guyot, 2009). Dethlefsen and colleagues (2008) analyzed the antibiotic

(Ciproﬂoxacin)-associated disturbance of the human gut microbiota. Ciproﬂoxacin

19

treatment inﬂuenced the abundance of about a third of the bacterial taxa in the gut, and
decreased the taxonomic richness, diversity, and evenness of the community, however,
the bacterial community returned to the pretreatment state indicating this community’s
resilience. Also, rRNA pyrosequencing may be used to measure the outcome of
management of microbial community composition to aid functional stability in

bioreactors (unpublished) and wastewater treatment systems (Appendix B3).

BACTERIAL GROUPS THAT CORRELATE TO HABITAT
CHARACTERISTICS

Several studies have tried to ﬁnd correlations between characteristics of habitats
and the presence or relative abundance of certain bacteria] groups. Bacterial community
composition from 87 different soils, was signiﬁcantly correlated with differences in soil
pH, largely driven by changes in the relative abundances of Acidobacteria,
Actinobacteria and Bacteroidetes across the range of soil pHs. Phylogenetic diversity of
the bacterial communities was also correlated with soil pH (Lauber et al., 2009). Relative
abundance, diversity, and composition of the Phylum Acidobacteria were correlated
strongly with soil pH (Jones et al., 2009), suggesting the ecological relevance of this
poorly-cultivated, less-known group — Acidobacteria. Also, a comparison of four
geographically distant microbial communities showed few shared members, indicating
environmental characteristics are strong features determining microbial community

composition (Fulthorpe et al., 2008).

20

METHOD VALIDATION

The bias that can be caused by sample handling and experimental procedures such
as sample storage and DNA extraction also can be investigated by rRNA pyrosequencing.
The changes in bacterial cormnunity composition and diversity was studied in samples of
healthy children’s feces analyzed immediately at sampling and after storing at room
temperature up to 72h. In the latter samples, members of Bacteroides and Clostridium
decreased and the members of the Enterobacteriaceae increased (Roesch et al., 2009b).
Understanding of the bias of DNA extraction was studied by comparing the bacterial
composition in the DNA recovered after ﬁrst extraction and 6th serial extraction
(Feinstein et al., 2009). Rarely-cultivated groups such as Acidobacteria,
Gemmatimonades, and Verrucomicrobia were extracted more efﬁciently in the ﬁrst

extraction, while proportionally more Proteobacteria and Actinobacteria were recovered

in DNA from the 6th extraction.

AMPLICON PYROSEQUENCING OF PROTEIN ENCODING GENES
Describe earlier, short read length offers a limited phylogenetic information for
more conserved genes, like the ribosomal genes, which may be addressed by targeting
genes other, faster-evolving, phylogenetically-informative genes. Pyrosequencing of a
protein-encoding gene, e. g. Chaperonin-60 universal target (cpn60 UT), provided better
resolution at the species level than 16S rRNA genes when describing the vaginal

microbial community (Schellenberg et al., 2009).

21

CONCLUSIONS AND FUTURE DIRECTIONS

Pyrosequencing of rRNA genes has been opening a new path to assess microbial
communities, in respect to species distribution, diversity, the organism identiﬁcation,
community comparisons and dynamics. Although rRNA pyrosequencing is currently
(arguably) the most effective bacterial community analysis method, we have often faced
the problem in linking these 16S rRNA sequences to biological functions in the microbial
community, especially when sequences reﬂected dominant species whose functions are
unknown (Fulthorpe et al., 2008). Also, rare members, which usually comprise more than
half the species of natural environments, are the outcome of evolutionary history and
have a seemingly the inﬁnite source of genomic inventory (Sogin et al., 2006). Gathering
genomic information and physiology of unknown groups and rare members is beginning
to be addressed by the GEBA Project (the Genomic Encyclopedia of Bacteria and
Archaea) which aims to systematically ﬁll the gaps in genome sequence of major
branches in Bacterial and Archaeal of the Tree of Life.

Microbial ecologists would beneﬁt from consensus in a standard operating
procedure for rRNA pyrosequencing. Mostly because the short read length of current
pyrosequencing technique, has led to use of different universal primers and targeting of
different regions in SSU rRNA resulting non-comparable datasets generated by numerous
laboratories (discussed in Chapter 5). Even though rRNA pyrosequencing is powerful, it
still provides a rather the sketchy vies of microbial communities since the resolution of an
already conserved gene is much, much less that for whole metagenomic analysis.

Community level-MLST/A (Multi Locus Sequence Typing/Analysis) may become

22

possible in near future and if so, should provide better insight into microbial community

diversity and perhaps membership, and a good bridge to metagenomic data.

23

REFERENCES

Anderson IC, Caimey JW (2004) Diversity and ecology of soil fungal communities:
increased understanding through the application of molecular techniques. Environ
Microbial 62769-779

Andersson AF, Lindberg M, Jakobsson H, Backhed F, Nyrén P, Engstrand L (2008)
Comparative analysis of human gut microbiota by barcoded pyrosequencing.
PLaS One 3ze2836

Armougom F, Bittar F, Stremler N, Rolain JM, Robert C, Dubus JC, Sarles J, Raoult D,
La Scola, B (2009) Microbial diversity in the sputum of a cystic ﬁbrosis patient
studied with 16S rDNA pyrosequencing. Eur J Clin Microbial Infect Dis (in
process)

Bowers RM, Lauber CL, Wiedinmyer C, Hamady M, Hallar AG, Fall R, Knight R, F ierer
N (2009) Characterization of airborne microbial communities at a high-elevation

site and their potential to act as atmospheric ice nuclei. Appl Environ Microbial
75:5121-5130

Brosius J, Palmer ML, Kennedy PJ, Noller HF (1978) Complete nucleotide sequence of a
168 ribosomal RNA gene from Escherichia coli. Prac Natl Acad Sci U S A
75:4801-4805

Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS,
McGarrell DM, Marsh T, Garrity GM, Tiedje JM (2009) The Ribosomal Database

Project: improved alignments and new tools for rRNA analysis Nucleic Acids Res
37:D141-145

Curtis TP, Head IM, Lunn M, Woodcock S, Schloss PD, Sloan WT (2006) What is the
extent of prokaryotic diversity. Philas Trans R Soc Land 8 Biol Sci 361:2023-
2037

DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi
D, Hu P, Andersen GL (2006) Greengenes a chimera-checked 16S rRNA gene
database and workbench compatible with ARB. Appl Environ Microbial 72:5069-
5072

Dethlefsen L, Huse S, Sogin ML, Relrnan DA (2008) The pervasive effects of an
antibiotic on the human gut microbiota, as revealed by deep 16S rRNA
sequencing. PLaS Biol 6:e280

Doroghazi JR, Buckley DH (2008) Evidence from GC-TRFLP that bacterial communities
in soil are lognormally distributed. PLaS One 3ze2910

24

Feinstein LM, Sul WJ, Blackwood CB (2009) Assessment of bias associated with
incomplete extraction of microbial DNA from soil. Appl Environ Microbial (in
print)

Fulthorpe RR, Roesch LF, Riva A, Triplett EW (2008) Distantly sampled soils carry few
species in common. ISME J 2:901-910

Gaidos E, Marteinsson V, Thorsteinsson T, Johannesson T, Runarsson AR, Stefansson A,
Glazer B, Lanoil B, Skidmore M, Han S, Miller M, Rusch A, Foo W (2009) An
oligarchic microbial assemblage in the anoxic bottom waters of a volcanic
subglacial lake. ISME J 3:486-497

Hamady M, Knight R (2009) Microbial community proﬁling for human microbiome
projects: Tools, techniques, and challenges. Genome Res 19:1141—1152

Hamp TJ, Jones WJ, Fodor AA (2009) Effects of experimental choices and analysis noise
on surveys of the "rare biosphere". Appl Environ Microbial 75:3263-3270

Hong SH, Bunge J, Jeon SO, Epstein SS (2006) Predicting microbial species richness.
Prac Natl Acad Sci USA 103:117-122

Huber JA, Mark Welch DB, Morrison HG, Huse SM, Neal PR, Butterﬁeld DA, Sogin
ML (2007) Microbial population structures in the deep marine biosphere. Science
318:97-100

Jones RT, Robeson MS, Lauber CL, Hamady M, Knight R, Fierer N (2009) A
comprehensive survey of soil acidobacterial diversity using pyrosequencing and
clone library analyses. ISME J 32442-453

Lauber CL, Hamady M, Knight R, Fierer N (2009) Pyrosequencing-based assessment of
soil pH as a predictor of soil bacterial community structure at the continental scale
Appl Environ Microbial 75:51 1 1-5120

Leigh MB, Pellizari VH, Uhlik O, Sutka R, Rodrigues J, Ostrom NE, Zhou J, Tiedje JM
(2007) Biphenyl-utilizing bacteria and their functional genes in a pine root zone
contaminated with polychlorinated biphenyls (PCBs). ISME J 1:134-148

Leigh MB, Prouzova P, Mackova M, Macek T, Nagle DP, Fletcher JS (2006)
Polychlorinated biphenyl (PCB)-degrading bacteria associated with trees in a
PCB-contaminated site. Appl Environ Microbial 721233 1-2342

Ley RE, Hamady M, Lozupone C, Turnbaugh PJ, Ramey RR, Bircher JS, Schlegel ML,
Tucker TA, Schrenzel MD, Knight R, Gordon JI (2008) Evolution of mammals
and their gut microbes. Science 320: 1647-1651

Liu Z, Lozupone C, Hamady M, Bushman FD, Knight R (2007) Short pyrosequencing
reads sufﬁce for accurate microbial community analysis. Nucleic Acids Res
35ze120

25

Martinez 1, Wallace G, Zhang C, Legge R, Benson AK, Carr TP, Moriyama EN, Walter J
(2009) Diet-induced metabolic improvements in a hamster model of

hypercholesterolemia are strongly linked to alterations of the gut microbiota. Appl
Environ Microbial 75:4175-4184

Nawrocki EP, Kolbe DL, Eddy SR (2009) Infernal 10: inference of RNA alignments.
Biainfarmatics 25: 1 335-13 37

Parameswaran P, Jalili R, Tao L, Shokralla S, Gharizadeh B, Ronaghi M, Fire AZ (2007)
A pyrosequencing-tailored nucleotide barcode design unveils opportunities for
large-scale sample multiplexing. Nucleic Acids Res 352e130

Price LB, Liu CM, Melendez J H, Frankel YM, Engelthaler D, Aziz M, Bowers J, Rattray
R, Ravel J, Kingsley C, Keim PS, Lazarus GS, Zenilman JM (2009) Community
analysis of chronic wound bacteria using 16S rRNA gene-based pyrosequencing:

impact of diabetes and antibiotics on chronic wound microbiota. PLaS One
4:e6462

Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner F0 (2007)
SILVA: a comprehensive online resource for quality checked and aligned
ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35:7188-
7196

Quince C, Curtis TP, Sloan WT (2008) The rational exploration of microbial diversity
ISME J 2:997-1006

Quinlan AR, Stewart DA, Stromberg MP, Marth GT (2008) Pyrobayes: an improved base
caller for SNP discovery in pyrosequences. Nat Methods 5:179-181

Rodrigues DF, da C Jesus E, Ayala-Del-Rio HL, Pellizari VH, Gilichinsky D, Sepulveda-
Torres L, Tiedje JM (2009) Biogeography of two cold-adapted genera:
Psychrabacter and Exiguabacterium. ISME J 3:658-665

Roesch LF, Casella G, Simell O, Krischer J, Wasserfall CH, Schatz D, Atkinson MA,
Neu J, Triplett EW (2009) Inﬂuence of fecal sample storage on bacterial
community diversity. Open Microbiol J 3:40-46

Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH,
Camargo FA, F armerie WG, Triplett EW (2007) Pyrosequencing enumerates and

contrasts soil microbial diversity. ISME J 12283-290

Roesch LF, Lorca GL, Casella G, Giongo A, Naranjo A, Pionzio AM, Li N, Mai V,
Wasserfall CH, Schatz D, Atkinson MA, Neu J, Triplett EW (2009) Culture-

independent identiﬁcation of gut bacteria correlated with the onset of diabetes in a
rat model. ISME J 3:536-548

Schellenberg J, Links MG, Hill JE, Dumonceaux TJ, Peters GA, Tyler S, Ball TB,
Severini A, Plummer FA (2009) Pyrosequencing of the chaperonin-60 universal

26

target as a tool for determining microbial community composition. Appl Environ
Microbial 75:2889-2898

Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, Neal PR, Arrieta JM,
Hemdl GJ (2006) Microbial diversity in the deep sea and the underexplored "rare
biosphere". Prac Natl Acad Sci U S A 103:12115-12120

Sun Y, Cai Y, Liu L, Yu F, Farrell ML, McKendree W, Farmerie W (2009) ESPRIT:

estimating species richness using large collections of 16S rRNA pyrosequences.
Nucleic Acids Res 37 :e76

Tringe SG, Hugenholtz P (2008) A renaissance for the pioneering 16S rRNA gene. Curr
Opin Microbial 11:442-446

Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naive Bayesian classiﬁer for rapid
assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ
Microbial 73:5261-5267

Youssef N, Sheik CS, Krumholz LR, Najar FZ, Roe BA, Elshahed MS (2009) A
Comparative study of species richness estimates obtained using nearly complete
fragments and simulated pyrosequencing-generated fragments in 16S rRNA gene-
based environmental surveys. Appl Environ Microbial (in process)

27

CHAPTER II

COMMUNITY RESPONSES TO AGRICULTURAL PRACTICES IN TROPICAL
AFRICA ANALYZED BY PYROSEQUENCING

28

ABSTRACT

We analyzed the microbial community that developed after four years of testing
different soil-crop management systems in the savannah-forest transition zone of Eastern
Ghana where management systems can rapidly alter stored soil carbon as well as soil
fertility and structure. The treatments were: (i) the native practice of winter regrowth of
native elephant grass (Pennistum purpureum) followed by burning that biomass before
planting maize in the spring, (ii) the same practice but without burning and the maize
received mineral nitrogen fertilizer, (iii) a winter crop of a legume, pigeon pea (Cajanus
cajan), followed by maize, (iv) a treatment kept vegetation free in the winter (bare
fallow) followed by maize and (v) and unmanaged elephant grass-shrub vegetation. The
mean soil carbon contents of the sampled soils were: 1.29, 1.67, 1.54, 0.80 and 1.34,
respectively, differences that could be expected to affect the microbial communities.

From the more than 290,000 sequences obtained by pyrosequencing of the SSU
rRNA gene, 80% belonged to seven bacterial phyla common to most soils; Acidobacteria,
Proteabacteria, F irmicutes, Actinobacteria, Verrucomicrobia, Gemmatimanadetes, and
Bacteroidetes. Less than 5% of all sequences were identical to SSU rRNA gene
sequences previously recovered from cultivated bacteria, most were 90% or more similar
to previous sequences in pubic databases, but 1.2% (2330 sequences) had lower than 85%
similarity to any environmental or isolated sequences suggesting potentially novel phyla.
Canonical correspondence analysis and stepwise multiple regression showed that soil
organic carbon (SOC) was the most important factor that explained differences in
microbial community structure among treatments. Most notably, members of the

Acidobacteria subdivisions GP4, GP6, and Alphaprateabacteria were more abundant in

29

soils with relatively high SOC whereas Acidobacteria subdivisions GP7 and GP],
Actinabacteria, and Gemmatimanadetes were more prevalent in soil with lower SOC.
While community structure was most affected by SOC, diversity appeared to be
inﬂuenced by a combination of factors. The data suggest that the use of a pigeon-pea
fallow in tropical agriculture promotes a higher microbial diversity and sequesters more

soil organic carbon, thus improving soil structure, function, and resiliency.

ABBREVIATIONS
ID: Nuleotide identity
SOC: Soil organic carbon
MD: Mean uncorrected nucleotide distance
EbM: Maize-elephant grass (Pennisetum sp) rotation with fallow residue burning
EfM: Fertilized maize-elephant grass rotation with minimum tillage of fallow residue by
hand slashing
PM: Maize-pigeon pea (Cajanus cajan) rotation with minimum tillage of fallow residue
by hand slashing
BF: Maize-bare fallow rotation with complete residue removal during fallow period
Eu: Unmanaged elephant grass

INTRODUCTION

Conversion of natural ecosystems to agriculture results in soil organic carbon

(SOC) losses due to increased organic matter oxidation, leaching, and erosion (1).
Globally, deforestation rates are greater in the tropics than rates of current or historical
changes in any other region (2). SOC retention increases soil cation exchange capacity,
improves structure, and conserves nitrogen, phosphorus, potassium and sulfur.
Cultivation, in concert with fertilizer application, tillage, and residue removal results in

rapid SOC depletion followed by a slower decrease, typically spanning several decades,

before a new steady-state is reached (3). These losses can range between 20% to 70% of

30

the original SOC content (4), but can be remediated with the use of cover crops and
minimum tillage, when the residue is not removed (5,6). Reduced tillage increases SOC
retention through macroaggregate preservation (7), and has been proposed as a primary
method for optimizing SOC in ﬁne textured soils (8). Current agricultural practices in
tropical regions typically involve fallow residue removal, either by grazing or burning.
This practice has in recent years been re-evaluated with the goal of gaining beneﬁts from
developing a winter crop that would provide food, sequester more carbon in the soil,
improve soil fertility and structure and provide the potential for earning cash from the
developing carbon markets (9).

While much study has focused on the chemical and physical changes to soil from
different cropping systems, the associated shifts in soil microbial community structure
and function remain largely unknown. Soil harbors the largest reservoir of microbial
diversity due to an enormous number of niches, small-scale spatial isolation (10, 11), and
3.8 billion years of evolution. Soil microbial communities are responsible for carbon and
nutrient cycling and are thus an integral component of the soil productivity and the global
element cycles. Therefore, their response needs to be understood when developing new
agricultural practices. Recent studies assessing soil microbial community changes due to
cropping systems used methods such as clone libraries (12) and denaturing gradient gel
electrophoresis (DGGE) (13), which often lack the coverage and resolution necessary to
reveal changes among treatments. Pyrosequencing (14) now allows us to deﬁne diversity
and complexity by targeted SSU rRNA gene sequencing (15-17) at such depth that

community responses may be quantiﬁed in contrasting soil management schemes.

31

In this study, we utilized SSU rRNA gene pyrosequencing to determine the effect
of different maize-fallow rotations on soil microbial communities in the savannah-forest
transition zone of Ghana (18). Soils were sampled from four replicated plots after maize
harvest and after 4 years of the following annual rotations: 1) EbM: Growth of elephant
grass (Pennistum sp) in the winter with its residue burned followed by maize cultivation
(native practice), 2) PM: winter Pigeon pea (Cajanus cajan) crop, minimal tillage of
fallow residues followed by maize cultivation, 3) EfM: Growth of elephant grass with no
burn and followed by fertilized maize, 4) BF: bare fallow, i.e. no fallow season plant,
followed by maize cultivation and 5) Eu: re-growth of the native elephant grass-shrub

vegetation left unmanaged for 4 years (native condition).

RESULTS

Characterization of Microbial Communities and Phylogenetic Structures.
After trirmning of the forward and reverse primers and passage of trimmed reads through
quality ﬁlters to minimize the effects of embedded pyrosequencing errors, more than
290,000 sequencing reads with an average length of 207 bp were obtained. The number
of high quality sequences per sample was evenly distributed between 7519 to 12204
(Table 1). When operational taxonomic units (OTUS) were deﬁned at 95% identity (ID),
rarefaction curves indicated that sampling was, as expected, not fully exhaustive. The
phylogenetic architecture (19, 20) of these soil communities showed an extensive deep-
lineage variation, with a phyla rich pattern typical for soil habitats. In contrast, the
microbial community from a carbon amended aquifer exhibited shallow-lineage variation

with a lower taxa level (species) rich pattern, which drastically increased at 98% ID (Fig.

32

0:38 wamocosvomoiq 0:0 303 12330.2? 00 303380.93 =8 00 baa—ham .~.N 030.:

 

 

 

 

 

 

 

33

3.0 wmnn :3 .00:— E— nz Q2 92 :N.: GZ :m.m ham 5:
0:0 M03 :0: 500 «in
00.0 0b0n mmm _ rmma :5 <2 <2 0 m m :—.: a 2 00.0 3”.— am
«m0 :03 50— 300. 3:— _ _m 3.: 5. m0 3.:
v.0 Evm 0:3 000: mu:— 0 m0 5.: m.m 0.0 3.:
m0.0 0:00 000" 3.:— — «km NS. m:.: 0.0 0.0 .0:
«0.0 2:0 an" 030 Em 90.: m:.: Nam 0:.: 0.0 9.0 0b.: mm
and name an: mam“. 325— m _ o 2.: rd «.0 «.0.—
vm.0 Sum m:v~ Sn: «.2:— 03 2.: _.~_ :.0 N0.—
mb.0 3:0 2.: man... «Em Ev — _.: :.:_ 0.0 mm.—
o0.0 S E 00: man: 30.5 0nd 0m.~ N _ w 3.: 0.: :0 5; Eh...—
h:.b :Nbb vm _ m :00 2200 o _0 0:: :.0 0.0 0:.—
m0.0 3% 0:3 0mmw min—n.— Sv 2.: N0 5.0 $0.—
000 3.0.0 mmwn :0::— Nina :5. 2.: as 0.0 00.—
000 _:N0 — :N 50¢ :23. a _ .0 ~:._ 000 0:: WM 5.0 0:; 33m
nus :03 :3” $0: 053 $0 3.: 0.0 N0 rm.—
:..0 0:00 :23 32— Sam 30 N—.: 0.n 0.0 :0.—
n— .h «305 53 «:3 «SE 000 3.: :.:_ 0.0 an.—
30 Sub M03 an? :20— N—d 3..— 80 :1: 0.0 m0 3”.— E.—
n: $0: «a 32.2.3». A: 8.953 32:05 22> 30:3:— Aﬁb Ania—5 as}: 32.—
.= 2.26 5.5 £95 508.2 :0
0.. .oz 0.. .oz masﬁazm 3...: as: .0322: .35 .— 2303. 000 5.53:3.

 

1A). Based on the non-parametric estimator Chao 1, the fallow plant rotations PM and
EbM led to higher bacterial richness compared to the BF and EN (P<0.007)(Table l).

Microbial Members in Ghana Soils. Taxonomic classiﬁcation of the sequences
was assessed using the RDP Classiﬁer trained on species type-strain sequences from the
Taxonomic Outline of the Bacteria (TOBA; http://www.taxonomicoutline.org/) along
with additional sequences for regions of bacterial diversity not well-covered by TOBA.
The classiﬁer was set to a bootstrap conﬁdence threshold of 50%. Sequences covered by
our newly designed primer set were assigned to 23 phyla, 57 orders, 149 families, and
490 genera. Irrespective of the rotation practice, seven phyla accounted for 73% to 86 %
of total sequences in a given sample: Acidobacteria, Prateabacteria, F irmicutes,
Actinabacteria, Verrucomicrobia, Gemmatimanadetes, and Bacteraidetes (Fig. 1B).
These phyla appear highly ubiquitous as been observed by SSU rRNA clone libraries in
most soil environments (21, 22). Notably, BF signiﬁcantly contained more
Actinabacteria (15.4, SD 9.7%) sequences than the other treatment samples (ANOVA,
P=0.037) (Fig. 1B).

Structural Differences in Microbial Communities among Agricultural Plots.
In order to identify differences among the microbial communities, all 192,835 sequences
were clustered at 95% ID, yielding 26,287 clusters. All nineteen microbial communities
(including 1M), were compared by calculating the pair-wise abundance-based adjusted
Sorensen similarity index (23). This index was used in Principle Coordinate Analysis
(PCoA) which showed that the EM, EbM and PM generally grouped together whereas
BF was unique. The IM was clearly distinct from from the Ghanaian soils, most likely

due to a different soil origin and history (Fig. 2A).

34

~l Maize soil (Ghana)

ﬂa- River sediment

+ Rhizosphere

+Carbon amended aquifer

- - Species

    
 

 

 

W
a
0
C
S
.2
n:
0 L070--- _-- ._. .’ .
100 98 96 94 92 90 88 86 84 82 80
Distance (% ID)
B 40 IEDM
‘ AEfM
DPM 3
30 0 9 BF
0 Eu

Abundance (%)
N
O

 

 

 

 

law”

0

 

8
+—.-———
9;
-e-
_.—
B-

(D

-’b .r0 9 -’b ’b (a b 9 ’9 r\ '\ K
a“? ‘0“ &° 59° «06 8"}0 0“" .858, of)? 0“ 08+ 0 0
° -§ 0 f 0 5° 0 o 0 l
é, 00° << 6,0 05? (gs 00 9.9 (a o ‘
Y. Q‘ F 4° 6‘ ‘2‘0 1
0° 1

Figure 2.1. Microbial community structure and composition. (A) Phylogenetic
architecture of microbial communities among habitats. Richness was estimated by
rarefaction curves with randomly selected 10,000 sequences from each habitat. Sequence
data sets for the rhizosphere, river sediment, and carbon-amended aquifer samples
(unpublished) were obtained as described. (B) Phylum-level composition of the microbial
communities in Ghanaian soil.

35

 

IEIM2 2

 

.BFz EfM,’ vailable P
it 1
EbM4
08F, 9”
0PM2 g
: 4 ¢ : 0
TN 0

 

BFa’EfM o .gmas soc

‘ EbM
PM'I ’PM43
EM

4

 

 

 

- 3 2 -1 0 1 2
CCA1

Fig. 2.2 Microbial community comparison (A) PCoA analysis based on abundance-based
adjusted Sorenson similarity indices. (B) Two-dimensional CCA ordination plot. The
magnitude of the environmental vectors; microbial biomass (biomass), total nitrogen
(TN), available phosphorus (available P), and soluble organic carbon (SOC), is
represented by arrows. Cluster positions are indicated by grey symbols.

36

Canonical Correspondence Analysis (CCA) was implemented in order to establish
the linkage between cluster abundance and the environment by implicitly embedding the
environmental soil data (Table 1) with the cluster abundance. This method explained
36% of the cluster variability at the whole community level. Model signiﬁcance was
conﬁrmed using anovasim (number of permutations=10000, Pseudo-P<0.005) and
permutation (number of permutations =10000, Pseudo-P<0.02) tests. The ﬁrst ordination
axis was positively correlated with both TN and SOC while the second ordination axis
correlated with Available P. The BF was negatively correlated with all environmental
variables and is clearly distinguishable from the others. Utilizing two independent
methods, both PCoA and CCA served to illustrate the distinct structure of the BF.

To identify taxonomic groups that were most responsive to fallow practice,
clusters were selected that exhibited at least a three-fold abundance difference in BF
compared to the other agricultural treatment. Using this approach [Supporting
information (SI) text 1], 620 clusters were identiﬁed that accounted for approximately
25% of total sequences. Clusters more predominant under fallow rotation were classiﬁed
as Acidobacteria GP6 and GP4 (EbM and PM), class Bacilli (EbM), and
Alphaproteabateria (EfM). In contrast, clusters more abundant in BF were mostly
afﬁliated to Actinabacteria, Acidobacteria GPl, and Gemmatimanadetes (Fig. 3).

In order to investigate the environmental factors that inﬂuenced cluster
abundance, stepwise multiple regressions of the 620 clusters was performed. Signiﬁcant
stepwise multiple regressions were identiﬁed with 287 clusters (46%) (adjusted P<0.05).
SOC was the most consistent signiﬁcant predictor of relative cluster abundance among

the sites, followed by TN and available P. Among those clusters, 182 (63%) included

37

Actinobacteria

  

_ Gemmatimanadetes
-gmn '
\ yo" . L

Unc. Bacten’a I Q

GP4 ' ‘1\‘ 1 ,

Proteobacteria

   

Acidobacteria GP6 \L‘l I: f J

- 0.01 Bacteraidetes

Figure 2.3. Neighbor-joining phylogenetic tree displaying 287 clusters with a signiﬁcant
stepwise multiple regressions to any of the environmental parameters. From inner to
outer rings, red-colored n'ngs represent the fold-difference in relative cluster abundance
in BF compared to EbM, EM, and PM. Blue-colored rings represent the fold-difference
in cluster relative in EbM, EM, and PM compared to BF. The outer ring is color-coded
according to the taxonomic placement of the clusters.

38

SOC in the regression, 132 included total nitrogen, 130 included available P, 90 included
pH, and 35 clusters included microbial biomass. Regression slopes of SOC were
positively correlated to clusters afﬁliated to mostly Proteabacteria and Acidobacteria
GP4, GP5, and GP6, reﬂecting increasing cluster abundance with larger SOC values. In
contrast, Actinabacteria and Verrucomicrobia clusters were negatively correlated to

SOC.

Characterization of New Clades of Sequences Unafﬁliated to Known
Sequences. In order to determine the relatedness between our sequences and those in the
public database, the uncorrected nucleotide distance to the closest public isolate and
environmental sequence was calculated. The mean uncorrected nucleotide distance (MD)
when sequences were compared to the environmental plus isolate database (MDENV) or
isolate database (MDISO) were 96.8% and 88.6% ID, respectively. Interestingly, each
bacterial phyla or class exhibited a distinct MDISO. For Bacilli and Alphaprateabacteria
the MDISO were 98.2% and 95.4%, respectively, whereas for Acidobacteria,
Gemmatimanadetes, Verrucomicrobia, Chlaroflexi, Planctamycetes, and Nitraspira the
MDISO were below 90% ID (Fig 4A).

Interestingly, 2330 sequences had a similarity below 85% ID to any
environmental or isolate sequence in public databases. When clustered at 95% ID, 286
OTUS (941 sequences) contained at least two sequences, whereas 1389 OTUS contained a
single sequence. Notably, 144 OTUs (including a large OTU containing 27 sequences)

originated from multiple samples. This suggests that novel, yet to be sequenced bacterial

39

63030.0

wows—o2 20 505 080388 com: 3:: 027.00 0:0 530800 3380 33 03580835 20 505 08:09:00 com: 0003 203

3:: 020m .83on 2800880 02028 8 0802000 $0538 00 828300 0.088525 20 00 0:203:30 55:35 A<v .v.~ 0.5“;

E. 0.. 8:an

00—.

 

 

 

0030.05.00

 

 

 

 

on :0 cm :3 on om pm 8.50 :0 . o
11 l I- \c :1 j \ ll 1.. \-

/,.. - < 00

mmumoxEotcmi 0.2300002030202000 :0 m

l I II I ll II! II II II t I, . u

l/ t. ( 8m

:0 e

000830391 :0

4o

clades, exist that are poorly classiﬁed by both the modiﬁed TOBA (24) and Hugenholtz

schemes (http://greengenes.lbl.gov) (Fig 4B).

DISCUSSION

Pyrosequencing of the SSU rRNA gene was used to contrast microbial
community structures at a greater depth and with more replication than typically
attainable through previous methods. These analyses promote when properly selecting
sequencing region in SSU rRNA genes (SI text 2 and Fig. Sl)(25) and providing
highthroughput analysis tools (26).

We identiﬁed changes in bacterial community diversity responsive to agricultural
fallow treatments. Bacterial richness was higher in all agricultural plots when compared
to an elephant grass-shrub dominated, unmanaged plot (Eu). This illustrated the inﬂuence
of agricultural management on microbial community structure. After four years, bacterial
groups responsive to particular treatments were opportunistic additions to the endogenous
community represented by Eu. These groups are likely part of the “rare biosphere” in the
Eu community and serve as a genetic or functional reservoir. Physical disturbance of the
soil due to plowing, planting and burning of fallow plants may increase spatial resource
competition. Among the treatments, low diversity occurred in the bare fallow treatment
and was likely due to overall resource limitation from low exogenous organic matter
input. However, fertilizer application also restricted diversity, perhaps due to the higher
nutrient availability driving a less metabolically diverse r-selected community. While
fertilizer application exhibited the highest organic matter deposition, the PM treatment

served to sequester carbon in woody biomass. Microbial diversity was highest in this

41

 

 

 

      

 

      

 

   

42

 

 

 

 

 

 

 

 

 

 

cm I
005:3th 9. I
ON I
m98>Eo~oc£nr 3... . . v.3}. .
wouo0ocoEszEmO _ : , , .....
22.5.8.0. . ® .mmu
==omm a. x . 1 23
832$??? 444...... .a.. “at; ..
m_nob_E8_Em> ...l .H .
4.... .. .
N2... . 3...... . a n. x 4431...... «3...... . as... ..u......... t
mtogomnoouoaﬁmm _ gr”. 5.. . .. 1m” ..
mtouownoououamEEmmu— "t. . . _ . 1W“
x I». o x O 1 .3 . .00
0020309032.“? 4..”... .3.“ . up it.
.24: n V I ‘1... . 0c
argumcoaoamﬁo. .44. . ..u .. n we... ,
atone—3022a .05... _ £44.... . 9...? . E . u.
. . . . . :33... u...“
3 . .
F in 9.;
«58282 B
mmaogtﬂowm . .. .- .....
wtmﬁmnocuo<_ .4. M...)
mtouowm 6:: {H3 .H
can
B mm ...
E. 5 8985

treatment (BF3). Open circles 1, 2, and 3 are example of OTUS with a similarity below

Figure 2.4. (B) NEO plots for a representative microbial community under bare fallow
85% ID.

treatment, possibly due to slower nutrient release from the more recalcitrant pigeon pea
organic matter structure, steady N addition from N2 ﬁxation, and P solubilization. Based
on these results, pigeon pea appears to be the most appropriate cover crop in a tropical
ecosystem such as this, by sustaining a diverse bacterial community while sequestering
SOC, thus improving overall soil health.

Overall soil microbial community structure and speciﬁc taxa distribution were
found to be most affected by SOC abundance. Sequestered carbon appears to largely
inﬂuence Actinobacteria abundance in soil. The lowest-SOC, (bare fallow) treatment
consistently exhibited the highest abundance of Actinobacteria, largely of the subclass
Rubrobacteridae. Previously isolated bacteria within this subclass, Rubrobacter (27) and
Thermoleophilum (28), are resistant to radiation and are found primarily in arid soil,
which consistent with the more harsh condition of this soil since the summer maize crop
was meager in years 3 and 4 (Table 1). Though not selected by regression as temperature
was not a variable, the high Bacilli abundance in the burned treatment (EbM) was notable
and may be due to the heat resistance of these spore-forming bacteria. The traditional
burn of the fallow season vegetation has resulted in measured soil temperatures as high as
C (29), which could inﬂuence survivors in surface soil communities. In contrast to the
dogma that all Acidobacteria are oligotrophs (20), we found that certain groups were
positively correlated to SOC and were present in high abundance in the nutrient enriched
plots. However, overall it does not appear that Acidobacteria unifomly respond to
environmental variables, which could be expected for this very large and diverse phylum.

Our observations support that SSU rRNA gene pyrosequencing can be used to

assess microbial abundances in soils among different environments, and can be used to

43

test widely held inferences that were perhaps based on insufﬁcient data. First, our data
show that 4.1% of sequences were identical to SSU rRNA gene sequences previously
recovered from cultivated bacteria, in comparison to the common notion that “less than
1% of bacteria are cultivated”. However, since our reads covered only a small portion of
the total SSU rRNA gene and our sampling was not exhuastive, this estimation may be
artiﬁcially inﬂated. In order to extrapolate to the full diversity coverage of the samples,
we calculated the ratio of Chao l with sequences identical to isolates at 100% ID against
Chao 1 with all sequences at 100% ID. This estimate was 0.13%, the adjusted value
expected with exhaustive sequencing. Secondly, our data indicate that most members of
the Acidobacteria, Verrucomicrobia, Gemmatinomadetes, Nitrospira, and
Planctomycetes are poorly cultivated, whereas many Proteabacteria and F irmicutes, and
most of the Bacilli, have been isolated (30, 31). Within the Proteobacteria, however, the
Gammaproteobacteria have a large number of highly divergent, uncultivated members
(Fig. 4A). This is particularly interesting since it has been generally assumed that the
Gammaproteobacteria are easily cultured and most of their diversity is known. . Thirdly,
the massive compilation of SSU rRNA gene sequences yielded highly divergent
sequences from groups that were not previously sequenced. Based on our evidence, we
suggest that the 2330 sequences with less than 85% ID threshold against SSU rRNA
genes sequences in public database, are deeply divergent taxa that have yet to be isolated
or characterized. As such, this method is useful in discovering novel bacterial clades and
in providing potential probes to aid in their recovery of for studying their ecology.

In conclusion, this study illustrates the usefulness of pyrosequencing for the

comparison of microbial community structures. Land use change, including the

44

expansion of agriculture in the tropics is having major effects on ecosystems and on our
climate. These changes will most likely change the supporting microbial communities
and perhaps the soil processes and ecosystem services they provide. The new sequencing
methodologies now provide the depth and replication needed to assess microbial change
as a part of evaluating management and land use impacts. In this case, our data suggest
that the use of a pigeon-pea winter crop in tropical agriculture not only promotes a higher
microbial diversity but also serves to sequester soil organic carbon, thus improving soil

structure, ﬁmction, and resiliency.

MATERIALS AND METHODS

Experimental Design and Sampling. The research site was located at the erve
Agricultural Experimental Station (KAES) in Volta Region, Ghana (coordinates 6o
43.15’N, Oo 20.45’E). Classiﬁed as a savanna to forest transitional zone, the area is
dominated by Haplic Lixisols (sandy clay loams), Haplic Acrisols and Leptic Haplic
Acrisols. Soil samples were taken from each of four replicate plots (50 m by 80 m) in a
randomized complete block design with a 2.5 cm x 18.5 cm corer on September 10, 2006
after the maize harvest and after 4 years of the same annual rotations (32). Each replicate
sample was a homogenized composite of ten random sub-samples (18), with the
exception of Eu, composites of two sub-samples, separated by 0.7 m. The soils were
immediately place on ice and then stored at -20C until DNA extraction. The soil was
cultivated at the time of plot establishment but not after. The Iowa soil (1M), classiﬁed as
Tama silty clay loam, was collected on Dec. 1, 2006 following a maize crop which was

preceeded by soybean and was under no-till management for over 5 years.

45

SSU rRNA Gene Amplicon Pyrosequencing. Soil DNA was extracted with the
Mobio PowerSoil DNA Isolation Kit (Mobio, Carlsbad, CA) according to the
manufacturer’s instructions. Primers were designed with barcodes for pyrosequencing to
accommodate multiple samples in a single PicoTiterPlate (Roche Applied Science,
Indianapolis, IN). The forward key-tagged primers were composed of sequencing adaptor
A, sample-speciﬁc 4 or 6-bp keys, and a eubacterial 563F primer (bold in sequences
below). The reverse fusion primer consisted of sequencing adaptor B, and a eubacterial
802R primer. All primers were passed through dual HPLC-puriﬁcation (Integrated DNA
Technologies, Coralville, IA) in order to increase speciﬁcity of primers and minimize the
miss-sorting of samples by primer synthesis error. The forward primer sequence is 5’-
GCCTCCCTCGCGCCATCAG(keys)AYTGGGYDTAAAGVG-3’ and the reverse
primer is 5’-GCCTTGCCAGCCCGCTCAGTACNVGGGTATCTAATCC-3’. PCR
mixtures contained 1 uM of each primer (IDT, Coralville, IA), 1.8 mM MgC12, 0.2 M
dNTPs, 1.5 X BSA (New England Biolabs, Beverly, MA), 1 unit of FastStart High
Fidelity PCR System enzyme blend (Roche Applied Science, Indianapolis, IN), and 10
ng of DNA template. Ampliﬁcation conditions were as follows: initial incubation for 3
min at 950C; 30 cycles of 950C for 45 sec, 570C for 45 sec, and 720C for l min; and a
ﬁnal 4 min incubation at 720C. For each sample, three replicate PCR reactions were run
in parallel, PCR products were puriﬁed by agarose gel electrophoresis, and excised bands
of 270-300 bps were combined. Amplicon recovery was performed with Qiagen Gel
extraction (Qiagen, Valencia, CA) followed by an extra Qiagen PCR Puriﬁcation step.
DNA was quantiﬁed spectrophotometrically using the NanoDrop ND-1000

spectrophotometer (NanoDrop Technologies, Wilmington, DE) and equimolar amounts

46

of each sample were subsequently combined and subjected to pyrosequencing using the
Genome Sequencer FLX System (454 Life Sciences, a Roche company, Bradford, CT)

Pyrosequencing Data. Raw reads were processed, ﬁltered, aligned, clustered,
and bias-corrected Chaol species richness estimates obtained using programs from the
RDP Pyrosequencing Pipeline (26). Sequences were assigned to bacterial taxa using the
RDP Classiﬁer version 2 using the RDP release 9.53 training set (25). Chao’s abundance-
based adjusted Sorensen similarity (23) were calculated for each pair of samples using
Estimates (purl.oclc.org/estimates) after ﬁrst clustering each sample pair together.

For the phylogenetic tree, aligned representative sequences of 287 selected
clusters were exported and a neighbor-joining phylogenetic tree was constructed (35).
Tree and fold-difference color codes were visualized by iTOL (36) and resorted based on
phylum-level classiﬁcation. For the NEO plots, sequences were ﬁrst ordered by
classiﬁcation results either at the phylum or class level. Each symbol indicates the
uncorrected distance of a given sequence read to its closest match within the isolates
database (ISO) and its closest match within the environmental plus isolates (ENV) SSU
rRNA database. Each sequence was used as a query to the RDP’s SequenceMatch tool
(37, 38) to identify the sequence in the RDP’s database with the largest number of
matching words. The uncorrected pairwise distance was calculated between the aligned
query and the SequenceMatch result sequence.

Statistical Analyses and Implementation. ANOVA and canonical
correspondence analysis (CCA) was performed using the R statistical program (R
Development Core Team) running the vegan package. Clusters were assigned at 5% ID

using the complete linkage clustering method and soil environmental data from each

47

replicate was used. The joint effect or “signiﬁcance” of constraints in the CCA model
was tested using both an anova permutation test (anova.cca, a=0.05, n=10000) and a
CCA permutation test (permutest.cca, n=10000). Except where otherwise indicated,
processing software was written in Java (API v1.5.0) and executed on the Macintosh (OS
10.4) or Linux (2.4.23) operating systems running Java virtual machines from Apple or

Sun, respectively.

ACKNOWLEDGEMENTS.
This work was supported by grants from The Ofﬁce of Science (BER), US. Department
of Energy(DE-FGO2-99ER62848, DE-FG02-04ER63933); National Science Foundation

(DBI-O328255); and the US. Department of Agriculture (NRI).

AUTHOR CONTRIBUTIONS

Stella Asuming-Brempong, Samuel Adiku, and James Jones designed and managed
agricultural plots in Ghana for four years. Jorge Rodrigues managed DNA samples. Jim
Cole, and Qiong Wang set up computational sequences analysis: quality controls,
alignment, clustering, Neo’s plot, etc. Dieter Tourlousse and Ryan Penton performed

stastical analyses: CCA and multiple regression.

Designed research: S.A-B, S.G.K.A., and J.W.J.

Performed research: W.J.S. S.A-B, and J.L.M.R.

Analyzed data: W.J.S, Q.W., D.M.T., C.R.P., and J.R.C.

Wrote the paper: W.J.S., C.R.P., D.M.T., Q.W., J.R.C., and J.M.T.
SUPPORTING INFORMATION 1 TEXT

Selection of Clusters Contrast to BFs. These clusters were identiﬁed by

pairwise comparison of each practice to BF, with the ﬁltering criteria that: l) the number

48

of sequences in each cluster were found in all replicates, and 2) clusters exhibited at least
a three-fold prevalence as a replicate average in either each of the agricultural plots or
BF. For example, when identifying clusters that are more prevalent in PM compared to
BF, only those clusters with non-zero sequence counts in all PM replicates irrespective of
BF, EbM, and EM were included. The average number of those clusters among the four

replicates was required to be 3x higher than BF.

SUPPORTING INFORMATION 2 TEXT

Bacterial Primer Design for Pyrosequencing of SSU rRNA Genes. Regions in
the SSU rRNA gene suitable for pyrosequencing were identiﬁed that exhibited: 1) an
appropriate amplicon length for pyrosequencing reads, 2) high coverage by bacterial
universal primers, 3) high resolution and accuracy for bacterial classiﬁcation and
identiﬁcation, and 4) a low frequency of insertions and deletions to simplify sequence
alignment. A new set of bacterial universal primer, designed that encompassed the
hypervariable V4 region (corresponding to Escherichia coli SSU rRNA gene positions
563 to 802), allows for accurate bacterial taxa identiﬁcation with the RDP Classiﬁer (1).
Its applicability for pyrosequencing was further supported by in-silico Unifrac analysis
(2). The universality of the primers was determined by internal alignment of perfect
matches against SSU rRNA gene sequences in the Ribosomal Database Project II (RDP)
(94.6% coverage) and from the metagenomic database of the Sorcerer 11 Global Ocean
Sampling Expedition (94.7% coverage) (3). Speciﬁcally, the primers designed in this

study targeted an overwhelming majority of known SSU rRNA gene sequences

49

throughout all phyla while providing deep taxa classiﬁcation useful for community

comparisons (SI Figure 5).

SI MATERIALS AND METHOD

Initial Processing and Filtering. Raw reads were sorted into individual samples
using the assigned tag sequence. Forward and reverse primers were then removed from
the sequences. Trimmed sequences less than 150 bases in length were discarded. Also
discarded were sequences with a simple edit distance of greater than two in the forward
primer sequence. The read length was not always sufﬁcient to cover the entire reverse
primer. Depending on the end point in the reverse primer, a maximum edit distance 0 to 2
to the covered portion of the reverse primer was allowed. After this work was completed,
additional control experiments indicated that sequences with incomplete reverse primer
sequences or imperfect reverse primer sequences had an above average sequence error
rate (not shown). We would suggest that a perfect reverse primer sequence ﬁlter be
included in future work.

Sequence alignment. Sequences were aligned using the INFERNAL version 8.1,
a stochastic context-free grammar based aligner (http://infemal.janelia.org/). The rRNA
gene region corresponding to the region between primers (E.coli position 578 to 784) was
extracted from the RDP version 9 alignment for the 5341 representative sequences used
to train the RDP Classiﬁer (1). The INFERNAL aligner was trained using this
subalignment along with the Bacterial l6S rRNA secondary-structure model of Gutell
and colleagues (4). The 205 residues estimated to be present in greater than 95% of all

bacterial 16S rRNA sequences were selected as model positions for training. Sequences

50

18:01
musseioun
88M
egqorogwmnueA

ML

9960qu
mammal;
WWNWJOHI
ms

easements
9449108409an
9949ququ

lldO

OLdO

100

3468011»:
aanaqugurn
sereneuowsawwes
WQOBM
smnoguuu
891910900104
ammo
alumni-60000000190
semoooooouaueo
99330901410490
ayatwqoueﬁo
seroueboishuo
momma

momma

seiMumuo

L088

smemmenaa
mumbv
qumv
WW

El Prlmer Set 1

lPrtmer Set 2

  
 
 
 
  
 
 
 
 
  
 
  
  
  
  
  
  
  
  
  
  
  
  
 
 
 
  
   

Figure 2.81. Coverage of 16S rRNA sequences in RDP by V4 primers.

were aligned using this model and the options “--hbanded” and “--full”. With this short
model, Infernal aligns approximate 2200 reads per minute.

NEO plots. Sequences were ﬁrst ordered by classiﬁcation results at the phylum
level, and for Firmicutes and Proteobacteria at the class level. Sequences assigned to each
taxon were then ordered by successive complete linkage clustering at distances between
0.5 and 0.0 with a step size of 0.01. Each sequence was used as query to the RDP’s
SeqMatch tool trained on the RDP release 9.56 data set (6, 7) to ﬁnd the sequence in the
RDP’s database with the largest number of matching words. The program options were
set to search among all high-quality sequences greater than 1200 bases in length or only

high-quality sequences from cultured isolates of length greater than 1200 bases.

52

REFERENCES

1. Lal R (2007) Carbon sequestration. Phil Trans R Soc B doi:10.1098/rstb.2007.2185.

2. Houghton RA (1994) The worldwide extent of land use change. Bioscience 44:305-
313.

3. Scholes MC, Powlson D, Tian G (1997) Input control of organic matter dynamics.
Geoderma 79:25-47.

4. Mann LK (1986) Changes in soil carbon storage after cultivation. Soil Sci 142:279-
288.

5. Mann L, Tolbert V, Cushman J (2002) Potential environmental effects of corn (Zea

mays L.) stover removal with emphasis on soil organic matter and erosion. Agric Ecosyst
Environ 89: 149-166.

6. Lal R et al. (2004) Managing Soil Carbon. Science 304:393.

7. Grandy AS, Robertson GP (2007) Land use intensity effects on soil organic carbon
accumulation rates and mechanisms. Ecosystems 10:58-73.

8. Chivenge PP, Murwira HK, Giller KE, Mapfumo P, Six J (2007) Long-term impact of
reduced tillage and residue management on soil carbon stabilization: Implications for
conservation agriculture on contrasting soils. Soil Till Res 94:328-337.

9. Sandor R, Walsh M, Marques R (2002) Greenhouse-gas-trading markets. Philos
Transact A Math Phys Eng Sci 360:1889—1900.

10. Zhou J et a1. (2002) Spatial and resource factors inﬂuencing high microbial diversity
in soil. Appl Environ Microbiol 68:326-3 34.

ll. Treves DS, Xia B, Zhou J, Teidje JM (2003) A two-species test of the hypothesis
that spatial isolation inﬂuences microbial diversity in soil. Microbial Ecol 45:20-28.

12. Ndour NYB et al. (2008) Characteristics of microbial habitats in a tropical soil
subject to different fallow management. Appl Soil Ecol 38:51-61.

13. Muyzer G, De Wall BC, Uitterlinden AG (1993) Proﬁling of complex microbial
populations by denaturing gradient gel electrophoresis analysis of polymerase chain

reaction-ampliﬁed gene coding for 16S rRNA. Appl Environ Microbiol 59:695-700.

14. Margulies M et al. (2005) Genome sequencing in microfabricated high-density
picolitre reactors. Nature 437:376-3 80.

15. Sogin ML et a1. (2006) Microbial diversity in the deep sea and the underexplored
"rare biosphere". Proc Natl Acad Sci USA 103:12115-12120.

53

16. Huber JA et al. (2007) Microbial population structures in the deep marine biosphere.
Science 318297-100.

17. Roesch LF et al. (2007) Pyrosequencing enumerates and contrasts soil microbial
diversity. ISME J 1:283-290.

18. Asuming-Brempong S et al. (2008) Changes in the biodiversity of microbial
populations in tropical soils under different fallow treatments. Soil Biol Biochem
40:2811-2818 .

19. Acinas SG et al. (2004) Fine-scale phylogenetic architecture of a complex bacterial
community. Nature 430:551-554.

20. Ley R, Peterson DA, Gordon 11 (2006) Ecological and evolutionary forces that shape
microbial diversity and genome content in the human intestine. Cell 124:837—848.

21. Fierer N, Bradford MA, Jackson RB (2007) Toward an ecological classiﬁcation of
soil bacteria. Ecology 88: 1354-1364.

22. Janssen PH (2006) Identifying the dominant soil bacterial taxa in libraries of 16S
rRNA and 168 rRNA Genes. Appl Environ Microbiol 72:1719—1728.

23. Chao A, Chazdon RL, Colwell RK, Shen TJ (2006) Abundance-based similarity
indices and their estimation when there are unseen species in samples. Biometrics
62:361-371.

24. Garrity GM, Bell JA, Lilbum TG (2004) Taxonomic outline of the prokaryotes.
Bergey's manual of systematic bacteriology. second edition. Release 5.0. Springer-Verlag
New York.

25. Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naive Bayesian classiﬁer for rapid
assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol
73:5261-5267.

26. Cole JR et al. (2009) The Ribosomal Database Project: improved alignments and new
tools for rRNA analysis. Nucleic Acids Research 372D141-D145.

27. Chen MY (2004) Rubrobacter taiwanensis sp. nov., a novel thermophilic, radiation-
resistant species isolated from hot springs. Int J Syst Evol Microbiol 54:1849-1855.

28. Yakimov MM, Liinsdorf H, Golyshin PN (2003) Thermoleophilum album and
Thermoleophilum minutum are culturable representatives of group 2 of the
Rubrobacteridae (Actinobacteria). Int J Syst Evol Microbiol 53:377-3 80.

29. Giardina CP, Sandford RL, Dokersmith IC, Jaramillo VJ (2000) The effects of slash
burning on ecosystem nutrients during the land preparation phase of shifting cultivation.
Plant Soil 220: 247-260.

54

30. Rappé MS, Giovannoni SJ (2003) The uncultured microbial majority. Annu Rev
Microbiol 57:369-394.

31. Hugenholtz P, Goebel BM, Pace NR (1998) Impact of culture-independent studies on
the emerging phylogenetic View of bacterial diversity. J Bacteriol 180:4765-4774.

32. Adiku SGK, Narh S, Jones JW, Laryea KB, Dowuona GN (2008) Short-term effects
of crop rotation, residue management, and soil water on carbon mineralization in a
tropical cropping system. Plant Soil 311, 29-38.

33. Nawrocki EP, Eddy SR (2007) Query-dependent banding (QDB) for faster RNA
similarity searches. PLoS Comput Biol 3:e56.

34. Cannone JJ et al. (2002) The Comparative RNA Web (CRW) Site: an online database
of comparative sequence and structure information for ribosomal, intron, and other
RNAs. BMC Bioinformatics 3:2.

35. Saitou N, Nei M (1987) The neighbor-joining method: a new method for
reconstructing phylogenetic trees. Mol Biol Evol 4, 406-425.

36. Letunic I, Bork P (2007) Interactive Tree Of Life (iTOL): an online tool for
phylogenetic tree display and annotation. Bioinformatics 23, 127-128.

37. Cole JR et al. (2005) The Ribosomal Database Project (RDP-II): sequences and tools
for high-throughput rRNA analysis. Nucleic Acids Res 332D294-296.

38. Cole JR et al. (2007) The Ribosomal Database Project (RDP-II): introducing myRDP
space and quality controlled public data. Nucleic Acids Res 35:D169-172.

SUPPORTING INFORMATION REFERENCE

1. Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naive Bayesian classiﬁer for rapid
assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol
73:5261-5267.

2. Liu Z et al. (2007) Short pyrosequencing reads sufﬁce for accurate microbial
community analysis. Nucleic Acids Res 35:e120.

3. Rusch DB et al. (2007) The Sorcerer 11 Global Ocean Sampling expedition: northwest
Atlantic through eastern tropical Paciﬁc. PLoS Biol 5:e77.

4. Cannone JJ et al. (2002) The comparative RNA web (CRW) site: an online database of
comparative sequence and structure information for ribosomal, intron, and other RN As.
BMC Bioinformatics 3:2.

5. Cole J R et al. (2009) The Ribosomal Database Project: improved alignments and new

55

tools for rRNA analysis. Nucleic Acids Res. 37:D141-D145.

6. Cole JR et al. (2005) The Ribosomal Database Project (RDP-II): sequences and tools
for high-throughput rRNA analysis. Nucleic Acids Res 33:D294-296.

7. Cole JR et al. (2007) The ribosomal database project (RDP-II): introducing myRDP
space and quality controlled public data. Nucleic Acids Res 35:D169-D172.

8. Chao A, Chazdon RL, Colwell RK, Shen TJ (2006) Abundance-based similarity

indices and their estimation when there are unseen species in samples. Biometrics
62:361-371.

56

CHAPTER III
DNA-STABLE ISOTOPE PROBING INTEGRATED WITH METAGENOMICS:

RETRIEVAL OF BIPHENYL DIOXYGENASE GENES FROM PCB-
CONTAMINATED RIVER SEDIMENT

The work presented in this chapter has been published:
Woo Jun Sul, Joonhong Park’, John F. Quensen III, Jorge L. M. Rodrigues, Laurie
Seliger, Tamara V. Tsoi, Gerben J. Zylstra, and James M. Tiedje (2009) Appl Environ

Microbiol 75:5501-5506

Author contributions:

Laurie Seliger and Gerben Zylstra performed sequencing and analysis of two cosmid
clones. John Quensen measured PCB-transformation and biphenyl disappearance,
Joonhong Park, and Tamara Tsoi were involved in experimental design and project

development. Jorge Rodrigues helped with the phylogenetic analysis of 163 gene.

57

ABSTRACT

Stable isotope probing with [13C]-biphenyl was used to explore the genetic

properties of indigenous bacteria able to grow on biphenyl in PCB-contaminated River

Raisin sediment. A bacterial l6S rRNA gene clone library generated from [13C]-DNA

after a l4-day incubation with [13C]-biphenyl revealed the dominant organisms to be

Achromobacter and Pseudomonas. A library from PCR ampliﬁcation of genes for

aromatic ring hydroxylating dioxygenases from the [13C]-DNA ﬁ'action revealed two

sequence groups similar to bphA (encoding biphenyl dioxygenase) of Comamonas

testosteroni strain B-356 and of Rhodococcus sp. RHAI. A library of 1,568 cosmid

clones was produced from the [13C]-DNA fraction. A 31.8 kb cosmid clone, detected by

aromatic dioxygenase primers, contained genes of biphenyl dioxygenase subunits bphAE,
while the rest of the clone’s sequence was similar to an unknown y-Proteobacteria. The
discrepancy of G+C content near the bphAE genes implies their recent acquisition
possibly by horizontal transfer. The biphenyl dioxygenase from the cosmid clone
oxidized biphenyl and unsubstituted and para-only substituted rings of polychlorinated
biphenyl (PCB) congeners. DNA-stable isotope probing based cosmid libraries enabled
the retrieval of functional genes from an uncultivated organism capable of PCB

metabolism and suggests dispersed dioxygenase gene organization in nature.

INTRODUCTION
Commercially used polychlorinated biphenyls (PCBs), which are mixtures of

more than 60 individual chlorinated biphenyl congeners, are among the most persistent

58

anthropogenic chemical pollutants that threaten natural ecosystems and human health (1).
Numerous biphenyl-degrading microorganisms have been isolated and studied, especially
for the range of PCB congeners degraded. Research has been primarily focused on the
biodegradative pathways and the biphenyl dioxygenases responsible for initial PCB
oxidation by isolated bacteria (14, 27). Knowledge, however, is limited concerning the
indigenous microbial populations that metabolize PCBs in the environment. Stable
isotope probing (SIP) coupled with metagenomics is one approach to more directly
explore which organisms and genetic information may be involved PCB degradation in
PCB contaminated sites.

SIP was developed to separate and concentrate nucleic acids or fatty acids of
microbial populations that metabolize and hence assimilate the isotopically labeled
substrates into new cell material (4, 5, 28). Recently, the active PCB degraders in a
bioﬁlm community on PCB droplets were revealed as Burkholderia species using DNA-

SIP (32). In another DNA-SIP study, 75 different genera that acquired carbon from

[13C]-biphenyl were found in the PCB contaminated root zone of a pine tree (22). In

addition, that heavy [13C]-DNA fraction revealed new dioxygenase sequences and

possible PCB degradation pathways from GeoChip (16) results and from PCR ampliﬁed
sequences using primers targeting aromatic ring hydroxylating dioxygenase (ARHD)
genes (22).

A major hurdle in using DNA-SIP for metagenomic analyses (9) is the very small
amount of heavy DNA that is produced and hence recovered making library construction
difﬁcult. Two studies have shown the feasibility of DNA-SIP for metagenomic analyses

for C-l compound utilizing communities but they ﬁrst increased the amount of the heavy

59

DNA fraction by multiple displacement ampliﬁcation (6, 10) or enriched the community

by growth in sediment slurries. (18).

In this study, we used [13C]-biphenyl to probe for potential PCB-degrading

populations in a PCB-contaminated river sediment and to recover genes potentially
involved in the critical ﬁrst step of PCB degradation, the dioxygenase attack. We found a
31.8 kb cosmid clone that contained a biphenyl dioxygenase sequence (bphAE) and

demonstrated its activity on PCBs.

MATERIALS AND METHODS

Sample description and SIP microcosms. Sediment historically contaminated with

Aroclor 1248 at concentrations of 0.2 to 4.6 mg kg.1 was collected in October 2000 from

River Raisin at Monroe, Michigan, USA. Sediment samples were stored at 4°C under
river water until use.

Five replicate microcosms, each containing 5 g of sediment amended with 10 mg

of uniformly labeled [13C]-biphenyl (99 atom % l3C)(Sigma—Aldrich) and 10 ml K1

minimal medium (34) was placed in 160-ml serum bottles. Sample bottles were sealed
with Teﬂon stoppers and aluminum crimp-caps and incubated at room temperature in the
dark on a horizontal shaker at 150 rpm. The microcosms were aerated by opening ﬂasks
in sterile conditions for 10 min every 3-4 days, and after 14 d, DNA was extracted from
all microcosms.

To monitor biphenyl metabolism, nine microcosms amended with 10 mg

unlabeled-biphenyl, and three sterile microcosms with twice-autoclaved sediment and

60

unlabeled-biphenyl were established in parallel and incubated as described above. After
0, 7 and 14 d of incubation, triplicate microcosms were sacriﬁced for biphenyl extraction
by the addition of 10 ml saturated KCl and 10 ml dichloromethane. Biphenyl
concentrations were determined by gas chromatography with ﬂame ionization detection.
Split injections (50:1) were made on a J&K Scientiﬁc [CB-PAH capillary column (15 m,

0.25 mm ID, 0.15 um ﬁlm thickness). Temperature conditions were: inlet at 220°C; oven

at 80°C for l min and then ramped 40°C min'1 to 220°C; detector at 325°C. Colony

counts at each time point were obtained using R2A (29) agar plates and counted after 3

weeks of incubation.

DNA extraction and [13C]-DNA separation. DNA was extracted following a

previous protocol (35) but modiﬁed as follows to recover high molecular weight DNA.
All sediment slurries were centrifuged at 3500 x g and 4 g of sediment pellet was
transferred to a disposable 50-ml polypropylene centrifuge tube where 13.5 ml extraction
buffer containing 0.1 M PIPES (pH 6.4), 100 mM EDTA, 1.5 M NaCl and 1% CTAB
was added. Tubes were amended with 1.5 ml 20% SDS (w/v) and incubated in a 65°C
water bath for 2 h with gentle inversion every 10 min. Supernatant without whitish
material was collected after centrifugation at 3000 x g for 5 min and transferred into
another 50-ml polypropylene tube and extracted with an equal volume of chloroform.
DNA was precipitated with isopropanol, washed with ethanol, and dissolved in water at
50°C. For removing humic substances, the DNA solution was adjusted to 0.3 M NaCl by
adding 1 M NaCl in TE (10 mM TrisCl, pH 8.0) and placed into 1 ml DEAE Sephacel
(Sigma-Aldrich) columns pre-equilibrated with 0.3 M NaCl in TE. The columns were

washed with 4 ml of 0.3 M NaCl in TE, and DNA was eluted with 4 ml of 0.5 M NaCl in

61

TE. DNA was again precipitated with isopropanol, washed with ethanol and dissolved in

water at 50°C.

A total of 70 ug DNA at 0 d (D0) and 14 d (D14) was loaded in 18.5 ml cesium

triﬂuoroacetate (CsTFA) (Amersham, Piscataway, New Jersey) solution without the

addition of ethidium bromide and with a starting buoyant density of 1.60 g ml'l. The

CsTFA solution with DNA was transferred to 18.5 ml-Ultracrimp tubes (Sorvall,
Waltham, Mass.) The tubes were centrifuged in a TV-865B vertical rotor (Sorvall) at

179,000 x g (43,500 rpm) for 40 h at 20°C. The gradients were fractionated into 500 pl

fractions (up to 37 fractions) by displacement with water using a syringe pump at a ﬂow

rate of 1 ml min-]. The buoyant density of each fraction was measured at 25°C by a

refractometer. DNA fractions were precipitated with 1/10 volume of 3 M sodium acetate
(pH 5.2) and isopropanol. The DNA pellets were then washed and re-suspended in BB
elution buffer (Qiagen, Valencia, Calif.) and incubated at 50°C for l h. Fractionated
DNA was quantiﬁed with a ND-1000 spectrophotometer (NanoDrop, Wilmington,
Delaware). Secondary isopycnic density gradient centrifugation of combined DNA and
quantitative PCR (Q-PCR) were conducted as described (22).

16S rRNA and Aromatic Ring Hydroxylating Dioxygenase (ARHD) gene

clone libraries. Ampliﬁcations of 16S rRNA genes for clone libraries were conducted

using primers 27F (17) and 529R (33), on Do, and 27F and 1392R (17) on D14H

(H=heavy DNA fraction). Cycling conditions were as follows: denaturation for 5 min at

94°C, then 25 cycles of 1 min at 94°C, 1 min at 55°C, and 1 min (D0) or 1 min 40 s

(D14H) at 72°C, and an additional 7 min extension at 72°C. PCR ampliﬁcation of ARHD

62

genes was performed using primers ARHDIF (5'-

TTYRYNTGYANNTAYCAYGGNTGGG-3') and ARHD2R (5'-

AANTKYTCNGCNGSNRMYTTCCA-3') with D14H as previously described (22). PCR

amplicons of both 168 rRNA and ARHD genes were gel-puriﬁed using a QIAquick Gel
Extraction Kit (Qiagen) and cloned using a TOPO TA Cloning Kit for Sequencing
(Invitrogen, Carlsbad, Calif). Clone libraries were sequenced using primers T7 or T3 at
the Michigan State University, Research Technology Support Facility with an ABI 3730
Genetic Analyzer (Applied Biosystems Inc., Foster City, Calif). The phylogenetic
identiﬁcation of 16S rRNA gene consensus sequences was performed using the RDP-II
Classiﬁer (7).

Cosmid library construction and screening library with ARHDs primers.
Size-selected D14H (25-40 kb) was obtained by electrophoresis on 1% (w/v) low melting
point agarose TAE gel, and the desired size DNA was recovered using Gelase (Epicentre
Inc., Madison, Wisc.) without UV irradiation, end-repaired with T4 DNA polymerase,

and then inserted into pWEB m cosmid (Epicentre Inc.) at SmaI site. A cosmid library

was constructed by using pWEB TM cosmid cloning kit. All cosmid clones were stored at -

80°C. PCR ampliﬁcation with ARHD primers was used for cosmid library screening as
described above. Every 96 cosmid clones were pooled as templates for PCR screening.
Sequencing cosmid clone and genomic analysis. The cosmid clone Ll lElO was
sheared into approximately 4 kb fragments using a GeneMachines HydroShear device
(Genomic Solutions, Ann Arbor, Mich.) The fragments were end repaired with T4 DNA

polymerase and phosphorylated with T4 polynucleotide kinase (Epicentre). The DNA

63

fragments were then ligated into the vector pCR-Blunt (Invitrogen) and transformed into

E. coli TOP10. A total of 192 colonies were picked and then grown in LB plus 50 ug ml-

1 kanamycin in deep well microtitre plates. Plasmid DNA was isolated using the

Invitrogen PureLink 96 well lysis technique. The two ends of the inserted DNA fragment
were sequenced using either the primer BL (5'-TCGGATCCACTAGTAACGGC-3') or
BR (5'-CCAGTGTGATGGATATCTGC-3'). Sequences were trimmed and assembled
using the Lasergene software (DNAStar, Madison, Wisc.).

PCB transformation by expression in E. coli. The bphAE of Burkholderia
xenovorans LB400 was ampliﬁed from genomic DNA using primers (5'-
QA_C_(_JATGAGTTCAGCAATCAAGAA-3') (Underlined sequences were for directional
cloning described below) for the forward sequence of bphA and (5’-
CTAGAAGAACATGCTCAGGTT-3’) for reverse sequence of bphE. PCR for LB400-
bphAE was performed with Platinum® Pfx polymerase (Invitrogen) and 30 pmol of each
primer for 25 cycles of 1 min at 94°C, 1 min at 55°C, and 4 min at 72°C. The bphAE
genes of L11E10 were ampliﬁed from the cosmid clone DNA using (5'-
C_A_C__C_ATGAATACTTTGATCAAAGAA-3') for forward sequence of bphA with ‘
modiﬁcation of start codon GTG to ATG and (5'-TTAGAAGAACATGCTCAGGTT—3')
for reverse sequence of bphE. PCR for L11E10 was performed for 25 cycles of l min at

94°C, 1 min at 55°C, and 6 min at 68°C. Both pET101[LB400-bphAE] and

pET101[L11E10-bphAE] were generated using ChampionTM pETlOl Directional TOPO

Expression Kit (Invitrogen). pET101[LB400-bphAE] or pET101[L11E10-bphAE] and

64

pDB31[LB400-bphFGBC](2) were co-transformed into Escherichia coli BL21
Star(DE3).

PCB degradation capabilities of transformants were assessed using a resting cell
assay. E. coli BL21 containing pET101[LB400-bphAE] or pET101[L11E10-bphAE], plus

pDB31[LB400-bphFGBC] was grown in LB medium containing 100 ug ml"l ampicillin
and 25 ug ml-l kanamycin in addition to 0.8 mM IPTG at 37°C. Log phase cells were

washed and resuspended to an optical density of 1.75 at 600 nm in M9 medium
containing 0.8 mM IPTG and 0.1% (w/v) sodium acetate. Portions (2 ml) were pipetted
into glass vials, amended separately with one of two PCB mixtures in 10 ul of acetone,
and sealed with Teﬂon-lined stoppers and aluminum crimp caps. The PCB mixtures
were identical to mixtures 1B and 2B (3) except that 2,2’,4,4',6,6’-CB (chlorinated

biphenyl) was used as the internal standard instead of 2,2’,4,4',6-CB; the ﬁnal

. -l .
concentration of each congener was 1 ug ml . The tubes were then incubated at 37°C

with shaking at 200 rpm for 18 h. Following incubation, the contents of the tubes were
acidiﬁed with three to four drops of concentrated HCl, and the PCBs were extracted three
times with 1 ml of hexane:acetone (1:1, vzv). The extracts from each sample were
com bined and analyzed for PCBs using a gas chromatograph ﬁtted with an electron
Capri—Ire detector and a DB-S capillary column (30 m length, 0.32 mm ID, 0.25 um ﬁlm

thickness). The oven temperature program was 140 °C for 1 min, then increased 2°C

min to 260 °C. The inlet and detector temperatures were 220 °C and 325 °C,

respectively. PCBs were quantiﬁed using a four-point calibration curve and the internal

standard method. In a separate experiment, accumulation of 2-hydroxy-6-oxo-6-

65

phenylhexa-2,4-dienoate (HOPDA) by transformants was determined at 434 nm (19)
with a UV-Vis spectrophotometer (Varian Inc., Palo Alto, Calif.) after addition of

biphenyl.

Nucleotide sequence accession numbers. The GenBank accession numbers are: ARHD

of D14H (accession no. GQ231323-GQ231332), 16S rRNA clone libraries of D14H

(accession no. GQ231333-GQ231378), and D0 (accession no. GQ231379-GQ231433),

and cosmid clone Ll 1E10 (accession no. GQ231434).

RESULTS
Disappearance of biphenyl during the incubation. To conﬁrm the feasibility of
this sediment for the SIP experiment, biphenyl disappearance was measured in
microcosms incubated with unlabelled biphenyl. Only 0.6% of the biphenyl remained

after a 14 d aerobic incubation, whereas none of the biphenyl disappeared in the sterile
microcosms. During the period, total culturable bacteria increased from 4.6 x 105 to 1.79
x 108 CFU’s g-1 dry sediment as determined by plate counts.

DNA extraction and isopycnic centrifugation. DNA (Do: DNA from sediment

at 0 time, D14: DNA from sediment in microcosms incubated with [13C]-biphenyl for 14

d) was extracted by our high molecular weight DNA extraction method. Both D0 and D14
were separately loaded, approximately 70 ug each, to 18.5 ml-scaled isopycnic

centrifugation. [13C]-DNA fractions of D” were collected for buoyant densities from

66

1.634 to 1.656 g ml'l, where DNA was detected in D14 but not in Do. For conﬁrmation
that this fraction had [13C]-DNA, the collected DNA from the heavy fraction (D1411),

from the unlabeled biphenyl incubated microcosms at 14 d (unlabeled D14), and from Do,

were applied to 2 ml-scaled isopycnic centrifugation, followed by quantitative PCR of

16S rRNA genes on the separated fractions (Fig. 1). These results conﬁrmed D14H
consisted of only [13C]-DNA, clearly separated from either D0 or unlabeled D14. The

approximately 3 ug of D14H, was enough to construct a 16S rRNA gene clone library, a

metagenomic library, and a PCR-based ARHD library.

Analysis of 168 rRNA and ARHDs genes in clone libraries. Fifty-ﬁve 16S
rRNA gene clones from Do and 46 clones from D14H were sequenced. The two libraries
exhibited distinct microbial community composition and diversity (f-LIBSHUFF P

values for both Any and Any were <0.001) (30). The D14H clone library, which

should include active biphenyl degrading microorganisms, contained members of genera

Achromobacter, Pseudomonas, A cidovorax, Ramlibacter, Azoarcus, and

Hydrogenophaga, which were not found in the D0 clone library (Table 1).

A library of ARHDS gene sequences in D14H yielded ﬁve unique ARHD

sequences from 10 clones, which could be divided in two groups, based on the translated
amino acid sequences (99-106 aa). Clones 8, 13 (numbers of identical sequences=3), and
17 (n=2) exhibited 92%, 94%, and 94%, respectively, amino acid identities to a biphenyl

dioxygenase large subunit of Comamonas testosteroni strain B-356 (31) (now Pandoraea

67

    

0.8“
l
.l
.1
3
06- .-
."
1' I:

.O
J;

  

‘ __':‘“*§t»s

Ratio of maximum 168 rRNA
copies detected in gradient

 

Density (mg/mi)

Figure 3.1. Separation of [12C]- and [13C]-DNA by small-scaled secondary isopycnic
centrifugation and quantiﬁed by Q-PCR of 16S rRNA genes on triplicate samples. Solid

circles and lines D1411; open circles and dashed lines D0; and open triangles and dashed

lines D14.

68

 

a
Phylogenetic group

Number of clones

 

DO D1411

Generab (Number of cl ones)

 

Actinobacteri a
lntrasporangiaceae (c)
Propionibacteriaceae (c)

Unclassified Actinobacteria

Acidobacteri a

Bacteroidetes

Chloroﬂe xi
Caldilineacea( c )

Unclassiﬁed Anaerolineae

Firmicute s

Planctomyce tes

Proteobacteri a

(Jr-Proteobacteria
Rhodobacteraceae (c)

Unclassiﬁed (Jr-Proteobacteria

B-Proteobacteri a

Rhodocyclaceae (c )

Gallionellaceae ( c)
Comamonadaceae ( c )

Alcaligenaceae ( c )
Hydrogenophilaceae ( c)
Unclassified B-Proteobacteria
y-Proteobacteria
Pseudomonadaceae (c )
Xanthomonadaceae ( c )
6-Proteobacteri a
Unclassified Proteobacteri a
O P l 0
Unclassified bacteria

wra—

l
11 l

Levilinea (1*), Leptolinea ( l " ).

Pirellula ( l *)

Rhodobacter ( 1*)

Azoarcus ( l )
Gallionella (1*)

Acidovorax (6) , Ramlibacter (2)
Hydrogenophaga (l), Rhodoferax ( 1*)

Achromobacter (22)
Thiobacillus (2*)

Pseudomonas (9)

Smithella (1*), Pelobacter (1*)

 

Total

55 46

 

a. The taxonomic assignment was based on the lowest taxonomic level that gave a > than 80% conﬁdence
level for assignment by the RDP-ll Classiﬁer release 9.50 (7).

b. Genera is indicated when more than 80% conﬁdence.

0. Indicated taxonomy unit family.

‘. Genera found in Do
Table 3.1. Phylogenetic classiﬁcation of 16S rRNA genes in clone libraries at zero (D0)
and 14 (D14H) days.

69

pnomenusa (15)). Another group including clone 11 (n=2) and 12 (n=2) were similar to a
dioxygenase large subunit of the gram-positive Rhodococcus sp. strain RHAI (24) with
amino acid identities of 82% and 77%, respectively.

Screening for and analysis of biphenyl dioxygenases. A library of 1568 cosmid
clones, which contained DNA inserts averaging 30 to 40 kb (data not shown), from D14H
was constructed and screened for genes encoding large subunits of biphenyl
dioxygenases (bphAs) using primers to detect ARHD-encoding DNA. Five of the clones
yielded ARHD amplicons of 300-330 bps, but sequencing of the amplicons showed that
only one clone, L11E10, actually contained a bphA sequence. The bphA sequence from

L11E10 was not an exact match with any of the PCR ampliﬁed ARHD sequences found
in D14H.

The clone L11E10 contained an insert of 31,850 bps with 67.38% G+C content.
Seventeen of 22 open reading frames (ORFs) in L11E10 gave top BlastX hits against
ORFs in the genera of Xanthomonas and Stenotrophomonas. Genes for subunits of the
biphenyl dioxygenase (bphA and E) were found in L11E10. L1 1E10 contained no other
genes directly relevant to the known biphenyl degradation pathway (Fig 2A). The bphA
was highly similar to bphA in Pseudomonas sp. strain Carn-l (90%) and bphA] in
Pseudomonas pseudoalcaligenes KF707 (89.5%)(13). The bphA also encoded the motif
Cys-X-His-X17-Cys-X2-His that forms the Rieske-type [2Fe-ZS] cluster of iron-sulfur
proteins. The bphE in L11E10 was 93% identical to bphE (a small subunit of biphenyl
dioxygenase) in B. xenovorans LB400 and bphAZ in P. pseudoalcaligenes KF 707.

Functional analysis of biphenyl dioxygenases. To determine the activity of

bphAE encoded in L11E10 (bphAE-L1 1E10) toward biphenyl and PCBs, bphAE-Ll 1E10

7O

.29 3 :m <75 00003 .00 0:00:00 0+0 09:20 9 000.00: 2000: 0.0 :m 0:5 .03 00mm 300:0,»
.«0 9:00:00 0+0 .m .0553 0w:0_ 6005:? 80:08:: 32:00:00 5000 0:0 £8000 5:00:20 :ozatomﬁb 4.000. 803:: @2033:
ME: “0000—2880 2.000% <75 00050.0 00:6 .500: £953 :25 .0m0:0wmxo_0 080003 .0030 £550... 0wH0_ 60053080
080005 .330 ”0005:? 03:00:00.3 080 030.0 0:008:20? €00 ”00050000 000 .0000 .300 £300» on 00:me 00000830:
<72 £05 ”00000003 00008000 .300 ”5000:: b0000000kv£00 mo onEo: 05VQ=0V .033 30:53 20:00 000:0w80300
00085:: :00000 .0080 £553 2003: 0005300300 0008:08 :0000 .0680 £553 030— 000530.300 0008:08 50:00
.0000 M0800 w:_000: :000 MED ”:ouqtom00 0:0w 0.0 w:m>.6=om .035 S 0:20 E 00000 0:0w mo E00w0m0 000E005 .< .N.m 0.53..—
35 8208 808032
0.0m mKN QDN DNN odN mKw 0.0; ode 0.09 ms ox.“ mN

.P

 

¥ F ii

 

 

-00
am

}% 3 4 >3 ‘1.» (Fr .r J 0-..... _ F 100W
9 £24 :1... ‘..EW

%

-B(

 

m

 

 

 

I 00.0 000 0...... 4.4.0.. .000 000.05.... .010 l I I I I l l
m 0 mo p o 0 a on m m m
w mumwmwwma maammm u... <

71

was expressed in E. coli BL21 along with bphFGBC from B. xenovorans LB400
(bphFGBC—LB400). The bphFGBC-LB400 encodes ferredoxin (BphF), ferredoxin
reductase (Bth), biphenyl-2,3-dihydrodiol 2,3-dehydrogenase (BphB), and 2,3-
dihydroxybiphenyl 1,2-dioxygenase (BphC), involved in the upper pathway of biphenyl
catabolism. In this pathway, biphenyl is transformed to HOPDA producing a yellow color
(23). When E. coli BL21 transformants containing bphAE-LIIEIO were induced with
IPTG and incubated with biphenyl, they produced the yellow color indicative of HOPDA
within 2 h. In resting cell assays with PCB mixtures, the same transformants metabolized
2,3-CB, 2,4'-CB, 4,4’-CB, 2,4,4'-CB, and 2,4',5-CB; the 4,4'-CB, 2,4,4’-CB to a greater
extent than similar transformants containing bphABF G genes from LB400. These results
are consistent with activities of resting cell assays of P. pseudoalcaligenes KF707 (11),
with the exceptions that KF707 also exhibited some transformation of 2,2’,3,3'-CB and

2,3’,4,4’-CB (Table 2).

DISCUSSION

A major hurdle in DNA-SIP based metagenomics is the recovery of [13C]-DNA

in sufficient quantity for cosmid library construction and the production of a target

number of clones. Due to these constraints, we used sediment slurries that were able to
increase biphenyl consumption compared to our SIP study using [HQ-biphenyl in

rhizosphere soil (22), thus enhancing the incorporation of labeled carbon into cell

material and obtaining sufﬁcient [13C]-DNA to produce a cosmid library. The resulting

community, D14H, seems to have less bacterial diversity than the heavy fraction from

72

% Depletion

 

 

Congener a 3
L1 ”310 LB400 LB400 KF707
2,2' <10 100 100 5
2,3 100 100 100 100
2,4' 100 100 100 100
4,4' [00 <10 15 100
2,2',5 0 100 100 0
2,4,4' 92 22 45 93
2,5,4' 89 99 94 83
2,2',3,3' <10 96 94 6O
2,2',3,5' O 96 96 O
2,2',4,4' 0 16 38 0
2,2’,5,5' 0 99 95 O
2,3',4,4’ 0 0 16 24
2,3',4',5 <10 94 83 0
3,3',4,4’ O O 0 O
2,2’,3',4,5 0 <10 38 O
2,2',3,4,5 ' O 29 58 0
2,2’,4,5,5' O 64 73 O

 

a. Resting-cell assay data were obtained ﬁ'om previous study (11).

Table 3.2. Depletion of PCB congeners by the biphenyl dioxygenases of L1 1E10 and

LB400.

73

using [13C]-biphenyl in rhizosphere soil (22), as would be expected from the addition of

the larger amount of biphenyl. This approach is useful for recovering functional genes
from potentially unculturable populations and for analyzing their natural genetic context,
but would not be useful for recovering genes from populations that might be specialists
for low substrate concentrations. The dioxygenase clone we recovered did not overlap
with the sequences ampliﬁed by the ARDH primers. The most likely explanation is that

PCR bias favored genes not recovered in the cosmid.

The D14H community analysis showed that the dominant bacterial groups were

closely related to previously known PCB and biphenyl-utilizing bacteria. The most
dominant group, genera Achromobacter, includes Achromobacter xylosoxidans KF701,
which can grow on biphenyl, 4-methylbiphenyl, 2-hydroxybiphenyl, benzoate and
salicylate (12). Seven sequences in family Comamonadaceae, classiﬁed as Acidovorax
and Hydrogenphaga by the RDP classiﬁer, are most similar to PCB and biphenyl-
degrading Acidovorax sp. (formerly Pseudomonas sp.) strain KKS102 (20, 26), and
biphenyl-utilizing and PCB-cometabolizing psychrotrophic Hydrogenophaga
taeniospiralis IA3-A (21). Also, genera Pseudomonas includes P. pseudoalcaligenes
KF707, a well-known biphenyl and PCB-degrading microorganism.

It is interesting that L11E10 had only the bphAE genes of the biphenyl pathway and
that the genetic organization differs from the upper bph operons of known biphenyl
degrading microorganisms (27). In addition, the G+C content around bphAE was lower
than average for the clone (Fig. 2B). Furthermore, the gene order of rpoE—ORF3-desA-
ORF4-ORF5-cfaA-ORF6-ORF7-ORF8 (Fig. 2A, grey arrows) and recJ-rpr-greA (black

arrows) in L11E10 were identical to six sequenced Xanthomonas genomes, none of

74

which have the upper bph operons. Therefore, bphAE in L1 1E10 could have been
recently acquired from another microorganism, perhaps an outcome of the at least 40-
year exposure to Aroclor 1248 in these sediments. It is possible that the gene organization
of bph operons in nature is dispersed while the bph operons found in biphenyl-degrading
microorganisms typically isolated by enrichment culture are less common, but better
arranged for rapid growth and hence isolation.

Analysis of the origin of L1 1E10 suggests that the insert DNA came from a y-
Proteobacterium because the homology in L11E10 of real, a single stranded DNA
speciﬁc exonuclease required for efﬁcient recovery of DNA synthesis (8), was highly
similar to those in y-Proteobacteria.

BphAE-Ll 1E10 showed a PCB congener transformation spectrum similar to but
narrower than the KF707 biphenyl dioxygenase. It appeared to transform only PCB
congeners without chlorines at the 2,3 positions. This is consistent with BphA protein
sequences in which regions I, II, III and IV of LllElO, responsible for substrate
speciﬁcity (25), are identical to KF 707 biphenyl dioxygenase except Val-337 (L1 1E10)
instead of Ile-335 (KF 707) at LB400 position 336 (Fig. 3). As such, Val-337 (Ll 1E10)
may effect a narrow speciﬁcity toward 2,2',3,3’-CB and 2,3’,4,4’—CB. Even though the
difference in the N-terminus (31 amino acid differences before position 196) and C-
terminus (11 amino acid differences after position 395) between BphA-L1 1E10 and
KF 707 or LB400 is greater than between LB400 and KF707 (only one amino acid
difference), this does not appear to affect PCB substrate speciﬁcity (14).

Combining DNA-SIP and metagenomic analyses should increase our

understanding of genomic features of microbial populations in nature since it avoids

75

.moocosvom Eon oEEa 333:8 3:323“ 38 33% 2:. {sum
8va Ho.“ can motion Eon 038m mo Eonga 2E. .Eoﬁcwzm 2: E 850% Ed mgam ooh: wcoﬁm Rosco? Ho: 05 :2: 28363
2: 3:0 .mommqowxxomc 1:235 8an 93 55:4 .8va mo £553. owes mo Eoﬁcwzm 8:268 Eon 05:3. .m.m .255

u n m
mmm_xmmm2>> m a m m

 

2.2“: E 39:20.:

Noﬁamvsmawmhwmomvaomononmnhmcmhonammuo

mVMNNNoommahVnnnmmmmww Fomohommtnmmnsbwamwmsmvwwmmobomeo
vvvvvvvvmmnmnnwmmmmmmmnMNNNNNNNNNwreFronnwomvvmmmwwrwpFmommcnuw
%&a% %m aw.“
4,... w.

0
14

ll/

76

cultivation bias and minimizes interference from nonfunctional genes. The efﬁciency of
the methods, particularly the sufﬁcient recovery of labeled nucleic acids of high
molecular weight, and its use under conditions that typify the natural environment, e.g.

little disturbance and natural substrate concentrations, need further development.

ACKNOWLEDGEMENTS
WJS thanks Vivian Pellizari and Stephan Gantner for providing primers for
library screening, and Jong-Chan Chae for technical assistance of cosmid library
construction. We acknowledge Michel Sylvestre for advice and for providing plasmids
containing bphFGBC—LB4OO for PCB transformation. This work was supported by
NIEHS grant P42-ESOO4911 under the Superfund Basic Science Program, and the Center

for Microbial Ecology.

77

10.

REFERENCES

ATSDR. 2000. Toxicological proﬁle for Polychlorinated Biphenyls (PCBs). In
Agency for Toxic Substances and Disease Registry. Public Health Service.

Barriault, D., and M. Sylvestre. 1999. A ColEl-compatible expression vector
for the production of His-tagged fusion proteins. Antonie Van Leeuwenhoek
75:293-7.

Bedard, D. L., R. Unterman, L. H. Bopp, M. J. Brennan, M. L. Haberl, and
C. Johnson. 1986. Rapid assay for screening and characterizing microorganisms

for the ability to degrade polychlorinated biphenyls. Appl Environ Microbiol
51:761-8.

Boschker, H. T. S., S. C. Nold, P. Wellsbury, D. Bos, W. de Graaf, R. Pel, R.
J. Parkes, and T. E. Cappenberg. 1998. Direct linking of microbial populations
to speciﬁc biogeochemical processes by 13C-labelling of biomarkers. Nature
392:801-805.

Buckley, D. H., V. Huangyutitham, S. F. Hsu, and T. A. Nelson. 2007. Stable
isotope probing with 15N achieved by disentangling the effects of genome G+C
content and isotope enrichment on DNA density. Appl Environ Microbiol
73:3189-95.

Chen, Y., M. G. Dumont, J. D. Neufeld, L. Bodrossy, N. Stralis—Pavese, N. P.
McNamara, N. Ostle, M. J. Briones, and J. C. Murrell. 2008. Revealing the
uncultivated majority: combining DNA stable-isotope probing, multiple
displacement ampliﬁcation and metagenomic analyses of uncultivated
Methylocystis in acidic peatlands. Environ Microbiol 10:2609-22.

Cole, J. R., B. Chai, R. J. Farris, Q. Wang, A. S. Kulam—Syed-Mohideen, D.
M. McGarrell, A. M. Bandela, E. Cardenas, G. M. Garrity, and J. M. Tiedje.
2007. The ribosomal database project (RDP-II): introducing myRDP space and
quality controlled public data. Nucleic Acids Res 35:D169-72.

Courcelle, C. T., K. H. Chow, A. Casey, and J. Courcelle. 2006. Nascent DNA
processing by RecJ favors lesion repair over translesion synthesis at arrested
replication forks in Escherichia coli. Proc Natl Acad Sci U S A 103:9154-9.

Dumont, M. G., and J. C. Murrell. 2005. Stable isotope probing - linking
microbial identity to function. Nat Rev Microbiol 3:499-504.

Dumont, M. G., S. M. Radajewski, C. B. Miguez, I. R. McDonald, and J. C.
Murrell. 2006. Identiﬁcation of a complete methane monooxygenase operon
from soil by combining stable isotope probing and metagenomic analysis.
Environ Microbiol 8: 1240-50.

78

11.

12.

13.

14.

15.

16.

l7.

18.

19.

20.

21.

Erickson, B. D., and F. J. Mondello. 1993. Enhanced biodegradation of
polychlorinated biphenyls after site-directed mutagenesis of a biphenyl
dioxygenase gene. Appl Environ Microbiol 59:3858-62.

Furukawa, K., N. Hayase, K. Taira, and N. Tomizuka. 1989. Molecular
relationship of chromosomal genes encoding biphenyl/polychlorinated biphenyl

catabolism: some soil bacteria possess a highly conserved bph operon. J Bacteriol
171:5467-72.

Furukawa, K., and T. Miyazaki. 1986. Cloning of a gene cluster encoding
biphenyl and chlorobiphenyl degradation in Pseudomonas pseudoalcaligenes. J
Bacteriol 166:392-8.

Furukawa, K., H. Suenaga, and M. Gate. 2004. Biphenyl dioxygenases:
functional versatilities and directed evolution. J Bacteriol 186:5189—96.

Gomez-Gil, L., P. Kumar, D. Barriault, J. T. Bolin, M. Sylvestre, and L. D.
Eltis. 2007. Characterization of biphenyl dioxygenase of Pandoraea pnomenusa

B-356 as a potent polychlorinated biphenyl-degrading enzyme. J Bacteriol
189:5705-15.

He, Z., T. J. Gentry, C. W. Schadt, L. Wu, J. Liebich, S. C. Chong, Z. Huang,
W. Wu, B. Gu, P. Jardine, C. Criddle, and J. Zhou. 2007. GeoChip: a
comprehensive microarray for investigating biogeochemical, ecological and
environmental processes. ISME J 1:67-77.

Johnson, J. L. 1994. Similarity analyses of rRNAs. American Society for
Microbiology, Washington, DC.

Kalyuzhnaya, M. G., A. Lapidus, N. Ivanova, A. C. Copeland, A. C.
McHardy, E. Szeto, A. Salamov, I. V. Grigoriev, D. Suciu, S. R. Levine, V. M.
Markowitz, I. Rigoutsos, S. G. Tringe, D. C. Bruce, P. M. Richardson, M. E.
Lidstrom, and L. Chistoserdova. 2008. High-resolution metagenomics targets
speciﬁc functional types in complex microbial communities. Nat Biotechnol

26: 1029-34.

Khan, A., and S. Walia. 1989. Cloning of bacterial genes specifying degradation
of 4-chlorobipheny1 from Pseudomonas putida OU83. Appl Environ Microbiol
55:798-805.

Kimbara, K., T. Hashimoto, M. Fukuda, T. Koana, M. Takagi, M. Oishi, and
K. Yano. 1989. Cloning and sequencing of two tandem genes involved in
degradation of 2,3-dihydroxybiphenyl to benzoic acid in the polychlorinated
biphenyl-degrading soil bacterium Pseudomonas sp. strain KKS102. J Bacteriol
171:2740-7.

Lambo, A. J., and T. R. Patel. 2006. Isolation and characterization of a
biphenyl-utilizing psychrotrophic bacterium, Hydrogenophaga taeniospiralis IA3-

79

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

A, that cometabolize dichlorobiphenyls and polychlorinated biphenyl congeners
in Aroclor 1221. J Basic Microbiol 46:94-107.

Leigh, M. B., V. H. Pellizari, O. Uhlik, R. Sutka, J. Rodrigues, N. E. Ostrom,
J. Zhou, and J. M. Tiedje. 2007. Biphenyl-utilizing bacteria and their functional

genes in a pine root zone contaminated with polychlorinated biphenyls (PCBs).
ISME J 1:134-48.

Maltseva, O. V., T. V. Tsoi, J. F. Quensen, 3rd, M. Fukuda, and J. M. Tiedje.
1999. Degradation of anaerobic reductive dechlorination products of Aroclor
1242 by four aerobic bacteria. Biodegradation 10:363-71.

Masai, E., A. Yamada, J. M. Healy, T. Hatta, K. Kimbara, M. Fukuda, and
K. Yano. 1995. Characterization of biphenyl catabolic genes of gram-positive

polychlorinated biphenyl degrader Rhodococcus sp. strain RHAI. Appl Environ
Microbiol 61 :2079-85.

Mondello, F. J., M. P. Turcich, J. H. Lobos, and B. D. Erickson. 1997.
Identiﬁcation and modiﬁcation of biphenyl dioxygenase sequences that determine

the speciﬁcity of polychlorinated biphenyl degradation. Appl Environ Microbiol
63:3096-103.

Ohtsubo, Y., H. Goto, Y. Nagata, T. Kudo, and M. Tsuda. 2006. Identiﬁcation
of a response regulator gene for catabolite control from a PCB-degrading beta-
proteobacteria, Acidovorax sp. KKSIOZ. Mol Microbiol 60: 1563-75.

Pieper, D. H. 2005. Aerobic degradation of polychlorinated biphenyls. Appl
Microbiol Biotechnol 67: 170-91.

Radajewski, S., P. Ineson, N. R. Parekh, and J. C. Murrell. 2000. Stable-
isotope probing as a tool in microbial ecology. Nature 403:646-9.

Reasoner, D. J., and E. E. Geldreich. 1985. A new medium for the enumeration
and subculture of bacteria from potable water. Appl Environ Microbiol 49: 1-7.

Schloss, P. D., B. R. Larget, and J. Handelsman. 2004. Integration of microbial
ecology and statistics: 3 test to compare gene libraries. Appl Environ Microbiol
70:5485-92.

Sylvestre, M., M. Sirois, Y. Hurtubise, J. Bergeron, D. Ahmad, F. Shareck, D.
Barriault, I. Guillemette, and J. M. Juteau. 1996. Sequencing of Comamonas
testosteroni strain B-356-biphenyl/chlorobiphenyl dioxygenase genes:
evolutionary relationships among Gram-negative bacterial biphenyl dioxygenases.

Gene 174:195-202.

Tillmann, S., C. Strompl, K. N. Timmis, and W. R. Abraham. 2005. Stable
isotope probing reveals the dominant role of Burkholderia species in aerobic
degradation of PCBs. FEMS Microbiol Ecol 52:207-17.

80

33.

34.

35.

Weisburg, W. G., S. M. Barns, D. A. Pelletier, and D. J. Lane. 1991. 16S
ribosomal DNA ampliﬁcation for phylogenetic study. J Bacteriol 173:697-703.

Zaitsev, G. M., and Y. N. Karasevich. 1985. Preparatory metabolism of 4-

chlorobenzoic and 2,4-dichlorobenzoic acids in Corynebacterium sepedonicum.
Mikrobiologiya 54:356—359.

Zhou, J., M. A. Bruns, and J. M. Tiedje. 1996. DNA recovery from soils of
diverse composition. Appl Environ Microbiol 62:316-22.

81

CHAPTER IV

UNIQUE PCB- AND BIPHENYL-UTILIZNG POPULATIONS IN THREE
DIFFERENT ENVIRONMENTAL MATRICES

82

ABSTRACT
PCB- and biphenyl-utilizing populations in three PCB-contaminated

environmental matrices: plant rhizosphere, sandy industrial soil, and river sediment were

characterized using stable isotope probing with l3C-biphenyl substrate and subsequent

V4-16S rRNA gene pyrosequencing. Among the sites, PCB- and biphenyl-utilizing
populations were mostly afﬁliated with Phyla Proteobacteria, Actinobacteria and
Acidobacteria as well as F irmicutes particularly in the sediment. However, there is less
phylogenetic redundancy among these PCB- and biphenyl-utilizing populations.
Abundant members of PCB- and biphenyl-utilizing population were suggested to possess
aromatic degradation genes or to have activity on aromatic compounds from previous
studies. Phylum Acidobacteria and Genus Escherichia are new candidate groups that may
be involved in PCB degradation in the environment. Ratios of richness (biphenyl-
utilizing population / original community) suggested that 10-40% of total bacteria might
utilize biphenyl carbon. Information attained by proﬁling populations active in PCB
degradation in different environments might provide the clues for bioaugmentation of
PCB.
INTRODUCTION

Polychlorinated biphenyls (PCBs) are widely distributed, persistent,
anthropogenic pollutants (ATSDR, 2000). Removal of PCB from the environment occurs
mostly by the way of bacterial oxidative degradation, anaerobic dechlorination or a
combination of both, an important mechanism for ecosystem sustainability. Laboratory-
based research shows that there were successes in the introduction of bacteria, known as

bioaugmentation, can result inresponsible for extensive PCB degradation from extensive

83

laboratory-based research to contaminated site materials (Hickey et al., 1993; Focht and
Brunner, 1985). However, in-situ studies with by introductedion of PCB-degrading
strains to PCB contaminated environments often ﬁnd that PCB degradation is minimal.
This is thought to be due to several factors including failure of introduced strains to
survive and/or grow, insufﬁcient distribution and poor bioavailability and propagation
failure in natural conditions. It is, therefore, necessary to investigate the composition of
natural PCB-degrading populations in concert with thorough analysis of the chemical and
physical properties of contaminated matrices (Rysavy et al., 2005; Yan et al., 2006). This
will serve as a guide for improving the successful bioaugmentation strategies by selected
indigenous PCB-degrading organisms.

As of February 2009, there were hundreds of 16S rRNA gene sequences that were
“tagged” to “PCB/biphenyl” isolated bacteria deposited in Ribosomal Database Project
(http://rdp.cme.msu.edu/index.jsp). These sequences were mostly afﬁliated with known
aerobic PCB degraders: Burkholderia, Pseudomonas, and Rhodococcus as well as the
anaerobic Dehalococcoides known for its dechlorination abilities. Although isolation of
bacteria is necessary for evaluation of bioaugmentation strains, there is often a limited
range of bacterial taxa that are cultivated ﬁom PCB-contaminated environments (Leigh et
al., 2006). Isolated bacteria likely do not represent actual PCB degrading community
(Leigh et al., 2007).

Thus, culture independent methods, such as 16S rRNA gene clone libraries, have
been employed to study indigenous bacterial communities in PCB-contaminated
environments. Sequences similar to Burkholderz'a and Sphingomonas, well-known PCB-

degrading bacteria, were retrieved from PCB-contaminated soil. In. addition, there were

84

numbers of sequences afﬁliated with the phylum Acidobacteria, which is one of
the most abundant phyla in soil, but are not known PCB—degraders (Nogales et al., 1999;
Nogales et al., 2001). Another study identiﬁed increased abundance of Rhizobiales and
Acidobacteria in rhizomediated PCB-contaminated sites (de Career et al., 2007). These
authors speculated that the identiﬁed bacteria were involved in either direct or indirect
PCB utilization since PCB was a major carbon source.

Bacterial members responsive to PCB addition have been determined by assessing
community structure before and after exposure to PCB droplets. Members of the active
PCB-degrading population were found to be closely related to the genera Aquabacterium,
Caulobacter, Imtechiu, Nevskia, Parvibaculum, and Burkholderia (Macedo et al., 2007).

Alternatively, stable isotope probing (SIP) (Radajewski et al., 2000) has been
used to directly trace active bacteria involved in aerobic PCB degradation. This method
takes advantage of the incorporation of labeled substrate into DNA and RNA of cells
growing on the labeled substrate, which allows for taxonomic classiﬁcation of the
organisms and identiﬁcation of functional genes that have become labeled. This has been
used to target PCB-degrading bacteria in the rhizosphere of Austrian pine (Pinus nigra)

growing in a PCB-contaminated industrial site. The most frequently identiﬁed members

from the l3C-DNA fraction were Pseudonocardia, Kribella, Nocardiodes and

Sphingomonas (Leigh et al., 2007).

In this study, we investigated active PCB-degrading communities in three PCB-
contarninated environments using a combination of SIP and 16S rRNA gene
pyrosequencing of the hypervariable V4 region. This study focuses on whether common

PCB populations are selected from different soil or sediment communities.

85

MATERIALS AND METHODS
Site Description. Rhizosphere (Cz) soil (15 mg/kg of PCB, pH 7.7) was
collected in the root zone of an Austrian pine (Pinus nigra) in the Czech Republic (Leigh
et al., 2006). Sandy soil (Pi) (120 mg/kg of PCB, pH 7) was collected at Picatinny
Arsenal, NJ, USA. Sediment (4.8 mg/kg of PCB, pH 7.6) was collected from River
Raisin, Monroe, MI. DNA of C2 0d, Cz 4d, Cz 14d, Rr 0d, and Rr 14d (d=days) was

obtained from previous studies (Leigh et al., 2007; Sul et al., 2009). Other DNA was

collected by SIP following incubation with l3C-biphenyl as follows. Microcosms for SIP

were established following previous studies (Leigh et al., 2007). Brieﬂy, uniformly 1 mg

13C-labeled biphenyl was added per 5 g environmental material. Isopycnic density

gradient centrifugation and fractionation protocols were conducted following DNA

extraction as previously described (Leigh et al., 2007). l3C-DNA fractions were

determined by real-time PCR using 16S rRNA genes (Leigh et al., 2007).

V4-16S rRNA Gene Pyrosequencing. PCR for amplicon pyrosequencing was
performed with barcode primers, which targeted the 16S rRNA gene V4 region as
previously described (Chapter 2). Pyrosequencing was performed using the Genome
Sequencer FLX System (454 Life Sciences, Bradford, CT). Raw reads were processed,
ﬁltered, aligned, and clustered through the RDP Pyrosequencing Pipeline (Cole et al.,
2009). All 122,651 sequences were assigned to bacterial taxa with the RDP Classiﬁer
version 2, using the Taxonomic Outline of the Bacteria and Archaea (TOBA), release 7.8

(Cole et al., 2007). Bacterial assemblages were compared with Chao abundance-based

86

adjusted Sarensen similarity calculated using EstimateS (purl.oclc.org/estimates) and then
performed Principle Coordinate Analysis (PCoA) using the R statistical program (R
Development Core Team) running the vegan package.

Estimates of Bacterial Richness. We implemented 7 parametric models: single
point mss, gamma, lognorrnal, Inverse Gaussin, Pareto, mixture of two exponentials, and
mixture of three exponentials to rank-frequency matrix of each sample. Model selection
followed empirical procedures (Bunge and Barger, 2008). Brieﬂy, we require that both
GOFS and GOFO > 0.01 and then sort the results ﬁrst by decreasing "tau" (right
truncation point) and second by increasing AICc. Then the minimum-AIC model within
each tau block (models evaluated at the same tau) is examined, and the one with the
largest tau such that SE<= est/2 is selected. This may result in competing models, in
which case we have to use expert judgment. Also, eleven nonparametric estimators were
calculated using the software SPADE.

RESULTS

Bacterial communities in PCB-contaminated sites and their biphenyl-utilizing
populations. The bacterial composition at the phyla level of three PCB-contaminated
sites (rhizosphere, river sediment, and sandy soil), differed by soil type and PCB
concentration, was determined by V4 16S rRNA gene pyrosequencing. The rhizosphere
soil (Cz 0d) was dominated by three phyla: Proteobacteria, Acidobacteria, and
Verrucomicrobia (Figure 1A). River sediment (Rr 0d) exhibited a high Proteobacteria]
abundance and contained more sequences afﬁliated to Bacteraidetes, F irmicutes, and

Chloroﬂexi than rhizosphere and sandy soil (Figure 1B). Actinobacteria dominated the

87

._o>2 oozowmaoo $3 a “a Edommmmm—Udna .3 35320 0.83 moocosvom
=< =8 bonam .U .225me Emﬁm 83¢ .m #8 Begging nooNU .< 4.32.95 .53 Sagan—5:. he 93..
3 on: v has can 38 3335 no»: cowaamﬁﬁaeoA—Um 09:: E nemummenﬁeo 82.39 13.—89am dd. 0.5!...—

 

 

 

 

 

 

 

9 n
w p
d w M s
W V 5 d
u m... W W W M u w
d w w D u. P m w. W
n. o a o o o
w. m w. m. m m m. m m
w m m w w m m. w m. w
W m m. m m. u u M u u
J S S s s e. e e. e. e.
- q -. o
a m
f 3 mm
2
ﬁn
r Tma M
. ow m.
m
. mm D.
e
u
. on m
\1
. mm %
(
l. . 0v
3: NU. UV NU E no NU .L.
. mv

 

88

I Rr 14d

'U
m
L
m:

 

Rr 0d

[8'

.__1

 

 

 

 

 

 

 

 

 

 

(%) aauepunqe aAnelaa

 

89

131410

saJapeuownewwas)

SQJQD/inJDUEId

euapeqoumv

engagwoonJJa/l

IX 900-10110

sainaley

euazaeqoppv

sazapimapeg

91.193399

payyssepun

eyepeqoaqmd

TBXOHS

Figure 4.1B. River Raisin sediment.

sandy soil (Pi 0d) and included the genera Streptomyces (5.2%), Nocardioides (2.8%),
and Solirubrobacter (2.6%) (Figure 1C).

Biphenyl—utilizing populations were analyzed using the collected heavy DNA

derived from l3C-biphenyl-SIP after 4 d and 14 d incubations. Both rhizosphere time

points (Cz 4d and C2 14d) contained sequences most closely classiﬁed as Proteobacteria,
Actinobacteria, and Acidobacteria (Figure 1A). Notably, these samples were dominated
by genera afﬁliated with Actinobacteria: Nocardioides, Pseudonocardia, Kribbella, and
Sphingomonas, and with Proteobacteria: Escherichia, and Bradyrhizobium, and lastly to
Acidobacteria Gp6 (Appendix A). In river sediment (Rr 4d & Rr 14d), F irmicutes were
higher in relative abundance to other soils and were marked by a high abundance of
Proteobacteria and Acidobacteria (Figure 1B). The most dominant genera were Bacillus,
Arthrobacter, Burkholderia, and Escherichia (Appendix A). There was a lower
abundance of sequences afﬁliated with Bacteraidetes and Chloroﬂexi, which were more
than 5% of the relative abundance in the original matrix (Rr 0d) (Figure 1B). In the sandy
industrial area soil (Pi), Proteobacteria had grown to 80% at 14d in relative abundance
(20% at 0d) with less Actinobacteria compared to its 45% at 0d (Figure 1C). High
abundances of Phenylobacterium, Azospirillum, Lysobacter, Wautersia,
Pseudoxanthomonas, Escherichia, Sphingomonas, (ordered by relative abundance) as
well as Acidobacteria Gp6 were identiﬁed in Pi at 14d (Appendix A). Among all three
PCB contaminated sites, the l3C-biphenyl utilizing populations were mostly
Proteobacteria, Actinobacteria, Acidobacteria as well as F irmicutes, the later particularly

in the sediment.

9O

I Pi 14d

“PI 0d

0
O)

Jauzo

(NJ.

SBJQPEUOLUIJELUUJGQ

saJaDALuOJDUE/d

l—I

 

SBJnDjLUJL-I

PIQOJDILUODnJJa/l

euaiaeqoppv

 

agapeg
- paylssepun

eyapeqoaamd

 

eueJaeqounov

 

70-
60-
50-
0.
o.
04
101
0

O
on

(%) aouepunqe aAneIeu

91

Taxons

Figure 4.1C. Sandy soil.

PCB- and Biphenyl- Population Shifts During Incubation. A distance-based
(Chao ’s abundance based Serenson Similarity) principal coordinate analysis (PCOA) at a
97% OTU clustering illustrates the shift in bacterial community structure between that of
the original total community and the biphenyl-utilizing populations over the 14 day
incubation for the three PCB-contaminated sites (Figure 2). Shared OTUS between Cz 4d
and C2 14d contain 85% the sequences while shared OTUS between Rr 4d and Rr 14d)
contain 75% of those sequences Most of the lower abundance OTUS in Cz4d were
Actinobacteria whereas Proteobacteria increased at Cz 14d (Figure 5A). This increase
was also found in the RI incubation at 14d, but was accompanied by a decrease in
Bacillus (Figure 5B).

Richness of both the total bacterial and biphenyl-utilizing communities was
estimated by both parametric and non-parametric methods (supplemental materials).
Regardless of sample origin, an estimation carried out at lower OTUS (90%) selected an
inverse Gaussian as the appropriate abundance model. In contrast, 2-mixed or 3-mixed
exponential models were better ﬁts at higher OTU clustering levels. The proportions of
the biphenyl-utilizing populations relative to total bacteria can be calculated from the
ratio of richness estimations (biphenyl-utilizing population / total bacteria). Ratios at 97%
OTUS are 27% (C2 4d/Cz 0d with parametric), 27% (C2 4d/Cz 0d with nonparametric),
43% (C2 14d/C2 0d with parametric) and 36% (C2 14d/CZ 0d with nonparametric). The
sandy soil has a lower proportion of biphenyl-utilizing populations: 16% (Pi 14d/Pi 0d
with parametric), and 10% (Pi 14d/Pi 0d with nonparametric), while richness estimations
of biphenyl-utilizing populations in the sediment were larger than the total bacteria

population estimates: 218%, 153% (Rr 3d/Rr 0d with parametric, nonparametric,

92

 

 

 

 

A
Rr14ds
”-..
O
S? N Pi 14d
:3 0'" RrOd I
9" O
5
a. o- Cz4d
. -
PIOd Cz14d
O
a! Rr3d .
c?" '3 Rr14d 620“
E!
I I I I
-0.4 -O.2 O 0.2

P01 (40.3 %)

Figure 4.2. Principal Coordinate Analysis (PCoA) plot. Circles represents original
PCB-contaminated matrix, square represent PCB- and biphenyl-utilizing community.

93

respectively), 128% and 109% (Rr l4d/Rr 0d with parametric, nonparametric,
respectively) (Table 1, 2, and 3).

Shared OTUs of Three Biphenyl-Utilizing Populations After 14 Days
Incubation. Over the same incubation period (Cz 14d, Rr 14d, Pi 14d), only 46 of 11,951
OTUs of biphenyl-utilizing bacterial populations were shared among all three samples.
Representative sequences of each shared OTU, deﬁned as those with the lowest sum
distance to others within OTU’s, were mostly Acidobacteria, Actinobacteria, and
Proteobacteria. Two OTUS assigned to the genera Escherichia and unclassiﬁed
Enterobacteriaceae were present at a relatively high abundance in all three samples
(Figure 4). Most of the remaining OTUS were identiﬁed at high abundances in only one
or two samples.

Different Incubation Methods Altered Biphenyl-Utilizing Populations.
Different biphenyl-utilizing populations were detected depending on the SIP incubation
conditions. A previously studied incubation on River Raisin sediments at 14 days (Rr
14d) used a slurry incubation instead of the static one as used in the experiments
presented so far. The dominant genera in the slurry were Pseudomonas (47.8%),
Acidovorax (6.9%), Chitinophaga (4.7%), and Achromobacter (3.6%). Using the static
method these genera comprised less than 0.3% of the community in either Rr 3d or 14d.
The top ten high abundance 97% OTUS of the current Rr 14d are rare members in Rr 14d
slurry: <0.15% of relative abundance (Figure 2). The ten most abundant Rr 14d slurry

OTUS accounted for only 0.46% of the sequences in Rr 14d static.

94

 

non-

 

at 90% No. of Obseved Parametric Abundance Parametric Estimator
OTUS sequences OTUS estimate Model .
estlmate

Cz 00 11400 1390 31714249 “we?“ 2530454 ACE-1
Gauss1an

Cz4d 4089 586 1006473 ”Wise 824416 ACE
Gaussmn

C2140 12338 898 1270457 Invert“ 1138419 ACE
Gaussmn

11:00 12697 1547 33684234 "wart“e 2737463 ACE-l
Gauss1an

R: 3d 22716 2274 3535473 Z’M'xe‘l 3006434 ACE
Exponentlal
2-Mixed

Rr 14d 24217 2167 2856439 . 2551425 ACE
Exponent1al

Rr 14ds 21449 551 12494191 Z‘M'xe‘l 830440 ACE
Exponential

PiOd 10609 1113 29734338 "“8“.“ 21084194 ACE-1
Gauss1an

Pi14d 3136 255 397437 '“VC'T” 338419 ACE
Gaussmn

 

Table 4.1. Bacterial richness estimations at 90% OTUs. Abundance model of
parametric estimates and estimator of nonparametric estimates were selected by empirical
procedures to calculate “best” estimation.

95

 

non-

 

97% No. of Obseved Parametric Abundance Parametric Estimator
OTUS sequences OTUS estimate Model .
est1mate
Cz 0d 11400 2846 90604726 3-1916141 74514119 ACE-l
Exponent1al
Cz 4d 4089 1075 24564180 “we?“ 20454192 ACE-l
Gaussmn
3-Mixed
Cz 14d 12338 1871 39384406 . 2647434 ACE
Exponent1al
Rr 0d 12697 2923 69944241 2’M'xe‘l 682741 14 ACE-l
ExponentIal -
Rr 3d 22716 6162 1522541447 3'M'Xe‘l 10429490 ACE
ExponentIal
Rr 14d 24217 5493 895241 13 3'“er 7449454 ACE
Exponent1al
Rr 14ds 21449 926 21514155 3’M'xe‘! 20814250 ACE-1
Exponentlal
Pi 0d 10609 2324 64404358 3'M'xe‘l 63754130 ACE-1
ExponentIal
Pi 14d 3136 402 10304159 "war.“ 646440 ACE
Gauss1an

 

Table 4.2. Bacterial richness estimations with 97 % OTUs.

96

 

non-

 

99% No. of Obseved Parametric Abundance Parametric Estimator
OTUS sequences OTUs estimate Model .
estimate

C2 00 11400 3931 1852742527 3'04”“? 135564193 ACE-1
Exponential

Cz 40 4089 1432 38664283 3'M‘xe‘l 3433481 ACE-l
Exponential

C2 140 12338 2824 73064681 3'M‘xe‘l 5573496 ACE-1
Exponential

RrOd 12697 4132 147344950 3‘M‘xe‘.‘ 129774182 ACE-l
Exponenhal

Rr3d 22716 10095 251614374 Z'M‘xe‘l 193734152 ACE
Exponennal

er4d 24217 9016 168644194 3'0““! 13425484 ACE
Exponenual

er4ds 21449 1428 38334273 3‘M‘xe‘l 3583490 ACE-l
Exponential

Pi 00 10609 3224 124634846 3'54”“? 118484176 ACE-1
Exponential

. 2-Mixed

P114d 3136 542 1326.24123 . 1272.4223 ACE-1
Exponential

 

Table 3. Bacterial richness estimations with 99% OTUs.

97

Figure 4.3A. Increase and decrease in relative abundance of shared OTUs in C2 4d
and C2 14d. Solid line in the middle represents mean ratio of OTUs’ relative abundance
between two samples. OTUS indicated by lower case characters have at least two fold
higher abundance than Cz 14d and more than 0.5% in relative abundance in C2 4d. OTUS
representative sequences were classiﬁed as: a, Nocardioides; b, unclassiﬁed bacteria; 0,
unclassiﬁed Nocardioidaceae; d, unclassiﬁed Micromonosporaceae; e, Nocardioides; f,
Nocardioides; g, Promicromonospora; h, Kribbella; I, Acidobacteria Gp16; j,
Acidobacteria Gp6. OTUS indicated by italic characters have consistent abundance both
samples less than two fold difference to either side. OTUS indicated by numbers have at
least two fold higher abundance than Cz 4d and more than 0.5% in relative abundance in
C2 14d. OTUS representative sequences were classiﬁed as: l, Pedomicrobium; 2,
Escherichia; 3, unclassiﬁed Rhizobiales; 4, unclassiﬁed Comamonadaceae; 5,
unclassiﬁed Comamonadaceae; 6, Sphingomonas; 7, unclassiﬁed bacteria; 8,
Verrucomicrobia Subdivision 3; 9, unclassiﬁed Rhizobiales; 10, unclassiﬁed
Sphingomonadaceae.

98

Ratio of Relative Abundance

 

 

 

 

 

 

.__. Cz4d/Cz14d Cz14d/Cz4d ,_
8 co .'—' dd .9:
I 42." I I A
F
.' —7
I
I -:
E}
* \
M33:
x
N 4__
u 3
. " 9
z 0
d) h-
.0 U -
G
«a. .1 «3. ' «a. 0 4 4' 4 07
N F o o u- N
01723 917120
(%) GOUBPUNQV 94113198

99

Figure 4.38. Increase and decrease in relative abundance of shared OTUs in Rr 3d
and Rr 14d. Solid line in the middle represents ratio of OTUs’ relative abundance
between two samples. OTUS indicated by small cap characters have at least two fold
higher abundance than Rr 14d. Notable OTUs’ representative sequences were classiﬁed
as: a, Acidobacteria Gp7; b, Acidobacteria Gp4; c, Burkholderia; d, Bacillus;e,
Bradyrhizobium; f, Sporosarcina; g, Acidobacteria Gp5. OTUs indicated by italic
characters have consistent abundance both samples less than two fold difference to either
side. Notable OTUs are: a, b, and c, Bacillus; d, Arthrobacter; e and f, Bacillus; g,
Acidobacteria Gp4; h, unclassiﬁed Proteobacteria; i, Bacillus; j, Acidobacteria 6134; k,
Methylobacterium; l, unclassiﬁed bacteria; m and n, Acidobacteria Gp6; o,
Verrucomicrobia; p, Acidobacteria Gp4; q, Blastochloris; r, Acidobacteria Gp6; s,
Escherichia; t, Acidobacteria Gp6; u, Acidobacteria Gp4; v, unclassiﬁed bacteria; w,
Rhodoplanes; x, Gemmatimanas; y, Verrucomicrobia. OTUS indicated by numbers have
at least two fold higher abundance than Rr 4d. Notable OTUs’ representative sequences
were classiﬁed as: 1, Clostridium; 2, Pseudomonas; 3, unclassiﬁed Rhizobiales; 4,
unclassiﬁed Sphingomonadaceae; 5, unclassiﬁed Beijerinckiaceae; 6, unclassiﬁed
Bacteria; 7, Gemmatimanas.

100

113d

fold
11141
18;;
114111
11110
1; gE
4; h

1:4

131?
lCCS

ﬁed

Rr 3d I Rr 14d

Ratio of Relative Abundance

*-

Rr14d/Rr3d

 

 

 

 

 

'30 .‘—' co
- A . I A - n
~00
'V'
-
(D
7 H
a
“'1.
_ Q.
‘ a
— E
N.
32 g “‘ 32
«a .4 .._
0") g " N
E
a 'u
8‘3
".
N
‘—
(Dr
D‘—
u I
(D
N'Dunlnq—Innmom ,0me
"-4“- "-o'°4 “93". N..-
‘- P O C o O ‘—
€J 9171.18

(%) aouepunqv 80112135

101

 

990
emummaua sun
sue

11014919940141an

 

W
0000an nun
000000433019

3119941)!

53001110111193

- eeeomoooomwuun
' 9199
' 9199

 

E :3 2 j

6‘ a a -

I I [J '

=-

c!

4 ' o 4 A o
N v- P 0
(10°wame

 

102

53383838

E

0
9910490001100 “Juno“ “I;

Figure 4.4. Shared OTUS among three PCB- and biphenyl-utilizing populations after 14 days incubation with 13C-

biphenyl (Pi). P is abbreviation of Proteobacteria.

DISCUSSION

We focused on the characterization of indigenous bacterial communities in three
different PCB-contaminated sites and their PCB- and biphenyl-utilizing populations.
Bacterial communities in these PCB-contaminated sites had very low phylogenetic
commonality. These trends were also found in a previous study that showed four
randomly chosen soils shared just a few common species, <5% at 97% OTUs (Fulthorpe
et al., 2008). Since the presence of PCBs is the only apparent common attribute in our
soils, the differences in geographical distances, soil characteristics, plant interactions, and
PCB concentrations can explain the taxonomic differences.

PCB- and biphenyl-degrading populations in PCB-contaminated sites differed by
sample origin. The dominant genera in these sites are either known as PCB- and
biphenyl-degrading bacteria, possess aromatic compound degradative genes, or were
previously found in PCB-contaminated sites. Among PCB- and biphenyl-degrading
populations of rhizosphere soil, were members of Nocardioides, Pseudonocardia,
Kribbella, and Sphingomonas, which were previously identiﬁed in the 16S rRNA clone
library from thee soils (Leigh et al., 2007). In addition, Bradyrhizobium was found,
which has members known to degrade 4-chlorobenezoate (Gentry et al, 2004) was also
found in PCB-contaminated soil (Nogales et al., 1999; Nogales et al., 2001) and in PCB-
bioﬁlms (Tillmann et al., 2005; Macedo et al., 2007). Among PCB-and biphenyl-
degrading populations in river sediment, Bacillus is known a thermophilic PCB-degrader
isolated from compost (Shimura et al., 1999). Arthrobacter can transform PCB congeners
(Kohler et al., 1988), induce PCB degradation by plant compounds (Gilbert and Crowley,

1997) and was also found in a chlorobenzene-contaminated aquifer (Abraham et al.,

103

2005) and Antarctica (Michaud et al., 2007). Burkholderia are well-known PCB-
degraders (reviewed in Pieper, 2008). Among PCB- and biphenyl-degrading populations
in sandy soil, Phenylobacterium spp. possessed (herbicide) Chloridazon catechol
dixoygenase (Blecher et al., 1981), Azospirillum species showed chemotaxis to aromatic
compounds such as protocatechuate, catechol, and 4-hydroxybenzoate (Lopez-de-
Victoria and Lovell, 1993), Lysobactor species can degrade naphthalene and
phenanthrene (Maeda et al., 2009), and Pseudoxanthomonas species were able to degrade
BTEX compounds (Kim et al., 2008).

Most of the abundant genera have a relevancy to PCB or its intermediates
degradation, while several dominant bacterial groups in biphenyl-degrading populations
were not previously identiﬁed as known PCB- and biphenyl-degraders. The presence of
Acidobacteria in the biphenyl-degrading populations in all three samples is of particular
interest. Acidobacteria, especially of subdivision 4 and 6, may be members of an initial
biphenyl-degrading consortium. However, there is no proof their biphenyl degradation
due to difﬁculty in cultivation of members of this Phylum. Acidobacteria dominated in a
highly PCB-contaminated soil (Nogales et al., 1999) and the presence of aromatic ring
dioxygenases such as protocatechuate 3,4-dioxygenase, albeit a more common aromatic
metabolism pathway, was found in complete Acidobacteria genomes (Ward et al., 2009).
Surprisingly, sequences of the genera Escherichia was also consistently found in three
biphenyl-degrading populations (Figure 4 and appendix A). Escherichia can be found
outside of animal intestinal tracts, and environmental strains may harbor more metabolic
diversity (Whitman and Nevers, 2003). The biphenyl-selected OTU, whose median

(representative) sequence was classiﬁed as Shigella, seems most like clade V of

104

503809.48 #5983644 can
CﬁogoEoEoV .nemoﬁohzmmk mm voEmmﬂo A3 390 can .on 6\om. S .3 3306:: .xcmdm Em bus—m m3: E E $50 235820 00::
mo Bonanza—m 2630M £48.33an 939w: a wEBozom bus—m 3; E E 894988“ 0338 .088me 238% $08 28 R: a
as 40250 bogus—59038 ﬂ 0:: czom .mDhO a: .5— no “.54.. 2: .3 3.36.8 4.3.—.0 we: a no mean—3:53: 9523— .mé 9.5”:—

 

03 .e... B scaméaho
oooow coo? cow or F
O III-0". .) “on no u. p x O
X x o X
o X X X
m x x
m

md.x x x 4N6
a
m. 4 x m
M.
n F... x LBW
q 4...
n x n
u
w x m
mo... .66 w
m. w

m.

m... x m
m N. x .md 1.
.. . m
m x 1......

ma. 1

0 x x3 mx

 

 

 

105

environmental E. coli based on sequence identity although there are no polymorphisms
within the V4 region among clade V environmental, pathogenic E. coli, and Shigella.
Regardless of whether this group is environmental E. coli or not, this group of bacteria
hasn’t yet been reported contain any biphenyl degradation related genes, although little is
known about the metabolic capacity of the understudied environmental Escherichia. It is
known, however, The E. coli possess enzymes for downstream steps of the biphenyl
pathway. The consistently higher abundance of the Escherichia OTU in 14 (1 rather than
3d in sediment and rhizosphere soil is consistent with utilization of PCB intermediates.

A caveat of using SIP incubations is that primary biphenyl-degraders initially
metabolize biphenyl but also produce secondary and intermediate metabolites that can be
utilized by cross-feeders or non-speciﬁc carbon substrate scavengers. Hence, it is
impossible to distinguish between primary or secondary biphenyl-C utilizing populations.
This complexity is illustrated by the difference in biphenyl degrading populations among
our sites (Figure 6). Although there was a general lack of common biphenyl degrading
populations among our PCB-contaminated sites, 46 OTUS were common and may
represent cosmopolitan bacteria able to degrade biphenyl or consume intermediate
biphenyl substrates regardless of environmental barriers.

The application of deep sequencing to SIP (heavy DNA) samples has advantages
in searching for and identifying less abundant possible PCB-degraders. For instance, in
both Cz 4d and C2 14d, we found 0.1% of sequences to be of Rhodococcus, which were
previously the dominant isolates from the same sample (Leigh et al., 2006), although not
detected in the previous clone library. Another beneﬁt is more reliable bacterial richness

estimations that enables calculation of the portion of the community that can derive

106

Based on
13C-SIP

Primary biphenyk
utilizing bacteria

Intermediate substrate
Cross-feeder or
Carbon Scavenger

Rhizosphere

    

Sediment Sandy Soil

 

Phenylobact - ,
Order Baal/[ales mum

Burkholderla

Halstonia

 

 

 

H n
u //

 

 

u v
E. coli, unclassiﬁed Enterobacteriaceae
Acidobacteria Gp4,6,&16

 

 

Figure 4.6. Schematic summary of biphenyl-utilizing bacteria and cross-feeders in three

PCB-contaminated sites.

107

carbon from the single source. Based on our calculation, biphenyl can be utilized by 10-
45% of the total community. Estimation ratios between Rr 0d, and R 3d and RR 14d in
river sediment are not reliable because we altered the environmental condition form
anaerobic to aerobic during incubation. Nonetheless, this might be the ﬁrst estimation of
single carbon effect in microbial community.

Our comparison of bacterial populations between two different enrichment
methods (Rr 14d slurry and er4d static) indicated that the slurry addition caused rapid
growth of speciﬁc r-strategy bacterial groups. The slurry condition had greater substrate
availability due to a 10x higher biphenyl concentration and resulted in an even carbon
source distribution. The static conditions probably favored populations like those that
would naturally encounter PCBs while the slurry favored the fast-growing soil

consortium.
Overall, these ﬁndings indicate that lJC-biphenyl utilizing population change as a

function of the inherent site characteristics, incubation time, and incubation method. The
lack of a common biphenyl degrading population among sites illustrates that soil
heterogeneity plays a large role in promoting and maintaining these populations. This
suggests that successful bioaugmentation of PCB contaminated soils requires that the
capability of the native soil to sustain an augmented population is known. An appropriate
augmented population can then be chosen to increase success rates in the remediation of

PCB contaminated soils.

108

ACKNOWLEDGMENTS
W.J.S thanks to Seth Walk for advice on environmental Escherichia and to
Ribosomal Database Project group for incredible support in analyses of rRNA sequence
data.
AUTHOR CONTRIBUTIONS

Mary Beth Leigh provided the l3[C]-DNA. John Bunge performed parametric and

nonparametric estimates calculation. Ryan penton involved in statistical analysis and

project improvement.

109

REFERENCES

Abraham WR, Wenderoth DF, Glasser W (2005) Diversity of biphenyl degraders in a
chlorobenzene polluted aquifer. Chemosphere 582529-533

Blecher H, Blecher R, Wegst W, Eberspaecher J, Lingens F (1981) Bacterial degradation
of aminopyrine. Xenobiotica 11:749-754

Bunge J, Barger K (2008) Parametric models for estimating the number of classes. Biom
J502971-982.

Chao A, Chazdon RL, Colwell RK, Shen TJ (2006) Abundance-based similarity indices
and their estimation when there are unseen species in samples. Biometrics 62:361-
37]

Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS,
McGarrell DM, Marsh T, Garrity GM, Tiedje JM (2009) The Ribosomal Database

Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res
37 :D141-145

Focht DD, Brunner W (1985) Kinetics of Biphenyl and Polychlorinated Biphenyl
Metabolism in Soil. Appl Environ Microbial 50: 1058-1063

Fulthorpe RR, Roesch LF, Riva A, Triplett EW (2008) Distantly sampled soils carry few
species in common. ISME J 2:901-910

Gentry TJ, Wang G, Rensing C, Pepper IL (2004) Chlorobenzoate-degrading bacteria in
similar pristine soils exhibit different community structures and population

dynamics in response to anthropogenic 2-, 3-, and 4-chlorobenzoate levels.
Microb Ecol 48:90-10

Gilbert ES, Crowley DE (1997) Plant compounds that induce polychlorinated biphenyl
biodegradation by Arthrobacter sp. strain B1B. Appl Environ Microbiol 63:1933-
1938

Hickey WJ, Searles DB, Focht DD (1993) Enhanced mineralization of polychlorinated
biphenyls in soil inoculated with chlorobenzoate-degrading bacteria. Appl
Environ Microbiol 59: 1 194-1200

Kim JM, Le NT, Chung BS, Park JH, Bae JW, Madsen EL, Jeon CO (2008) Inﬂuence of
soil components on the biodegradation of benzene, toluene, ethylbenzene, and o-,
m-, and p-xylenes by the newly isolated bacterium Pseudoxanthomonas spadix
BD-a59. Appl Environ Microbial 74:7313-7320

Kohler HP, Kohler-Staub D, Focht DD (1988) Cometabolism of polychlorinated
biphenyls: enhanced transformation of Aroclor 1254 by growing bacterial cells.
Appl Environ Microbial 54: 1940-1945

110

Leigh MB, Pellizari VH, Uhlik O, Sutka R, Rodrigues J, Ostrom NE, Zhou J, Tiedje JM.
(2007) Biphenyl-utilizing bacteria and their functional genes in a pine root zone
contaminated with polychlorinated biphenyls (PCBs). ISME J 1:134-148

Lopez-de-Victoria G, Lovell CR (1993) Chemotaxis of Azospirillum Species to Aromatic
Compounds. Appl Environ Microbial 59:2951-2955

Macedo AJ, Kuhlicke U, Neu TR, Timmis KN, Abraham WR (2005) Three stages of a
bioﬁlm community developing at the liquid-liquid interface between
polychlorinated biphenyls and water. Appl Environ Microbiol 71:7301-7309

Maeda R, Nagashima H, Zulkhamain AB, Iwata K, Omori T (2009) Isolation and
characterization of a car gene cluster from the naphthalene, phenanthrene, and

carbazole-degrading marine isolate Lysobacter sp. strain 0C7. Curr Microbial
592154-159

Michaud L, Di Marco G, Bruni V, Lo Giudice A. (2007) Biodegradative potential and
characterization of psychrotolerant polychlorinated biphenyl-degrading marine

bacteria isolated from a coastal station in the Terra Nova Bay (Ross Sea,
Antarctica). Mar Pollut Bull 54: 1754-1761

Nogales B, Moore ER, Abraham WR, Timmis KN (1999) Identiﬁcation of the
metabolically active members of a bacterial community in a polychlorinated
biphenyl-polluted moorland soil. Environ Microbiol 1:199-212

Nogales B, Moore ER, Llobet-Brossa E, Rossello-Mora R, Amann R, Timmis KN (2001)
Combined use of 16S ribosomal DNA and 16S rRNA to study the bacterial

community of polychlorinated biphenyl-polluted soil. Appl Environ Microbial
67:1874-1884

Pieper DH, Seeger M (2008) Bacterial metabolism of polychlorinated biphenyls. J Mol
Microbiol Biotechnol 15: 121-1 3 8

Radajewski S, Ineson P, Parekh NR, Murrell JC (2000) Stable-isotope probing as a tool
in microbial ecology. Nature 403:646-649

Rysavy JP Yan T, Novak PJ (2005) Enrichment of anaerobic polychlorinated biphenyl
dechlorinators from sediment with iron as a hydrogen source. Water Res 39:569-
578

Shimura M, Mukerjee-Dhar G, Kimbara K, Nagato H, Kiyohara H, Hatta T (1999)
Isolation and characterization of a thermophilic Bacillus sp. JP 8 capable of

degrading polychlorinated biphenyls and naphthalene. FEMS Microbiol Lett
178:87-93

Sul WJ, Park J, Quensen JF III, Rodrigues JLM., Seliger L, Tsoi TV, Zylstra, GJ, Tiedje
JM (2009) DNA-Stable Isotope Probing Integrated with Metagenomics: Retrieval

lll

of Biphenyl Dioxygenase Genes from PCB-Contaminated River Sediment. Appl
Environ Microbiol (in process)

Tillmann S, Strompl C, Timmis KN, Abraham WR (2005) Stable isotope probing reveals
the dominant role of Burkholderia species in aerobic degradation of PCBs. FEMS
Microbial Ecol 522207-217

Ward NL, Challacombe JF, Janssen PH, Henrissat B, Coutinho PM, Wu M, Xie G, Haft
DH, Sait M, Badger J, Barabote RD, Bradley B, Brettin TS, Brinkac LM, Bruce
D, Creasy T, Daugherty SC, Davidsen TM, DeBoy RT, Detter JC, Dodson RJ,
Durkin AS, Ganapathy A, Gwinn-Giglio M, Han CS, Khouri H, Kiss H, Kothari
SP, Madupu R, Nelson KE, Nelson WC, Paulsen 1, Penn K, Ren Q, Rosovitz MJ,
Selengut JD, Shrivastava S, Sullivan SA, Tapia R, Thompson LS, Watkins KL,
Yang Q, Yu C, Zafar N, Zhou L, Kuske CR (2009) Three genomes from the
phylum Acidobacteria provide insight into the lifestyles of these microorganisms
in soils. Appl Environ Microbial 75:2046-2056

Yan T, Lapara TM, Novak PJ (2006) The Impact of Sediment Characteristics on PCB-
dechlorinating Cultures: Implications for Bioaugmentation. Bioremediat J 10:143-
151

112

CHAPTER V
MICROBIAL COMMUNITY (ASSEMBLAGES) COMPARISONS BY

BACTERIAL TAXONOMY-SUPERVISED METHOD BYPASSING SEQUENCE
ALIGNMENT AND CLUSTERING

Author contributions:

Ryan Farris provide computational support and bacteria classiﬁcations. Ederson Jesus
helped with statistical analysis, project improvement and provided some sets of
sequences. Other providers of DNA samples, environmental matrix or pyrosequences
were: Mary Beth Leigh, David Emerson, Chris Blackwood, Ederson Jesus, Erick
Cardenas, Stres Blaz, Stephan Gantner, Claudia Etchebehere, Thad Stanton, Debora

Rodrigues, Aviaja Hansen, Mathew Marshall, Alexandre Soares Rosado, and Dan Fisher.

113

ABSTRACT

Two different species-sites matrices, the abundance list of species as rows and
sites (bacterial assemblages) as columns, from taxonomy-bins based on existing bacterial
taxonomy and non-taxonomy-supervised (clustering-determined) OTUS were compared
by classic Q-mode analysis, to describe interrelationships between sites and bacterial
assemblages. Similarity index measures and morphology of points in principle coordinate
analysis (PCoA) from two matrices based on 1.3 million 16S rRNA gene sequences from
pyrosequencing were signiﬁcantly correlated to each other. The taxonomy-supervised
method, using taxonomy-bins, is able to compare non-overlapping sequences, which are
often found in various regions within 168 rRNA genes sequences generated by
pyrosequencing, and is not limited by the exhaustive computation required for the
alignment and clustering required by the non-taxonomy-based method, but it does not

resolve as well were the current taxonomy is limited.

114

INTRODUCTION

Recently, the increasing abundance of 16S rRNA genes sequences has provided
new insight into the analysis of microbial communities (Tringe and Hugenholtz, 2008),
mostly due to reduced sequencing cost by new sequencing technologies. Although short
read lengths make it difﬁcult to assign sequences for the purpose of bacterial taxonomy,
deep sequencing with these new formats (e.g. 454 pyrosequencing [Margulies et al.,
2005]) is an emerging trend (Sogin et al., 2006; Huber et al., 2007; Roesch et al., 2008;
Chapter 2). More comprehensive sequencing provides better opportunities for intensive
bacterial community proﬁling and bacterial community comparisons. When comparing
bacterial assemblages with 16S rRNA gene sequences by classic Q-mode analysis to
describe interrelationships between sites (bacterial assemblages), each sequence is
allocated to species or OTUS (operational taxonomic units) by alignment-based clustering
at a speciﬁed nucleotide distance, usually at a 97% similarity. This species-site OTU
matrix, which is exclusively based on the nucleotide distances among 16S rRNA
sequences, is aligned as rows with sites or assemblages as columns. This matrix can be
generated and used for measuring site similarities either with presence / absence or
abundance data. Site clustering and site ranking can also be performed with this site-site
distance based matrix by ordination-based or hierarchical clustering. This process is
termed “taxonomy non-supervised analysis”, and is based simply on the distribution of
sequences to OTUS.

When applying taxonomy non-supervised analysis, the large numbers of

sequences (>106) generated by new sequencing technologies are an issue. Analysis

requires a large computational capacity in order to process the sequence data (Hamady

115

and Knight, 2009). The alignment and clustering of sequences that requires calculation of
pair-wise nucleotide distances is the bottleneck when this method is used. Taxonomy
non-supervised OTU analysis is advantageous in that it includes sequences which are yet
unassignable to taxonomy. However, the current computational limitations make
pursuing comparisons between among samples difficult.

Thus, we investigated an alternative method which is to allocate sequences into
taxonomy-supervised OTUS V or ‘taxonomy-bins’ based on the existing bacterial
taxonomy, which rooted in ‘polyphasic taxonomy’ (Colwell, 1970) reﬂecting
physiological, morphological, and genetic information. We deﬁne taxonomy-bins as all
taxonomic units (Genus to Phylum) provided by the Taxonomic Outline of the Bacteria
and Archaea (TOBA), release 7.8 (Cole et al., 2007) augmented with non-validated taxa
to cover sequences unassigned to the current bacterial taxonomy. Currently, several
ribosomal RNA databases (i.e. RDP [Wang et al., 2007], Greengenes [DeSantis et al.,
2006], and SILVA [Pruesse et al., 2007]) are dedicated to sequence deposition and
provide algorithm-based 16S rRNA gene classiﬁcation tools.

In this study, taxonomy non-supervised OTUs and taxonomy-bins are compared
using two similarity measures using 1.3 million sequences from 211 bacterial

assemblages (Appendix B3).

MATERIALS AND METHODS
We used approximately 1.3M V4 region-16S rRNA gene sequences collected
from 211 samples previously described in Chapter 2. We choose the following priori: The

habitat grouping was based on the habitat deﬁnitions (Category of priori group GOI-Gll

116

were listed in Appendix B2; Group assignment of 211 samples were listed in Appendix
B3.) suggested in Habitat-Lite Version 0.4 (Hirschman et al., 2008; deﬁnition of terms
were listed in Appendix B1).

For the non-supervised analysis, species-site matrices were generated as
previously described in appendix B5. Brieﬂy, all sequences were aligned by secondary
structure using Infernal (Nawrocki et al., 2009)), clustered by complete-linkage
clustering, and then allocated into 97% OTUS through RDP’s pyrosequencing pipeline
(Cole et al., 2009).

For the taxonomy-supervised analysis, all sequences were allocated into
taxonomy bins: 1400 genus and 492 artiﬁcial ‘unclassiﬁed’ taxa provided by RDP
classiﬁer-II at 80%, 50%, and 0% conﬁdence thresholds. Each of the lowest taxonomy
units, i.e. genera and ‘unclassiﬁed’ taxa were considered as taxonomy-bins. The
reliability of classiﬁcation of each sequence was estimated using bootstrapping, and
sequences that could not be assigned, as they were below a bootstrap conﬁdence
threshold, were located to an artiﬁcial 'unclassiﬁed' taxon.

Similarity measures of 211 samples (bacterial community assemblages) were
calculated by pair-wise Chao’s corrected Sorensen index (quantitative measures)(Chao et
al., 2006) and Jaccard index (presence/absence measures)(Jaccard, 1901) using
Estimates (http://viceroy.eeb.uconn.edu/EstimateS). Two site-by-site distance based
matrices (l- Chao’s corrected Sorensen index and 1- Jaccard index) from species-sites
matrices of OTUs and taxonomy-bins were compared by Mantel test (Mantel, 1967)
based on Spearman’s rank correlation rho. Site rank (ranks of bacterial assemblages)

based Principal Coordinate Analysis (PCoA) was visualized in two dimensions to

117

represent the greatest variability. The shape of points (assemblages) in PCoA plots was
compared by Procrustes analysis, a statistical shape analysis that compares the
distribution of points’ shapes with all 211 points in 210 Principal Coordinates (PC)
dimensions.

Three different sets of full-length (>1200bp) l6S rRNA gene sequence collections
were used: RDP-II classiﬁers training set, human gut (Dethlefsen et al., 2008), and soil
(Elshahed et al., 2008) were aligned and cut into V3, V4, and V6 hypervariable regions
based on the reference positions of the Escherichia coli 16S rRNA gene. A query of full-
length sequences to RDP-II classiﬁer were compared to the query of the V3, V4, and V6

hypervariable regions.

RESULTS

Allocation of 1.3M sequences to taxonomy-bins or 97% OTUS. Each rRNA
query sequence was assigned to a set of bins, 1400 genus and 492 artiﬁcial 'unclassiﬁed'
taxa using a naive Bayesian rRNA classiﬁer (RDP-II classiﬁer version 10). When the
Classiﬁer cutoffs were set at 80%, 50%, and 0% threshold (the latter forced all sequences
to genus bins), 48%, 64% and 100% of the sequences were classiﬁed up to the genus
level (Figure 1), and total number of taxonomy-bins (genera and ‘unclassiﬁed’ taxa)
covering the 1.3 M sequences was 903, 1170, and 1259 bins, at 80%, 50%, and 0%,
respectively. The mean value of maximum distance among the sequences within each bin
was increased when the Classiﬁer threshold was set lower. For taxonomy non-supervised
OTUS, all sequences were clustered into 112,233 OTUS at 97% 16S rRNA sequence

identity.

118

232 282.95
Hecate com Hocmmmﬂu 2-QO .3 uocgeov mEonmoEu ooaouunonu Bogota. Hm momﬁsoaoa noumommmmﬂo oocoscom 46 9:6:—
25. BEG—.88...

9:00 2E5“. 520 wmﬂo E:_>cn_ EmEoD
.xb

nxém

$9»

$09

$om

Jaggssep
dag Aq seouanbes peggssejo 4° aﬁetuamed

 

 

$03

119

A total of 22,154 pair-wise similarity index (Chao ’s corrected Sorenson similarity
index or Jaccard similarity index) calculations of 211 bacterial assemblages were
performed with both the taxonomy non-supervised OTUS-sites and the taxonomy-bins-
sites matrices. We used Mantel matrix correlation test to compare the two site-site
distance (l-similarity) based matrices (Table 1). The site-site matrix from taxonomy non-
supervised OTUS was signiﬁcantly and highly correlated with three site-site matrices
from taxonomy-bins (Table 1 and Figure 2). All ordinations of principle coordinated
analysis (PCoA) from the OTU-based dissimilarity matrix and taxonomy-bins-based
dissimilarity matrices were also highly correlated to each other when all ordinations

(k=210) of PCoA plots were compared by Procrustes rotation (Table 1).

DISCUSSION

The major advantage of the taxonomy-supervised method is the possibility for
comparison between any region of the 16S rRNA gene without alignment and clustering,
in contrast to the non-taxonomy supervised OTU method. Depending on the 16S rRNA
sequence length and the resolution of the bacterial taxonomy classiﬁcation, the
taxonomy-based method can also compare the bacterial assemblages of 16S rRNA
sequences spanning other hypervariable regions or bacterial assemblages with previously
deposited sequences. For example, the RDP classiﬁer-II returns similar classiﬁcation
results when compared to full-length queries at the genus level, regardless of the
hypervariable region (Table 2). Therefore one can obtain compatible data regardless of

the sequenced region. However, the coverage of the eubacterial primers used must be

120

Similarity Taxonomy Taxonomy Taxonomy

 

Index compar‘sons bins at 80% bins at 50% bins at 0%
“Chao Mame] Te“ O.7763* 08008“ 08146“
corrected r statstics
Sarensen Procrustes * * *
97% OTU index Analysis 0,) 0.9396 0.9406 0.9404
Based Mantel test * * *
1_Jaccard r statsﬁcs 0.7856 0.8595 0.7856
Index Procrus’“ 0.6853* 0.7007* 06853“

analysis (r)

 

Table 5.1. Similarity index measures and morphology of points in principle coordinate

analysis (PCoA). Mantel statistic based on Spearman’s rank correlation rho and
Procrustes rotation

a. The signiﬁcance of the statistic is evaluated by permuting rows and columns of the ﬁrst
dissimilarity matrix, * P value < 0.001

121

20000

1 5000

1 0000

5000

Rank of dlstance from 97% OTUS

 

“O‘KJ' 3'

0 5000 10000 1 5000 20000
Rank of distance from taxonomy-bin at 0% classifier

Figure 5.2. Rank comparison of distances (l-Chao’s corrected Sarenson similarity)

calculated using non taxonomy—supervised 97% OTUS and taxonomy-bins at 0% RDP
classiﬁer threshold.

122

ocow <79“ me. E £5me c> 98 J5 .m> oEmtmEoQE wcmﬁaam 328568 3:3 we .8838 :ozmommmmﬂo Ecoﬁmm .~.m 035.

doumoEmmm—o Mamas—-23 55, maouwogmmﬂo .32on :23 Mo @3038 ”wcﬁoﬁﬁ £22..
65. macaw 9 wowmmmﬂo 30:83.6. me 8:08.“ 665mg? £1.

as 32% swag Ea 9.2293 8280.. :8 .m acne b>b
8% 228 :92 as... 3:wa season :66 .m ”some 4? was 8:5 582 Ea 32mm cannon :8 .m swan same 9s

 

 

 

 

 

3w 3 a. was ma 3.” 3a 2w E 823% as:
5.2 Rm 2: 3m 3s 2: 2:. m. K 2: Banana .8...
mom
is was as 53 2a was :3 3 .8 wagons x:
2:. 3K 2: a: 3 2: 3w 4.8 2: Beans”. s...
“30 583E
$3 $9.. so :3 .88 so so» sow so :98 @565
00> pv> mm>

 

 

123

considered because the different sets of primers preferentially covered or does not cover
certain group of bacteria that derives the conﬂict community compositions.

Another advantage of the taxonomy-based method is that, due to the ﬁxed number
of taxonomy-bins, it is simple to add and delete bacterial assemblages from a pre-
formulated bacterial assemblage comparison. Using taxonomy non-supervised OTUS, the
addition and deletion of bacterial assemblages affects the species-sites matrices because
the number and composition of sequences within OTUS are affected by re-alignment and
re-clustering causing the addition and deletion of sequences. In addition, taxonomy-bin
allocation is faster computationally than taxonomy non-supervised OTUS, which requires
signiﬁcantly longer processing times with the addition of sequences (complete linkage
clustering requires increasing memory as the square root of the number of added
sequences).

We focused on deﬁning the differences between using taxonomy non-supervised
OTUS and taxonomy-bin when comparing bacterial assemblages. Both a distance-based
matrix and the morphology of points in PCoA ordinations conﬁrmed that the two
methods are signiﬁcantly correlated such that the conclusions would be comparable.
However, the resolution in comparing the bacterial assemblages is more limited with the
taxonomy method due to the coarser average distance among taxa. The mean distance
among the sequences inside the taxonomy-bins was 5.6%, 7.4%, 14.6% at 80%, 50%, 0%
threshold, respectively. For example, there was a decreased resolution of priori G01
(basically soils) in taxonomy-bin based PCoA plots as compared to taxonomy non-
supervised OTUS. This is due to the more limited number of taxonomy-bins in the

Phylum Acidobacteria (26 genera and 4 ‘unclassiﬁed’ taxa), Verrucomicrobia (10 genera

124

ii

97%OTU

 

 

 

 

 

 

 

1.5
1 i XI:
0.5 ‘
$09 at x
0 q 3“ ”git ‘ F
i. 1" r
-0.5 ‘ .3
-1 J
’1-5 I I r I
-‘|.5 - -0.5 0 0.5 1
I __ PCl w L
3 Taxonomy-pins at 59%
2 i t:
1 q ‘
., L.“ 09 “E”
g 0 ' 3:" ’l '1' Clin!
-1 .
-2 q
‘3 I I
-2 2 4
PC]

Taxonomv-bins at 0%

 

 

 

 

 

 

3
2- .. '4
1“ 5 )- “' ;‘- I O
.1? F .- 93% a
g 0‘ :x i; n

-1.
-2 .
'3 I I

-2

PCI

3 Taxonomv-bins at 80%
2.4

14 ,

r‘ :“F’
El O-l .‘ C Mb. 0

4. * ‘ 836%:
-2. " 7-
-3 .7 I ’1

-2 2

PCI

Figure 5.3A. PCoA plot comparisons by abundance based distance

125

 

 

97%OTU Taxonomy-bins at 0%

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.5 0.5
1‘
0.25 -
0.25 d ,, 3
g x g o . (“if x t,
0 § u '17 dﬂ m
1 W‘ . x ‘1 ‘ - n r
Q 3’ m“ ” a N ‘ ‘ -o.zs ‘ x 3 4590
”-0.25 " - 7 ~ *7 w; 7 , -05 r '
-0.3 ’0.1 0.1 0,3 _o.s o 0.5
PC1 PCI
0 5 Taxnnan-hins at 50% Taxonomv-bins at 80%
0'25 ‘ 3’ x snag?” 0.25 4
x ‘0‘ D
x i . B L?
N x :1 LCD ‘ x -. x
8 0 ‘ x - 1;; g o q a»: x I!
._ x “km \0- “ 3” ‘dn m
x! x “ 0
‘0.25 ‘ ‘O-ZS ‘ x g f
-0.5 r ' -0.5 r .
'0 5 0 O 5 -0 5 O O 5
PC]

PCi

Figure 5.3B. PCoA plot comparisons by occurrence based distances

126

and 8 ‘unclassiﬁed’ taxa), and Gemmatimanadetes (2 genera and 5 ‘unclassiﬁed’ taxa).
These bins have a relatively large number of sequences in priori G01 to the low number
of isolated bacteria or described clusters. As such, their taxonomy is currently
incomplete. In contrast, the assemblages in priori GO4 (animal feces) were mostly
composed of well-characterized groups and exhibited better separation to other groups
with the taxonomy-bin method rather than the taxonomy non-supervised OTU method.
When better classiﬁcation of the bacterial taxonomy is available for these phyla and the
‘unclassiﬁed’ taxa, the bacterial assemblage comparison result should exhibit a higher
resolution and more accurately reﬂect microbial community composition.

Revolutionary sequencing technologies continue to emerge, generating
tremendous numbers of 168 rRNA gene sequences. However, current clustering tools are
limited in both their ﬂexibility and computational requirements. The taxonomy-based
method has the potential to overcome these limitations as a fast and simple bacterial
assemblage comparison method. Its value could be further improved if the

microbiologists advanced the taxonomy for the poorly characterized groups.

127

REFERENCES

Chao A, Chazdon RL, Colwell RK, Shen TJ (2006) Abundance-based similarity indices
and their estimation when there are unseen species in samples. Biometrics 62:361-
71

Colwell RR (1970) Polyphasic taxonomy of the genus Vibrio: numerical taxonomy of
Vibrio cholerae, Vibrio parahaemolyticus, and related Vibrio species. J Bacterial
104:410-433

DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi
D, Hu P, Andersen GL (2006) Greengenes, a chimera-checked 16S rRNA gene
database and workbench compatible with ARB. Appl Environ Microbiol 72:5069-
5072

Dethlefsen L, Huse S, Sogin ML, Relman DA (2008) The pervasive effects of an
antibiotic on the human gut microbiota, as revealed by deep 16S rRNA
sequencing. PLoS Biol 62e280

Elshahed MS, Youssef NH, Spain AM, Sheik C, Najar FZ, Sukharnikov LO, Roe BA,
Davis JP, Schloss PD, Bailey VL, Krumholz LR (2008) Novelty and uniqueness
patterns of rare members of the soil biosphere. Appl Environ Microbiol 74:5422~
5428

Hamady M, Knight R (2009) Microbial community proﬁling for human microbiome
projects: Tools, techniques, and challenges. Genome Res 19:1141-1152

Hirschman L, Clark C, Cohen KB, Mardis S, Luciano J, Kottmann R, Cole J, Markowitz
V, Kyrpides N, Morrison N, Schriml LM, Field D, Novo Project (2008) Habitat-

Lite: a GSC case study based on free text terms for environmental metadata.
OMICS 122129-136

Huber JA, Mark Welch DB, Morrison HG, Huse SM, Neal PR, Butterﬁeld DA, Sogin
ML (2007) Microbial population structures in the deep marine biosphere. Science
318:97-100

Jaccard P (1901) Etude comparative de la distribution ﬂorale dans une portion des Alpes
et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37:547—579

Mantel N (1967) The detection of disease clustering and a generalized regression
approach. Cancer Res 27:209-220

Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J,
Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV,
Godwin BC, He W, Helgesen S, Ho CH, Ho CH, Irzyk GP, Jando SC, Alenquer,
ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Leﬂcowitz

128

SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP,
Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT,
Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt
KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg
JM (2005) Genome sequencing in microfabricated high-density picolitre reactors.
Nature 437 :376-380

Nawrocki EP, Kolbe DL, Eddy SR (2009) Infernal 1.0: inference of RNA alignments.
Biainfarmatics 252133 5- 13 37

Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH,
Camargo FA, F armerie WG, Triplett EW (2007) Pyrosequencing enumerates and
contrasts soil microbial diversity. ISME J 1:283-290

Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner F0 (2007)
SILVA: a comprehensive online resource for quality checked and aligned
ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35:7188-
7196

Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, Neal PR, Arrieta JM,
Hemdl GJ (2006) Microbial diversity in the deep sea and the underexplored "rare
biosphere". Proc Natl Acad Sci U S A 103212115-12120

Tringe SG, Hugenholtz P (2008) A renaissance for the pioneering 16S rRNA gene. Curr
Opin Microbial 112442-446

Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naive Bayesian classiﬁer for rapid

assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ
Microbiol 73:5261-5267

129

Appendix A.

C2 C2 C2
0d 4d 14d

Pi Pi Rr Rr Rr Rr
0d 14d 0d 3d 14d 14ds

 

 

no rank Root 100 100 100 100 100 100 100 100 100
domain Bacteria 99.9 100 100 100 100 99.9 99.9 99.9 100
UC) Bacteria 22.3 12.5 13.0 11.9 4.66 25.0 15.7 17.2 2.30
P) Actinobacteria 7.25 26.2 18.4 44.4 5.71 1.87 8.50 8.75 0.15
C) Actinobacteria 7.25 26.2 18.4 44.4 5.71 1.87 8.50 8.75 0.15
UC) Actinobacteria 0.98 1.96 1.66 2.48 0.10 0.20 2.09 2.31 0.00
SC) Actinobacteridae 3.26 19.4 12.6 21.1 4.91 1.18 4.51 3.72 0.11
O) Biﬁdobacteriales 0.01 0.94

F) Biﬁdobacteriaceae 0.01 0.94

G) Biﬁdobacterium 0.01 0.94

O) Actinomycetales 3.22 19.4 12.6 21.1 4.91 0.23 4.44 3.69 0.1 1
UC) Actinomycetales 1.21 1.10 0.92 1.58 0.10 0.06 0.48 0.49 0.02
SO) Streptosporangineae 0.02 0.18 0.08 0.03 0.04 0.08

F) Streptosporangiaceae 0.14 0.03 0.00

G) Streptosporangium 0.11 0.00

80) Micrococcineae 0.55 2.13 1.95 2.76 0.54 0.06 2.64 1.87 0.02
UC) Micrococcineae 0.16 0.17 0.15 0.08 0.10 0.10

F) Cellulomonadaceae 0.16 0.07 0.08 0.10 0.16 0.04 0.05

G) Cellulomonas 0.16 0.07 0.08 0.10 0.16 0.04 0.05

F) Pmmicromonosporame 0.03 0.68 0.26 0.06 0.03 0.00

G) Promicromonospora 0.64 0.24 0.04

F) Microbacteriaceae 0.07 0.42 0.53 0.17 0.13 0.02 0.02 0.03 0.01
UC) Microbacteriaceae 0.04 0.20 0.19 0.08 0.10 0.01 0.00 0.03

G) Agromyces 0.01 0.17 0.25 0.04 0.03 0.01 0.01
F) lntrasporangiaceae 0.11 0.29 0.33 0.35 0.10 0.02 0.04

UC) lntrasporangiaceae 0.03 0.02 0.16 0.12 0.03 0.00 0.01

G) Janibacter 0.09 0.27 0.11 0.17 0.01

F) Micrococcaceae 0.03 0.49 0.60 2.01 0.13 0.04 2.45 1.64 0.01
UC) Micrococcaceae 0.29 0.17

G) Renibacterium 0.03 0.02 0.47 0.31

G) Arthrobacter 0.03 0.49 0.60 1.98 0.13 0.02 1.66 1.16 0.01
SO) Frankineae 0.17 0.15 0.21 0.76 1.95 0.01 0.05 0.12 0.03
F) Kineosporiaceae 0.04 0.07 0.01 0.31 0.13 0.02 0.01

G) Kineosporia 0.04 0.07 0.01 0.25 0.13 0.01 0.00

F) Nakamurellaceae 0.03 0.06 0.17

G) Nakamurella 0.03 0.06 0.17

F) Geodermatophilaceae 0.06 0.07 0.11 0.20 1.82 0.01 0.08 0.02
G) Blastococcus 0.03 0.07 0.06 0.17 1.79 0.05 0.02
SO) Pseudonocardineae 0.08 4.55 4.46 1.87 0.10 0.03 0.09 0.06 0.00
F) Actinosynnemataceae 0.02 0.17 0.05 0.56 0.06 0.01 0.00 0.01 0.00
G) Actinosynnema 0.18 0.03

G) Lentzea 0.17 0.04 0.35 0.03

F) Pseudonocardiaceae 0.06 4.30 4.38 1.28 0.03 0.07 0.05

UC) Pseudonocardiaceae 0.01 0.05 0.37 0.07 0.01

G) Kutzneria 0.01 0.02 0.09 0.10

G) Saccharopolyspora 0.04 0.50 0.02

G) Pseudonocardia 0.04 4.23 3.83 0.57 0.03 0.04 0.01

SO) Propionibacterineae 0.77 8.58 2.92 6.20 1.56 0.05 0.39 0.47 0.00
F) Nocardioidaceae 0.76 8.54 2.92 6.20 1.56 0.05 0.38 0.45 0.00

 

 

130

 

Appendix A cont’d
SO) Propionibacterineae
F) Nocardioidaceae
UC) Nocardioidaceae
G) Nocardioides
G) Kribbella
G) Aeromicrobium
SO) Micromonosporineae
F) Micromonosporaceae
UC) Micromonosporaceae
G) Micromonospora
G) Actinoplanes
SO) Streptomycineae
F) Streptomycetaceae
G) Streptomyces
SO) Glycomycineae
F) Glycomycetaceae
G) Stackebrandtia
G) Glycomyces
SO) Corynebacterineae
F) Nocardiaceae
G) Rhodococcus
F) Mycobacteriaceae
G) Mycobacterium
SC) Rubrobacteridae
O) Rubrobacterales
SO) Rubrobacterineae
UC) Rubrobacterineae
F) Rubrobacteraceae
UC) Rubrobacteraceae
G) Solirubrobacter
G) Conexibacter
G) Thermoleophilum
G) Rubrobacter
SC) Acidirnicrobidae
O) Acidirnicrobiales
SO) Acidimicrobineae
F) Acidirnicrobiaceae
G) Acidimicrobium
P) Bacteroidetes
UC) Bacteroidetes
C) Flavobacteria
O) F lavobacteriales
UC) F lavobacteriales
F) F lavobacteriaceae
G) Flavobacterium
G) Lutibacter
F) Cryomorphaceae
UC) Cryomorphaceae
G) Brumimicrobium
G) Crocinitomix
G) Fluviicola
C) Sphingobacteria

0.77
0.76
0.15
0.55
0.01
0.04
0.04
0.04
0.01

0.02
0.06
0.06
0.03

0.33
0.04
0.02
0.25
0.25
2.97
2.97
2.97
0.37
2.60
1.32
0.57
0.61
0.08
0.02
0.03
0.03
0.03
0.03
0.03
4.99
0.18
2.40
2.40
0.03
2.25
2.25

0.13
0.04

0.09
2.38

8.58
8.54
0.22
6.53
1.39
0.34
1.17
1.17
0.32
0.34
0.32
0.78
0.78
0.68
0.32
0.32
0.05
0.27
0.68
0.15
0.10
0.39
0.39
4.72
4.72
4.72
0.44
4.21
1.66
1.12
1.08
0.34

0.05
0.05
0.05
0.05
0.05
0.95
0.07
0.05
0.05
0.02
0.02
0.02

0.83

2.92
2.92
0.1 l
1.95
0.65
0.17
0.48
0.48
0.06
0.15
0.12
0.91
0.91
0.83
0.18
0.18
0.17
0.01
0.35
0.1 1
0.10
0.21
0.21
4.03
4.03
4.03
0.30
3.67
1.65
0.85
0.95
0.23

0.12
0.12
O. 12
0.12
0.12
0.96
0.03
0.07
0.07
0.03
0.01
0.01

0.03
0.02

0.01
0.85

 

131

6.20
6.20
0.77
3.30
0.89
1.14
0.83
0.83
0.34
0.32
0.08
6.08
6.08
5.95

0.96
0.45
0.42
0.49
0.49
20.5
20.5
20.5
0.57
19.9
7.1 1
4.68
5.14
2.97
0.08
0.23
0.23
0.23
0.23
0.23
0.44

0.01
0.01

0.01
0.01

0.43

1.56
1.56

1.31
0.10
0.16
0.03
0.03

0.35
0.35
0.29

0.26
0.13
0.06
0.06
0.06
0.70
0.70
0.70
0.06
0.64
0.13
0.26
0.06

0.19

0.29

 

0.05
0.05
0.03
0.02

0.02
0.02
0.02

0.01

0.01
0.01
0.47
0.47
0.47
0.09
0.38
0.14
0.02
0.17
0.03
0.01
0.01
0.01
0.01
0.01
0.01
9.17
2.06
5.42
5.42
2.52
0.58
0.28
0.29
2.32
1.31
0.61
0.27
0.13
1.22

0.39
0.38
0.10
0.24
0.01

0.20
0.20
0.13
0.00

0.18
0.18
0.11

0.36
0.02
0.00
0.30
0.30
1.88
1.88
1.88
0.44
1.38
0.86
0.16
0.20
0.09
0.07
0.01
0.01
0.01
0.01
0.01
0.29
0.01
0.00
0.00

0.00

0.47
0.45
0.12
0.27
0.04
0.00
0.14
0.14
0.10
0.01
0.02
0.19
0.19
0.16

0.26
0.00

0.21
0.21
2.70
2.70
2.70
0.58
2.10
1.24
0.23
0.38
0.24
0.00
0.01
0.01
0.01
0.01
0.01
0.44
0.03
0.00
0.00

0.00

0.00

0.40

0.00
0.00

0.00

0.02
0.00
0.00

0.03
0.03
0.03

0.03

0.00
0.02

6.53
0.25
1.05
1.05
0.83
0.13
0.10
0.02
0.10
0.01
0.01
0.07
0.01
5.23

 

1””

Appendix A cont’d
O) Sphingobacteriales
UC) Sphingobacteriales
F) Crenotrichaceae
UC) Crenotrichaceae
G) Terrimonas
G) Chitinophaga
F) Sphingobacteriaceae
G) Pedobacter
F) Saprospiraceae
G) Levvinella
F) Flexibacteraceae
UC) Flexibacteraceae
G) Niastella
C) Bacteroidetes
O) Bacteroidales
UC) Bacteroidales
F) Porphyromonadaceae
G) Paludibacter
P) Nitrospira
C) Nitrospira
O) Nitrospirales
F) Nitrospiraceae
U C) Nitrospiraceae
G) Nitrospira
G) Magnetobacterium
P) Acidobacteria
C) Acidobacteria
O) Acidobacteriales
F) Acidobacteriaceae
U C) Acidobacteriaceae
G) Gp4
G) Gp22
G) Gp 16
G) Gp 10
G) Gp5
G) Gp l 8
G) Gp6
G) Gp23
G) Op 1 1
G) Gp3
G) Gp l
G) Gp2
G) Gp25
G) Op 1 7
G) Gp‘7

P) PI’011eobacteria

UC) 1)l‘oteobacteria

g) EDSilonproteobacteria

17)) (C3: al'npylobacterales

G) Campylobacteraceae
alnpylobacter

2.38
0.18
1.52
0.31
1.01
0.19

0.1 1
0.05
0.58
0.09
0.46
0.03
0.03
0.03

0.07
0.07
0.07
0.07

0.07

14.8
14.8
14.8
14.8
0.32
2.66
0.59
0.46
0.41
0.32
0.02
7.88

0.34
0.43
0.02
0.04
0.12
0.92
0.22
24.8

5.41
0.03
0.03
0.03
0.03

0.83

0.49
0.10
0.32
0.07
0.02
0.02
0.02

0.29
0.02
0.22

0.15
0.15
0.15
0.15

0.15

17.5
17.5
17.5
17.5
0.05
2.76

2.13
0.05
0.42

9.98

0.12
0.24
0.15
0.02

0.93
0.61
29.1

3.77

0.85
0.05
0.59
0.04
0.43
0.1 1

0.21
0.03
0.18

12.5
12.5
12.5
12.5

2.33
0.06
1.51
0.02
0.24
0.01
6.57

0.12
0.31
0.04

0.03
0.81
0.45
41.8

5.61

 

0.43
0.01
0.30
0.01
0.23
0.07

0.06
0.04
0.07
0.01
0.06

0.06
0.06
0.06
0.06

0.06

10.1
10.1
10.1
10.1
0.06
3.38
0.02
2.18
0.05
0.12
0.03
3.15

0.04
0.34
0.16
0.01
0.07
0.20
0.35
20.7

1.88
0.18
0.18
0.18
0.18

132

0.29

0.10

0.03
0.06
0.19
0.19

5.36
5.36
5.36
5.36
0.06
1.24

0.22

0.03

2.42

0.64
0.54

0.10
0.10
78.5

0.32

 

1.22
0.20
0.29

0.20
0.01
0.01
0.01
0.25
0.19
0.47
0.24
0.16
0.47
0.47
0.37
0.10
0.10
0.32
0.32
0.32
0.32
0.13
0.01
0.18
8.06
8.06
8.06
8.06
0.06
0.85
0.02
0.58
0.01
0.08
0.59
4.48
0.65
0.01
0.17
0.05
0.03
0.07
0.18
0.14
35.8

5.76
0.25
0.25
0.24
0.23

0.26
0.03
0.17

0.12
0.01

0.06
0.01
0.05
0.01
0.01

0.07
0.07
0.07
0.07

0.07

26.7
26.7
26.7
26.7
0.39
7.46
0.07
0.61
0.08
1.35
0.08
11.4
0.01
0.08
0.56
1.69
0.1 1
1.35
0.26
1 .12
22.4

2.35
0.07
0.07

0.40
0.03
0.31
0.02
0.24
0.04

0.07

0.06

0.10
0.10
0.10
0.10

0.10

26.5
26.5
26.5
26.5
0.36

0.05
0.64
0.02
0.87
0.06
l 1.2
0.01
0.06
0.79
2.44
0.10
1.48
0.21
0.94
25.8

2.21
0.00
0.00

5.23
0.15
4.79
0.05
0.05
4.69

0.01
0.01
0.28
0.01
0.26
0.01
0.01

0.00
0.00

1.59
1.59
1.59
1.59
0.02
0.09
0.08
0.13
0.05

0.18
0.81
0.08

0.02

0.03
0.08
83.9

0.73

 

Appendix A cont’d
C) Deltaproteobacteria
UC) Deltaproteobacteria
0) Syntrophobacterales
F) Syntrophaceae
G) Smithella

F) S yntrophobacteraceae

UC) Syntrophobacteraceae

G) Syntrophobacter

O) Desulfuromonales

F) Geobacteraceae

G) Geobacter

0) De sulfobacterales

F) De sulfobacteraceae

UC) Desulfobacteraceae

G) Desulfobacterium

G) De sulfonema

F) De sulfobulbaceae

G) Desulfobulbus

G) De sulfocapsa

0) Desulfovibrionales

F) De sulfovibrionaceae

G) Desulfovibrio

0) Myxococcales

UC) Myxococcales

SO) Cystobacterineae

UC) Cystobacterineae

F) Cy stobacteraceae

G) An aeromyxobacter

F) MYXococcaceae

SO) N annocystineae

F) 1\IaJ‘inocystaceae

UC) Nannocystaceae

F) Ha I iangiaceae

G) Ha liangium

SO) Sorangineae

F) l)Olyangiaceae

UC) Polyangiaceae

G) BYSsovorax

0) Bdellovibrionales

F) BC! ellovibrionaceae

G) Bdellovibrio

C) A lphaproteobacteria

UC) Alphaproteobacteria
) c:atilobacterales

F) CEllJlobacteraceae

G) Camlobacter

G) P11 enylobacterium

g; g 1" evundimonas

F) S I) liingomonadales

U C) p hingomonadaceae
S phingomonadaceae

3.51
1.90
0.22
0.07

0.15
0.11

0.19
0.13
0.11
0.02

0.02
0.02

0.96
0.55
0.05

0.04
0.03
0.01
0.04
0.03

0.01
0.01
0.32
0.32
0.17
0.08
0.21
0.16
0.16
1 1.0
0.42
0.87
0.87
0.18

0.55

0.11
0.74
0.74
0.25

2.64
1.12
0.34

0.34
0.29

1.05
0.42
0.27
0.10
0.07
0.02
0.10
0.20
0.10

0.05
0.05
0.17
0.17
0.02
0.12
0.12
0.07
0.07
16.7
0.66
1.20
1.20
0.29

0.71

0.20
2.64
2.64
0.61

3.85
1.36
0.09
0.02

0.06
0.05

0.02
0.02
0.02

2.32
1.15
0.31
0.15
0.10
0.07
0.06
0.36
0.17
0.12
0.19
0.19
0.49
0.49
0.26
0.10
0.06
0.02
0.02
20.6
0.39
0.74
0.74
0.06

0.61

0.06
5.39
5.39
0.96

 

133

1.00
0.45
0.01

0.01
0.01

0.17
0.17
0.17
0.01

0.01

0.33
0.19
0.03

0.01
0.01
0.02

0.11
0.1 1
0.08
0.03
0.03
0.03
0.03
14.3
0.26
0.21
0.21
0.01

0.18

0.92
0.92
0.12

0.29

0.22
0.03
0.10

0.10

0.10
0.10
0.10

0.06

53.4
1.21
18.4
18.4
1.79
16.6

0.03
13.8
13.8
1 1.3

 

4.93
1.82
0.98
0.49
0.37
0.48
0.22
0.18
0.28
0.16
0.13
1.21
0.88
0.37
0.39
0.1 1
0.32
0.14
0.1 1
0.19
0.17
0.17
0.43
0.18
0.16
0.02
0.1 1
0.10
0.02

0.09
0.09
0.02
0.04
0.04
0.04
0.04
3.43
0.21
0.18
0.18
0.03

0.15

0.64
0.64
0.15

1.81
0.95
0.06
0.02
0.01
0.04
0.02

0.1 1
0.09
0.01
0.08
0.07
0.01
0.00
0.05
0.01

0.03
0.03
0.03
0.54
0.16
0.24
0.09
0.13
0.04
0.02
0.02

0.01
0.01
0.12
0.12
0.03
0.07
0.04
0.04
0.04
10.3
0.29
0.33
0.33
0.00

0.29

1.80
1.80
0.45

1.88
0.98
0.07
0.03
0.03
0.03
0.02

0.1 1
0.09

0.03
0.02
0.00
0.01
0.00
0.00
0.00

0.67
0.19
0.28
0.06
0.14
0.05
0.08
0.03
0.02

0.00
0.00
0.17
0.17
0.09
0.04
0.02
0.02
0.02
1 1.7
0.52
0.35
0.35
0.00

0.26

0.05
2.42
2.42
0.82

1.93
0.48
0.32
0.17
0.13
0.13
0.07
0.01
0.03
0.03
0.03
0.71
0.62
0.12
0.14
0.35
0.09
0.07
0.01
0.00
0.00
0.00
0.23
0.00
0.23

0.04
0.03
0.19

0.14
0.05
0.05
4.90
0.04
2.49
2.49
0.45

0.19

1.85
1.02
1.02
0.05

 

Appendix A cont’d
G) Novosphingobium
G) Sphingosinicella
G) Sphingomonas
O) Rhodobacterales
F) Rhodobacteraceae

UC) Rhodobacteraceae

G) Amaricoccus

G) Rhodobacter

O) Rhodospirillales

UC) Rhodospirillales

F) Acetobacteraceae

UC) Acetobacteraceae

G) Be Inapia

G) Ro seomonas

G) Ste lla

G) Rhodopila

F) Rhodospirillaceae

UC) Rhodospirillaceae

G) Skermanella

G) Inquilinus

G) AZospirillum

0) Rh izobiales

UC) Rhizobiales

1F ) I’11)Illobacteriaceae

G) M esorhizobium

G) Phyllobacterium

F) Rh izobiaceae

G) En sifer

G) Rh izobium

F) Bradyrhizobiaceae

UC) Bradyrhizobiaceae

G) Bo sea

G) Aﬁpia

G) l{l'modopseudomonas

G) N i trobacter

G) Bradyrhizobium

F) HY‘phomicrobiaceae

UC) Hyphomicrobiaceae

G) Rhodoplanes

G) Pedomicrobium

G) Hthomicrobium

G) Dewosia

G) B lastochloris

F) Be i jerinckiaceae

1(3) Chelatococcus
) I\“I'Ethylocystaceae

19)) I\Vtﬁathylopila

U CM 6: thylobacteriaceae

) l\-’letiiylobacteriaceae

g; M ethylobacterium
M i crovirga

0.08
0.09
0.24
0.31
0.31
0.18
0.11
0.01
1.07
0.17
0.71
0.49

0.06
0.12
0.03
0.19
0.11
0.08
0.01

7.67
1.04
0.26
0.24

0.18
0.04
0.12
4.34
0.91
0.18
0.10
0.43
0.14
2.56
1.06
0.35
0.11
0.10
0.25
0.04
0.13

0.14
0.05

0.61
0.10
0.22
0.29

0.12
0.32
1.47
0.24
0.24
0.02
0.22

1.98
0.29
1.35
0.95
0.05
0.02
0.15
0.17
0.34
0.05
0.24
0.02
0.02
9.98
0.98
0.44
0.22
0.20
0.34
0.02
0.32
5.33
1.20
0.42
0.17
0.49
0.20
2.81
2.08
0.64

0.17
0.71
0.49
0.07

0.10
0.02

0.66
0.02
0.32
0.32

0.13
0.71
3.36
0.39
0.39
0.09
0.27
0.02
2.1 1
0.42
1.45
1.13
0.01

0.23
0.01
0.24
0.15
0.05
0.04

11.6
1.90
0.84
0.52
0.24
0.17
0.03
0.11
4.48
1.65
0.18
0.17
0.30
0.25
1.94
3.57
1.12
0.13
0.80
0.83
0.36
0.18
0.02
0.10
0.02
0.15
0.15
0.41
0.06
0.17
0.17

 

0.08
0.09
0.54
0.55
0.55
0.33
0.15

2.24
0.20
1.44
1.18
0.04
0.01
0.08
0.10
0.60
0.22
0.36
0.03

10.1
1.23
0.26
0.12
0.07
0.24
0.03
0.20
3.44
1.08
0.12
0.03
0.04
0.03
2.14
2.83
0.63
0.08
0.65
0.94
0.08
0.16
0.18
0.25
0.16
0.02
0.02
1.79
0.41
0.28
1 . 10

134

0.19
2.23
0.16
0.16
0.06
0.10

12.5
0.67
5.42
4.11
1.08

0.19
0.03
6.47
0.16
0.51
0.89
4.91
7.17
0.73
0.45
0.32
0.13
2.01
0.89
1.08
0.96

0.64

0.03

0.29
0.26
0.06

0.10

0.10

0.06
0.03
0.03
0.03
2.61
0.61
0.89
1 . 12

 

0.28
0.10
0.04
0.49
0.49
0.14

0.16
0.50
0.07
0.40
0.24

0.13

0.02
0.02

1.41
0.19
0.06
0.02
0.01
0.02

0.01
0.29
0.13
0.01
0.02
0.01
0.02
0.10
0.63
0.24
0.10
0.01
0.13
0.02
0.12

0.06
0.01
0.02

0.06
0.02
0.02
0.01

0.08
0.97
0.22
0.01
0.01
0.01

1.28
0.26
0.81
0.68

0.01
0.07
0.01
0.21
0.18
0.01
0.01
0.01
6.59
1.18
0.07
0.07

0.18
0.00
0.16
1.41
0.59
0.01
0.03

0.05
0.68
2.49
1.26
0.50
0.07
0.05
0.05
0.50
0.04
0.10
0.08

1.09
0.25
0.40
0.44

0.04
1.07
0.43
0.01
0.01
0.01

1.43
0.28
0.91
0.80

0.02
0.06
0.00
0.25
0.21
0.03
0.00
0.00
7.02
1.37
0.08
0.05
0.02
0.1 1

0.08
1.43
0.81
0.05

0.05
0.51
3.03
1.40
0.72
0.14
0.05
0.12
0.52
0.02
0.12
0.04

0.85
0.20
0.39
0.26

0.02
0.00
0.91
0.03
0.03
0.00

0.02
0.93
0.02
0.33
0.09

0.24

0.58
0.00

0.57
0.39
0.02

0.09

0.09
0.20
0.00
0.18
0.02

0.02
0.00
0.01

0.00

0.03
0.00

0.02
0.00
0.00
0.01

 

Appendix A cont’d
C) Gammaproteobacteria
UC) Gammaproteobacteria
O) Alteromonadales
O) Pseudomonadales

F) Moraxellaceae

G) Acinetobacter

F) Pseudomonadaceae

G) Pseudomonas

G) Ce llvibrio

0) En terobacteriales

F) Enterobacteriaceae

UC) Enterobacteriaceae

G) [(1 ebsiella

G) Sh i gella

O) Chromatiales

UC) Chromatiales

F) Ectothiorhodospiracate

UC)

Ectoth iorhodospiraceae

F) Ch romatiaceae

UC) Chromatiaceae

G) M arichromatium

0) M ethylococcales

F) Methylococcaceae

UC) Nethylococcaceae

G) M ethylobacter

0) Xanthomonadales

F) Xanthomonadaceae

UC) Xanthomonadaceae

G) Luteimonas

G) Stfcnotrophomonas

G) Ly sobacter

G) I)Sdeudoxanthomonas

0) Legionellales

F) L-€=.gione11aceae

F) CC>J~tiellaceae

G) R ickettsiella

G) Aquicella

0) Oceanospirillales

F) Ha lomonadaceae

G) H a lomonas

C) Betaproteobacteria

UC) Betaproteobacteria

0) Ne isseriales

F) Ne isseriaceae

UC) Neisseriaceae

g) F9mivibrio

F)) N.1 trosomonadales

G) 11:1.trosomonadaceae

1 trosomonas
I?) ethylophilales
) M e thylophilaceae

3.11
1.56
0.01
0.41
0.02

0.39
0.17
0.22
0.04
0.04
0.01
0.01
0.01
0.18
0.05
0.11

0.10

0.02
0.02

0.52
0.52
0.22
0.20

0.07
0.01
0.34
0.09
0.23
0.11
0.09
0.02

1.70
0.40
0.01
0.01

0.01
0.05
0.05

0.01
0.01

2.96
1.25

0.39
0.22
0.12
0.17
0.17

0.27
0.27

0.27
0.22
0.02
0.20

0.20

0.37
0.37
0.02

0.05
0.24

0.46
0.17
0.29

0.29

3.03
0.10
0.02
0.02
0.02

0.02
0.02

6.75
3.24
0.16
0.72
0.24
0.15
0.48
0.48

1.35
1.35
0.02

1.31
0.40
0.06
0.32

0.32

0.02
0.01

0.63
0.63
0.11
0.05
0.07
0.28
0.01
0.21
0.10
0.10
0.02
0.04
0.03
0.03
0.03
4.97
0.22
0.03
0.03

0.03
0.12
0.12

 

1.74
0.70

0.28
0.02
0.02
0.26
0.25
0.01
0.17
0.17

0.02
0.14
0.20
0.05
0.15

0.15

0.33
0.33
0.10
0.03
0.03
0.11
0.01
0.06
0.01
0.02

0.01
0.01

1.67
0.39

0.07
0.07
0.01

135

1 1.2
0.19
0.19
0.99
0.26
0.26
0.73
0.73

3.06
3.06
0.10
0.06
2.81
0.03

0.03
0.03

6.60
6.60

0.06
0.03
3.86
2.65
0.03

0.03

0.03
0.19
0.16
0.16
13.2
0.03
0.03
0.03
0.03

 

l 1.3
8.12
0.02
0.06
0.02
0.01
0.04
0.04

0.16
0.16
0.01

0.13
1.17
0.19
0.31

0.28

0.67
0.47
0.17
0.86
0.86
0.13
0.65
0.65
0.65
0.34
0.13

0.13

0.29
0.01
0.28
0.20
0.04
0.02

10.1
3.17
0.24
0.24
0.14
0.10
0.17
0.15
0.06
0.24
0.24

3.99
1.77
0.00
0.28
0.05
0.04
0.23
0.19
0.00
0.88
0.88
0.19
0.09
0.55
0.07
0.03
0.04

0.04

0.00
0.00
0.00

0.42
0.42
0.12
0.03
0.15
0.05
0.01
0.50
0.01
0.45
0.01
0.39
0.01

3.94
0.95
0.14
0.14
0.09
0.05
0.00
0.00

4.62
1.59
0.02
0.34
0.02
0.02
0.32
0.29
0.02
1.16
1.16
0.21
0.13
0.77
0.15
0.05
0.05

0.03

0.04
0.03

1.00
1.00
0.19
0.02
0.57
0.06

0.35
0.03
0.32

0.27
0.01
0.01
0.01
5.41
1.51
0.19
0.19
0.13
0.06
0.01
0.01
0.00

50.5
0.70

48.5
0.56
0.55
48.0
47.9

0.11
0.11

0.11
0.01
0.01

0.00
0.00

1.20
1.20
0.06
0.02
0.87
0.23
0.01

0.00

25.8
0.14
0.00
0.00
0.00

0.19
0.19
0.18
0.02
0.02

 

Appendix A cont’d
a) Methylophilus
o) Rhodocyclales
F) Rhodocyclaceae
U C) Rhodocyclaceae

G) Dechloromonas

G) Azoarcus

O) H y drogenophilales

F) Hydrogenophilaceae

G) Th iobacillus

O) Burkholderiales

UC) Burkholderiales

F) Ox alobacteraceae

UC) Oxalobacteraceae

G) H erbaspirillum

G) Du ganella

G) M assilia

G) Heminiimonas

G) Janthinobacterium

G) Namibacter

F) Comamonadaceae

UC) Comamonadaceae

G) Comamonas

G) Hydrogenophaga

G) PC) laromonas

G) Ac idovorax

G) V ariovorax

G) Rhodoferax

G) Ottowia

G) Ramlibacter

F) Burkholderiaceae

G) Cupriavidus

G) W autersia

G) Burkholderia

G) Ralstonia

F) Il‘lc:<:rtae sedis 5

UC) Incertae sedis 5

G) Azohydromonas

G) Aquabacterium

F) Al caligenaceae

G) Te trathiobacter

G) Bordetella

G) Achromobacter

P) Ch loroﬂexi

C) Anaerolineae

O) Anaerolinaeles

F) An aerolinaeceea

SC): An aerolinea

0) ) C aldilineae

F) CC 2: ldilineales

U C ) 3- ldilineacea

G) L. Qaldilineacea
eVilinea

0.01
0.23
0.23
0.1 1
0.03
0.06

1.00
0.25
0.11
0.02

0.03
0.04

0.02
0.47
0.25

0.06
0.13
0.01
0.01
0.01

0.17
0.11
0.01
0.04

0.32
0.28

0.28
0.28
0.28
0.12
0.10

0.02
0.05
0.05

0.05

2.84
0.15
0.83
0.07

0.15
0.15
0.29
0.15
0.02
1.17
0.17
0.02

0.37
0.05
0.49

0.07
0.54

0.02
0.12
0.37
0.07
0.05
0.02

0.07
0.02
0.02
0.02
0.39
0.27

0.27
0.27
0.27
0.05
0.15

0.74
0.74
0.62

0.04

3.84
0.20
0.25
0.06
0.01

0.02
0.1 1
0.03

2.39
0.59
0.01
0.02
0.22
0.06
1.36

0.1 1
0.02
0.69
0.05

0.58
0.28
0.22
0.02
0.02
0.03
0.02

0.35
0.27

0.27
0.27
0.27
0.12
0.14

136

 

0.11
0.11
0.07

0.04

1.10
0.32
0.13
0.05
0.01
0.02
0.02

0.02

0.28
0.08
0.01

0.03
0.03
0.08

0.04
0.03
0.04
0.04

0.31
0.17
0.11

0.02

0.02

0.54
0.27

0.27
0.27
0.27
0.10
0.08

0.03
0.03
0.03

13.1

2.65
0.19
0.51

0.35
0.03
0.54
1.02
1.18
0.22

0.03

0.03

0.86
4.59
0.83
2.71
0.03
0.83
3.48
2.65
0.61
0.22
1.21
0.45

0.67
0.22

 

0.24
1 . 17
1.17
0.79
0.17
0.05
0.55
0.55
0.55
4.59
2.28
0.08
0.03
0.01
0.01

0.01
0.02

1.43
0.24
0.17
0.02

0.81
0.11

0.06
0.15
0.01

0.02
0.01
0.46
0.23
0.03
0.19
0.18

0.17

5.09
5.04
0.67
0.67
0.67
4.30
4.30
4.30
1.07
1.10

0.12
0.12
0.06
0.02
0.02
0.03
0.03
0.03
2.70
0.53
0.53
0.15
0.02
0.12
0.14
0.05
0.02

0.26
0.12
0.01
0.00
0.00
0.07
0.01

0.00
0.02
1.21
0.02
0.04
1.04
0.09
0.15
0.07
0.05

0.02

0.02
0.26
0.1 1

0.1 1
0.11
0.11
0.04
0.04

0.17
0.17
0.08
0.03
0.00
0.10
0.10
0.10
3.43
0.66
0.50
0.16
0.04
0.07
0.17
0.02
0.03

1.37
0.45
0.03
0.49
0.02
0.22
0.04
0.02
0.02
0.08
0.68
0.02
0.01
0.45
0.14
0.19
0.1 1
0.02

0.03

0.02
0.25
0.1 1

0.10
0.10
0.10
0.05
0.01

0.02
2.61
2.61
0.03

2.57
0.17
0.17
0.17
22.7
0.28
1.27
0.87
0.25

0.03
0.01
0.06
0.05
9.05
0.67

0.22
0.03
7.39

0.74
0.14
0.13
0.01

0.05
0.02

0.01
11.9

0.07
11.8
0.17
0.17

0.14
0.14
0.14
0.09
0.02

 

Appendix A cont’d

G) Leptolinea

G) Caldilinea

C) Chloroﬂexi

O) Chloroﬂexales
UC) Chloroﬂexales

F) Oscillochloridaceae
G) Oscillochloris

P) TM7
G) TM7_genera_IS
P) Spirochaetes
C) Sp irochaetes
O) Sp irochaetales
F) Sp irochaetaceae
UC) Spirochaetaceae
P) ws3
G) W S3_genera_1S
P) ODl
G) o D1_genera_ls

P) OP 10

G) O P10_genera_IS

P) Vemcomicrobia

C) V emcomicrobiae

0) Verrucomicrobiales
UC) Verrucomicrobiales
F) Sub 3

2)) Su 1) 3_genera_1S
Xiph imematobacteriaceae
UC)

Xiph imematobacteriaceae
G)

Xiph imematobacteriaceae
F) Sub 5
G) Sub 5_genera_IS

F) Opitutaceae
G) Opitutus
F) Verrucomicrobiaceae
UC) Verrucomicrobiaceae
G) Verrucomicrobiaceae
G) Prosthecobacter
Verrucomicrobium

P) BRCI
G) BRC1_genera_1S
P) Cy anobacteria
C) Cy anobacteria
F) Ch loroplast
S) S.t1-eptophyta

) F 1micutes
Firmicutes

actobacillales

0.01
0.05
0.02
0.02
0.01
0.01
0.01
0.18
0.18
0.08
0.08
0.08
0.04
0.02
0.81
0.81
0.46
0.46
0.11
0.11
16.9
16.9
16.9
1.00
5.32
5.32

7.09

0.22

6.82

0.01
0.01
1.52
1.52
1.98
0.27
1.11
0.12
0.48
0.02
0.02
0.36
0.36
0.36
0.36
1.26
0.47
0.15

0.07

0.12
0.12
0.12

0.20
0.20

0.05
0.05
0.32
0.32

3.74
3.74
3.74
0.20
1.20
1.20

1.52
0.02

1.49

0.64
0.64
0.20
0.02
0.07
0.07
0.02

1.05
0.34
0.17

0.05

0.01
0.06
0.06
0.04
0.02
0.02
0.41
0.41

0.02
0.02
0.32
0.32
0.03
0.03
4.47
4.47
4.47
0.25
2.42
2.42

1.22

0.06

0.58
0.58
0.02
0.01

0.01

0.02
0.02
0.06
0.06
0.06
0.05
0.75
0.46
0.07
0.01
0.01

 

0.01
0.08
0.22
0.22
0.14
0.03
0.03
0.23
0.23
0.05
0.05
0.05
0.05
0.04
0.1 1
0.1 1
0.18
0.18
0.02
0.02
6.17
6.17
6.17
0.17
1.98
1.98

3.81

0.03

3.78

0.17
0.17
0.05

0.05

0.03
0.03
0.02
0.01
1.63
0.20
1.03
0.01

137

0.22
0.22

0.19
0.19
2.01
2.01

1.18
1.18
1.18
0.10
0.38
0.38

0.67

0.67

0.03

0.03

0.06
0.06
0.06
0.06
1.02
0.16
0.77

0.13

 

1.61
0.51
0.01
0.01
0.01

0.02
0.02
0.31
0.31
0.31
0.28
0.19
0.49
0.49
0.05
0.05
0.27
0.27
4.95
4.95
4.95
0.06
2.57
2.57

0.59

0.59

0.21
0.21
0.09
0.09
1.43
0.09
1.22
0.01
0.09
0.16
0.16
0.02
0.02

6.33
1.17
0.95
0.02
0.07

0.00
0.03
0.10
0.08
0.03

0.39
0.39
0.01
0.01
0.01
0.01

0.32
0.32

0.01
0.01
3.35
3.35
3.35
0.05
0.40
0.40

2.84

0.00

2.83

0.03
0.03
0.04

0.02
0.00
0.01
0.04
0.04
0.10
0.10
0.04
0.00
19.1
0.92
17.6
0.18
0.01

0.04
0.10
0.10
0.03
0.00
0.00
0.59
0.59

0.17
0.17
0.00
0.00
0.03
0.03
3.79
3.79
3.79
0.02
0.41
0.41

3.20
0.01

3.20

0.02
0.02
0.09
0.09
0.05
0.01
0.02
0.01

0.02
0.02
0.1 1
0.1 1
0.08
0.05
13.0
0.94
11.2
0.17
0.02

0.01
0.01

0.00
0.00
0.01
0.01
0.01
0.01
0.00
0.14
0.14

0.02
0.02
2.53
2.53
2.53
0.06
1.40
1.40

0.24

0.24

0.38
0.38
0.44
0.01
0.33

0.10
0.01
0.01
0.00
0.00
0.00

2.16
0.50
1.37
0.00

 

Appendix A cont’d
o) Bacillales
UC) Bacillales
F) Bacillaceae
UC) Bacillaceae
SF) Bacillaceae 1
UC) " Bacillaceae 1"
super-(3) Bacillus
UC) Bacillus
G) B acillus d
G) B acillus h
G) B acillus c
G) B acillus k
G) Anoxybacillus
F) Li s teriaceae
SF) Paenibacillaceae 2
G) Oxalophagus
F) Paenibacillaceae
SF) Paenibacillaceae 1
G) Brevibacillus
G) Pa enibacillus
G) C ohnella
F) P1 anococcaceae
UC) Planococcaceae
G) Sporosarcina
G) Pasteuriaceae Incertae
Sedis
C) C l ostridia
UC) " Clostridia"
0) C lostridiales
UC) Clostridiales
F) II'lcertae Sedis X1
G) Sedimentibacter
F) RUminococcaceae
UC) " Ruminococcaceae"
G) Acetivibrio
G) Ruminococcaceae [S
F) Peptococcaceae
F) C lostridiaceae
SF) Clostridiaceael
G) C 1 ostridium
F) Incenae Sedis xv
UC) I ncertae Sedis XV
F) II‘lcertae Sedis X11
G) I:‘Llsibacter
P) G13.1nmatimonadetes
C) G emmatimonadetes
1(7)) CEemmatimonadales
) Gernmatimonadaceae
1(3)) gemmatimonas
C) Ch 1amydiae
O) hlamydiae
c:lﬂlamydiales

0.15
0.03
0.11

0.11
0.02
0.10
0.04

0.04

0.01
0.01

0.64
0.22
0.41
0.36
0.01
0.01

0.01
0.01
0.01
0.02
0.02

3.06
3.06
3.06
3.06
3.06
0.22
0.22
0.22

0.12

0.05

0.05
0.02
0.02
0.02

0.02
0.02

0.02

0.54
0.15
0.34
0.24
0.02
0.02

0.05
0.05
0.05

6.07
6.07
6.07
6.07
6.07
0.27
0.27
0.27

0.06
0.01
0.01

0.01

0.01

0.03
0.03

0.02
0.01

0.21
0.06
0.15
0.04
0.02
0.02

0.03

5.67
5.67
5.67
5.67
5.67
0.13
0.13
0.13

 

1.02
0.07
0.80
0.02
0.77
0.19
0.58
0.20
0.14

0.22

0.08
0.08

0.07
0.01
0.07
0.02

0.03

0.41
0.05
0.36
0.25
0.01
0.01

0.03
0.02
0.02
0.02
0.05
0.05

1.42
1.42
1.42
1.42
1.42
0.17
0.17
0.17

138

0.64

0.03

0.03

0.03
0.03

0.51
0.51

0.29
0.16

0.10
0.03
0.06
0.03

0.96
0.96
0.96
0.96
0.96

 

0.86
0.07
0.69
0.02
0.67
0.24
0.35
0.1 1
0.06
0.01
0.15
0.02
0.07
0.02
0.02
0.02
0.02
0.02
0.01

0.02
0.06
0.03
0.03

4.22
0.80
3.36
1.88
0.07
0.07
0.43
0.1 1
0.13
0.19
0.17
0.04
0.03
0.02
0.22
0.20
0.24
0.24
0.50
0.50
0.50
0.50
0.50
0.09
0.09
0.09

17.4
1.27
14.1
0.52
13.5
4.06
9.31
2.62
3.50
0.47
2.44
0.19
0.21
0.12
0.12
0.08
0.49
0.49
0.1 1
0.35
0.03
1.35
0.89
0.28

0.11

0.63
0.17
0.44
0.14
0.11
0.11
0.01

0.00
0.00
0.03
0.05
0.05
0.04

0.00
0.00
2.17
2.17
2.17
2.17
2.17

1 1.0
0.98
8.70
0.41
8.23
2.90
5.22
1.64
1.73
0.19
1.42
0.21
0.12
0.21
0.21
0.19
0.33
0.33
0.03
0.25
0.02
0.76
0.47
0.12

0.12

0.88
0.24
0.64
0.24
0.07
0.07
0.00

0.00

0.05
0.18
0.18
0.17

2.65
2.65
2.65
2.65
2.65
0.01
0.01
0.01

1.36
0.13
0.37

0.37

0.37
0.36

0.00

0.00
0.00

0.86
0.86

0.84

0.29
0.1 1
0.18
0.06
0.03
0.03
0.00

0.00

0.01
0.00
0.00
0.00

0.03
0.03
0.32
0.32
0.32
0.32
0.32
0.01
0.01
0.01

 

Appendix A cont’d

F) Parachlarnydiaceae 0.12 0.22 0.06 0.11 0.01

G) Parachlamydia 0.07 0.15 0.02 0.07 0.01

P) Planctomycetes 1.76 1.39 0.93 1.57 1.13 0.30 0.30 0.05
C) Planctomycetacia 1.76 1.39 0.93 1.57 1.13 0.30 0.30 0.05
O) Planctomycetales 1.76 1.39 0.93 1.57 1.13 0.30 0.30 0.05
F) Planctomycetaceae 1.76 1.39 0.93 1.57 1.13 0.30 0.30 0.05
UC)P1anctomycetaceae 0.81 0.51 0.43 0.64 0.28 0.10 0.12 0.01
G) Gemmata 0.39 0.46 0.19 0.31 0.05 0.15 0.13

G) Planctomyces 0.28 0.05 0.05 0.20 0.11 0.02 0.01

G) Blastopirellula 0.06 0.04 0.05 0.20 0.01
G) Pirellula 0.21 0.29 0.14 0.25 0.48 0.00 0.02 0.02
G) Isosphaera 0.01 0.07 0.09 0.12 0.01 0.02 0.02
Domain Archaea 0.13

P) Euryarchaeota 0.13

C) Methanomicrobia 0.13

O) Methanomicrobiales 0.11

F) Methanomicrobiaceae 0.11

 

 

 

Appendix A. Detailed classification of sequences of bacterial assemblages from
chapter 4

1. P) Phylum, C) class, SC) subclass, 0) order, SO) suborder, F) family, SF) subfamily,
G) genus, and U) “unclassiﬁed” artiﬁcial taxa.
2. Classiﬁcation is based on RDP classiﬁer result at 50% threshold.

3. Taxons with maximum value of nine samples > 0.1% was shown in this table.
4. “0.00” indicates < 0.05% and > 0.001%.

139

Appendix B1. Habitat-Lite two level scheme and its terms deﬁnition

Top level term Definition

Aquatic A habitat that is in or on water
A habitat that is in or on a body of water containing low concentrations of
Aquatic: Freshwater dissolved salts and other total dissolved solids (<0.5 grams dissolved salts per

 

 

 

 

 

liter)
A habitat that is in or on a sea or ocean containing high concentrations of
Aquatic: Marine dissolved salts and other total dissolved solids (typically >35 grams dissolved
__ __________ _ ~ ‘ salts per 1iter_)
Terrestrial A habitat that is on or at the boundary of the surface of the Earth

 

 

The mixture of gases, roughly (by molar content/volume: 78% nitrogen,
. 20.95% oxygen, 0.93% argon, 0.038% carbon dioxide, trace amounts of other

 

 

 

 

A” ; gases, and a variable amount [average around 1%] of water vapor), that
surrounds the planet Earth Mmkﬂwmn m >__
Fossil ' The mineralized or otherwise preserved remains or traces (such as footprints)
of animals, plants, and other organisms
A substance, usually composed primarily of carbohydrates, fats, water and/or
Food proteins, that can be eaten or drunk by an animal or human being for nutrition
or pleasure
Organism-Associated A habitat that is in or on a living thing _
A habitat having at least one environmental quality that tends towards either
Extreme the largest or smallest element of the set. The physical or geochemical

extreme conditions found in an extreme

Cultured habitat is an controlled habitat created by humans through laboratory
Cultured techniques usually for the purposes of preparing cell, organ, tissue and plant
tissue cultures

 

Other

Second level terms Definition

Any material within 2 m from the Earth's surface that is in contact with the
soil atmosphere, with the exclusion of living organisms, areas with continuous ice
., not covered blotherinaterial, and water bodies deeper than 2 m

Sediment is an environmental substance comprised of any particulate matter
sediment that can be transported by ﬂuid ﬂow and which eventually is deposited as a
layer of solid particles\non the bedor bottom of a body of water or other liquid _

 

 

 

The residual semi-solid material left from domestic or industrial processes, or

 

 

 

 

 

__ SIUdge ,xastswa.tgae3®£9_tgwge§s§
A habitat that is in or on a body of water containing low concentrations of
waste water dissolved salts and other total dissolved solids (<0.5 grams dissolved salts per
litre)
hot spring A spring that is produced by the emergence of geothermally-heated

. groundwater from the Earth's crust ____ ___
hydrothermal vent A ﬁssure in the Earths's surface from which geothermally heated waiter issues~

A complex aggregation of microorganisms marked by the excretion of a
protective and adhesive matrix; usually adiering to a substratum

 

 

bioﬁlm

 

iicrobial mat
Table 1. Definition of terms in Habitat-Lite version 0.4 (revised May 20, 2009). A

given habit might be described with one or more appropriate Top-level terms, and second
level terms as appropriate (Hirschman et al., 2008).

140

Append ix BZ. Priori groups described by Habitat-Lite

 

Group Numbers Of Habitat—Lite description
samples
_ (3,01 , 116 Terrestrial], soil2
' ,(302 6 Extreme], Soil?

_ C} 03 12 Terrestrial], Extreme], Soil2
C} 04 16 Oragnism-Associatedl
C3 05 6 Freshwater], Waste water2

_ G 06 7 Freshwaterl, Sediment2
C} 07 2 F ossil], Oragnism-Associatedl
(3 08 _ 10 Marine], Sediment2 f
C} 09 14 Culturedl, Soilzgor Sediment2
G 10 _ 20 _ Extremel, FreshwaterISedim__ent2
(3 11 2 Extreme], Microbial matz

 

1 Top 1 eve] terms in Habitat-Lite
2 Second level terms

141

Appendix B3. List of samples and their priori groups

Samp l 6 ID

Cz__OD
Du_E2 2 _7
Du_E2 2 _8

Gh_B F 1
Gh_B F 2
Gh_B F 3
Gh_B F4

Gh_B F c:

Gh_Eb N1
Gh_Eb N2
Gh_E.b N3
Gh_Eb N4

Gh_Eb NC

GUS m1
Gh_E m2
Gh_E m3
Gh_E m4
Gh_E me

thgu 1

GILEuz

(“LE u c

Sampling description and
location

PCB-contaminated soil under
Austrian pine tree, Czech
Republic

Rhizosphere

Rhizosphere

Bare follow plots (BF),
replication] , erve Agricultural
Experimental Station (KAES) in
Volta Region, Ghana

BF, rep2, KAES in Volta
Region, Ghana

BF, rep3, KAES in Volta
Region, Ghana

BF, rep4, KAES in Volta
Region, Ghana

BF, composiite, KAES in Volta
Region, Ghana

Maize-elephant grass
(Pennisetum sp) rotation with
fallow residue burning plot
(EbM), repl, KAES in Volta
Region, Ghana

EbM, rep2, KAES in Volta
Region, Ghana

EbM, rep3, KAES in Volta
Region, Ghana

EbM, rep4, KAES in Volta
Region, Ghana

EbM, composite, KAES in
Volta Region, Ghana

F ertilized maize-elephant grass
rotation with minimum tillage of
fallow residue by hand slashing
(EfM), repl, KAES in Volta
Region, Ghana

EfM, rep2, KAES in Volta
Region, Ghana

131M, rep3, KAES in Volta
Region, Ghana

EfM, rep4, KAES in Volta
Region, Ghana

EfM, composite, KAES in
Volta Region, Ghana
Unmanaged elephant grass (Eu),
repl , KAES in Volta Region,
Ghana

Eu, rep2, KAES in Volta
Region, Ghana

Eu, composite, KAES in Volta
Region, Ghana

142

Habitat Lite Description

Terrestriall, Soil2
Terrestrial], Soil2
Terrestriall, Soil2

Terrestriall, Soil2
Terrestrial], Soil2
Terrestrial], Soilz
Terrestrial‘, Soil2

Terrestriall, Soil2

Terrestriall, Soil2
Terrestrial], Soil2
Terrestriall, Soil2
Terrestrial], Soil2

Terrestrial], Soil2

Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestriall, Soil2

Terrestrial], Soil2

Terrestrial], Soil2
Terrestrial], Soil2

Terrestriall, Soil2

Groups

601
G01
(301

(301
001
(301
G01

001

G01
G01
601
(301

GO]

001
001
G01
G01

GOl

G01
(101

001

Appendix B3 cont’d

Gh_PM l
Gh_PM 2
Gh_P M 3
Gh_P M4
HA_S i I: e1
HA_s it 62

HA_S i t e3

Hi_5 0 H_1
Hi_5 O H_2
Hi_5 0 H_4
Hi_5 2 H_1
Hi_5 2 H_2
Hi_5 2 H_3
Hi_5 2 H__4
Hi_5 4 H_1
Hi_S 4 H}
Hi_S 4 H}
Hi_S 4 1‘1.“
Hi__S 6 H 1
Hi__S 6 1‘1 2
Hi_56 H 3
HLS 6 H 4

Maize-pigeon pea (Cajanus
cajan) rotation with minimum
tillage of fallow residue by hand
slashing (PM), repl, KAES in
Volta Region, Ghana

PM, rep2, KAES in Volta
Region, Ghana

PM, rep3, KAES in Volta
Region, Ghana

PM, rep4, KAES in Volta
Region, Ghana

Hawaii Mauna Kea
permafrost_location 1

Hawaii Mauna Kea
permafrost__location2

Hawaii Mauna Kea
permafrost_location3
Kanchenjunga glacier (5000 m),
repl, slopes descending from
Drohmo peak (6980 m) in
Himalaya, Nepal (27° 48’ 00”
N and 88° 07’ 01” E).

slope at 5000 m, rep2 from
Drohmo peak

slope at 5000 m, rep4 from
Drohmo peak

slope at 5200 m, repl from
Drohmo peak

slope at 5200 m, rep2 from
Drohmo peak

slope at 5200 m, rep3 from
Drohmo peak

slope at 5200 m, rep4 from
Drohmo peak

slope at 5400 m, repl from
Drohmo peak

slope at 5400 m, rep2 from
Drohmo peak

slope at 5400 m, rep3 from
Drohmo peak

slope at 5400 m, rep4 from
Drohmo peak

slope at 5600 m, repl from
Drohmo peak

slope at 5600 m, rep2 from
Drohmo peak

slope at 5600 m, rep3 from
Drohmo peak

slope at 5600 m, rep4 from
Drohmo peak

slope at 5800 m, repl from
Drohmo peak

143

Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2

Terrestrial], Soil2

Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestriall, Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial‘, Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2

Terrestrial], Soilz

GOl

G01

GOl

G01

G01

60]

G01

001

601

001

GO]

GO]

G01

G01

G01

001

001

001

G01

601

001

601

G01

Appendix BB cont’d

Hi_5 8 H__2
Hi_5 8 H_3
Hi_5 8 H_4
Hi_60 H_1
Hi_6O H_2
Hi_60 I—l_3
Hi_6O H_4

[A
J e_A 7 2 _l
Je_A '7 2_2
Je_A‘7 4_l
Je_A ‘74_2
Je_A‘7 4_2
Je_A 8 2_l
Je_A 8 2_2
Je_A 8 4_1
Je__A 8 4_2
Je__G‘7 :2, _1
Je_G ‘7 4_1
Je_G'7 4_2
Je_G‘7 4_2
Je_C}8 2 1
Je_(}8 4.1
Je_G 8 4-2
Mi—Ag:c 1
Mi—Ag‘cz
Mi—“"\g_C3

slope at 5800 m, rep2 from
Drohmo peak
slope at 5800 m, rep3 from
Drohmo peak
slope at 5800 m, rep4 from
Drohmo peak
slope at 6000 m, repl from
Drohmo peak
slope at 6000 m, rep2 from
Drohmo peak
slope at 6000 m, rep3 from
Drohmo peak
slope at 6000 m, rep4 from
Drohmo peak

lowa farm soil aﬁer corping, lA

, USA

California
California
California
California
California
California
California
California
California
California
California
California
California
California
California

California

MSU farm, East Lansing, corn

MSU farm, East Lansing, corn

MSU farm, East Lansing, corn

”1"ng
MI__A g_FC2
M‘~Ag__rc3

Mi—A g_SB 1

Mi—Ag_saz

MSU farm, East Lansing, canola

MSU farm, East Lansing, canola

MSU farm, East Lansing, canola

MSU farm, East Lansing,

soybean

MSU farm, East Lansing,

soybean

144

Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestriall, Soil2

Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestriall, Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestriall, Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial‘, Soil2
Terrestrial], Soil2

Terrestrial], Soil2

Terrestrial], Soil2

G01

G01

G01

G01

GOl

G01

G01

GO]
GO]
G01
G01
G01
G01
G01
G01
G01
G01
G01
G01
00]
GO]
G01
G01
G01
G0l
G01
G0l
G01
G01
G01

G01

G01

Appendix B3 cont’d

Mi_Ag__SB3
Mi_Ag_SF l

Mi_Ag_SF2

MLA g__SF 3

Mi_Ag_SWl
MLA g_SW2

MLA g_SW3
Mi_F C_Ml
Mi_F‘ o_M2
Mi__F o_M3

Mi_F o U l
Mi_Fo_U2
Mi_F o__U3

Mi_Ro__C2R
Mi_Ro___c3R
Mi_Ro‘C4R
MLRO‘FCZR
Mi_Ro~FC3R
MLR0_FC4R

MLRO‘RZ
MLRO‘Rg,
Mi_Ro§R4

MLRo‘sazn
MLRo‘st
Mi—R0\SB4R
MLR0\SF2R
MLR0___SF3R
MLkoxst

MLR°\SW2R
Mi—Roxswm

Mi—R0\SW4R

MSU farm, East Lansing,
soybean

MSU farm, East Lansing,
sunﬂower

MSU farm, East Lansing,
sunﬂower

MSU farm, East Lansing,
sunﬂower

MSU farm, East Lansing,
switchgrass

MSU farm, East Lansing,
switchgrass

MSU farm, East Lansing,
switchgrass

East Lansing, deciduous forest
East Lansing, deciduous forest

East Lansing, deciduous forest

Chatham, Upper Peninsula, MI,
pine forest

Chatham, Upper Peninsula, Ml,
pine forest

Chatham, Upper Peninsula, MI,
pine forest

Rose Township, MI, corn

Rose Township, MI, corn

Rose Township, Ml, corn

Rose Township, MI, canola
Rose Township, Ml, canola
Rose Township, MI, canola
Rose Township, Ml, Trees
Rose Township, MI, Trees
Rose Township, MI, Trees
Rose Township, MI, Soybean
Rose Township, MI, Soybean
Rose Township, MI, Soybean
Rose Township, MI, Sunflower
Rose Township, MI, Sunﬂower

Rose Township, MI, Sunﬂower
Rose Township, MI,
Switchgrass

Rose Township, Ml,
Switchgrass

Rose Township, Ml,
Switchgrass

145

Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial}, Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2

Terrestrial', Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestriall, Soil2

Terrestrial', Soil2
Terrestrial], Soil2

Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial', Soil2
Terrestrial], Soil2
Terrestrial], Soil2
Terrestrial‘, Soilz

Terrestrial], Soil2
Terrestriall, Soil2

Terrestrial], Soil2

G01

G01

G01

G01

GOI

G01

G0]
G0]
G0]
G0]

001

G01

GOl
G01
G01
G01
G01
G01
G01
G01
G01
G01
G01
G01
GOI
G01
G01
G01

GOI

GOI

G01

Appendix B3 CODt’d

Terrestrial‘, Soil2

 

 

 

OH_E 1x1: Ohio G 01
0H_E 1,“, Ohio Terrestrial], Soil2 G 01
OH_E 1x611 Ohio Terrestrial], Soil2 G 01
OH_E lx6b Ohio Terrestrial], Soil2 G 01
0H_E2xla Ohio Terrestrial], Soil2 G 01
OH_sz1b Ohio Terrestrial], Soil2 G 01
0H_E2x6a Ohio Terrestrial', Soil2 G 01
OH_E 2x6b Ohio Terrestrial], Soil2 G 01
. PCB-contaminated sandy soil, _ l , 2
P1 0D Picatinny arsenal, NJ, US TerreStnal , 3011 G 01
Si_lO o_120_1 1 Siberia ExtremeI, Soil2 G 02
Si_15 _40__10 Siberia Extremel, Soil2 G 02
Si_1s _40_7 Siberia Extremel , Soil2 G 02
Si_2_3 m_21 Siberia Extreme], Soil2 G 02
Si__2__3 m_24 Siberia Extreme], Soil2 G 02
Si 5 1 or 14 Siberia Extreme1 , Soil2 G 02
Ant_A D10 Antartica Terrestrial‘, Extremel , Soil2 G 03
Ant_ADll Antartica Terrestrial], Extreme], Soil2 G 03
Ant_IC 1 Antartica Terrestriall, Extreme], Soil2 G 03
Ant_IC2 Antartica Terrestrial], Extreme], Soil2 G 03
Ant_ID1 Antartica Terrestrial', Extreme], Soil2 G 03
Ant_I D2 Antartica Terrestrial], Extreme], Soil2 G 03
Ant_Q c7 Antartica Terrestrial', Extreme', Soil2 G 03
Ant_Q C8 Antartica Terrestrial], Extreme], Soil2 G 03
Ant__Q D7 Antartica Terrestrial], Extreme], Soil2 G 03
Anr_Q D8 Antartica T errestrial', Extreme], Soil2 G 03
St_Av 1 Spitsbergen Terrestrial], Extreme], Soil2 G 03
it sz spitsergen Terrestrial', Extreme], Soil2 G 03
Pig_Dom pig feces, Oragnism-Associatedl G 04
PigﬁFo 26 pig feces, Oragnism-Associatedl G 04
Pig_Fo 31 Pig feces, Oragnism-Associatedl G 04
Pig_Fo 32 Pig feces, Oragnism-Associated] G 04
Pig_Fo 3 5 Pig feces, Oragnism-Associatedl G 04
PiEJ-‘O 37 Pig feces, Oragnism-Associatedl G 04
PfgaF 1 04 Pig feces, Oragnism-Associated1 G 04
‘08an Pig feces, Oragnism-Associated1 G 04
P18.1:3 11 Pig feces, Oragnism-Assoeiatedl G 04

146

 

Appendix B3 cont’d

Oragnism-Associatedl

 

 

 

 

Pig F3 12A Pig feces, G 04
_ 1 . 1
Pig F3 1213 Pig feces, Oragmsm-Assocrated G 04
_ . - l
Pig F3 13 Pig feces, Oragnlsm-Assoclated c 04
- . . 1
Pig F6 Pig feces, Oragmsm—Assomated o 04
‘ . . 1
Pig 001 Pig feces, Oragmsm-ASSOClated G 04
- . . 1
Pig 002 Pig feces, Oragnlsm-ASSOClated G 04
— . . 1
Pig o 03 Pig feces, Oragnlsm-Assoc1ated G 04
WWT 01 Urgary Freshwater', Waste water2 G 05
1
WWT 02 Urgary Freshwater , Waste water2 O 05
WT o3 Urgary Freshwater], Waste water2 G 05
WWT__04 Urgary F reshwater', Waste water2 G 05
WT 05 Urgary Freshwater], Waste water2 G 05
WT O6 Egary Freshwater], Waste water2 G 05
PCB-contaminated sediment, l .
Mi_R ROD River Raisin, MI, US Freshwater a Sedimentz G 06
WA_D 0H Washington Freshwater], Sediment2 G 06
1 .
WA_I—I anOl Columbina river, Washington Freshwater , Sediment2 G 06
l - 2
WA_H an02 Columbina river, Washington Freshwater , Sediment G 06
1 .
WA_H ano3 Columbina river, Washington Freshwater , Sediment2 G 06
1 .
WA_I—-I an04 Columbina river, Washington Freshwater , SBdlIl'lCI'lt2 G 06
l .
WA H an05 Columbina river, Washington Freshwater , sedlmentz G 06
Fossil], Ora ism-
l
Mam‘AZ Siberia Assomated G 07
Fossil], Oragnism-
Matn Ce Siberia Associatedl G 07
Marine sediment from the
. Northern Adriatic sea, Gulf of _ l . 2
Adria Trieste (45°33'N 13°37E) Manne , Sedlment G 08
Barrow Canyon (BC, 186 m
depth, 71.607N 156.214W) from
the Alaskan maritime in the _ l . 2
BC] 80 Chuckchi Sea Marine , Sediment G 08
East Hanna Shoal (EHS, 160 m
depth, 72.637N 158.667W) from
E the Alaskan maritime in the . l . 2
”3 Chuckchi Sea Marine , Sediment G 08
F Florida Bay 10 (FLIO, 25.025N , 1 _
L- 1 0 80.681W) Marine , Sediment2 G 08
FL Florida Bay 11 (PH 1, 24.913N _ l , 2
_ l 1 80.938W) Marine , Sedlment G 08
. 1 .
FL~9 Florida Bay 9 (FL9, 25.177N Marine , Sediment2 G 08

147

Appendix B3 cont’d

80.490W)
(800 m depth, 26.404N

Marine], Sediment2

 

GM] 96.064W) in the Gulf of Mexico G 08
West of the Juan de F uca Ridge
(JF, 3869 m depth, 46.783N . l . 2
JF 133.667W) n the Paciﬁc Ocean Marine , Sediment G 08
ST_2 Marine], Sediment2 G 08
Washington Margin (WM, 1138
m depth, 46.575N 124.822W) n . 1 . 2
WA Coast the Paciﬁc Ocean Marine , Sedlment G 08
PCB- and biphenyl-degrading
population form PCB-
contaminated soil under 1 - 2
Austrian pine tree at 14 days Cultured ’ 8011 01‘
Cz_l4D_SlP incubation with l3C-biphenyl Sediment2 G 09
PCB- and biphenyl-degrading
population form PCB-
contaminated soil under I - 2
Austrian pine tree at 4 days CUItllI‘Cd ’ $011 01'
Cz_4D_SlP incubation with 13C-biphenyl Sedlment2 G 09
PCB- and biphenyl-degrading
population form PCB-
contaminated River Raisin l - 2
sediment at 14 days incubation CUltUI'Bd ’ 8011 01'
Mi_RRl4D_SIP with l3C-biphenyl Sedlment2 G 09
PCB- and biphenyl-degrading
population form PCB-
contaminated River Raisin l - 2
sediment at 14 days incubation CUltLII'Cd ’ 8011 01'
Mi_RRl4Ds_SlP with 13C-biphenyl with slurry Sedlmentz G 09
PCB- and biphenyl-degrading
population form PCB-
contaminated River Raisin l - 2
sediment at 3 days incubation CUltured ’ 8011 01'
Mi_RR3D SlP with l3C-biphenyl Sediment2 G 09
PCB- and biphenyl-degrading
population form PCB-
contaminatedPicatinny sandy l - 2
soil at 14 days incubation with €11“:de ’ 8011 01‘
Pi_l4D SlP l3C-biphenyl Sediment2 G 09
Culturedl , Soil2 or
St_AN 1_IN Spitsbergen Sediment2 G 09
Cultured‘, Soil2 or
St_AN2_IN Spitsbergen Sediment2 G 09
Culturedl , Soil2 or
St_anN l_IN Spitsbergen Sedimentz G 09
cultured‘, Soil2 or
St_anN2_lN Spitsbergen Sediment2 G 09

148

Appendix B3 cont’d
Cultured‘, Soil2 or

 

St_Ol_IN Spitsbergen Sediment2 G 09
Cultured], Soil2 or
St 02 IN Spitsbergen Sediment2 009
_. _ Cultured‘, Soil2 or
St ON] IN Spitsbergen Sediment2 (309
_ _ Cultured], Soil2 or
St 0N2 IN Spitstﬂgen Sedimentz 609
Extreme 1 ,Freshwater1 Sedi
FRCl FRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC10 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRCl l PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC12 PRC ment2 G 10
Extreme 1 ,Freshwater1 Sedi
PRC13 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
FRC14 PRC ment2 (3 10
Extreme 1 ,Freshwaterl Sedi
FRCl 5 PRC man’t2 G 10
Extreme 1 ,Freshwaterl Sedi
FRC16 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC17 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC18 PRC ment2 G 1 0
Extreme 1 ,Freshwaterl Sedi
PRC2 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC20 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi
PRC22 PRC ment2 G 10
Extreme 1 ,Freshwater1 Sedi
FRC23 PRC ment2 G 10
PRC24 PRC Extreme 1 ,Freshwaterl Sedi G 1 o

149

Appendix B3 cont’d

ment2
Extreme 1 ,Freshwater1 Sedi
ment2

 

PRC25 PRC G 10
Extreme 1 ,Freshwaterl Sedi

PRC4 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi

PRC5 PRC ment2 G 10
Extreme 1 ,Freshwaterl Sedi

FRC6 PRC merit2 O 10
Extreme 1 ,Freshwaterl Sedi

PRC9 PRC mentz G 10

Du_l7_l DUSEL Extreme], Microbial mat2 G 11

Du_17_2 DUSEL Extreme], Microbial mat2 G 11

 

150

 

Appendix B4. Confusion table of priori groups and bacterial assemblage’ clusters
by average distance clustering

 

 

(301 002 (103 G04 ; (305 GO6 007 608 Off‘ 6203' 610 ' 01] Sum
C01 114 p . .. ‘ 4 g 5 118
an 6 .. .. . __ . _ 6
1:03 , 12 14 p l , i 3 _ ' 30
C04 _ . : 5 ' 5
cos 2 . ' ' 6 . 1 9
C06 1 1
C07 .1 2 1 ‘ 3
C08 : , ? 6 6
C09 . l } * 1
C10 s 2 ﬁ 8 , 10
C“ . ,, .20i 20.
C12 r .i . 2 2
Sum 116 6 12 16 6 7 2 10 6 8 20 2 211

 

 

 

* G09 were separated in two sub-groups: G09-1, PCB- and biphenyl-utilizing population

(Chapter 4) and G09-2, various enrichments of bacterial community from Spitsbergen
permafrost soil.

Table 84.1. Confusion table Of priori groups and bacterial assemblage’ clusters by

average distance clustering

Priori groups (G01-G11), assigned based on Habitat-Lite based description,
(Appendix B.l.) were compared to bacterial assemblage clusters (C01-C12) using Q-
mode average clustering based on 1- Chao’s corrected Sorensen Similarities with at 97%
OTU matrix. Most priori groups and bacterial assemblage clusters were correlated to
each other with a few exceptions. It means that similar bacterial assemblage is present in

the same microbial habitats, congruent with habitat-description.

151

 

 

 

 

 

 

 

 

 

 

 

 

 

.3 $0
_ __.. W _ . “_ w... . . ,
., m L _ L. ... .M. w. . . . _ .... .
a... .. . . . . _ .. . ... H . , . _. _ ..._ 1 . . +$o~
. m. .. _ .. _. .. _. _ _. h. _ .__ . . . . .._. .m. .
. z ._ _ _ ._ ___ __ . .. _. _. . . _ . _ ... n. . _ __ . . R _ .
._T: . . .3 __...m. m . u
. . _ __. i... .__... .__. . ..
.5... . ._._...... ___. _. _ _ ._.._.__H_... ..._._.....__. m n...
.32.... .. ...i .m I... _ .. f... ___....._._..:...___... .__... leer
m. ._w _._ . . __ r . ._ . . . . _._ H.. .. n. _ .. .__ __ m __. .
.__ x; _ _ .w . _u . .__ w _._ ... _ _ _ ___.ww ___ _.. _._ 2:... _ _ . __.m_. n..”
“__.... 5.2.. .____.__ . _ ___. _ . ...W __ .1... .. __ . __ .__ ..
4 2...... .__: _ . .. . _. _ _ _. .__ .3... __....
._r.._..q......._ ..._ 2.. ., _ . . ___ _ _ . __ _ . ._ __ .E __.;
.... ..__ _ .m _. _ . _ _ . _ . . i... as.
._.._...,...__.i_ _.._ . . _ _ _ __ _ _
1...... :2 .: _ _ _ . ..
.3: __ g... _ __
.1 ___. . _ __
. . _. __. __..
._._: .__ .. as
_ . : __ . _
. :J... _..- 1
.. .__. : : .__ ________. _
___ ___:—

 

_
______ ________.___._.__.
.__. __ ___... __..

.
_.
:
_

 

—I=I=

_

 

E...
.
_

 

___

_ .

_ n

_ .

.

T. .

 

 

 

 

 

_.:. 5...

SD 8620 own—p888 3.3.25

#1

 

.__...
. _
.
j...

 

 

 

 

 

*OOH

autos—:0 owns—=83. 13.—8.25 .mm 56:997.

152

 

.mmuoheaeEtoEEmD 23 .EQEONESFENS .anxooaohmow Harman 8.8.2: 0380:

025 5.3 cum 522.94.. £ch magma? owe—£833 .3883 5.3 HEP—mace .883 33 COO. .852. :8 _mtuumoto...

5035555 I
_anoeoa I
3563035 H
3.333an H
mmuomzuogam
H00
maumu>EoﬁcmE .r..
niece—taunt; I

muommcuumtam I
mBoonEbF—I
c.8522 I
332335.. I
ongeozu . .
protozoan...
mmuauEtE I
$329033

HHQO I
mmuautocmbl

2.30.5.2 I
atouumnomam ﬂ
Sam
mmumumcoEnmEEwU
5.663053 H.
mm; I

mtmuummluoEmmﬂoc: I
588302338565... I
35 I

onO E
mucouumetﬁoo I

szuoﬁrmauuouoch
NE... D

_xococozu I
515830395 m.

153

 

Appendix B6. Indicator Species of Selected Priori Groups

Group 01 (Terrestriall, Soilz)

Bradyrhizobium
Xiphinematobacteriaceae_genera_incertae_sedis
Gemmatimanas

Acidobacteria Gp3

Acidobacteria Gp4

Acidobacteria Gp5

Acidobacteria Gp6

Acidobacteria Gp7

Unclassiﬁed Micromonosporaceae

Group 02 (Extremei, Soil!)

Psychrobacter
C arboxydocella

Exiguobacterium

Group 08 (Marinel, Sediment

Jannaschia

Pelobacter

Desulfuromusa
Desulfosarcina
Desulfatibacillum
Desulfococcus
Desulforhopalus
Owenweeksia

Rubritalea

Acidobacteria Gp9
Acidobacteria Gp21
Acidobacteria Gp26
Caldithrix

Unclassiﬁed Myxococcales '
Unclassiﬁed Desglfuromonaceae

 

 

Ii)

 

 

Q-value < 0.05 (false discovery rate signiﬁcant value)

To ﬁnd indicator species that represents a speciﬁc habitat priori group, RDP
classiﬁer based taxonomy-bins at 50% threshold were used in function “duleg” (Dufrene-
Legendre indicator species analysis in R package “labdsv”), which considers the

occurrence frequency, and the relative abundance. Priori G01 contains the member of the

154

phyla Acidobacteria, Verrucomicrobia, and Gemmatimanadetes, which were often found
exclusively in soil habitats. Priori G02, contains 6 Siberian permafrost soils, has as
indicator species, Psychrobacter and Exiguobacterium, which can grow at temperatures
as low as -10 and -5 °C. Exiguobacterium spp. and Psychrobacter spp. abundance in

these sites also were measured by Q-PCR ampliﬁcation (Rodrigues et al, 2009).

155

Appendix B7. Functional Diversity Measures
INTRODUCTION

Functional diversity is "the value and range of the functional traits of the
organisms in a given ecosystem" by deﬁnition of Tilman (2001). The distribution of trait
values can be characterized through the average trait value, i.e. community-weighted
mean (CWM) trait value (Violle et al., 2007), which is an indicator of functional
biodiversity and reﬂects the "average" trait value of dominant species in a community.
METHOD

Calculating community-weighted mean (CWM) trait value:

11
CWM=2 Pi * Trait

l
where pi is the relative contribution of species i to the community, and trait i is the trait

value of species i. The total number of species included in the calculation is "n".

I measured the bacterial traits using genomic information: gene copy numbers of
each COG and KEGG categories obtained from complete (782 genomes) and draft
genome (502 genomes) projects (total of 1284 genomes). I randomly selected the
representative genomes of each genus and then match the genera names to taxonomy-bin
names of RDP classiﬁer results. Thus, 236 taxonomy-bins deﬁned by RDP classiﬁer at
50% threshold (for COG; 226 for KEGG) were given the assigned traits and gene
numbers in the COG or KEGG categories. There are two assumptions for this analysis: 1)
higher copy number in a gene category means possibly more diverse functions, and 2) the

intraspecies variances within genera are small. Priori groups were aligned by Habitat-Lite

deﬁnition (Appendix B2, and B3).

156

Priori A cumulated relative abundance of species

 

Group included in the calculation

G01 10. 1
G02 43.4
GO3 44.7
G04 18. 1
G05 38.0
606 1 1.0
G07 64.7
G08 20.3
609-] 45. 1
G09-2 25.7
G10 21 .0
G1 1 17.9

 

Table. 38.1. A cumulated relative abundance of species included in the calculations.

RESULTS

Current genomes information only covers in the range of 10-65 % of the genera in
211 bacterial (community) assemblages. Though the lowest coverage, which is in priory
group G01, has the highest CWM value in most of the COG categories. This means soil
might possess highly divergent traits that reﬂects complexity of soil ecological niches.
G03 (Antarctic rhizosphere) and G04 (animal feces) had the lowest CWM values in
categories involved in metabolism and energy production. Surprisingly, three COG
categories - replication and repair, translation, and transcription - showed constant CWM
values regardless of priori groups. This means that the house-keeping genes, essential to
sustaining bacterial live, is consistently present in all environments at the same level. It

also supports the validity of this approach.

157

we:_onSoil?mucooomIBImﬁoﬁimem.s_>>o no...

£2105E<Jm£o|colem=ons§.226 IT
cozosumcmﬁlﬁcgmégo le
\

C

3:on torn .3 mos—g 230 2:3 mo_._ow88 000 ._Km 9:53

a__=o_2|__oo.2>>o i
Em__8982|8=8_82.5:50 IT

cozmuﬁmoo ucmlmcﬁomégo

 

 

0 O
mmmwmmmmmwmwm
Ir 0 Ir 6 8 I. 9 Ca V 9 7w Ir w

o m.
,w
.oum
M

m.
.ovm
m

w
.oom
U

3
.8m
M
m.
.OOrn...
m.

m.
.oﬁm
8

 

158

895% __a E 625, 320 E8800 msm 6.5m:—

 

 

:o_um_m:m._....2>>U
cozatumcmchuziu ll .__mammlucmlcozmuzawm.2>>U :1 m
9 w
9 9 w 9 9 9 9 9 9 9 9 9 m
I I _ 0 0 0 0 0 O 0 0 0 u
T. 0 I 6 8 I. 9 S .7 E Z I I.
u n h P P P b b n o m
+ x .i 1. «AU
.. oﬁ ha.
m
1 ON .m.
m.
. om m
8..-: . . w
xx... . ov m
.......l 3 t i. lli‘v. /.-.. u
.r. xx... Rx...) . s
.. .. .__ T om )
m
. cm W
m.
.. on n
D.
. 11+.
om A
W
. co m

 

159

332m rota .3 83? 230 .23 mar—0328 600 «.5— 0.5»...—

Em=onm§2lucmlcozmumcmmuemlmozoﬁocmx._2>>0 1|
w:_ESSIU:mlmcoyoﬁooIBIEﬂ.8922.s_>>o no:
5263902129455 11

Em=onmuo§I>9mch>>o
Egansmznsegcoemo..226 +
Em__o§ozno_o<xoc_e<..226 i

C;

m 9 9 9 9 9 9 9
_ 0 0 0 0 0 0 0
Ir 6 8 I. 9 C. .7 00

L09

 

 

O

0

w

w

n

m.

w.

w

o w
m

.8 a.
u.

R: w.
w

a

. om: w
S

. 8N m
M

. SN m.
n...

. 8a m...
A

w

. 8..” m

 

160

REFERENCE
Rodrigues DF, da C Jesus E, Ayala-Del-Rio HL, Pellizari VH, Gilichinsky D, Sepulveda-
Torres L, Tiedje JM (2009) Biogeography of two cold-adapted genera:
Psychrobacter and Exiguobacterium. ISME J 3:658-665

Tilman D (2001) Functional diversity. — In: Levin, S. A. (ed.), Encyclopedia of
biodiversity. Academic Press, pp. 109—120

Violle C, Navas ML, Vile D, Kazakou E, Fortune] C, Hummel I, Gamier E (2007) Let
the concept of trait be functional! Oikos 1162882-892

161

 

A" .