Application of metagenome analysis to characterize the molecular diversity and saxitoxin-producing potentials of a cyanobacterial community: a case study in the North Han River, Korea

A wide variety of cyanobacterial species that inhabit freshwater systems are known to produce diverse toxins and off-flavor compounds during the development of environmentally harmful blooms. However, cyanobacterial community development and toxin production potential have not been well studied. In this study, we examined the taxonomic diversity and saxitoxin production potential of cyanobacteria in the water and sediments of a large river, the North Han River in South Korea, by metagenome analysis using next-generation sequencing (NGS) and molecular biological approaches, respectively. NGS revealed that the entire cyanobacterial community in the study area consisted of 39 genera and 47 species. The most abundant genera were Microcystis, Anabaena, Cyanobium, and Synechococcus, which accounted for more than 90% of the entire community. The saxitoxin production potential of the cyanobacterial community was assessed by detecting the sxtA and sxtG genes related to saxitoxin production. Eleven sxtA and 24 sxtG genes were identified through molecular cloning and sequencing. Phylogenic analysis revealed that three sxtA genes that grouped in one phylogenic branch with Scytonema sp. were distinctly separated from the sxtA genes of Anabaena, Aphanizomenon, Lyngbya, and Cylindrospermopsis. Sixteen of the detected sxtG genes were phylogenically similar to those of Anabaena circinalis (Dolichospermum circinale), Aphanizomenon gracile, and Aphanizomenon flos-aquae. Our study demonstrates the utility of the metagenomics approach for characterizing the natural community structure of cyanobacteria containing diverse and even rare species, and the evaluation of saxitoxin-producing potential in the cyanobacterial community.


Introduction
Globally, eutrophic freshwaters are being contaminated by harmful algal blooms, particularly those produced by cyanobacteria, as a consequence of human-induced eutrophication and climate change [1,2]. Many cyanobacterial species produce secondary metabolites, such as toxins and off-flavor compounds, which not only deteriorate ecosystem health but also cause socioeconomic problems in water use [3][4][5]. Among the various toxins of algal origin, saxitoxin is known as a paralytic shellfish poison [6,7]. This toxin has a lower LD 50 than that of other cyanotoxins, such as microcystin and anatoxin [8,9], thereby indicating its high toxicity. Cyanobacteria known to produce saxitoxin, including Anabaena (Dolichospermum), Aphanizomenon, Cylindrospermopsis, and Lyngbya, and also produce blooms in many freshwater systems worldwide, and the potentially adverse effects of saxitoxin are increasingly impacting both ecosystem and human health [10][11][12][13].
Nonetheless, the cyanobacteria primarily responsible for saxitoxin production have yet to be conclusively identified, because the toxin-synthesizing mechanism of cyanobacteria is not universal. Moreover, even though a particular cyanobacterial species may possess a series of toxin-synthesizing genes, production may be strain specific [14,15]. Accordingly, the same cyanobacterial species can be classified as either toxic or nontoxic [16]. Furthermore, differential biosynthesis of cyanotoxins by the same cyanobacterial species in geographically isolated waterbodies has often been observed. For example, saxitoxin is produced by an Australian strain Anabaena circinalis, whereas microcystin is produced by a French strain [17]. Moreover, cylindrospermopsin and saxitoxin are produced by Cylindrospermopsis raciborskii isolated from New Zealand and Brazil, respectively. It has also been reported that Anabaena flos-aquae occurring in different lakes in Canada produces either microcystin or anatoxin [18]. These observations serve to emphasize that although some cyanotoxins are detected in certain waterbodies, it would be difficult to pinpoint the species responsible for the production of these harmful toxins. Moreover, natural cyanobacteria are very diverse, and species richness is generally governed by many rare taxa [19]. Thus, it would be necessary to identify the community diversity and to find out a specific species possessing harmful toxin production potential.
In South Korea, many freshwaters have been experiencing annual cyanobacterial blooms, including those produced by Microcystis, Anabaena, Aphanizomenon, and Oscillatoria [20][21][22]. Detection of microcystin has been reported in most of these blooms [23][24][25][26], although there have been no reports of saxitoxin detection. However, the possible threat of saxitoxin should not be underestimated because saxitoxin can be produced by diverse species of dinoflagellates and cyanobacteria, such as Alexandrium miutum and Anabena circinalis [27][28][29]. It has also been reported that saxitoxin-producing ability is dependent upon environmental conditions such as the concentrations of salt and inorganic nutrients [16]. Thus, it is reasonable to infer that saxitoxin could be produced by the cyanobacterial strains present in Korean freshwater systems in response to changes in environmental conditions.
To assess the saxitoxin-producing potentials of cyanobacteria in Korean freshwaters, it is necessary to analyze both the entire cyanobacterial community in a given environmental system and the presence of toxinsynthesizing genes such as sxtA and sxtG, the core genes for saxitoxin biosynthesis [29,30]. Typically, cyanobacterial community analysis is performed by microscopic observation. However, it is difficult to identify the entire cyanobacterial community microscopically because some genera, particularly species of Anabaena, have highly similar cellular morphologies. Moreover, rare taxa are likely to be overlooked during such observations. Thus, in the present study, we used next-generation sequencing (NGS), which facilitates the detailed analysis of community composition, including rare taxa. The aims of this study were to characterize the molecular diversity and to evaluate the saxitoxin-producing potentials of a freshwater cyanobacterial community in order to gain an understanding of the relationship between community structure and toxin production. We accordingly examined the entire community diversity and saxitoxin production potentials of cyanobacterial samples collected from both the water and sediment of a large river system using the NGS method and PCR, respectively.

Cyanobacterial samples
Cyanobacterial samples were obtained from both the water column and sediment in marginal areas of the downstream region of the North Han River (37°34 0 12.2 00 N, 127°20 0 12.2 00 E) during August 2014, which coincided with the cyanobacterial blooming period in South Korea. The cyanobacteria in water were collected by pulling a plankton net (pore size: 20 lm; WildCo, Florida, USA) from a depth of 1.5 m (total estimated volume of 104.6 L). The water samples were concentrated and transferred to 200-mL transparent polyethylene bottles, to which Lugol's solution was added to a final concentration of 5%. Cyanobacteria in the sediment were collected from a depth of 1.7 m using a grab sampler (Peterson grab sampler, QT Technology, Seoul, Korea) in the same area where the water samples were collected. The topmost 3 cm of sediment was removed and transferred to a 100-mL dark bottle. Both water and sediment samples were stored in dark at 4°C prior to analysis.

Genomic DNA extraction
The zooplankton was initially removed by filtration using a 60-lm mesh, and then the filtered solution was applied to a black-polycarbonate filter (pore size: 1 lm; Whatman Co.) to collect the cyanobacteria but remove other bacteria, with a sequential washing with distilled water three times. The filter was applied directly to a lysis solution to disrupt the cells, and then the genomic DNA was isolated from the cell lysates using a FastDNA TM SPIN Kit (MP Biomedicals, USA). The genomic DNA of cyanobacteria in sediment samples was extracted using a genomic DNA extraction kit for soil (Macherey-Nagal, Düren, USA) following the procedures reported previously with minor modification [31]. Briefly, 0.5 g of well-mixed sediment sample was vortexed with the lysis solution and silica beads, and then the supernatant was applied to a column provided in the genomic DNA extraction kit. After washing three times, the DNA was eluted using distilled water. The concentration of extracted DNA from both water and sediment samples was determined using a nanodrop spectrophotometer (Thermo Science, USA; wavelength; 260 nm).

Construction of a genomic DNA library
A DNA library of extracted DNA for shotgun metagenome analysis was constructed using an Illumina library preparation kit (Illumina Co., San Diego, USA) following the manufacturer's procedure. The genomic DNA extracted from the water samples was fragmentized by vigorous vortexing with glass beads. The DNA fragments were separated by agarose gel electrophoresis, and then, the DNA bands of less than 300 base pair (bp) were excised and purified to produce the library. The genomic DNA library was analyzed by nucleotide sequencing using the Illumina Miseq System (Illumina Co., San Diego, USA). Raw sequence data from this study have been registered to NCBI BioProject database (BioProject; https://www.ncbi. nlm.nih.gov/bioproject/) with the accession number PRJNA427611.

Analysis of the cyanobacterial community
Metagenome pipeline analysis of gene fragments was conducted following procedures reported previously. The overall process comprised the following steps: quality control, assembly, gene prediction, alignment, and taxonomic annotation. During quality control, the small fragments, after removing the primer and adapter sequences from the fragments reads, were filtered using the Trimmomatic program (version 0.31) to improve the fragment assembly efficiency [32,33] ,and assembly procedure was performed using the MetaVelvet (version 1.2.02) programs to produce contigs [34,35]. Gene prediction was conducted using the FragGeneScan program (ver. 1.19) targeting the assembled complete genome [36]. Alignment of the contigs obtained from the previous steps was performed using the DIAMOND program (version 0.27) of the nr database (http://ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz) [37]. Taxonomic annotation, the final step for metagenome analysis, was performed using a taxonomic mapping file provided by the MEGAN5 program (version 5.7.0) [38,39].
To identify the cyanobacterial species in the water samples, alpha taxonomy analysis was performed by microscopic observation using a Carl Zeiss Axio Vert. A1 microscope under 4009 magnification (Carl Zeiss, Roßdorf, Germany). Identification of cyanobacteria at the species level was performed based on various references [40][41][42][43].
To evaluate the saxitoxin-producing potentials of the cyanobacterial community, the presence of sxtA and sxtG, genes related to saxitoxin synthesis, was determined by PCR using the genomic DNA obtained from both water and sediment samples as templates. Primers specific for sxtA and sxtG were synthesized by Bioneer (Daejeon, Korea), and the sequences of which are listed in Table 1. The sxtA and sxtG genes were amplified by PCR, and PCR products were cloned into a T-blunt vector (Solgent, Daejeon, Korea). Primer amplification through PCR was performed in a serial temperature treatment with 30 cycles (1 min at 95°C, 30 s at 95°C, 1 min at 53-55°C (primer annealing temperature), 1 min at 72°C) with final extension of 72°C in 5 min. The cloning of sxtA and sxtG was confirmed by PCR using the same primers as used initially. Plasmids were isolated from each colony, and the DNA sequences were determined using an ABI 3730XL DNA analyzer (Perkin-Elmer, Massachusetts, USA).
The determined sequences of saxitoxin-producing genes were applied to a BLAST search against the GenBank database of NCBI (National Center for Biotechnology Information, Maryland, USA). Additionally, cyanobacterial strains, including both coiled and straight types of Anabaena circinalis (AG40090 and AG40092) and Aphanizomenon flos-aquae (AG40091), isolated previously from the same study area [44][45][46], were also analyzed to determine whether saxitoxin-synthesizing genes were present in cells. The identified genes, including 26 sxtA and 24 sxtG genes that encode polyketide synthase and amidinotransferase, respectively, were aligned and analyzed. A phylogenic tree was constructed by MEGA 6.0 [47] using 9 sxtA and 13 sxtG genes from Alexandrium spp. as the out-group.

Molecular diversity of the cyanobacterial community
The molecular diversity of cyanobacterial community in the samples was investigated by the shotgun metagenome analysis. As a result, a total of 5,237,023 reads (751,853,027 bp) were analyzed, the read length of which ranged from 100 to 301 bp. Since the rarefaction curve reached a plateau after sequencing 120,000 times, the number of reads was deemed reasonable for analyzing the diversity of species, as shown in Fig. 1. On the basis of the nucleotide sequence information obtained from shotgun metagenome analysis, contigs, defined as a set of overlapping DNA segments that together represent a consensus region of DNA, were generated through the quality control and assembly procedures of the in silico pipeline. A total of 1,040,416 contigs, with an average length of 315 bp, were assembled for the cyanobacterial community in the study area ( Fig. 1 inset). Taxonomy classification of the assembled contigs revealed that the cyanobacterial community was composed of 39 genera, although four genera remained unclassified. The dominant genera identified were Synechococcus, Cyanobium, Anabaena (also referred to as Dolichospermum), and Microcystis, which represented 25.9, 25.6, 24.7, and 16.9% of the identified genera, respectively, and collectively accounted for 93.0% of the total cyanobacterial community. At the species level, the most dominant species were Cyanobium sp. CICIAM14 (25.5%) and Anabaena circinalis (24.6%), followed by Microcystis aeruginosa (16.8%) and Synechococcus rubescens (14.6%) ( Fig. 2A). In contrast with the metagenomics method, microscopic identification of cyanobacteria in the same sample revealed only three cyanobacteria, of which Anabaena spiroides (75.2%) was the most abundant, followed by Microcystis aeruginosa (23.5%) and Merismopedium glaucum (1.3%) (Fig. 2B). Neither Synechococcus and Cyanobium were not observed in our microscopic observations, nor were these genera detected in previous studies of the same river system [48][49][50][51][52].
The failure to detect these two genera could be attributable to their very small cell size (2-4 lm in diameter) and to their morphological similarity to single cells of Microcystis. Accordingly, they might be overlooked or misidentified during microscopic observation [62]. Moreover, since Anabaena spiroides is morphologically very similar to Anabaena circinalis, it would be difficult to distinguish these two species by alpha taxonomy using microscopic observation [41]. Most of all, the difference between metagenome and microscopic analysis was manifest in rare species that accounted for less than 5% of the total population. Metagenome analysis revealed more than 20 rare cyanobacterial species during the sampling period, including Aphanizomenon flos-aquae, Prochlococcus spp. Planktothrix agardhii, Synechocystis spp., Lyngbya aestuarii, Nodularia spumigena, Scytonema hofmanii, Scytonema spp., and Cylindrospermopsis raciborskii, whereas the only rare species observed in the microscopic analysis was Merismopedium glacum (Fig. 1B). This result indicates the limitation of direct microscopic observation for rarely occurring species and emphasizes the efficiency and power of metagenomics for measuring the diversity of cyanobacterial communities.

Identification of sxtA and sxtG in the cyanobacterial community
Since field samples collected from the North Han River were consisted of diverse cyanobacterial species as shown in Fig. 2, it was difficult to isolate the species responsible to produce saxitoxin. In order to evaluate the saxitoxinproducing potential, the saxitoxin-producing genes including sxtA and sxtG in water and sediment samples collected from the North Han River were assessed by molecular biological method, and then the genes were subsequently identified by BLAST searching in the Gen-Bank database (NCBI) followed by DNA sequencing.
As a result, sxtA and sxtG genes were amplified from the genomic DNA of both water and sediment samples, and it was noticed that the sediment samples were a source of relatively more saxitoxin-producing genes (Fig. 3). The PCR products shown the expected sizes were identified as sxtA and sxtG from the DNA sequencing and BLAST searching. Totally 26 and 24 genes of sxtA and sxtG genes, respectively, were registered in GenBank, and these were isolated from various countries, including Australia, China, Japan, and Germany. The cyanobacteria registered as  [53][54][55]. From the study area in the North Han River, 11 sxtA (two clones in water and nine clones in sediment) and 24 sxtG genes (4 clones in water and 20 clones in sediment) were identified based on database searches (Fig. 4), and these were all clustered as genes involved in cyanobacterial saxitoxin production.

Phylogenetic analysis of sxtA and sxtG
In order to identify the saxitoxin-producing cyanobacterial species in the study area, the DNA sequences of sxtA and sxtG genes were applied to phylogenetic analysis. As shown in Fig. 4, sxtA and sxtG genes from the study area were classified into phylogenic tree branches distinguished from the saxitoxin-synthesizing genes in dinoflagellates (Alexandrium spp.), a marine cyanobacteria species. Moreover, three out of 11 sxtA genes were grouped in one phylogenic branch with Scytonema spp., and they were distinctly separated from those of Anabaena, Aphanizomenon, Lyngbya, and Cylindrospermopsis (Fig. 4A). In case of the remaining eight sxtA genes, they were classified into same branch distinguished from the majorities in the study area. It was speculated that they would come from the rare species, and those rare species should be taken into consideration for evaluating saxitoxin production potentials in the freshwater system. On the other hand, 16 of 24 sxtG genes isolated from study area were phylogenically similar to those of Anabaena circinalis (Dolichospermum circinale), Aphanizomenon gracile, and Aphanizomenon flosaquae (Fig. 4B). However, sxtG gene was not detected from isolated Anabaena and Aphanizomenon species. The other eight sxtG genes showed relatively low homology, 50-79%, to cyanobacterial sxtG genes in GenBank, and they were clustered into a single phylogenic branch. Therefore, it would be inferred that sxtA and sxtG genes identified in this study could be derived from the rare species. To sum up the data from molecular diversity of community and phylogenic analysis, it was concluded that Scytonema hofmanii identified in the North Han River and other rare species generally overlooked in typical analysis would be responsible for the saxitoxin production.

Discussion
Cyanobacteria in the freshwater were known to be the major producers of diverse harmful materials such as microcystin, 2-MIB, and saxitoxin. In this study, we focused on cyanobacterial community analysis by metagenomics approach and the saxitoxin-producing potentials of cyanobacterial community by molecular biological method in the North Han River, South Korea. The metagenomics method used in this study revealed considerably higher species abundance than that revealed by microscopic observation and different percentages of the dominant species. As such, morphological similarity would be an obstacle for identifying the composition of freshwater algal communities, particularly cyanobacterial species. Indeed, the difficulties associated with alpha taxonomy have been reported with regard to identifying morphologically similar species, such as distinguishing between Oscillatoria agardhii and Planktothrix agardhii and between Anabaena circinalis and Anabaena crassa [40][41][42]46]. Most of all, the advantageous aspects of metagenomics analysis over microscopic analysis would be the capability to detect the rare species. In fact, it was the limitation of traditional approaches to analyze cyanobacterial communities, and emphasized again the necessity of molecular approaches because the rare species should be included to evaluate the potentials of adverse effects on freshwater systems.
Among the diverse harmful toxins produced by cyanobacteria, we investigated the saxitoxin production potentials by molecular biological methods. In fact, there was no report of problem related to saxitoxin at our study sites, the North Han River in South Korea. However, there would be a possibility of saxitoxin occurrence because a wide variety of cyanobacterial species were known to produce saxitoxin, and some of them were detected from the study sites as rare species. Aphanizomenon flos-aquae, Planktothrix agardhii, Lyngbya aestuarii, and Cylindrospermopsis raciborskii detected rarely in our study sites were reported to cause blooms with saxitoxin production in Australia, Europe, and USA [11,12,[56][57][58]. Additionally, Scytonema cf. crispum was recently reported to produce saxitoxin in New Zealand [59]. Since microorganisms known to produce saxitoxin were well studied, the series of genes involved in saxitoxin synthesis, such as sxtA, sxtG, and sxtF, and sxtM, has been identified from diverse species which encode polyketide synthase, amidinotransferase, and saxitoxin transporters, respectively [29,60,61]. Therefore, it was rational to evaluate the saxitoxin production potentials by determining the presence of those genes in cyanobacterial community.
Our results demonstrated that even if the majority of cyanobacteria, such as Anabaena and Aphanizomenon occurring in the study area, were reportedly known as saxitoxin-producing cyanobacteria, they did not reveal the presence of saxitoxin-producing sxtA and sxtG genes. Thus, it was inferred that the genes to produce saxitoxin would be derived from rare species not from majorities. In fact, BLAST searches indicated sxtA and sxtG genes identified from the cyanobacterial community were not derived from the dominant cyanobacteria. Moreover, the phylogenic analysis revealed that three sxtA genes were clustered into one phylogenic branch with Scytonema spp. and other eight genes were also distinguished from majorities. In this regard, that there have been no reports of existence of saxitoxin in the study area could be explained by the fact that Scytonema comprises only 0.77% of the cyanobacterial community. It is thus a reasonable inference that, even though Scytonema is able to synthesize saxitoxin, the amount of saxitoxin might be below the detection level of analytical methods such as HPLC. Thus, it can be assumed that the species composition of the cyanobacterial community is considerably more diverse than previously reported and that rare species overlooked in traditional studies could possess saxitoxin production potential. Although saxitoxin has not been a problem in this freshwater system to date, it remains imperative to characterize the cyanobacterial community and the species possessing saxitoxin-synthesizing genes. It was acknowledged that environmental conditions have been changing over time and that such changes could promote cyanobacterial blooms as well as an increase in species abundance. We therefore propose the necessity for continuous research and recommend obtaining nucleotide sequence information on saxitoxin-synthesizing genes in order to clarify the relationship between cyanobacterial blooms and saxitoxin production. Most importantly, we emphasize that the application of metagenomics and molecular biological methods in cyanobacterial community analysis would play a critical role in gaining a better understanding of the possible links between cyanobacterial blooms and toxin production in freshwater systems.
We demonstrated the strength and capability of metagenome analysis, which is rapid and precise, in exploring the detailed structure of an entire cyanobacterial community composed of diverse and rare species before bloom development. Moreover, we propose that saxitoxin-producing potential could be assessed by detecting the presence of sxtA and sxtG genes in the community. Although further studies will be necessary to identify the species possessing saxitoxin-producing potential, the findings of the present study could provide a significant context for the management of harmful cyanobacterial blooms as early warning information. To the best of our knowledge, this is the first study to characterize the molecular diversity of a freshwater cyanobacterial community using metagenomics. We accordingly believe that the introduction of molecular biological methods along with metagenome analysis will play a critical role in this research field in the future.