SNPs in a genome region for trehalose catabolism and thiamine biosynthesis
The genome of the original CO-adapted culture of thermophilic acetogen T. kivui  was sequenced to identify single-nucleotide polymorphisms (SNPs). Putative mutations conferring CO-tolerance will be described elsewhere. The analysis revealed that over 80% of identified SNPs (263 out of 327 putative mutations) occurred in a single 15 kilobase region of the ~2.4 Megabase genome (Fig. 1A). A closer look revealed two clusters of mutations, separated by a gap of around seven kilobases that contained no mutations (Fig. 1B). Checking the genome annotations for this region reveals that the mutations are in genes annotated as involved in sugar metabolism and sugar uptake via ABC-type transporters, while the mutation-free gap constitutes a thiamine (vitamin B1) synthesis operon (Fig. 1C). The first cluster of mutations occurs near the end of a glycoside hydrolase (TKV_c08300) with 98.6% amino acid identity to a kojibiose phosphorylase from T. brockii . This gene is followed by a sugar ABC transporter encoded by TKV_c08310-c08330, the first two genes of which also contain mutations, and a β-phosphoglucomutase (TKV_c08340) with no mutations. After the thiamine operon is a second, much higher concentration of mutations in an ABC transport system (TKV_c08410-c08440), although these genes are annotated as truncated pseudo-genes in the reference genome, so likely non-functional (Table S2). The high frequency of mutations continues in the glycoside hydrolase encoded by TKV_c08450, sharing 96% amino acid identity with a trehalose phosphorylase from Thermoanaerobacterium brockii Thebr_1548 (α, α-trehalose phosphorylase, EC. 188.8.131.52) . Assuming the truncated pseudo-genes were once intact, a hypothetical “ancestral” trehalose metabolism for T. kivui can be reconstructed, where trehalose enters the cytoplasm un-phosphorylated via the ABC transporter, and is converted into glucose and β glucose-1-phosphate by the trehalose phosphorylase. The β-phosphoglucomutase may play a role in both kojibiose and trehalose metabolism, converting β-glucose-1-phosphate to glucose-6-P for entry into glycolysis (Fig. S1).
The mutation-free area between the two highly mutated regions contains a five-gene thiamine biosynthesis operon, thiSGHFE (TKV_c08340-c08390). The first four genes, thiSGHF, encode essential enzymes for synthesis of the thiazole moiety of thiamine (the thiI gene TKV_c15580 is located elsewhere), and the fifth is one of two copies of the thiamine phosphate sythase (thiE2 TKV_c08390) gene responsible for combining the thiazole and pyrimidine moieties into thiamine monophosphate . Strain X514 lacks the genes for thiazole biosynthesis, but contains homologues to T. kivui’s thiC (TKV_c18650) and thiDEM (TKV_c03000-3020) genes, which alone are adequate for synthesis of thiamine if thiazole can be salvaged from the medium  (Fig. S2).
The unusual concentration of mutations in a small segment of the genome seems to indicate very strong selective pressure in this region. But interestingly, more than 60% of the putative SNP sites in this region (166 of the 263) are silent mutations not resulting in a change at the amino acid level (Fig. 1A, B). The high frequency of SNPs observed here suggests strong selective pressure towards new functions, while the high proportion of silent mutations suggests the opposite: selective pressure to maintain current function .
PCR and Sanger sequencing reveal genomic insertions/deletion
To confirm the results of SNP analysis, and in an attempt to reconcile these two contradictory lines of thought, portions of the mutated genome region were amplified by PCR and submitted for Sanger sequencing. The resulting PCR product lengths for the CO-strain were not consistent with the published genome (Fig. 2B) as, in addition to the SNPs detected by genome sequencing, several larger insertions and deletions were present. From the PCR, it is apparent that the CO-strain has a roughly 1 kb insertion in the second ABC-transporter cluster (primers BZ142-143) as well as a greater than 4 kb deletion in the thiamine biosynthesis operon (primers BZ144-145) (Fig. 2A).
Interestingly, Sanger sequencing of the anomalous length PCR products revealed perfect sequence agreement to the genome of Thermoanaerobacter sp. strain X514, a related strain that differs significantly in its metabolic properties. Strain X514 produces ethanol from glucose by fermentation as well as other alcohols from organic acids  but it does not contain the Wood-Ljungdahl pathway for CO2 reduction and fixation, and it therefore does not grow on H2 + CO2 . To rule out the possibility of cross contamination, genes specific to T. kivui (hycB3) and strain X514 (adhE) were amplified, confirming the absence of adhE in all T. kivui strains and the absence of hycB3 from strain X514 (Fig. 2C). The 16 S sequence perfectly matched the T. kivui 16 S rRNA gene, and contamination was also considered unlikely because the growth capabilities of the CO-strain differ significantly from strain X514 (see below). It is unclear exactly when contamination occurred in the original experiment . The original sample sent for SNP sequencing consisted of a CO-adapted community (generated by repeated passages, but never isolated), from which a single isolate was selected for generation of the ∆pyrE CO-strain used here . Since the HGT event is evident in the SNP sequence data, it was clearly present in at least a sub-population of the original adapted community.
An alignment of the two genomes revealed high synteny and sequence homology in this region (Fig. 2A), which would facilitate a HGT event. The most striking differences between the two genomes in this region are the absence of the entire thiamine biosynthesis operon in strain X514 and the absence of ~1 kb from parts of three ABC sugar import genes in T. kivui (Fig. 2A and Table S2). These large insertions/deletions do not show up in the SNP sequencing, which relies on mapping reads to the T. kivui reference genome, but more subtle differences between the genomes of the two species in the surrounding genes were detected and annotated as single nucleotide mutations. Therefore, rather than hundreds of independent SNP mutations in this region, a single HGT event occurred where the CO-strain of T. kivui took up DNA of strain X514 from the laboratory environment and, via homologous recombination, replaced around 15 kilobases of its own genome (containing a thiamine biosynthesis operon) with an 11 kilobase DNA fragment containing a functional ABC sugar import system (Fig. 2). Horizontal gene transfer explains the high frequency of silent SNPs, since the proteins would be evolving independently in both species, but in the context of carrying out similar conserved functions.
Since both strains are studied in our lab, and grow under similar conditions (anaerobic, above 60 °C, in Thermoanaerobacter media) we routinely sequence the 16 S rRNA gene of working stocks to exclude the possibility of cross-contamination. A more stringent screen for contamination involves PCR amplification of marker genes specific to suspected contaminants, such as the adhE of strain X514 used here (Fig. 2C), but it is important to note that only whole-genome sequencing is adequate to detect the type of “genome contamination” event we describe here. That T. kivui can take up and incorporate foreign DNA via natural competence and HGT is well known [10,11,12,13, 30], but this is the first report of transformation occurring unintentionally.
Media components and growth behavior
Next, we evaluated the growth of the two WT Thermoanaerobacter species and the CO-adapted T. kivui strain on trehalose, and determined their requirement for thiamine. As stated in the original isolation report, T. kivui grows well in a mineral salts medium without addition of vitamins or yeast extract, so they are not included in the standard medium recommended by DSMZ (medium 171). Yeast extract is naturally rich in thiamine, while supplementation of defined medium with the vitamin mixture results in a final concentration of 5 µg/L thiamine. Strain X514 grew well in T. kivui complex media but not in defined media, supporting previous reports finding minimal growth (OD < 0.1) in media without yeast-extract . The CO-adapted strain did not grow in media without vitamins (NV), but addition of 10 µg/L thiamine-HCl from a pure stock recovered normal growth without the need for any of the other vitamins from DSM141 (Fig. 3A). The medium used to adapt T. kivui to CO contained both yeast extract and vitamins, and the defined medium used in subsequent experiments also contained the vitamin solution , likely explaining why the thiamine requirement of the CO-strain was not noticed before.
Yeast extract is also one of the richest natural sources of trehalose, which can make up between 1 and 10% of yeast cell dry-weight . WT T. kivui cells do not grow well on trehalose (Fig. 3B), likely due to the disruptions in the trehalose ABC transporter subunits. In contrast to the WT, the T. kivui CO-strain exhibits robust growth on trehalose, nearly as rapid as growth on glucose (Fig. 3B), as does strain X514. To confirm that the CO-strain also exhibited better growth on yeast extract, cells were cultured on NV medium supplemented with 20 g/L yeast extract (ten times the amount in normal complex medium) as the only substrate. WT cells reached a final OD600 of 0.201 with a doubling time of 2.2 h, while the CO-strain reached a final OD more than twice as high (0.47) with a slightly faster doubling time of 1.9 h (Fig. 3C). Trehalose concentration in the medium was measured by HPLC before inoculation, revealing that the 20 g/L yeast extract contributed 2.3 ± 0.1 mM trehalose, which would imply that the Roth yeast extract used here is made up of approximately 4% trehalose by dry weight. Within 8 h the CO-strain had completely consumed the trehalose, while the WT strain reduced the trehalose concentration by less than 10% in 24 h. The limited growth observed in the WT was presumably mostly from peptides and other non-trehalose energy sources present in the yeast-extract. When grown on trehalose in complex media (Fig. 3B) the CO-strain was able to generate 5.7 ± 0.2 mM acetate per trehalose (2.9 acetate per glucose), which agrees well with literature values for homoacetogenic utilization of glucose by WT T. kivui, in the range of 2.3 to 3.0 acetate per glucose [7, 33].
These results confirm that the genotypes of the respective strains, as revealed by Sanger sequencing, lead to the expected phenotypes. The uptake and integration of a gene region encoding the intact ABC trehalose importer genes from strain X514 confer to the CO-strain the ability to utilize trehalose. The simultaneous deletion of the thiamine biosynthesis operon originally present adjacent to these genes also converts the CO-strain into a thiamine auxotroph. Presumably this gene transfer was facilitated by strong selective pressure, since once other components of the complex media had been consumed, residual trehalose would have represented a promising additional energy source for cells that initially could only poorly utilize CO. To confirm that this genotype could result from an HGT event, we attempted to transform T. kivui WT cells with genomic DNA from strain X514.
Replicating the HGT by transformation with genomic DNA
Due to the natural competence of T. kivui , strain X514 gDNA was simply added to media prior to inoculating a culture of WT T. kivui and incubating overnight in glucose media supplemented with trehalose to encourage uptake, since HGT is expected to be a rare event. This was followed by plating on selective agar plates containing only trehalose. Colonies only formed from cultures incubated with added DNA. Transformation with two different PCR products was also successful, both with a short PCR product of 4 kb (Short PCR) containing only the ABC transporter genes from strain X514, as well as with a longer 9 kb product (Long PCR) roughly matching the size of the full HGT region originally identified in the CO-strain (Fig. 4A). Subsequent PCR screening confirmed uptake of the intact ABC transporter genes in all transformed colonies, but colonies transformed with the Short PCR product retained the T. kivui WT thiamine biosynthesis operon (thi(+)), while those transformed with Long PCR did not (thi(-)), and transformation with gDNA resulted in a mixture of thi(+) and thi(-) colonies (Fig. 4B). Similar results were achieved by transforming the CO-strain with either T. kivui WT gDNA or a PCR product of the thiamine operon, and selecting on plates lacking thiamine. Only transformed CO-strain cells formed colonies on the selective plates, but all colonies picked from these plates showed mixed genotypes, suggesting that untransformed cells were cross-fed by thiamine produced by transformants (Fig. S3).
This is the first report of successfully transforming T. kivui with genomic DNA, although transformation with both integrating and replicating plasmids has already been demonstrated . Transformation efficiency cannot be accurately determined with this method because transformation occurs concurrently with cell growth, but a rough estimate of relative efficiencies is evident from the number of positive colonies counted on plates. Transformation with gDNA and Long PCR resulted in similar numbers of colonies per ug of the Trehalose-ABC gene operon, while Short PCR was at least two orders of magnitude lower (100x fewer colonies), likely due to a short upstream flanking region of only 100 bp, while all other flanking regions were at least 1000 bp long.
Despite the low transformation efficiency, the fact that integration of the Short PCR product was sufficient to confer growth on trehalose indicates that the first ABC gene operon containing large numbers of SNPs (TKV_c08310-330, Fig. 1) is not responsible for trehalose uptake, since this region was not included in Short PCR. These genes may be responsible for growth on kojibiose (Fig. S1), since T. kivui cannot grow on any oligosaccharide tested so far, and kojibiose is an obscure sugar that is difficult to source in pure form. It is found in trace amounts in honey and some fermented products , so its biological relevance to T. kivui is unclear.
The ease with which transformed T. kivui cells containing the full ABC trehalose-import subunits could be isolated on trehalose-containing agar plates (Fig. 4) suggests that the inability to utilize trehalose acted as a strong selective pressure in trehalose-rich environments. Considering that the original HGT event leading to the CO-strain occurred in supposed mono-cultures in a laboratory setting, it is logical to assume that prior to its isolation, T. kivui’s genome regularly incorporated DNA from the many diverse species present in its natural environment of Lake Kivu. Therefore, we undertook a closer analysis of its genome to identify potential recent HGTs.
Evidence of other HGT events in the T. kivui genome
The wild type T. kivui genome has a relatively low GC content of 35% when averaged across the entire genome. However, an analysis with GC profile reveals three GC-rich islands with greater than 50% GC content (Fig. S4). The first island contains genes encoding ribosomal RNA. Thermophiles are known to have higher GC-content in their 16 S rRNA genes, possibly to stabilize base-paring in the double-stranded regions at high temperatures . The next two islands each contain a gene annotated as aerobic B12 biosynthesis gene cobN, as well as ABC-transport components, some annotated as nickel transporters. These two islands are also identified by IslandViewer  on the basis of codon usage bias. A BLAST search reveals that no other Thermoanaerobacter species contains homologues to these regions, and in fact the only results (nucleotide sequence identity 84–89%) are to genes from two moderately thermophilic sulfate-reducing bacteria Desulfotomaculum kuznetsovii  and Candidatus Desulforudis audaxviator . Both species were isolated from environments between 2000 and 3000 m below the surface. Lake Kivu is unusually deep, approaching 500 m in places , so it is not surprising that T. kivui encountered relatives of these deep subsurface bacteria, or at least their DNA, in its natural environment prior to isolation. The exact functions of the genes in these clusters remain unclear, since the presence of cobN, typically considered part of the aerobic B12 biosynthesis pathway, was surprising in the obligately anaerobic Desulfotomaculum kuznetsovii . The cobN gene was also identified as a likely instance of horizontal gene transfer from an archaeon in Candidatus Desulforudis audaxviator .
The importance of HGT in microbial evolution remains controversial, but in certain circumstances it leads to much more rapid development of new capabilities than gradual random mutation . In this context, it is interesting that T. kivui, despite sharing high 16 S sequence similarity to other Thermoanaerobacter species, is the only acetogenic isolate of the genus so far. This is in contrast to another thermophilic acetogenic genus Moorella, for which multiple closely related acetogenic strains have been isolated independently . Since most acetogens encode the genes of the Wood-Ljungdahl pathway in a single large gene cluster, and T. kivui’s WLP genes are even more concentrated than most , it seems plausible that its acetogenic capabilities could result from an HGT event. The original environmental samples were stored at room temperature for 5 years before T. kivui was isolated , giving plenty of time for HGT to occur between species present in the samples.