- Research article
- Open Access
The TyrA family of aromatic-pathway dehydrogenases in phylogenetic context
© Song et al; licensee BioMed Central Ltd. 2005
- Received: 19 February 2005
- Accepted: 12 May 2005
- Published: 12 May 2005
The TyrA protein family includes members that catalyze two dehydrogenase reactions in distinct pathways leading to L-tyrosine and a third reaction that is not part of tyrosine biosynthesis. Family members share a catalytic core region of about 30 kDa, where inhibitors operate competitively by acting as substrate mimics. This protein family typifies many that are challenging for bioinformatic analysis because of relatively modest sequence conservation and small size.
Phylogenetic relationships of TyrA domains were evaluated in the context of combinatorial patterns of specificity for the two substrates, as well as the presence or absence of a variety of fusions. An interactive tool is provided for prediction of substrate specificity. Interactive alignments for a suite of catalytic-core TyrA domains of differing specificity are also provided to facilitate phylogenetic analysis. tyrA membership in apparent operons (or supraoperons) was examined, and patterns of conserved synteny in relationship to organismal positions on the 16S rRNA tree were ascertained for members of the domain Bacteria. A number of aromatic-pathway genes (hisH b , aroF, aroQ) have fused with tyrA, and it must be more than coincidental that the free-standing counterparts of all of the latter fused genes exhibit a distinct trace of syntenic association.
We propose that the ancestral TyrA dehydrogenase had broad specificity for both the cyclohexadienyl and pyridine nucleotide substrates. Indeed, TyrA proteins of this type persist today, but it is also common to find instances of narrowed substrate specificities, as well as of acquisition via gene fusion of additional catalytic domains or regulatory domains. In some clades a qualitative change associated with either narrowed substrate specificity or gene fusion has produced an evolutionary "jump" in the vertical genealogy of TyrA homologs. The evolutionary history of gene organizations that include tyrA can be deduced in genome assemblages of sufficiently close relatives, the most fruitful opportunities currently being in the Proteobacteria. The evolution of TyrA proteins within the broader context of how their regulation evolved and to what extent TyrA co-evolved with other genes as common members of aromatic-pathway regulons is now feasible as an emerging topic of ongoing inquiry.
- Lateral Gene Transfer
- Pyridine Nucleotide
- Congruency Group
- Genome Representation
- Profile HMMs
Abbreviations used to designate substrate specificities of tyrA/TyrA homologs
Description of specificityb
Specificity for cyclohexadienyl substrate is unknown
Broad-specificity cyclohexadienyl dehydrogenase (CDH)
Narrow-specificity prephenate dehydrogenase (PDH)
Broad-specificity cyclohexadienyl dehydrogenase having catalytic-core indels in correlation with an extra-core extension
Narrow-specificity arogenate dehydrogenase (ADH)
NAD tyrA a
TyrA homolog is AGN-specific and NAD+-specific
NADP tyrA a
TyrA homolog is AGN-specific and NADP+-specific
NAD(P) tyrA a
TyrA homolog is AGN-specific but utilizes either NAD+ or NADP+
x tyrA x
Specificity for both the cyclohexadienyl and pyridine nucleotide substrates is unknown
Key to organism acronyms
Acidithiobacillus ferrooxidans ATCC 23270
Acinetobacter sp. ADP1
Actinobacillus actinomycetemcomitans HK1651
Actinomyces naeslundii MG1
Agrobacterium tumefaciens strain C58
Anabaena sp. PCC 7120
Archaeoglobus fulgidus DSM 4304
Bacillis anthracis str. A2012
Bacillus cereus ATCC 14579
Bacillus halodurans C-125
Bacillus thuringiensis israelensis
Bifidobacterium longum NCC2705
Burkholderia cepacia J2315
Burkholderia fungorum LB400
Burkholderia mallei ATCC 23344
Burkholderia pseudomallei K96243
Chromobacterium violaceum ATCC 12472
Corynebacterium diphtheriae NCTC 13129
Corynebacterium efficiens YS-314
Corynebacterium glutamicum ATCC 13032
Desulfovibrio desulfuricans G20
Desulfovibrio vulgaris subsp. vulgaris strain Hildenborough
Enterococcus faecalis V583
Erwinia carotovoa subsp.atroseptica SCRI1043
Escherichia coli K12
Geobacter metallireducens GS-15
Geobacter sulfurreducens PCA
Gloeobacter violaceus PCC 7421
Haemophilus influenzae Rd KW20
Helicobacter hepaticus ATCC 51449
Helicobacter pylori 26695
Klebsiella pneumoniae subsp. pneumoniae MGH 78578
Leifsonia xyli subsp. Xyli strain CTCB07
Listeria innocua Clip 11262
Listeria monocytogenes EGD-e
Lotus corniculatus var. japonicus
Methanopyrus kandleri AV19
Methanosarcina barkeri strain Fusaro
Methanothermobacter thermoautotrophicus strain Delta H
Microbulbifer degradans 2–40
Mycobacterium avium subsp. paratuberculosis strain k10
Mycobacterium bovis TrEMBL
Mycobacterium leprae TN
Mycobacterium tuberculosis CDC1551
Myxococcus xanthus DK 1622
Neisseria gonorrhoeae FA 1090
Nitrosomonas europaea ATCC 19718
Nocardia farcinica IFM 10152
Nostoc punctiforme PCC73102
Novosphingomonas aromaticivorans DSM 12444
Oceanobacillus iheyensis THE831
Oryza sativa ssp. japonica
Pasteurella multocida subsp. multocida strain Pm70
Photorhabdus luminescens subsp. laumondii TT01
Prochlorococcus marinus subsp. pastoris strain CCMP1378
Prochlorococcus marinus MIT9313
Propionibacterium acnes KPA171202
Pseudomonas aeruginosa PAO1
Pseudomonas fluorescens PfO-1
Pseudomonas putida KT2440
Ralstonia eutropha JMP134
Ralstonia solanacearum GMI1000
Rhodobacter sphaeroides 2.4.1
Rhodopseudomonas palustris CGA009
Rubrobacter xylanophilus DSM 9941
Salmonella typhimurium LT2
Shewanella oneidensis MR-1
Staphylococcus aureus subsp. Aureus MW2
Streptococcus gordonii str. Challis
Streptococcus pneumoniae R6
Streptomyces avermitilis MA-4680
Streptomyces coelicolor A3(2)
Streptomyces roseochromogenes subsp. Oscitans
Streptomyces toyocaensis strain 7
Sulfolobus solfataricus P2
Sulfolobus tokodaii strain 7
Synechococcus sp. WH8102
Synechococcus sp. PCC7002
Synechocystis sp. PCC6803
Thermosynechococcus elongates BP-1
Trichodesmium erythraeum IMS101
Tropheryma whipplei TW08/27
Vibrio cholerae O1 biovar eltor strain N16961
Vibrio parahaemolyticus RIMD 2210633
Wolinella succinogenes DSM 1740
Xanthomonas campestris pv. campestris strain ATCC 33913
Xylella fastidiosa 9a5c
Yersinia enterocolitica (type 0:8)
Zymomonas mobilis subsp. mobilis ZM4
The TyrA family is typical of many protein families in that its members have a relatively small core domain that is not highly conserved. As such, substantial challenges for bioinformatic analysis are posed. Here we have not only carried out a labor-intensive manual analysis, but we have also developed tools intended to facilitate and refine follow-on studies of this protein family in the genome era. The approaches implemented in this study with the TYR segment of aromatic biosynthesis hopefully can serve as a template for forthcoming integrant analyses of other pathway segments of aromatic biosynthesis, and indeed for metabolic subsystems in general.
This manuscript contains three broad sections. First, the biochemical and enzymological complexity of the TyrA protein family is presented in terms of the diversity that exists in nature with respect to substrate specificity and the association of the core domain with other catalytic or regulatory domains. Secondly, the genomic colinear organization of tyrA genes with other genes is evaluated, i.e., tyrA is considered in its syntenic context. Thirdly, tyrA is evaluated in its context of regulation. These three sections are tied together in a framework of evolutionary perspective.
Background of TyrA diversity
As an illustration of the detailed information that follows, note that the TyrA sequences from the beta Proteobacteria at five o'clock in Fig. 2 form a cohesive cluster (termed a 'congruency group'). In this clade there exists a proposed ancestral background of broad specificity where either AGN or PPA in combination with either NAD+ or NADP+ could be used. This profile of broad substrate use (which can be denoted as NAD(P)TyrAc; see Table 1) generally persists in the beta Proteobacteria. From this background, narrowed specificities for the AGN/NADP+ couple emerged once in the lineage represented by Nitrosomonas europaea (Fig. 2; dark blue line), narrowed specificity for NAD+ emerged once in species of Neisseria (orange line), and fusion of tyrA c with aroF (which encodes enolpyruvylshikimate-3-P synthase, the sixth enzyme in the common pathway of aromatic biosynthesis; see [5, 6] for nomenclature used) occurred recently within the Burkholderia lineage. These character-state transformations appear to occur with relative ease, and independent emergence of the same character states can be seen elsewhere in the tree.
Phylogenetically congruent TyrA groupings
Multiple alignments of catalytic-core domains
A phylogenetic tree is only as good as the input alignment. An optimal multiple alignment of TyrA homologs requires a trimmed set of sequences that corresponds to the catalytic-core domain. Alignment of sequences with non-homologous N-terminal fusions (such as with chorismate mutase• (AroQ•), HisHb•, or plant transit peptides•; note the convention of using a bullet to indicate the fusion point of one domain with another domain) will make them appear to be more closely related than they actually are because residues in the non-homologous N-terminal regions find matches at random. Likewise, those TyrA sequences with C-terminal fusions (such as with •AroF, •ACT, or •REG) will appear to be anomalously close to one another. Even enzyme proteins that have much greater sequence conservation and amino-acid lengths than TyrA proteins cannot reasonably be expected to yield a protein tree that would be congruent over an extensive phylogenetic range with the overall 16S rRNA tree. However, if genome representation is sufficiently dense within a range of closely related organisms, 16S rRNA congruency with a given protein can be expected within that range of organisms provided that (i) the particular functional role has been retained and (ii) lateral gene transfer has not occurred to obscure the relationship. This expectation follows from the outcome of a detailed analysis of tryptophan-pathway proteins in Bacteria [7, 8].
Congruency within major clades
TyrA sequences from higher-plant and yeast Eukarya form cohesive clusters. Genome representation among Archaea is still relatively limited. (Fig. 2 does reveal, however, that genes encoding TyrA proteins in Archaea have experienced various catalytic- and regulatory-domain fusions at least as frequently as those in Bacteria). Eventual expansion of both the tryptophan-pathway and tyrosine-pathway analyses to Archaea should be quite interesting.
The great majority of TyrA sequences available are from Bacteria, and one can see (by inspection of the major clades supported by high bootstrap values in Fig. 2) a qualitatively apparent congruence of TyrA-tree sub-sections with 16S rRNA expectations of vertical genealogy. Thus, all cyanobacteria possess a NADPTyrAa type of TyrA enzyme, and this is a very cohesive grouping. A few of the larger cyanobacterial genomes have a co-existing second enzyme of the TyrAc_Δ type (discussed in detail later). The low-GC gram-positive bacteria (Bacillus/Staphylococcus/Enterococcus/Listeria) exhibit the NADTyrAp pattern of specificity and also possess a C-terminal domain (ACT) of allosteric regulation. It is interesting that the TyrAp•ACT proteins of the Streptococcus lineage (at eight o'clock in Fig. 2) differ from the main low-GC clade in possessing broad specificity for pyridine nucleotides (as indicated with black line color). The most parsimonious evolutionary conclusion would be that in the low-GC gram-positive grouping, acquisition of the ACT domain and narrowed specificity for prephenate preceded narrowed specificity for NAD+. Thus, the latter event occurred after divergence of the Streptococcus lineage from the remainder of the low-GC clade. Members of the subclass taxon Actinobacteridae (mostly actinomycetes) possess AGN-specific TyrA enzymes (light blue fill color in Fig. 2), but they separate into two distinct groups that correlate either with broad specificity for pyridine nucleotides (Actinobacteridae_1) or a NAD+-specific pattern (Actinobacteridae_2). The Proteobacteria are discussed immediately below.
By far the greatest genomic density available is for Proteobacteria, the group of Bacteria that includes purple bacteria and their relatives. The various divisions of Proteobacteria, as currently named, lack hierarchical equivalence. For example, the epsilon and delta divisions branch from much deeper positions on the phylogenetic tree than do the alpha Proteobacteria. As genome representation expands for epsilon and delta Proteobacteria, it is probable that these will subdivide to newly named groupings of approximate hierarchical equivalence with alpha Proteobacteria. The most recently diverged Proteobacteria are the beta and gamma divisions. From the combination of our previous analysis of tryptophan biosynthesis [7, 8], TYR biosynthesis (this paper), and other segments of aromatic biosynthesis (unpublished data), we find it useful to separate "upper-gamma" Proteobacteria from "lower-gamma" Proteobacteria (an "enteric lineage" with Shewanella oneidensis as approximately the most divergent member). This separation is because the beta Proteobacteria and the upper-gamma Proteobacteria exhibit a smooth continuity of relatively few evolutionary events with respect to aromatic biosynthesis, in striking contrast to extraordinarily dynamic evolutionary events in the lower-gamma Proteobacteria. As a consequence, the lower-gamma Proteobacteria are much more distinct (in terms of aromatic biosynthesis) from the upper-gamma Proteobacteria than the upper-gamma are from the beta Proteobacteria.
Figure 2 shows that alpha, beta and epsilon divisions of Proteobacteria form phylogenetically coherent clusters with respect to their TyrA proteins. Although delta Proteobacteria fall into two well-separated groupings denoted as Delta_1 and Delta_2, this should not be surprising since these groupings diverge at a deep level on the 16S rRNA tree where genome representation is poor. In addition, the Myxococcus xanthus TyrA sequence, currently an orphan (three o'clock in Fig. 2), represents a third divergent lineage in delta Proetobacteria. In contrast to delta Proteobacteria, genomic representation for the gamma Proteobacteria is relatively good. Nevertheless their TyrA sequences separate into several well-spaced groupings, albeit for entirely different reasons. In this case, the split seen between two clades of these fairly close relatives (upper-gamma and lower-gamma) is attributed to particularly dynamic evolutionary events compressed into a relatively short time span in the lower-gamma Proteobacteria. (We refer to such a dynamic divergence as an evolutionary jump; see the next section.) Note that the allocation of upper-gamma and lower-gamma Proteobacteria to separate TyrA congruency groups is not the same as being incongruent. It is quite possible that as new genomes come on line, new and intermediate TyrA sequences may result in the merging of the foregoing two congruency groups (currently tyrosine congruency group 1 (TyrCG-1) and tyrosine congruency group 2 (TyrCG-2)).
Comparison of tryptophan and tyrosine congruency groups
Although the true extent of lateral gene transfer (LGT) at present must be described as intensely controversial, there is little doubt that any given organism is mosaic with respect to some unknown fraction of its gene repertoire. Our "accounting" system for keeping track of proteins that are faithful to the vertical genealogy is to formulate congruency groupings that are defined by congruence of given protein-tree clusters to a section of the 16S rRNA tree. Ultimately this information will reveal which organisms are "pure" with respect to the vertical inheritance of a given pathway or pathway segment. Our congruency groups are intended to be fluid, in that with the continued availability of new sequences, a previous orphan sequence may very well become the seed for a new congruency group. On the other hand, previously separate congruency groups have the potential to merge. (See Methods for more information.) The present tyrosine congruency groups are listed on the AroPath website .
Seven tryptophan congruency groups in Bacteria were previously formulated  based upon the correspondence of cohesive clusters in trees of Trp-protein concatenates with sections of 16S rRNA trees. The information input for formulation of tryptophan congruency groups is of greater quality than for tyrosine congruency groups because seven-protein concatenates could be used for the former. On the other hand, the broad information input supporting tyrosine congruency groups in this study is more comprehensive because of greater genome availability. Tryptophan congruency group 1 (TrpCG-1) corresponds perfectly with the organisms represented in TyrCG-1, these being the lower-gamma Proteobacteria (enteric lineage). The upper-gamma Proteobacteria (TyrCG-2) and the beta Proteobacteria (tyrosine congruency group 3; TyrCG-3) are represented by different tyrosine congruency groups. In contrast, the membership of tryptophan congruency group 2 (TrpCG-2) includes both the upper-gamma Proteobacteria and the beta Proteobacteria. The latter merging probably reflects the advantage conferred by the greater information content of the concatenated sequences used to define tryptophan congruency groups.
Species of Xylella and Xanthomonas are usually referred to as gamma Proteobacteria. They probably represent an outlying deeply branching lineage, although trees based on concatenated strings of proteins  or 16S rRNA  position them with beta Proteobacteria. In any event, Trp-protein concatenate trees placed Xylella and Xanthomonas within TrpCG-2, which contains both upper-gamma and beta Proteobacteria. In contrast, the TyrA domains from Xylella and Xanthomonas were well separated (at about two o'clock in Fig. 2) from those of any other organism. This might simply be due to the limited resolving power of a single protein in combination with too few close relatives. (Note that single Trp-protein trees sometimes failed to achieve the congruency-group placements that were resolved by seven-protein Trp concatenates ). An additional clue may be relevant. The TyrA proteins from the Xylella/Xanthomonas genera possess an ACT domain, which has not been observed in any other proteobacterial TyrA proteins thus far. In view of this, origin by LGT seems to be a distinct possibility, but with the important caveat that no likely genome donors are yet obvious on the criterion of sequence similarity. Perhaps more likely is the following possible explanation that postulates a basis for accelerated divergence. The TyrA domains of Xanthomonas/Xylella proteins have an indel structuring (insertions and/or deletions) that places them within the TyrAc_Δ specificity subclass (see below). We suggest (see below) that such indel structuring reflects interaction of the core TyrA domain with an extra-domain extension. Thus, selection for amino acid changes accomplishing a new domain-domain interaction could account for accelerated divergence of the Xanthomonas/Xylella sequences on the TyrA tree (Fig. 2).
Cohesive tryptophan congruency groups of the alpha Proteobacteria (tryptophan congruency group 3; TrpCG-3) and the cyanobacteria (tryptophan congruency group 4; TrpCG-4) match up well with the corresponding tyrosine congruency groups (tyrosine congruency group 4 (TyrCG-4) and tyrosine congruency group 8 (TyrCG-8), respectively). The TyrA proteins of epsilon Proteobacteria define a cohesive tyrosine congruency group (tyrosine congruency group 5; TyrCG-5), whereas the Trp-protein concatenates of epsilon Proteobacteria did not exhibit a coherent congruency group, due at least in part to LGT . The delta Proteobacteria separate into two distinct tyrosine congruency groups: Delta_1 (tyrosine congruency group 6; TyrCG-6) and Delta_2 (tyrosine congruency group 7; TyrCG-7), as shown in Fig. 2. It is likely that corresponding tryptophan congruency groups exist (work in progress), but at the time of the Xie et al. study  only Trp-pathway protein concatenates for Desulfovibrio vulgaris (Delta_2) and Geobacter sulfurreducens (Delta_1) were available, and they were provisionally listed as "orphans". In the present work TyrA sequences from Deinococcus radiodurans and Thermus thermophilus are the sole members of tyrosine congruency group 12 (TyrCG-12). At the time of the Trp-pathway work, the genome of Thermus was unavailable and the Deinococcus concatenate was listed as an orphan. It is expected that the Deinococcus and Thermus concatenates will now seed a new tryptophan congruency group.
Whereas tryptophan congruency group 5 (TrpCG-5) is defined by cohesive concatenates from actinomycete bacteria, the TyrA proteins from the same organisms separated into two distinct congruency groups. It is intriguing that this partitioning into two congruency groups correlates with narrowed specificity for NAD+ (indicating an evolutionary jump) in one of the groups. The latter group (tyrosine congruency group 11; TyrCG-11) is denoted Actinobacteridae_2 in Fig. 2, whereas tyrosine congruency group 10 (TyrCG-10) is displayed as Actinobacteridae_1. The opposite scenario whereby a single tyrosine congruency group corresponds to split tryptophan congruency groups applies in the case of low-GC gram-positive bacteria. Whereas TyrA proteins form a single congruency group in these organisms (tyrosine congruency group 9; TyrCG-9), a small cluster of Trp-pathway concatenates from Bacillus subtilis, B. stearothermophilus, and B. halodurans (tryptophan congruency group 6; TrpCG-6) separate distinctly from the remaining organisms (tryptophan congruency group 7; TrpCG-7). The latter evolutionary jump reflects a dynamic scenario of tryptophan-pathway evolutionary events that include loss of one gene from the trp operon, insertion of the trp operon into a 6-gene aro operon to produce a supraoperon, and acquisition of the TRAP (tryptophan-activated RNA-binding protein) mechanism of regulation by an RNA-binding protein .
Tyrosine congruency groups and tryptophan congruency groups are maintained and updated at the AroPath website .
Distribution in nature of TyrA specificity subclasses for the cyclohexadienyl substrate
Four qualitative classes of specificity for the cyclohexadienyl substrate populate the TyrA superfamily of homologs (Fig. 1). These include PPA-specific (TyrAp), AGN-specific (TyrAa), the broad-specificity cyclohexadienyl (TyrAc) dehydrogenases and a fourth class represented by an enzyme of antibiotic biosynthesis (PapC) that converts 4-amino-4-deoxy-prephenate to 4-amino-phenylpyruvate . Representatives of each specificity class have been studied at molecular and genetic levels. TyrA family members sharing a given substrate specificity do not necessarily cluster tightly together, and assignment of substrate specificity to experimentally uncharacterized TyrA homologs is uncertain unless they exhibit very high amino acid identities with experimentally characterized TyrA proteins. In some cases we do not accept older literature reports without more recent verification. For example, the yeast Saccharomyces cerevisiae TyrA x was characterized as a TyrA p protein  long before it was recognized  that PPA preparations were often contaminated with AGN (an unknown compound at that time).
Curated TyrA amino-acid sequence files at AroPath 
Complete TyrA sequences
Pyridine-nucleotide discriminator segmentsb
Cyclohexadienyl-substrate core segments
Pseudogene TyrA sequences
Many TyrA proteins (at least in the domain Bacteria) are of the TyrAc subclass. The cyclohexadienyl dehydrogenases commonly accept PPA or AGN about equally well, but various degrees of preference for one of the alternative substrates are also observed. Detailed molecular and genetic studies of TyrAc proteins from Pseudomonas aeruginosa, , P. stutzeri , and Zymomonas mobilis  have been carried out. The distinct variety of TyrAc mentioned above, which has been denoted TyrAc_Δ exhibits a number of indels (mostly deletions) within the catalytic-core region when its consensus sequence is aligned with those of the other TyrA classes (Fig. 3). It is intriguing that the indel structuring of TyrAc_Δ correlates with the presence of an extra-core extension. This extension is often AroQ, but not always. For example, in the genera Nostoc and Anabaena it appears to be a degraded, catalytically inactive AroQ, whereas in Xanthomonas or Xylella it is an ACT domain. Since the one large clade of TyrAc_Δ proteins that has so far been studied prefers PPA over AGN by well over an order of magnitude, an evolutionary relationship of indel insertions to the narrowing of substrate preference for PPA might exist. If so, however, this cannot be the only molecular change to accomplish favored utilization of PPA over AGN since a number of TyrAc proteins, (e.g., TyrAc from Neisseria gonorrhoeae), also exhibits an overwhelming preference for PPA, even though this class lacks the indel structuring.
The TyrA a class of specificity is currently represented by higher plants and at least three widely spaced bacterial lineages: cyanobacteria, actinomycetes and Nitrosomonas europaea. This discontinuity of phylogenetic spacing is consistent with a fundamental evolutionary scenario  whereby the ancestral dehydrogenase was a broad-specificity TyrA c that evolved narrowed substrate specificity (to yield either TyrA p or TyrA a ) independently on multiple occasions in modern lineages. The ubiquitous presence of TyrAa in cyanobacteria has been heavily documented . Nitrosomonas europaea currently (as of March, 2005) has no sufficiently close genome relatives that have been sequenced. The first BLAST hit returned from a NADPTyrA a query from N. europaea (March,2005) is the protein from Ralstonia solanacearum (48% identity), which is known to possess broad specificity for both of its substrates (i.e., NAD(P)TyrA c ) [21, 22].
A similar relationship of phylogenetic separation associated with narrowed specificity for pyridine-nucleotide substrate exists for the low-GC gram-positive bacteria (eight o'clock in Fig. 2). Here the major clade is NAD+-specific, whereas species of Streptococcus have retained the ancestral breadth of specificity for NAD+/NADP+. Alignments of the pyridine-nucleotide discriminator regions of these latter two groups match up extremely well with the upper alignment of Fig. 4 where residue 32 of the Wierenga fingerprint  is 'D' and with the lower alignment where residue 32 is 'N' (data not shown).
Recently, a plant tyrA a from Arabidopsis thaliana has been reported to consist of two near-identical domains that are fused . The gene encoding this 68-kDa protein co-exists in the genome with a single-domain paralog  that encodes a predicted 37-kDa protein, somewhat larger than the catalytic-core domain of TyrA a from Synechocystis. TyrA a (known to be located in higher-plant chloroplasts ) may have originated from cyanobacteria via endosymbiosis. If so, however, the plant TyrAa sequences have diverged sufficiently that they no longer share a specific phylogenetic grouping with the cyanobacterial TyrA sequences. This is in marked contrast with the phylogenetic coherence of the tryptophan synthase subunit proteins (TrpEa and TrpEb_1) from cyanobacteria and higher plants .
TyrAp is conspicuously represented by a large clade of low-GC gram-positive organisms, of which Bacillus subtilis TyrAp is the best studied . Thus far, all TyrAp proteins are fused to a C-terminal ACT domain, and therefore no "minimal" TyrAp proteins that consist only of a catalytic core are available as yet. At the level of physiological function, it should be added that those cyclohexadienyl dehydrogenases that exhibit a very substantial preference for prephenate are for all practical purposes prephenate dehydrogenases, even though they carry a formal designation of TyrAc or TyrAc_Δ. These include most, if not all, of the AroQ•TyrAc_Δ enzymes of the enteric lineage (lower-gamma in Fig. 2). The TyrAc protein from Neisseria gonorrhoeae (and by inference, the closely related N. meningitides) is also a well-studied example of overwhelming preference for prephenate .
PapC participates in the formation of p-aminophenylalanine as a step in the synthesis of at least two antibiotics (see Fig. 1). It is so far represented by only a few sequences. The PapC specificity is strongly indicated by absence of the otherwise invariant residue H197 (E. coli numbering) that is associated with recognition of a 4-hydroxy moiety in the cyclohexadienyl substrates of the aforementioned dehydrogenases. This moiety, of course, differs in being a 4-amino substituent in the substrate used by the PapC dehydrogenase (Fig. 1). See Bonner et al.  for a more detailed overview.
The "redundant" trp/aro supraoperon of Nostoc/Anabaena
All cyanobacteria possess a highly conserved tyrA a gene, as well as a complete suite of tryptophan-pathway genes that are dispersed (unlinked) in the genome. The large-genome cyanobacterial lineage consisting of the Nostoc and Anabaena genera possess in addition a unique and seemingly redundant trp/aro supraoperon consisting of most of the aforementioned genes . These include a second tyrA gene (curated as tyrA c_Δ ), six trp-pathway genes (all except trpC), and genes encoding the first two common-pathway steps of aromatic amino acid biosynthesis. All of these supraoperonic genes appear to be redundant in that they are represented by homologs (paralogs or xenologs) elsewhere in the Nostoc and Anabaena genomes at scattered loci. The closest BLAST hits for the Nostoc/Anabaena TyrAc_Δ proteins are not the co-existing TyrAa homologs present in their own genomes (and universally present in cyanobacteria). Rather the closest BLAST hits are to the TyrAc_Δ domains of the AroQ•TyrAc_Δ fusions in the enteric lineage. Since the enteric proteins are NAD+-specific and strongly prefer prephenate, it is likely that the "extra" cyanobacterial proteins are also NADTyrAc_Δ proteins. Indeed, this would be consistent with enzymological evidence provided in the literature for both Nostoc and Anabaena .
Concerning the evolutionary origin of the redundant block of linked genes found in the Nostoc and Anabaena genomes, at least two possibilities await further illumination. (i) These genes might have been acquired by a common ancestor of Nostoc and Anabaena via lateral gene transfer. This is consistent with the observation that biosynthetic-pathway operons are generally absent in the cyanobacteria, and all of the linked genes could have been recruited in a single event. However, at present no candidate donor genomes are known that possess this supraoperon combination of genes. If the TyrAc_Δ proteins of Nostoc/Anabaena and the enteric lineage are possibly related by LGT, it is of interest that the N-terminal extension of TyrAc_Δ from Nostoc/Anabaena resembles a degraded AroQ domain of AroQ•TyrAc_Δ from enterics. In both cases the N-terminal residues may compensate for indel deletions within the catalytic core region of TyrAc_Δ. Subsequently, AroQ function may have evolved in one lineage (or have been lost in the other). This possibility of domain-domain interaction is consistent with the established interdependence of the AroQ• and •TyrAc_Δ domains from E. coli . Alternatively, tyrA a and tyrA c_Δ (and the duplicated trp and aro genes present in the supraoperon) might be ancient paralogs within the cyanobacterial lineage. If so, at a time following divergence of heterocystous cyanobacteria from the unicellular cyanobacteria, the latter may have lost the clustered block of aromatic-pathway genes in a single event of reductive evolution. The supraoperonic genes might be related to a specialized function associated with "developmental" physiological processes that typify the filamentous, heterocyst-forming cyanobacteria. This might be reminiscent of the nature of the phenazine-pigment operon of Pseudomonas aeruginosa. Here unique phenazine-pathway genes are combined with a redundant gene of common-pathway aromatic biosynthesis and two redundant (and fused) genes of tryptophan biosynthesis. This accomplishes the linkage of specific phenazine biosynthesis with a supply of 2-amino-2-deoxy-isochorismate, the branchpoint of divergence toward phenazine and tryptophan [33, 34]. This complexity in which multiple paralogs are differentially deployed is consistent with the large genome sizes of Anabaena (7.2 MB) and Nostoc (9.2 MB), compared with the much smaller unicellular genomes of Prochlorococcus marinus (1.7 MB), Synechococcus sp. WH8102 (2.4 MB), and Synechocystis sp. PCC6803 (3.6 MB).
Profile hidden Markov models (HMMs) to distinguish specificity subfamilies for cyclohexadienyl substrate
The limited information thus far available about specific molecular roles of particular TyrA amino acid residues has been summarized recently . The catalytic-core domains of known TyrAa, TyrAp, TyrAc, and TyrAc_Δ proteins were selected from our files of TyrA catalytic-core domains , and a new subset of sequences was prepared that lacked the pyridine nucleotide discriminator segment, a glycine-rich βαβ region at the N terminus. Although the glycine-rich βαβ region is not the only segment that contacts pyridine nucleotide substrate, it is the sole region that discriminates between NAD+ and NADP+. The resulting trimmed sequence is defined as the "cyclohexadienyl-substrate core segment". No distinctive motifs were found that, in isolation, would be a clear predictive indicator of specificity for cyclohexadienyl substrate. Similar substrate specificity profiles probably can be dictated by alternative patterns of interplay between different residue combinations.
Because of the rapid accumulation of incorrectly annotated TyrA entries in GenBank and other databases, partly due to the complications of misnaming that are associated with gene fusions and partly to a failure to assimilate published substrate specificities, the use of BLAST does not return reliable annotations with respect to substrate specificity. Even the HMMs used in Pfam  and Interpro  were not helpful in this case because the HMM deployed in those databases was broadly but incorrectly defined as 'prephenate dehydrogenase (NADP+) activity' for all TyrA dehydrogenases (accession number PF02153 in Pfam and entry IPR003099 in Interpro). However, Profile HMM is known to be well suited for modeling a particular sequence family of interest and for finding additional remote homologs . It is reputed to outperform methods that rely only upon pair-wise alignment of homologous residues in predicting protein function . Therefore, profile HMMs were constructed using our multiple sequence alignments of each curated TyrA specificity subfamily, using the HMMER package .
The profile HMMs obtained are only tentatively reliable for prediction of substrate specificity. To facilitate ongoing and future functional annotations, we have made our profile HMMs available as a working resource for "specificity prediction" at AroPath . Users can match query sequences against the four profile HMMs to predict the subfamily to which a query sequence belongs. It is anticipated that future experimental data relevant to substrate specificity will facilitate refinement of the prediction program. For example, at present the program predicts that the TyrA sequences from organisms such as Helicobacter pylori and Saccharomyces cerevisiae belong to the TyrAa grouping, and it will be interesting to see whether this holds up to experimental confirmation. It is additionally fascinating that (i) the dehydrogenase from Archaeoglobus fulgidus is predicted to belong to the indel-containing TyrAc_Δ grouping and (ii) that it possesses a possible cooperatively interacting extra-core domain extension (an AroQ fusion), just as occurs for the large clade of enteric bacteria. If this is relevant, it is even more fascinating that the Archaeoglobus aroQ is fused at the C-terminal side of tyrA c_Δ, rather than at the N terminus as is the case with enteric bacteria.
Users at AroPath  can enter query sequences into interactive multiple sequence alignments with any of the four sets of "cyclohexadienyl-substrate core segments" sequences that were used to train the profile HMMs. An ongoing effort is in process to extend the predictor capability to include the pyridine nucleotide substrate as well. One can also align query sequences of interest with either an assemblage of the complete set of curator-approved TyrA catalytic-core TyrA sequences or with any desired subset of seed sequences.
The catalytic-core domain of TyrA proteins
The simplest set of fully functional TyrA proteins consists only of the catalytic-core domain (about 180 amino acids)  and includes the well-characterized TyrA c enzymes from Neisseria gonorrhoeae  and Zymomonas mobilis , as well as TyrA a from a cyanobacterium . In addition the catalytic-core domain from Pseudomonas stutzeri has been engineered for study from a tyrA c •aroF fusion . These model core proteins are roughly as divergent from one another on the TyrA protein tree as are the organisms that contain them (Fig. 2). In view of the possibility raised in this paper about inter-domain interactions, the single-domain TyrA proteins are undoubtedly the simplest sources for study of the fundamental properties of the catalytic-core domain.
Cyclohexadienyl substrates and inhibitors of TyrA proteins possess identical sidechains
In contrast to the TyrA c proteins just described, the Z. mobilis TyrA c is totally insensitive to inhibition by either 4-hydroxyphenylpyruvate or TYR. Since both of these compounds lack a 1-carboxy moiety, it is reasonable to assume that the 1-carboxy substituent present in the two substrates accepted may be required for binding at the catalytic center. Thus, although TyrAc from Z. mobilis will accept the same two substrates as does the TyrA c from P. stutzeri, the greatly different inhibition results suggest that Z. mobilis obeys more stringent rules for binding at the catalytic site (i.e., a ring carboxylate must be present).
Synechocystis sp. and Arabidopsis thaliana TyrAa proteins accept as a substrate only AGN, which has an alanyl sidechain. The ring-carboxylate moiety is evidently not absolutely required for binding since these TyrAa proteins can recognize TYR (alanyl sidechain) as a competitive inhibitor. In contrast, since N. europaea TyrAa is not inhibited by TYR, it resembles the Z. mobilis TyrAc in the putative requirement for a 1-carboxy substituent to secure successful binding at the catalytic site.
In summary, some TyrA proteins probably exercise greater discrimination in their requirement for a 1-carboxy moiety for binding at the catalytic site, and these are insensitive to competitive inhibition by the aromatic reaction products (which lack the 1-carboxy substituent). Other TyrA proteins that require the 1-carboxy moiety for the fundamental catalytic process, but presumably do not require it for binding, will recognize product inhibitors that have the same sidechain as any substrate recognized.
Specificity for the pyridine nucleotide co-substrate within the TyrA superfamily
NAD+ differs from NADP+ only in that NADP+ has a phosphate group esterified at the 2'-position of adenosine ribose. Therefore, the ability of a dehydrogenase to discriminate between those two lies in the particular enzyme region that contacts the ribose moiety. The glycine-rich region known to constitute the ADP-binding βαβ fold is well known to be this point of contact . This Rossmann β α β fold is inevitably positioned at the extreme N terminus of TyrA proteins, and the typical GXGXXG motif is almost always observed, as illustrated in Fig. 4. This region is helpful for assessment of probable specificities for pyridine nucleotide. One can be fairly sure that TyrA proteins possessing D-32 (E. coli numbering, reference ) are NAD+-specific. A negatively charged residue (D or E) at position 32 is critical for hydrogen binding to the diol group of the ribose near the adenine moiety in NAD+-specific enzymes. NADP+-specific dehydrogenases cannot tolerate a negatively charged residue at position 32. TyrA proteins that possess an asparagine residue in the corresponding position appear to be broadly specific for both NAD+ and NADP+ as discussed above. No clearcut motif has been identified for NADP+-specific TyrA proteins, although at least one positively charged residue is expected in the region just beyond residue 32. By elimination, those sequences lacking D-32 or N-32 are strong candidates for NADP+ specificity. As with the cyclohexadienyl co-substrate, narrowed specificity for NAD+ (or NADP+) also seems to have occurred independently on many occasions (some examples given earlier).
The absolute specificity of TyrAp proteins for PPA tends to be accompanied by absolute specificity for NAD+, as illustrated by the large Bacillus/Staphylococcus/Listeria/Enterococcus clade at eight o'clock in Fig. 2. However, it is interesting that species of Streptococcus have retained the presumed ancestral breadth of specificity for the pyridine nucleotide substrate. The opposite relationship, whereby absolute specificity for AGN tends to be accompanied by absolute specificity for NADP+, is also observed. Here three of the four TyrAa lineages described earlier exhibit this pattern. Exceptions, though, are the aforementioned TyrAa proteins of Actinobacteridae_1 which accept either NAD+ or NADP+, as well as the TyrAa proteins of the sister Actinobacteridae_2 which are specialized for NAD+ [42, 43].
The TyrAc proteins of most complete-genome organisms thus far have happened to be NAD+-specific, and this has been the property of the most rigorously characterized ones (from Z. mobilis, P. stutzeri, and P. aeruginosa). However, it is clear from extensive enzymological surveys  that TyrAc proteins having broad specificity for NAD+/NADP+ are common, examples including species of Ralstonia and Burkholderia. The spectrum of variation that can exist, even within a clade of organisms that are of fairly close relationship, is illustrated by one striking example. In the pseudomonad clade marked by a common tyrA•aroF fusion, the Acinetobacter sp. TyrAc is NADP+-specific , whereas the sister subclade Pseudomonas/Azotobacter exhibits NAD+ specificity (Fig. 2). Here the entire clade marked by a common ancestral fusion shares approximately the same profile of cyclohexadienyl substrate preference, but cofactor specificity has been narrowed in opposite directions.
We had previously suggested that there might be a general structural relationship of substrate pairing that tends to favor interaction between PPA and NAD+, on the one hand, and, on the other hand, between the greater positive charge of AGN and the greater negative charge of NADP+. These relationships may indeed be favored, but it increasingly appears that any combination can occur.
Beyond the catalytic core: allosteric domains
Various lineages have acquired an amino acid binding domain known as the ACT domain (pfam01842), which is known to bind a variety of amino acids, thus functioning as an allosteric domain for many proteins including phosphoglycerate dehydrogenase, aspartokinase, acetolactate synthase, phenylalanine hydroxylase, prephenate dehydratase and formyltetrahydrofolate deformylase. Recruitment of this domain by fusion with tyrA p appears to have occurred in a common ancestor of the large Bacillus/Staphylococcus/Listeria/Enterococcus/Streptococcus assemblage (Fig. 2). It is interesting that B. subtilis also possesses a gene encoding a free-standing ACT domain in its genome (incorrectly annotated as pheB). An additional fusion of genes encoding an ACT domain and tyrA (that arose independently, judging from the widely spaced tree positions) occurred in the common ancestor of Xanthomonas and Xylella. Actinobacteria usually possess a C-terminal extension that probably functions as an allosteric domain. The extension possessed by the Actinobacteridae_2 assemblage, which includes Streptomyces coelicolor and its relatives, appears to be an ACT domain. On the other hand, it is not all all clear that the C-terminal extension of the Actinobacteridae_2 assemblage is an ACT domain. This difference, in addition to the differing specificities for pyridine nucleotide substrate, may have contributed to the overall TyrAa divergence observed between the two Actinobacteridae groups. There is no correlation between presence of the ACT domain and specificity for cyclohexadienyl substrate since TyrA p from the Bacillus clade is PPA-specific, Xanthomonas/Xylella TyrAc is broadly specific, and Streptomyces TyrAa is AGN-specific.
B. subtilis, which belongs to the large clade having an ACT domain as a carboxy extension, has been extensively characterized . 4-Hydroxyphenylpyruvate is an effective competitive inhibitor, as would be consistent with our proposed effects at the catalytic core for a PPA-specific enzyme. However, TYR, phenylalanine (PHE) and tryptophan were also inhibitors. The violation of the rule that the latter three amino acid inhibitors would not be expected to bind the catalytic core region (because they have alanyl sidechains even though the substrate-binding site only recognizes the pyruvyl sidechain of prephenate) and the finding that some of these were not competitive inhibitors can now be accounted for by the presence of the allosteric ACT domain. A carboxy extension shared by a number of Archaea (denoted 'REG' in Fig. 2) is presumably a regulatory domain as well. This is consistent with the recent result of Porat et al.  that not only 4-hydroxyphenylpyruvate, but also TYR, inhibited prephenate dehydrogenase activity of Methanococcus maripaludis.
The tyrA gene is a popular fusion partner
Fusion with aroQ
tyrA may be fused with a number of other catalytic domains, each of them relevant to aromatic biosynthesis (Fig. 2). aroQ (encoding chorismate mutase) is frequently fused with a number of other aromatic-pathway genes . The lower-gamma Proteobacteria (enteric lineage) located at twelve o'clock in Fig. 2 possess an aroQ•tyrA c_Δ fusion. The fusion physically links chorismate mutase (which forms PPA) with TyrAc_Δ (which utilizes PPA). The two protein domains of AroQ•TyrAc_Δ may have co-evolved to produce cooperative protein-protein interactions since physical separation of the domains evoked relatively low activities of both activities in E. coli . Substantial comparative work shows that the aroQ•tyrA c_Δ fusion has been stably maintained throughout the entire enteric lineage . Exceptions in some genomes lacking this fusion altogether can be attributed to reductive evolutionary loss in pathogens (e.g., Haemophilus ducreyi) or endosymbionts (e.g., Buchnera aphidicola). An independent aroQ•tyrA fusion was generated in the common ancestor of Sulfolobus solfataricus and S. tokodaii (Fig. 2). Since the TyrA domain of Sulfolobus species lacks the indel structure of the TyrAc_Δ class, it would be interesting to see whether physical separation of the two domains would yield evidence of independent function, in contrast to the results mentioned just above for E. coli.
Fusion with aroF
Secondly, tyrA c has been fused with aroF on at least two separate occasions in Bacteria. (The aroF gene encodes enolpyruvylshikimate-3-P synthase, the sixth enzyme in the common pathway of aromatic biosynthesis; see [5, 6] for nomenclature used.) One clade includes members of the upper-gamma Proteobacteria: P. aeruginosa, P. syringae, P. putida, P. stutzeri, P. fluorescens and Azotobacter vinelandii. It is interesting that P. syringae has experienced a deletion of about 200 residues at the N-terminal region of the AroF domain. This has been coupled with the acquisition of a stand-alone aroF gene that is absent in other members of the clade. Interestingly, the latter AroF shows high identity only with AroF from Agrobacterium tumefaciens, an alpha proteobacterium. The A. tumefaciens aroF, in turn, is unique compared to its α-subdivision relatives, both in having divergent sequence and in being unlinked to cmk and rpsA. Thus, it seems likely that the incongruence of AroF belonging to both P. syringae and A. tumefaciens reflects acquisition via LGT from some as yet unknown source. The disruption of the fused aroF domain in P. syringae is an unusual instance where the catalytic function of one fusion domain has become discarded while the function of the second domain has been retained. It is interesting to consider the possibility that the truncated remnant of the aroF fusion domain might be exploitable for use as an innovative source of a new regulatory domain. An additional fusion of tyrA with aroF has occurred independently within the beta Proteobacteria in the common ancestor of Burkholderia pseudomallei and B. mallei. This has been very recent since the closely related B. fungorum and B. cepacia organisms lack the fusion.
It has been suggested that presence of a given fusion may be useful for sorting out clades that diverged from a common ancestor, independent of other methods . Different fusions offer the power of discriminating clades at various hierarchical levels, i.e., nested clades discriminated by nested gene fusions. The tyrA•aroF fusion occurred in the common ancestor of the clade that includes the upper-gamma Proteobacteria shown in Fig. 2. One can reasonably assume that relatively close upper-gamma organisms lacking the tyrA•aroF fusion diverged from the common ancestor of the fusion clade prior to the fusion event. Such would appear to be the case, for example, with Acidithiobacillus ferrooxidans, an outlying member of the upper-gamma Proteobacteria that lacks the fusion. It is reasonable to conclude that the fusion event must have pre-dated the differential specialization for the pyridine nucleotide cosubstrate that distinguishes Acinetobacter sp. (NADP+-specific) from the large grouping of pseudomonads that are NAD+-specific.
Fusion with hisH b
Thirdly, a single organism, Rhodobacter sphaeroides, possesses a hisH b•tyrA fusion that must have occurred very recently. hisH b encodes an aromatic aminotransferase that is closely related to (or sometimes even synonymous with) imidazole acetol phosphate aminotransferase . The hisH b /tyrA/aroF linkage group is part of a supraoperon in some gram-negative bacteria in which a relatively conserved, yet frequently shuffled gene order is observed [5, 6]. Hence, it is reasonable to assume that at the time just prior to fusion, hisH b, tyrA and aroF were adjacent. Note that among the fusions currently known, hisH b and aroF are fused to the N-terminal and C-terminal ends of tyrA, respectively. It would be interesting to know the substrate specificity of the R. sphaeroides TyrA domain. If it is AGN-specific the significance of hisH b presumably would be to transaminate PPA to form AGN, the substrate used by TyrAa (see Fig. 1). On the other hand, if the dehydrogenase is PPA-specific, the significance of the HisHb domain would be to transaminate the product of the TyrAp reaction. If the enzyme is a TyrAc enzyme (as is probable), then HisHb likely is competent to catalyze either of the foregoing reactions.
Fusion with ACT
The widespread ACT regulatory domain appears to have been acquired by independent fusions at least three separate times judging from the widely separated lineages that possess a TyrA•ACT fusion (Fig. 2). Xie et al.  initially noted homologous domains positioned at the N terminus of mammalian phenylalanine hydroxylase and at the C terminus of most microbial prephenate dehydratases. This domain is responsible for phenylalanine-mediated activation and phenylalanine-mediated inhibition of the hydroxylase and dehydratase enzymes, respectively. This domain was later named the ACT domain  and shown to be a widely distributed domain family that shares a conserved overall fold. Members of the ACT-domain family possess a wide variety of different ligand-binding capabilities. For example, the ACT domain of 3-phosphoglycerate dehydrogenase binds L-serine as a allosteric inhibitor.
Fusion with REG
Another putative regulatory domain fused to tyrA (denoted tyrA•REG) is thus far restricted to some of the Archaea. This domain is a predicted regulatory domain, as described in COG4937.
A novel 4-domain fusion
Archaeoglobus fulgidus exhibits a striking four-domain fusion consisting of three catalytic domains and a regulatory ACT domain (TyrA•AroQ•PheA•ACT). The TyrA domain is predicted to belong to the TyrAc_Δ class when used as a query input into the AroPath Specificity Predictor Tool . We speculated earlier that the •AroQ fusion domain of Archaeoglobus may exercise cooperative interactions with TyrAc_Δ, as appears to occur between the AroQ•TyrAc_Δ domains of E. coli and its relatives.
tyrAin its syntenic context
Although the genes of prokaryotes have clearly been subject to frequent scrambling, some gene-gene associations persist more tenaciously than others. Xie et al. [5, 6] asserted that one such ancestral gene string that has resisted scrambling forces is hisH b > tyrA > aroF. As suggested above, contemporary gene fusions can serve as frozen-in-time indicators of ancient gene organizations that were later obscured by gene-scrambling events. Another gene string that is often within the syntenic region of hisH b, tyrA, and aroF is cmk > rpsA. Gene synteny in prokaryotes has not been easily recognized because substantial manual scrutiny in combination with a sufficient density of genomic representation on a given portion of the phylogenetic tree is necessary to detect patterns of synteny that are camouflaged by frequent scrambling events (inversion, deletion and transposition).
When the various examples of hisH b > tyrA > aroF linkage are mapped on a 16S rRNA tree, they first appear in gram-positive bacteria. In Bacillus and related organisms (such as Listeria), the hisH b > tyrA > aroF unit is associated with a large ancestral operon consisting of aroG > aroB > aroH > hisH b > tyrA p > aroF. Bacillus additionally possesses the cmk > rpsA unit, albeit in a separate location. Interestingly, in one narrow subclade (B. subtilis, B. halodurans and B. stearothermophilus) the trp operon has been inserted between aroH and hisH b to yield a supraoperon that has been fully characterized as a complex functional unit . See Xie et al.  for a full presentation of evolutionary interpretation relevant to the latter. Though highly scrambled, a pattern of association of pheA with hisH b > tyrA >aroF is suggested by linkage patterns seen at the hierarchical level of Cytophaga and Bacteroides (Fig. 5). aroQ became associated with pheA through gene fusion as early as the divergence of the Spirochaetes to yield an aroQ•pheA>tyrA>aroF>cmk>rpsA linkage unit (Leptospira interrogans in Fig. 5). The aroQ•pheA gene associated with tyrA and aroF in Clostridium difficile appears to have arisen from a distinctly different fusion event than that present in delta, epsilon, beta and upper-gamma Proteobacteria or from that present in lower-gamma Proteobacteria (based upon analysis of inter-domain linker regions; unpublished data).
The ancestor of alpha Proteobacteria has lost the aroQ•pheA fusion, and a stand-alone pheA is consistently observed. Members of this group are quite uniform in the stable possession of hisH b > tyrA and aroF > cmk > rpsA as two separated linkage groups. The beta Proteobacteria are represented by members that have the gene organization: serC > aroQ•pheA > hisH b > tyrA > aroF > cmk > rpsA. This is also seen in the members of the upper-gamma Proteobacteria.
Figure 5 includes organisms that illustrate the traces of synteny that can be detected in Bacteria where overall genome representation is just barely adequate. The following two figures illustrate how syntenic patterns of more resolution and refinement become evident with denser genome representation.
Zooming in on syntenic contexts of proteobacteria
Beta proteobacteria and upper-gamma proteobacteria
Key to evolutionary events asserted in Figure 6
Evolutionary event(s) proposed
Dispersal of aroQ•pheA > hisH b > tyrA away from one another and away from gyrA > serC and from cmk > rspA > himD; inversion of aroF with respect to cmk.
Complete dispersal of all nine genes originally in the gyrA/himD linkage group.
Insertion of serA after serC a ; separation of tyrA and aroF to yield the separated 6-gene unit and 4-gene unit shown.
Expulsion of hisH bfrom the genome; insertion of 'ORF' after serC.
Fusion of tyrA with aroF.
Loss of hisH b from genome.
Insertion of serA after serC a ; insertion of aroA Iα after hisH b.
Translocation of hisH b and tyrA to other regions, leaving two separated 3-gene units.
Fusion of tyrA with aroF.
Loss of hisH b.
N-terminal deletion of •aroF domain, and acquisition of new aroF gene (probable LGT).
Separation of cmk > rpsA > himD from aroQ•pheA > tyrA•aroF.
Insertion of 4 unknown genes between gyrA and serC in opposite orientation and separation of gyrA > ORF > ORF > ORF > serC from aroQ•pheA > tyrA•aroF.
Loss of himD; translocation of serC away from gyrA and aroQ•pheA.
The gamma Proteobacteria have separated into two distinctly different synteny patterns. The lower-gamma Proteobacteria have undergone marked syntenic change (see below). The assemblage portrayed between Acidithiobacillus and Microbulbifer in the lower part of Fig. 6 (termed the upper-gamma Proteobacteria) exhibit a strong overall syntenic resemblance of supraoperon genes to that of the beta Proteobacteria. Acidithiobacillus possesses a near-intact ancestral supraoperon, differing only in having two insertions: one gene encoding 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase between hisH b and tyrA, and the other being the insertion of serA between serC and aroQ•pheA. Pseudomonas aeruginosa and P. stutzeri have also retained nearly intact ancestral supraoperons, differing only in the fusion of tyrA and aroF. The serC > aroQ•pheA > hisH b > tyrA•aroF > cmk > rpsA supraoperon has been studied in P. stutzeri [5, 6]. The tyrA•aroF fusion occurred in the common ancestor of the clade shown between Azotobacter and Microbulbifer in Fig. 6. The supraoperons of P. syringae, P. fluorescens and P. putida lack hisH b. P. syringae exhibits a recent C-terminal truncation of the aroF domain, coupled with acquisition elsewhere in the genome of a free-standing •aroF that is not phylogenetically congruent (probably of LGT origin). Acinetobacter sp. and Microbulbifer degradans possess an aroQ•pheA > tyrA•aroF unit that has become dissociated from serC at one end and from cmk on the other end. In Xylella and Xanthomonas, hisH b has been deleted from the genome and tyrA has been transposed away from serC > aroQ•pheA > aroF. The latter unit has been transposed away from gyrA, the ancestral flanking gene. On the other hand, cmk > rpsA has remained next to himD, the gene usually flanking rpsA.
The enteric lineage
Key to evolutionary events asserted in Figure 7
Evolutionary events proposed
Escape of aroQ•pheA and tyrA from the ancestral gyrA > serC > aroQ•pheA > hisH b > tyrA > aroF > cmk > rpsA > himD supraoperon. Origin of an aroQ•tyrA fusion. Origin of the aroA Iα_Y > aroQ•tyrA operon. Addition of tyrR. Addition of third aroA Iα species: aroA Iα_F.
Fusion of aroQ•pheA with aroA Iβ pseudogene of unknown origin. Replacement of hisH b by aspC duplicate linked with three ORFs.
Dissociation of gyrA and serC.
Removal of all genes intervening between aroQ•pheA and aroQ•tyrA.
Dissociation of aroF from both serC and cmk > rpsA > himD. Insertion of trpR within the intervening region between aroQ•pheA and aroQ•tyrA.
Dissociation of serC > hisH b > aroF from cmk > rpsA > himD.
Loss of aroA Iα_Y from tyr operon.
aroF becomes dissociated from hisH b, and aroA Iα_Y is removed from the tyrA operon.
ORF > gyrA is inserted after aroF.
aroQ•tyrA becomes a pseudogene.
hisH b is lost.
himD is lost.
cmk, himD and aroA Iα_Y > aroQ•tyrA are lost.
aroF, himD, aroQ•pheA, and aroA Iα_Y > aroQ•tyrA are lost.
All intervening genes between aroQ•pheA and aroQ•tyrA are eliminated.
pheA domain of aroQ•pheA becomes a pseudogene.
Insertion of ycaL between aroF and cmk.
Insertion of ORF between aroF and ycaL.
Insertion of ORF between aroQ•pheA and aroQ•tyrA.
The dissociation of tyrA c_Δ from the serC/rpsA linkage group correlates with the fusion of aroQ with tyrA c_Δ . The aroQ•pheA fusion has also escaped from the serC/rpsA linkage grouping and has become linked with the newly emerged tyr operon. Some sort of duplication and recombinational event between aroQ•pheA and tyrA c_Δ may have led to the creation of aroQ•tyrA c_Δ since the AroQ•PheA proteins of lower-gamma Proteobacteria are distinct from AroQ•PheA proteins of other Proteobacteria with respect to the inter-domain linker length and the indel content (data not shown).
Although it usually is absent from the lower-gamma Proteobacteria, HisHb has persisted as the broad-specificity aromatic aminotransferase in the Pasteurella/Haemophilus grouping where two hisH paralogs are generally present, one of narrow specificity (denoted hisH n) being within the histidine operon. The aspC gene next to aroF in Shewanella is a paralog that probably functions as an aromatic aminotransferase, suggestive of the situation in the E. coli grouping where tyrB is a close paralog relative of aspC, tyrB having become specialized for aromatic biosynthesis . Gene reduction associated with both endosymbiotic and pathogenic lifestyles are evident. Thus, Buchnera lacks tyrA, cmk, hisH, tyrB, and possesses only a single aroA Iα species (aroA Iα_H ). Haemophilus ducreyi also lacks tyrA, as well as aroAIα_H and the entire trp operon .
TyrA in its context of regulation
Although outside the scope of this study, a logical expansion of it would be to examine the individual evolutionary histories of all the members of the contemporary E. coli TyrR regulon, i.e., asking when and in what order did these genes come under the influence of tyrR? Clearly, the recruitment of structural genes by tyrR has been recent, quite dynamic and even now, exhibits evidence of further ongoing change. For example, tyrosine phenol-lyase (a catabolic enzyme that is only sparsely present in gamma Proteobacteria) has been recruited to the TyrR regulons of Erwinia herbicola  and Citrobacter freundii . In these cases, not only does TyrR perform as a transcriptional activator, but it requires cyclic AMP receptor protein and integration host factor to do so.
As exemplified by E. coli, TyrR is generally a repressor. However, the transcriptional expression of mtr is activated by TyrR in the presence of TYR, and tyrP is activated in the presence of PHE (although it is repressed in the presence of TYR). The N-terminal domain of TyrR has been associated with the ability of TyrR to activate transcription in the case of mtr and tyrP . Members of the Haemophilus/Pasteurella lineage have all lost the N-terminal domain and presumably all lack the ability to accomplish transcriptional activation, as has been demonstrated experimentally with H. influenzae TyrR .
In view of the interesting complexity that two operons (mtr and aroLM) in E. coli are regulated by both tyrR and trpR , it may be more than coincidental that tyrR and trpR seem to have emerged at about the same evolutionary time, i.e., coincident with the divergence of the upper-gamma Proteobacteria from the lower-gamma Proteobacteria (Fig. 7). A possible interaction between the TyrR and TrpR proteins has been noted .
PhhR in relationship to aromatic catabolism
Arias-Barrau et al.  have recently characterized a central catabolic pathway (Hmg) that degrades homogentisate in three steps to fumarate and acetoacetate as a source of carbon and energy. One of several peripheral pathways feeding into the central pathway begins with PHE and produces homogentisate via the reactions of phenylalanine hydroxylase (Phh), aromatic aminotransferase, and 4-hydroxyphenylpyruvate dioxygenase (Hpd). In the absence of Phh, a shorter version of the peripheral pathway is one that can use TYR, but not PHE, as a source of carbon and energy. In Fig. 8 the presence of Phh, Hpd, and Hmg segments of catabolism are mapped on a 16S rRNA tree. (The aromatic aminotransferase distribution is not shown since a multiplicity of aromatic aminotransferases having overlapping substrate specificities makes it particularly challenging to identify the functional role .) The cyanobacteria are unique among Bacteria in the use of Hpd for a completely different metabolic role unrelated to aromatic catabolism, i.e., the synthesis of vitamin E derivatives .
Relationship of TyrR and PhhR
What might be of origin of TyrR? TyrR is an anomalous member of the large prokaryote family of σ54 enhancer-binding proteins that activate promoters dependent upon σ54. TyrR is unique within its homology grouping in that it targets σ70 promoters for regulation, usually (but not always) being a repressor. Its closest homolog relative is PhhR, a canonical member of σ54 enhancer-binding proteins. σ54-dependent enhancer proteins possess a highly conserved σ54-contact motif, GAFTGA, that is intimately involved in formation of the ternary complex of enhancer and σ54-RNA polymerase holoenzyme . This is perfectly or nearly perfectly retained in the upper clades shown in Fig. 9, but is disrupted or completely absent in the clades between Shewanella oneidensis and Pasteurella multocida. The deeper phylogenetic distribution of PhhR (Fig. 8) suggests that TyrR evolved as a variant of PhhR. If correct, a regulatory gene that is oriented to catabolism (phhR), and itself of relatively recent origin, was conscripted even more recently for a completely new role in the regulation of primary biosynthesis (tyrR).
Consistent with the latter supposition, the gain of TyrR generally correlates with the loss of competence for aromatic catabolism (Fig. 8). In contrast to the Citrobacter/Salmonella/Escherichia/Shigella and the Pasteurella/Haemophilus clades (whose TyrR homologs completely lack the GAFTGA motif), the remaining enteric clades have retained some residues in this region. These residues appear to be more than random remnants. It would be interesting to know if these residues have any functional significance. Indeed, the Photobacterium/Vibrio clade has retained the ancestral catabolic capabilities (Fig. 8) that would appear to demand retention of regulation via PhhR; yet the parallelism of the overall features of biosynthesis that are shared with other lower-gamma Proteobacteria would seem, on the other hand, to demand TyrR-mediated regulation. Perhaps this "TyrR" species participates in the regulation of both catabolic and biosynthetic genes. In this connection, it is interesting that Chaney et al.  found that a change in the GAFTGA motif of NifA could be partially "suppressed" by mutational changes in the N-terminal region of σ54.
Even more striking as a possible evolutionary intermediate is the most outlying member of the lower- gamma Proteobacteria, Shewanella oneidensis. The position of its TyrR on the protein tree parallels expectations based on the 16S rRNA tree. This, plus the conservation of the TyrR regulon features and the overall gene synteny suggest E. coli-like function as TyrR, i.e. acting as a general repressor of regulon-member σ70 promoters engaged in aromatic biosynthesis. However, the location of "tyrR" in S. oneidensis between phhA and phhB on one side, and hmgB and hmgC on the other side, strongly implies some kind of regulatory relationship with the catabolic genes. It would be quite interesting to determine experimentally whether "TyrR" in S. oneidensis (and maybe Vibrio, as well) can function as a repressor of the usual suite of σ70 promoters, as well as an activator of σ54 promoters for phhA/phhB and/or hmgB/hmgC.
We suggest that TyrR evolved as a modified version of PhhR as follows. In view of the distribution of genes encoding PhhR and TyrR, as well as the aforementioned catabolic enzymes, the most parsimonious evolutionary scenario may be that central and peripheral catabolic pathways depicted in Fig. 8 are quite ancient, but acquisition of PhhR as a σ54-dependent activator of phenylalanine hydroxylase was quite recent, originating about the time of divergence of gamma Proteobacteria. The clade defined by Shewanella/Vibrio/Photobacterium retained the catabolic pathway, whereas the other enteric lineages discarded the catabolic pathway, but retained PhhR, which was then recruited as a σ70-dependent regulator of aromatic biosynthesis (TyrR).
Regulation by attenuation
A widespread mechanism of regulation is via an attenuation mechanism whereby transcripts initiated at given promoters can be terminated prior to reaching the structural genes of an operon. Whether termination occurs usually depends on the balance (dictated by a variety of mechanisms) between mutually exclusive terminator and anti-terminator structures .
Putative attenuatorsa associated with tyrA
Gene organization b
¬ hisH b > tyrA
¬ aroG > hisH b > tyrA> aroF
¬ aroG > hisH b > tyrA> aroF
¬ tyrA> aroF
¬ pheA > hisH b > aroA Iβ •aroQ > tyrA
¬ gyrA > serC > aroQ•pheA > tyrA> aroF > cm k> rpsA > himD
¬ aroA' > aroB' > aroQ•pheA > aroF > tyrA> [trp operon]
¬ aroD > aroA Iβ > aroB > aroG > tyrA> aroF > aroE > pheA
¬ ysaA > blrG > kinG > tyrA> aroF > aroE > pheA
¬ORF > aroG > ORF > aroF > tyrA> aroE
¬ aroG > aroB > aroH > hisH b > tyrA> aroF
¬ ORF>aroC I >aroD>aroB>aroG> tyrA> ¬ ORF>aroF>aroE>pheA
¬ pheA > aroA Iβ > tyrA> aroF > ORF > ORF
¬ aroA Iβ > tyrA
¬ aroA Iα_Y > tyrA
¬ aroA Iα_Y > tyrA
Some of the supraoperons that appear to be controlled by attenuation are interesting in that they contain the majority of genes needed for both PHE and TYR biosynthesis, e.g., the supraoperons in Enterococcus faecalis and Streptococcus pneumoniae. The latter organism displays two attenuator units. The supraoperon of Desulfovibrio vulgaris is novel in that it begins with two relatively rare genes encoding alternative enzyme steps for aromatic biosynthesis , denoted here as aroA' and aroB'. The leading five genes are adjacent to the seven-gene trp operon.
Protein divergence within a vertical genealogy is not necessarily smooth and progressive. Qualitative biochemical innovations can result in a barrage of new selective pressures that result in evolutionary jumps. The consequent incongruence might easily be mistaken for LGT. The basis for evolutionary jumps will usually only be recognized by detailed and comprehensive analyses of any given subsystem. Examples in this study are as follows. (i) The tyrA c_Δ gene of the lower-gamma Proteobacteria has diverged markedly from tyrA c of the upper-gamma Proteobacteria. Here the milestone event was fusion of aroQ to a putative tyrA c in the ancestor of lower-gamma Proteobacteria to produce aroQ•tyrA c_Δ. Indels within the •tyrA c_Δ domain presumably reflect a multiplicity of selections for functional interactions known to exist between the two fused domains as discussed earlier. (ii) Members of the subclass taxon Actinobacteridae possess TyrAa proteins that separate into two distinct groupings. The presumed ancestral NAD(p)TyrAa that is still present in the Actinobacteridae_1 clade very likely spawned the divergent NAD+-specific variety of TyrAa to yield the contemporary Actinobacteridae_2 clade.
The previous evolutionary analysis of trp-pathway genes [7, 8] can be viewed as a model for comparable studies with other gene systems. Expansion to the greater aromatic pathway is a logical extension. The dynamics of evolutionary change for tyrA can be matched to the dynamics exhibited by the trp system. For example, the lower-gamma Proteobacteria separate as a distinct phylogenetic unit from beta Proteobacteria and upper-gamma Proteobacteria on criteria defined by milestone evolutionary events that altered many character states of both tryptophan and tyrosine biosynthesis in the lower-gamma Proteobacteria. In the future one can anticipate that comprehensive and systematic phylogenetic analysis of each protein member of the TYR, PHE and TRP branches, the common aromatic-pathway trunk, and minor vitamin-like branches (such as the 4-aminobenzoate/folate branch) will accommodate a progressively integrated picture of the entire aromatic network, including catabolic pathways and many other specialized pathways.
Most TyrA sequences were obtained from the National Center for Biotechnology Information (NCBI) . TyrA sequences from incomplete genomes were retrieved from the PEDANT database . Several sequences in our curated TyrA collection have been corrected for incorrect translation start sites. Various curated TyrA sequence files can be downloaded from our website. These files include complete sequences, trimmed catalytic-core domains, and amino-acid sequence segments that are relevant to specificity for pyridine nucleotide or to specificity for the cyclohexadienyl substrate. The sequence files are summarized in Table 3.
TyrA proteins that cluster together on the TyrA protein tree in congruence with the 16S rRNA tree are called congruency groups. Exact correspondence of branching orders is not necessarily observed. So far, congruency groupings have been assembled for tryptophan-protein concatenates  and for TyrA proteins. Completion of equivalent work with the remaining aromatic-pathway segments will identify the repertoire of bacterial organisms in possession of a "pure" vertical genealogy with respect to aromatic biosynthesis. Congruency groups for TyrA can be accessed at our AroPath website , where a listing of the membership of congruency groups is maintained and updated. Any members of congruency-group clusters, whose position there is incongruent with 16S rRNA expectations, probably (but not necessarily) originated by LGT. The donor lineage may not be obvious, but as more genomes come on line, many cases where donor identities are currently unknown may become revealed. A listing of "orphan" TyrA proteins that belong to no current congruency group is given. Such orphans reflect the lack of sufficient genome representation in particular phylogenetic regions and undoubtedly will become the nucleus for additional congruency groups in due course.
Multiple alignments were obtained by use of the ClustalW or ClustalX programs (Version 1.83) . Manual adjustments were needed in the region of the GxGxxG motif for binding of pyridine nucleotide cofactor in the N-terminal region of TyrA proteins. Guidance for alignment was assisted by maximizing conformation with the Wierenga fingerprint, making allowance for a variable loop of 2–5 residues . This was done with the assistance of the BioEdit multiple alignment tool of Hall (5.0.9 Edition) . The refined multiple alignment was used as input for generation of a phylogenetic tree using the phylogeny inference package (Version 3.2), PHYLIP . The neighbor-joining program was used to obtain a distance-based tree. The distance matrix was obtained by use of Protdist with a Dayhoff Pam matrix. The Seqboot and Consense programs were then applied to assess the statistical support of the tree using bootstrap resampling (1,000 replications). We also used the ANCESCON package , which produced similar results as shown in Fig. 2 (albeit with even wider separation of many groups). The presence of regulatory domains (ACT and REG) was accepted when indicated by the Domain Architecture Retrieval Tool (DART) on the BLAST menu at NCBI .
Profile hidden Markov models for each of the four TyrA subfamilies, TyrAa, TyrAc, TyrAp and tyrA c_Δ , were built using Sean Eddy's HMMER package . The HMMs were generated from our file of curated cyclohexadienyl-substrate core segments (see Table 3). The seed sequences for each subfamily were first aligned using ClustalW . The resulting multiple sequence alignments were then manually edited to produce more accurate alignment of the seed sequences. Finally, the edited multiple sequence alignments were used to generate the profile HMMs for each TyrA subfamily.
Appraisal of gene fusions as one-time or multiple events
Whether any given contemporary gene fusions tracked back to a fusion event in a common ancestor or whether they occurred independently was evaluated by phylogenetic analysis of the individual protein domains and by inspection of the inter-domain linker region. Linker regions were determined by multiple alignments of fusion sequences with corresponding free-standing domains present in the closest relatives to organisms that lack the gene fusions.
R. Jensen thanks the National Library of Medicine (Grant G13 LM008297) for partial support. This research is partially supported by the U. S. Army Research Institute of Infectious Diseases (USAMRIID). This analysis would not have been possible were it not for the yeoman efforts in comparative enzymology carried out over a period of more than 25 years by many students and postdoctoral fellows, most notably Graham S. Byng, Robert Whitaker, Alan X. Berry and Suhail Ahmad. This has produced an invaluable resource of comprehensive data, some of it unpublished. This paper is dedicated to our colleague and collaborator, John E. Gander, on the occasion of his 80th birthday.
- Xie G, Bonner CA, Jensen RA: Cyclohexadienyl dehydrogenase from Pseudomonas stutzeri exemplifies a widespread type of tyrosine-pathway dehydrogenase in the TyrA protein family. Comp Biochem Physiol C Toxicol Pharmacol. 2000, 125: 65-83.PubMedGoogle Scholar
- Jensen RA: Tyrosine and phenylalanine biosynthesis: relationship between alternative pathways, regulation and subcellular location. Rec Adv Phytochem. 1986, 20: 57-82.Google Scholar
- Todd AE, Orengo CA, Thornton JM: Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol. 2001, 307: 1113-1143. 10.1006/jmbi.2001.4513.View ArticlePubMedGoogle Scholar
- Teichmann SA, Rison SCG, Thornton JM, Riley M, Gough J, Clothia C: The evolution and structural anatomy of the small molecule metabolic pathways in Escherichia coli. J Mol Biol. 2001, 311: 693-708. 10.1006/jmbi.2001.4912.View ArticlePubMedGoogle Scholar
- Xie G, Brettin T, Bonner CA, Jensen RA: Mixed-function supraoperons that exhibit overall conservation, albeit shuffled gene organization, across wide intergenomic distances within eubacteria. Microb Comp Genomics. 1999, 4: 5-28.View ArticlePubMedGoogle Scholar
- Xie G, Bonner CA, Jensen RA: A probable mixed-function supraoperon in Pseudomonas exhibits gene organization features of both intergenomic conservation and gene shuffling. J Mol Evol. 1999, 49: 108-121.View ArticlePubMedGoogle Scholar
- Xie G, Keyhani N, Bonner CA, Jensen RA: Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol Mol Biol Rev. 2003, 67: 303-342. 10.1128/MMBR.67.3.303-342.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Xie G, Bonner CA, Song J, Keyhani NO, Jensen RA: Inter-genomic displacement via lateral gene transfer of bacterial trp operons in an overall context of vertical genealogy. BMC Biology. 2004, 2: 15-10.1186/1741-7007-2-15.PubMed CentralView ArticlePubMedGoogle Scholar
- AroPath. [http://AroPath.lanl.gov/Phylogeny/CG/tyrCG.html]
- Gil R, Silva FJ, Zientz E, Delmotte F, Gonzalez-Candelas F, Latorre A, Rausell C, Kamerbeek J, Gadau J, Holldobler B, et al: The genome sequence of Blochmannia floridanus : comparative analysis of reduced genomes. Proc Natl Acad Sci USA. 2003, 100: 9388-9393. 10.1073/pnas.1533499100.PubMed CentralView ArticlePubMedGoogle Scholar
- Gevers D, Vandepoole K, Simillion C, Van de Pere Y: Gene duplication and biased functional retention of paralogs in bacterial genomes. Trends Microbiol. 2004, 12: 148-154. 10.1016/j.tim.2004.02.007.View ArticlePubMedGoogle Scholar
- AroPath. [http://AroPath.lanl.gov/Phylogeny/CG/index.html]
- Blanc V, Gil P, Bamasjacques N, Lorenzon S, Zagorec M, Schleuniger J: Identification and analysis of genes from Streptomyces pristinaespiralis encoding enzymes involved in the biosynthesis of the 4-dimethylamino-L-phenylalanine precursor of pristinamycin I. Mol Microbiol. 1997, 23: 191-202. 10.1046/j.1365-2958.1997.2031574.x.View ArticlePubMedGoogle Scholar
- Lingens F, Göbel W, Üsseler H: Regulation der biosynthesis der aromatischen aminosauren in Saccharomyces cerevisiae, I. Hemmung der Enzymaktivitaten (Feedback-Wirkung). Biochem Z. 1966, 346: 357-67.Google Scholar
- Zamir LO, Jung E, Jensen RA: Co-accumulation of prephenate L-arogenate and spiro-arogenate in a mutant of Neurospora. 1983, 258: 6492-6496.Google Scholar
- National Center for Biotechnology Information. [http://www.ncbi.nlm.nih.gov]
- Xia T, Jensen RA: A single cyclohexadienyl dehydrogenase specifies the prephenate dehydrogenase and arogenate dehydrogenase components of the dual pathways to L-tyrosine in Pseudomonas aeruginosa. J Biol Chem. 1990, 265: 20033-20036.PubMedGoogle Scholar
- Zhao G, Xia T, Ingram L, Jensen RA: An allosterically insensitive class of cyclohexadienyl dehydrogenase from Zymomonas mobilis. Eur J Biochem. 1993, 212: 157-165. 10.1111/j.1432-1033.1993.tb17646.x.View ArticlePubMedGoogle Scholar
- Jensen RA: Enzyme recruitment in evolution of new function. Annu Rev Microbiol. 1976, 30: 409-425. 10.1146/annurev.mi.30.100176.002205.View ArticlePubMedGoogle Scholar
- Hall GC, Flick MB, Gherna RL, Jensen RA: Biochemical diversity for biosynthesis of aromatic amino acids among the cyanobacteria. J Bacteriol. 1982, 149: 65-78.PubMed CentralPubMedGoogle Scholar
- Subramaniam P, Bhatnagar R, Hooper A, Jensen RA: The dynamic progression of evolved character states for aromatic amino acid biosynthesis in gram-negative bacteria. Microbiology. 1994, 140: 3431-3440.View ArticlePubMedGoogle Scholar
- Byng GS, Whitaker RJ, Gherna RL, Jensen RA: Variable enzymological patterning in tyrosine biosynthesis as a means of determining natural relatedness among the Pseudomonadaceae. J Bacteriol. 1980, 144: 247-257.PubMed CentralPubMedGoogle Scholar
- Keller B, Keller E, Gorisch H, Lingens F: Biosynthesis of phenylalanine and tyrosine in Streptomycetes. Hoppe Seylers Z Physiol Chem. 1983, 364: 455-459.View ArticlePubMedGoogle Scholar
- Keller B, Keller E, Lingens F: Arogenate dehydrogenase from Streptomyces phaeochromogenes. Purification and properties. Biol Chem Hoppe Seyler. 1985, 366: 1063-1066.View ArticlePubMedGoogle Scholar
- Bonner CA, Jensen RA, Gander JE, Kehani NO: A core catalytic domain of the TyrA protein family: arogenate dehydrogenase from Synechocystis. Biochem J. 2004, 382: 279-291. 10.1042/BJ20031809.PubMed CentralView ArticlePubMedGoogle Scholar
- Wierenga RK, Terpstra P, Hol WGJ: Prediction of the occurrence of the ADP-binding β α β-fold in proteins, using an amino-acid sequence fingerprint. J Mol Biol. 1986, 187: 101-107. 10.1016/0022-2836(86)90409-2.View ArticlePubMedGoogle Scholar
- Rippert P, Matringe M: Molecular and biochemical characterization of an Arabidopsis thaliana arogenate dehydrogenase with two highly similar and active protein domains. Plant Mol Biol. 2002, 48: 361-368. 10.1023/A:1014018926676.View ArticlePubMedGoogle Scholar
- Rippert P, Matringe M: Purification and kinetic analysis of the two recombinant arogenate dehydrogenase isoforms of Arabidopsis thaliana. Eur J Biochem. 2002, 269: 4753-4761. 10.1046/j.1432-1033.2002.03172.x.View ArticlePubMedGoogle Scholar
- Xie G, Forst C, Bonner CA, Jensen RA: Significance of two distinct types of tryptophan synthase beta chain in Bacteria, Archaea and higher plants. Genome Biol. 2002, 3: Research0004.1-0004.13. 10.1186/gb-2001-3-1-research0004.View ArticleGoogle Scholar
- Champney WS, Jensen RA: The enzymology of prephenate dehydrogenase in Bacillus subtilis. J Biol Chem. 1970, 245: 3763-3770.PubMedGoogle Scholar
- Xie G, Bonner CA, Brettin T, Gottardo R, Keyhani NO, Jensen RA: Lateral gene transfer and ancient paralogy of operons containing redundant copies of tryptophan-pathway genes in Xylella species and heterocystous cyanobacteria. Genome Biol. 2003, 4: R14-10.1186/gb-2003-4-2-r14.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen S, Vincent S, Wilson DB, Ganem B: Mapping of chorismate mutase and prephenate dehydrogenase domains in the Escherichia coli T-protein. Eur J Biochem. 2003, 270: 757-763. 10.1046/j.1432-1033.2003.03438.x.View ArticlePubMedGoogle Scholar
- Mavrodi DV, Ksenzenko VM, Bonsall RF, Cook RJ, Boronin AM, Thomashow LS: A seven-gene locus for synthesis of phenazine-1-carboxylic acid by Pseudomonas fluorescens 2–79. J Bacteriol. 1998, 180: 2541-2548.PubMed CentralPubMedGoogle Scholar
- Pierson LS, Gaffney T, Lamb S, Gong F: Molecular analysis of genes encoding phenazine biosynthesis in the biological control bacterium. Pseudomonas aureofaciens 30–84. FEMS Lett. 1995, 134: 299-307. 10.1016/0378-1097(95)00423-X.Google Scholar
- AroPath. [http://AroPath.lanl.gov/Annotation/CuratedAASeqForDownload.html]
- Pfam. [http://www.sanger.ac.uk/Software/Pfam/]
- Interpro. [http://www.ebi.ac.uk/interpro/]
- Eddy SR: Profile-hidden Markov models. Bioinformatics. 1998, 14: 755-763. 10.1093/bioinformatics/14.9.755.View ArticlePubMedGoogle Scholar
- Park J, Kaplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C: Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol. 1998, 284: 1201-1210. 10.1006/jmbi.1998.2221.View ArticlePubMedGoogle Scholar
- AroPath. [http://AroPath.lanl.gov/Biosynthesis/TyrPath/hmmPfamTyrA.html]
- AroPath. [http://AroPath.lanl.gov/Biosynthesis/TyrPath/index.html]
- Fazel A, Jensen R: Obligatory biosynthesis of L-tyrosine via the pretyrosine branchlet in coryneform bacteria. J Bacteriol. 1979, 138: 805-815.PubMed CentralPubMedGoogle Scholar
- Fazel AM, Bowen JR, Jensen RA: Arogenate (pretyrosine) is an obligatory intermediate of L-tyrosine biosynthesis: confirmation in a microbial mutant. Proc Natl Acad Sci USA. 1980, 77: 1270-1273.PubMed CentralView ArticlePubMedGoogle Scholar
- Byng GS, Berry A, Jensen RA: Evolutionary implications of features of aromatic amino acid biosynthesis in the genus Acinetobacter. Arch Microbiol. 1985, 143: 122-129. 10.1007/BF00411034.View ArticlePubMedGoogle Scholar
- Porat I, Waters BW, Teng Q, Whitman WB: Two biosynthetic pathways for aromatic amino acids in the archaeon Methanococcus maripaludis. J Bacteriol. 2004, 186: 4940-4950. 10.1128/JB.186.15.4940-4950.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Calhoun DH, Bonner CA, Gu W, Xie G, Jensen RA: The emerging periplasm-localized subclass of AroQ chorismate mutases, exemplified by those from Salmonella typhimurium and Pseudomonas aeruginosa. Genome Biol. 2001, 2research0030.1-0030.16.Google Scholar
- Ahmad S, Jensen RA: The stable evolutionary fixation of a bifunctional tyrosine-pathway protein in enteric bacteria. Microbiol Lett. 1988, 52: 109-116. 10.1016/0378-1097(88)90309-6.View ArticleGoogle Scholar
- Jensen RA, Ahmad S: Nested gene fusions as markers of phylogenetic branchpoints in prokaryotes. Trends Ecol Evol. 1990, 5: 219-224. 10.1016/0169-5347(90)90135-Z.View ArticlePubMedGoogle Scholar
- Jensen RA, Gu W: Evolutionary recruitment of biochemically specialized subdivisions of Family I within the protein superfamily of aminotransferases. J Bacteriol. 1996, 178: 2161-2171.PubMed CentralPubMedGoogle Scholar
- Aravind L, Koonin EV: Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J Mol Biol. 1999, 287: 1023-1040. 10.1006/jmbi.1999.2653.View ArticlePubMedGoogle Scholar
- Henner D, Yanofsky C: Bacillus subtilis and other gram-positive bacteria. Biochemistry, physiology, and molecular genetics. Edited by: Sonenshein AL, Hoch J, Losick R. 1993, Washington, DC: ASM PressGoogle Scholar
- White RH: L-Aspartate semialdehyde and a 6-deoxy-5-ketohexose 1-phosphate are the precursors to the aromatic amino acids in Methanocaldococcus jannashii. Biochemistry. 2004, 43: 7618-7627. 10.1021/bi0495127.View ArticlePubMedGoogle Scholar
- Ahmad S, Johnson JL, Jensen RA: The recent evolutionary origin of the phenylalanine-sensitive isozyme of 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase in the enteric lineage of bacteria. J Mol Evol. 1987, 25: 159-167.View ArticlePubMedGoogle Scholar
- Jensen RA, Xie G, Calhoun DH, Bonner CA: The correct phylogenetic relationship of KdsA (3-deoxy-D-manno-octulosonate 8-phosphate synthase) with one of two independently evolved classes of AroA (3-deoxy-D-arabino-heptulosonate 7-phosphate synthase). J Mol Evol. 2002, 54: 416-423.View ArticlePubMedGoogle Scholar
- Pittard AJ, Camakaris H, Yang J: The TyrR regulon. Mol Microbiol. 2005, 55: 16-26. 10.1111/j.1365-2958.2004.04385.x.View ArticlePubMedGoogle Scholar
- Katayama T, Suzuki H, Koyanagi T, Kumagai H: Cloning and random mutagenesis of the Erwinia herbicola tyrR gene for high-level expression of tyrosine phenol-lyase. Appl Envir Microbiol. 2000, 66: 4764-4771. 10.1128/AEM.66.11.4764-4771.2000.View ArticleGoogle Scholar
- Bai Q, Somerville R: Integration host factor and cyclic AMP receptor proein are required for TyrR-mediated activation of tpl in Citrobacter freundii. J Bacteriol. 1998, 180: 6173-6186.PubMed CentralPubMedGoogle Scholar
- Zhao S, Somerville RL: Isolated operator binding and ligand response domains of the TyrR protein of Haemophilus influenzae associate to reconstitute functional repressor. J Biol Chem. 1999, 274: 1842-1847. 10.1074/jbc.274.3.1842.View ArticlePubMedGoogle Scholar
- Arias-Barrau E, Olivera E, Luengo J, Fernandez C, Galan B, Garcia J, Diaz E, Miñambres B: The homogentisate pathway: a central catabolic pathway involved in the degradation of L-phenylalanine, L-tyrosine, and 3-hydroxyphenylacetate in Pseudomonas putida. J Bacteriol. 2004, 186: 5062-5077. 10.1128/JB.186.15.5062-5077.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Dähnhardt D, Falk J, Appel J, van der Kooij A, Schulz-Friedrich R, Krupinska K: The hydroxyphenylpyruvate dioxygenase from Synechocystis sp. PCC 6803 is not required for plastoquinone biosynthesis. FEBS Lett. 2002, 523: 177-181. 10.1016/S0014-5793(02)02978-2.View ArticlePubMedGoogle Scholar
- Song J, Jensen RA: PhhR, a divergently transcribed activator of the phenylalanine hydroxylase gene cluster of Pseudomonas aeruginosa. Mol Microbiol. 1996, 22: 497-507. 10.1046/j.1365-2958.1996.00131.x.View ArticlePubMedGoogle Scholar
- Zhao G, Xia T, Song J, Jensen R: Pseudomonas aeruginosa possesses homologues of mammalian phenylalanine hydroxylase and 4a-carbinolamine dehydratase/DCoH as part of a three-component gene cluster. Proc Natl Acad Sci USA. 1994, 91: 1366-1370.PubMed CentralView ArticlePubMedGoogle Scholar
- Tropel D, van der Meer J: Bacterial transcriptional regulators for degradation pathways of aromatic compounds. Microbiol Mol Biol Rev. 2004, 68: 474-500. 10.1128/MMBR.68.3.474-500.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Chaney M, Grande R, Wigneshweraraj S, Cannon W, Casaz P, Gallegos M-T: Binding of transcriptional activators to sigma 54 in the presence of the transition state analog ADP-aluminum fluoride: insights into activator mechanochemical action. Genes Dev. 2001, 15: 2282-2294. 10.1101/gad.205501.PubMed CentralView ArticlePubMedGoogle Scholar
- Yanofsky C: The different roles of tryptophan transfer RNA in regulating trp operon expression in E. coli versus B. subtilis. Trends Genet. 2004, 20: 367-374. 10.1016/j.tig.2004.06.007.View ArticlePubMedGoogle Scholar
- Predicted attenuators in bacteria. [http://cmgm.stanford.edu/~merino]
- Riley ML, Schmidt T, Wagner c, Mewes H-W, Frishman D: The PEDANT genome database in 2005. Nuc Ac Res. 2005, 33: D308-D310. 10.1093/nar/gki019.View ArticleGoogle Scholar
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson T, Higgins D, Thompson J: Multiple sequence alignment with the Clustal series of programs. Nucl Ac Res. 2003, 31: 3497-3500. 10.1093/nar/gkg500.View ArticleGoogle Scholar
- BioEdit. [http://www.mbio.ncsu.edu/BioEdit/bioedit.html]
- Felsenstein J: PHYLIP-Phylogeny Inference Package (version 3.2). Cladistics. 1989, 5: 164-166.Google Scholar
- Cai W, Pei J, Grishin NV: Reconstruction of ancestral protein sequences and its applications. BMC Evol Biol. 2004, 4: 33-10.1186/1471-2148-4-33.PubMed CentralView ArticlePubMedGoogle Scholar
- Eddy S: HMMER package. 1995, [http://hmmer.wustl.edu]Google Scholar
- AroPath. [http://AroPath.lanl.gov/Annotation/TyrA/TyrA_substrateSpc.html]
- AroPath. [http://AroPath.lanl.gov/Organisms/Species_sorted_by_acronym.html]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.