N-Terminal Fusion Tags for Effective Production of G-Protein-Coupled Receptors in Bacterial Cell-Free Systems

G-protein-coupled receptors (GPCR) constitute one of the biggest families of membrane proteins. In spite of the fact that they are highly relevant to pharmacy, they have remained poorly explored. One of the main bottlenecks encountered in structural-functional studies of GPCRs is the difficulty to produce sufficient amounts of the proteins. Cell-free systems based on bacterial extracts fromE. colicells attract much attention as an effective tool for recombinant production of membrane proteins. GPCR production in bacterial cell-free expression systems is often inefficient because of the problems associated with the low efficiency of the translation initiation process. This problem could be resolved if GPCRs were expressed in the form of hybrid proteins with N-terminal polypeptide fusion tags. In the present work, three new N-terminal fusion tags are proposed for cell-free production of the human β2-adrenergic receptor, human M1 muscarinic acetylcholine receptor, and human somatostatin receptor type 5. It is demonstrated that the application of an N-terminal fragment (6 a.a.) of bacteriorhodopsin fromExiguobacterium sibiricum(ESR-tag), N-terminal fragment (16 а.о.) of RNAse A (S-tag), and Mistic protein fromB. subtilisallows to increase the CF synthesis of the target GPCRs by 5–38 times, resulting in yields of 0.6–3.8 mg from 1 ml of the reaction mixture, which is sufficient for structural-functional studies.


INTRODUCTION
Integrated membrane proteins (MPs) participate in a number of processes essential for single-cell and metazoan organisms. these proteins are responsible for cellular energetics, intercellular recognition, signal transduction, and transport of various substances through the cell membrane [1]. recent data indicate that MPs make up over 25% of all amino acid sequences in the genomes of higher organisms, including the human genome. G-protein-coupled receptors (GPcr) are among the most pharmacologically important MP classes. Over 800 GPcr genes have been identified in the human genome [3], and membrane receptors of this class are the targets of ~30% of modern drugs [4]. GPcrs are characterized by homological spatial organization and contain seven transmembrane (tM) helices, as well as the extracellular n-and intracellular c-terminal regions [5]. the binding sites of low-molecular-weight ligands localize in the tM domain of the receptor, whereas peptide hormones and regulatory proteins interact with the n-terminal region and extracellular loops [5].
GPcrs are of particular interest for pharmacological research; however, structural and functional investigations of these receptors are complicated [5] because of the infeasibility of isolating a sufficient amount of the protein from natural sources and the problems concerned with designing high-performance systems to heterologously produce these MPs [6]. Over the last decade, the joint use of expression systems based on eukaryotic cells and new methods of X-ray structure analysis has enabled to determine the spatial structure of a series of GPcrs [5], including the human β2-adrenoreceptor (β2Ar) [7] and human muscarinic M2 and M3 cholinoreceptors (mAchr) [8,9]. these studies have led to a better understanding of the principles of the spatial organization of GPcr. However, a thorough investigation into the functional dynamics and mechanisms of membrane receptor functioning requires the use of high-resolution spectroscopic methods, such as heteronuclear nMr spectroscopy [10]. the current nMr spectroscopy methods require milligram amounts of protein samples labeled with stable isotopes ( 2 H, 13 c, 15 n) [10], which are expensive when eukaryotic systems are used. Meanwhile, the use of conventional bacterial expression systems for GPcr production often does not allow to achieve high yields of the target protein and is complicated due to the necessity to develop re-naturation protocols [11].
cell-free (cF) expression systems [12], and in particular those based on bacterial extracts, have recently gained increasing popularity as an alternative tool for the recombinant production of MPs [13]. As compared with the systems based on cell production, cF systems have a number of advantages, including exclusive production of the target protein, the possibility to synthesize toxic proteins, simple procedure for synthesizing selectively isotope-labeled samples, and the possibility of direct introduction of various agents and cofactors to the reaction mixture to stabilize the native spatial structure of the synthesized protein in the solution [12,13]. thus, the components of membrane-mimicking media, such as detergent micelles, lipid/detergent bicelles, liposomes, and lipid-protein nanodiscs, can be added to the reaction mixture to produce soluble MPs [13][14][15].
According to the published data, direct expression of GPcr genes in cF systems is inefficient [14,[16][17][18]. the low efficiency of the translation initiation process [18], due to the formation of a secondary structure of the 5-prime end mrnA fragment, is among the possible reasons [19,20]. In most cases, this problem can be solved and the desired level of GPcr production can be attained by inserting additional nucleotide sequences encoding n-terminal polypeptide fusion tags, such as the fragment of the protein 10 leader sequence of bacteriophage t7 (t7-tag, 11 a.a.; hereinafter, the sequence length is given with allowance for the n-terminal Met residue) [14,16], the thioredoxin protein from E. coli (trX) [17], or 1-6 a.a. long synthetic sequences [18] at the 5-prime end of the target protein gene. three novel n-terminal fusion tags are proposed in this work in order to increase the efficiency of cell-free production of human GPcr by the example of β2Ar, M1-mAchr, and somatostatin receptor type 5 (SStr5). It is shown that the use of nucleotide sequences encoding the n-terminal fragment (6 a.a.) of bacteriorhodopsin from Gram-positive bacteria Exiguobacterium sibiricum (eSr-tag), the n-terminal fragment (16 a.a.) of ribonuclease A (n-terminal fragment of S-peptide, S-tag), and Mistic protein from Bacillus subtilis allows to increase the receptor yield by 5-38 times, providing a sufficient level of target protein production for further structural and functional studies.
Cell-free production of GPCR GPcrs were synthesized in the continuous cell-free system based on the E. coli S30 extract using protocols [15,21]. the final concentrations of the components of the reaction mixture were as follows: 100 mM HePeS-KOH (Fluka, uSA), pH 8.0; 8 mM Mg(OAc) 2 , 90 mM KOAc, 20 mM potassium acetyl phosphate (Sigma, uSA), 20 mM potassium phosphoenolpyruvate (Aldrich, uSA), 1.3 mM of each amino acid, except for Arg, cys, Met, trp, Asp, Glu, whose concentrations were 2.3 mM; 0.15 mg/ml folic acid (Sigma), each of four ribonucleoside triphosphates at a concentration of 1 mM; proteinase inhibitor (X1 complete protease inhibitor ® , roche Diagnostics, Germany); 0.05% of nan 3 ; 2% of polyethylene glycol 8000 (Sigma); 0.3 u/μl of ribonuclease inhibitor riboLock (Fermentas, Lithuania); 0.04 mg/ml of pyruvate kinase (Fermentas, Lithuania); 5.5 μg/ml of t7 polymerase; 0.3 mg/ml of plasmid DnA, 0.5 mg/ml of total trnA (from E. coli Mre 600) (roche Diagnostics, Switzerland), 30% of the total volume of the reaction mixture of the E. coli S30 extract. the feeding mixture (FM) had the same composition, except for the high-molecular-weight components: S30 extract, plasmid, enzymes, and ribonuclease inhibitor. the synthesis was carried out without the addition of any membrane-mimicking media in rM and FM. the rM and FM volumes were 50 and 750 μl, respectively. the rM was placed into the reactor separated from the FM solution with a dialysis membrane (pore size 12 kDa, Sigma, uSA), followed by incubation for 20 h at 30 o c under moderate stirring.
Isolation and purification of GPCR samples the rMs containing synthesized GPcrs were centrifuged for 15 min at 14000 rpm. the resulting precipitates were solubilized in buffer A (20 mM tris-Hcl, 250 mM nacl, 1 mM nan 3 , pH 8.0) containing 1% of sodium dodecyl sulfate (SDS), 1 mM dithiothreitol, and 8 M urea. the solubilized proteins were transferred to the column with ni 2+ -sepharose (Ge Healthcare, Sweden), washed with 10 column volumes of buffer A containing 1% SDS, and eluted with 3 volumes of buffer A containing 1% SDS and 500 mM imidazole. the GPcr samples were dialyzed against buffer A containing 1% SDS.
the eluate fractions were analyzed by SDS-PAGe and Western blotting using mouse monoclonal antibodies against the hexahistidine sequence (His-tag ® Monoclonal antibody, novagen, uSA). the amount of purified GPcr samples was determined spectrophotometrically at room temperature based on absorption at 280 nm. the cD spectra were recorded at room temperature on a J-810 spectrometer (Jasco, Japan).

RESULTS AND DISCUSSION
Design of the GPCR genes the truncated variants of the receptors containing additional point substitutions were used to increase the stability of the GPcr samples and to reduce the aggregation tendency of the proteins. Genetic engineering methods were used to excise the n-and c-terminal extramembrane regions that do not participate in ligand binding [7][8][9][22][23][24]. the deletion of the c-terminal regions of the receptors resulted in the removal of cysteine residues (241,435, and 320), which are presumably the sites of post-translational binding of palmitic acid residues in human β2Ar, M1-mAchr and SStr5 molecules, respectively [7,23,24]. In addition, the fragment of the third cytoplasmic loop (L3), which also does not participate in ligand binding, was deleted from the M1-mAchr molecule [8,9,25]. the genes obtained encoded the regions 25-340, 19-224/354-426, and 37-319 of human receptors β2Ar, M1-mAchr, and SStr5, respectively. Additional His 10 -tag sequences were inserted at the 3-prime end of the genes in order to provide further purification of recombinant proteins by ni 2+ affinity chromatography.
the truncated genes of the β2Ar, M1-mAchr, and SStr5 receptors encoded 10, 9, and 10 cysteine residues, respectively. Among those, only the residues from the extracellular region presumably participate in the formation of disulfide bonds (cys106-cys191 and cys184-cys190 in β2Ar; cys98-cys178 and cys391-cys394 in M1-mAchr; cys112-cys186 in SStr5, the numeration is provided for the native sequence of the receptors). In order to reduce the aggregation of recombinant proteins due to the formation of "non-native" disulfide intermolecular bonds, transmembrane and cytoplasmatic cys residues were substituted via site-directed mutagenesis. thus, the data [26,27] were used to substitute cys77, cys116 and cys125 residues in β2Ar for Val; and to substitute cys285, cys327, and cys265 for Ser. In M1-mAchr, the cys69, cys205, cys417, and cys421 residues were substituted for Ser [28]. In SStr5, the cys129, cys237, and cys260 residues were substituted for Ser; the cys169, cys218, and cys220 residues were substituted for Val; and cys51 and cys298, for Gly. Furthermore, an additional stabilizing Glu122trp substitution was introduced to the β2Ar sequence [29].
Expression of the GPCR genes in cell-free system the introduction of membrane-mimicking components to the rM allows to synthesize MPs in the soluble and functionally active forms [13][14][15][16][17][18]. However, most of these additives (e.g., detergent molecules) may reduce the productivity of the system via the partial or complete inhibition of the synthesis of the target protein [14][15][16][17]. For this reason, we did not use membrane-mimicking compounds for the synthesis when performing the comparative analysis of the efficiency of expression of the GPCR genes with additional 5-prime end regions. It should be mentioned that the target proteins accumulated as a precipitate in the rM. the precipitates were dissolved in a hard detergent (SDS) in the presence of urea and dithiothreitol as a reducing agent. the amount of synthesized proteins was determined spectrophotometrically after the dissolved precipitates had been purified via ni 2+ affinity chromatography. the synthesis of the target proteins was confirmed using monoclonal antibodies against the hexahystidine sequence. As one would expect, the direct expression of the truncated β2AR, M1-mAChR, and SSTR5 genes in cF systems based on the E. coli S30 extract was inefficient. the yield of the target proteins after the purification did not exceed 0.1 mg per 1ml of rM (Fig. 2). It should be noted that highly efficient production (with a yield of up to 1.6 mg/ml of rM) of bacteriorhodopsin from Gram-positive bacteria Ex. sibiricum (eSr) [30], a structural homolog of the GPcrs, which also contains seven tM helices, has been previously observed [15]. We supposed that the low yield of the model GPcrs could be attributed to the low efficiency of translation initiation due to the formation of a secondary mrnA structure at the beginning of the target gene. In order to confirm this assumption, the 5-prime end regions encoding the extracellular n-terminal amino acid residues preceding the first tM helix (25-33, 19-23, and 37-38 in β2Ar, M1-mAchr, and SStr5, respectively) in the truncated GPcr genes were substituted with the nucleotide sequence encoding the first 6 a.a. of bacteriorhodopsin eSr (eSr-tag, the sequence length is indicated with allowance for the n-terminal Met) (Fig. 1). this substitution allowed one to significantly increase efficiency in the production of the target protein (Fig.  2). the yield of the eSr-tag-β2Ar hybrid protein was comparable to that of the eSr protein, whereas the level of synthesis of the remaining two hybrid proteins (eSr-tag-M1-mAchr and eSr-tag-SStr5) was approximately three times lower (~0.5 mg/ml of rM).

Comparison of the efficiency in GPCR synthesis with various N-terminal fusion tags
the results obtained have confirmed that the 5-prime end sequence plays a significant role in efficient expression in a cell-free system. However, the yields of the target proteins attained using the eSr-tag presumably were not optimal. thus, synthesis of recombinant MPs in continuous cF systems based on the E. coli S30 extract with yields of up to 4-6 mg/ml of rM has been described in the literature [14]. For further optimization of the synthesis for the model GPcrs, we tested four n-terminal fusion tags. two of those, the t7-tag (11 a.a.) and trX protein (11.8 kDa), have previously been used in cell-free production of GPcrs [14,16,17], whereas the Mistic protein (12.8 kDa) was used for GPcr production in e. coli [31,32]. In addition, we tested the sequence encoding the n-terminal fragment (16 a.a.) of ribonuclease A (n-terminal fragment of S-peptide, S-tag), which is used to detect and purify recombinant proteins via affinity chromatography [33], but has never been used as an n-terminal fusion tag for the production of recombinant MPs. In contrast to the method used to design hybrid genes with the 5-prime end fragment encoding the eSr-tag, nucleotide sequences encoding the t7-tag, trX, Mistic, and S-tag were added in a single reading frame to the 5-prime end of the genes of the truncated GPcr variants (Fig. 1).
In most cases, the use of n-terminal fusion tags increased the yield of model receptors, but the yield levels varied for different proteins. thus, the use of t7-tag increased the yields of M1-mAchr and SStr5 receptors to ~0.5 mg/ml of rM, whereas the β2Ar level stayed low and was comparable to that observed during direct expression. the use of the trX also provided a small increase in the synthesis of the target proteins to ~0.3-0.7 mg/ml of rM (hereinafter, the amounts of the target proteins are given without the protein-fusion tags part, Fig. 2). Meanwhile, the use of the n-terminal fusion tags Mistic and S-tag allowed one to considerably increase the production of β2Ar and M1-mAchr (Fig. 2). the highest yield of β2Ar (~ 1.9 mg/ml of rM) was observed when using the Mistic protein, and the highest yield of M1-mAchr (~ 3.6 mg/ml of rM) was attained for the S-tag hybrid protein (Fig. 2). However, none of the sequences used has enabled to attain a considerable increase in the SStr5 yield. the yields of this receptor (0.4-0.7 mg/ml of rM) were very close when using various hybrid constructs (Fig. 2). It seems that the translation initiation for SStr5 is not the only crucial factor for providing efficient cell-free synthesis. Further optimization of the nucleotide sequence of the gene (e.g., substitution of the codon variants uncommon for E. coli) is presumably required to increase the production level in a cell-free system. It should be noted that a similar SStr5 yield (~ 0.5 mg/ml of rM) was earlier observed in the bacterial continuous cF system when using a full-length (nontruncated) hybrid of the receptor with the n-terminal t7-tag sequence [34].
As previously mentioned, the increase in efficiency in protein synthesis when using additional 5-prime end sequences can presumably be attributed to the reduction in the ability of the 5-prime end mrnA fragment to form a secondary structure. to confirm this assumption, the formation of a secondary structure of the 5-prime end mrnA fragment used for GPcr production was modeled. the modeling was performed using the M-fold software to analyze the free energy of formation of the secondary structure of rnA [35]. the free energy of secondary structure formation was calculated for the mrnA fragments containing four nucleotides upstream of the start codon, the start codon, and 34 nucleotides of the target protein gene or the fusion tag downstream of the start codon, as was described in [20]. the computation (Table) has shown that the native sequences of the truncated receptors can form stable secondary structures (∆G ~ -5.6, -8.2, and -19.3 kcal/mol for β2Ar, M1-mAchr, and SStr5, respectively). the use of t7-tag and trX slightly reduces the stability of the secondary structure of the 5-prime end mrnA fragment (∆G ~ -5.5-7.8 kcal/mol). Meanwhile, the use of the n-terminal sequence of bacteriorhodopsin eSr considerably reduces the stability of the secondary structure of the 5-prime end mrnA fragment in β2Ar and M1-mAchr (∆G ~ -3.1 and -3.5 kcal/mol, respectively). Secondary structures of mrnA characterized by the lowest stability were obtained for Mistic and Stag sequences (∆G ~ -1.3 and -3.3, respectively). the qualitative correlation between the calculated energies and the yields of GPcrs indirectly supports the important role of the formation of an 5-prime end mrnA secondary structure in the decrease in the efficiency of translation initiation and, as a consequence, in the total efficiency of the cell-free synthesis.
Modification of the 5-prime end region of the target protein gene is not the only way to prevent the formation of a secondary mrnA structure and increase efficiency in translation initiation. nucleotide sequences from the 5-prime end untranslated regions of mrnA can also affect these processes. In this study, we used genetic constructs based on a pet22b(+) vector (novagene) containing the lac-operator sequence inserted between the t7 promoter and the ribosome-binding site (rBS). According to published data, the use of pIVEX vectors (roche Applied Science, uSA) lacking the lac-Free energy of formation of the secondary structure by the 5-prime end mRNA fragment (∆G, kcal/mol)  operator can increase efficiency in the direct expression of the GPCR genes in bacterial cF systems [34]. In order to verify this assumption, we tested efficiency in the direct expression of the truncated M1-mAChR gene using the pIVEX2.3 vector. the yield of the target protein (~ 0.1 mg/ml of rM) in this case was no higher than that obtained via direct expression of the M1-mAChR gene cloned in the pET22b(+) vector. the data obtained were in close agreement with the results of the investigation of olfactory GPcrs, whose production in a bacterial cell-free system using pIVEX vectors was characterized by low efficiency [36]. Moreover, the use of n-terminal fusion tags was also required to provide highly efficient expression of human protein genes cloned into the pIVEX vectors [37]. Another method to solve the problem of low efficiency in translation initiation in cF systems can include the rational design of the 5-prime end sequence of the target protein gene using synonymous substitutions (without any changes in the encoded amino acid sequence), which is aimed at reducing the mrnA ability to form a secondary structure [20]. this approach was used to produce mammal cytokines when the presence of the fusion tag sequence (n-terminal fragment of cloramphenicol aminotransferase, 5a.a.) hindered the formation of the spatial structure [38].  resentative gels are shown in Fig. 3. the resulting samples, as well as the other MP samples, possess an anomalous electrophoretic mobility, which is presumably caused by incomplete denaturation of MP molecules in SDS [39]. Separate bands corresponding to receptor monomers, dimers, trimers, and higher order aggregates were detected on the gels (Fig. 3). this behavior is typical for GPcrs, which tend to form dimers and trimers in biological membranes and are prone to spontaneous aggregation due to hydrophobic interactions between tM helices even in hard detergent solutions [31]. the aggregation level of GPcr samples depends on the receptor type, the sequence of n-terminal fusion tags, and presumably on the protein concentration in the sample. thus, the highest amount of high-molecular-weight aggregates was observed in S-tag-M1-mAchr samples characterized by the most efficient synthesis. the secondary structure of the eSr-tag-M1-mAchr hybrid, which exhibited the lowest degree of aggregation in solution, was analyzed by cD spectroscopy (Fig. 4). the analysis of the resulting data revealed that the α-helical structure was the predominant one (α-helix -65%, β-sheet -4%, β-turn -9%, and irregular regions -22%), which attests to the fact that the secondary structure of the receptor is partially formed in the environment of SDS micelles. It should be noted that the content of α-helical elements in a molecule of the truncated M1-mAchr receptor calculated similarly to that in the known crystal structures of M2 and M3-mAchr [8,9] is supposed to be equal to ~72%. Further investigation of recombinant GPcrs requires either an optimization of the procedure of target-protein solubilization from the rM precipitate, followed by the development of renaturation methods for the obtained samples, or the use of membrane-mimicking media during the cF synthesis, which allows to synthesize MPs in the functionally active form in some cases [13,15,34,35].

Analysis of recombinant GPCRs
CONCLUSIONS the data obtained have demonstrated that the use of amino acid sequences of the eSr-tag, S-tag, and Mistic protein as n-terminal fusion tags allows to achieve a highly efficient production of human GPcrs in a cell-free system based on the E. coli S30 extract. utilization of these sequences provides yields of target protein production (0.6 -3.8 mg/ml of rM) that are sufficient for further structural and functional studies. the present work is the first to demonstrate the possibility of using the eSr-tag and S-tag to increase the level of heterologous production of MPs.