Research Article - (2020) Volume 14, Issue 5
Shilei Wang1#, Quanyong He2#, Jinlei Ye1, Zhichao Kang1, Qiping Zheng1, Shuo Liu3, Jun He3 and Lichun Sun1,2,4*
1Shenzhen Academy of Peptide Targeting Technology at Pingshan and Shenzhen Tyercan Bio-pharm Co., Ltd., Shenzhen, Guangdong, China
2The Third Xiangya HospitaI of Central South University, Changsha, China
3Sino-US Innovative Bio-Medical Center and Hunan Beautide Pharmaceuticals, Xiangtan, Hunan, China
4Department of Medicine, School of Medicine, Tulane University Health Sciences Center, New Orleans, LA70112, USA
#Equally contributed to this work
*Corresponding Author:
Lichun Sun
Department of Medicine
School of Medicine, Tulane University Health Sciences Center
New Orleans, LA70112, USA
Tel: 504-988-1179
E-mail: peptide612@gmail.com; lsun@tulane.edu
Received Date: July 12, 2020; Accepted Date: July 26, 2020; Published Date: August 31, 2020
Citation: Wang S, He Q, Ye J, Kang Z, Zheng Q, et al. (2020) N-linked Glycosylation and its Potential Application in Drug Development. Health Sci J. 14 No. 5: 743.
DOI: 10.36648/1791-809X.14.5.743
Protein glycosylation is a site-specific enzymatic process to attach oligosaccharides or carbohydrates to proteins. N-linked glycosylation (N-glycosylation) is the major type of glycosylation for the post-translational and co-translational modification of proteins in eukaryotic cells. N-linked glycosylation is to link saccharide molecules to proteins via covalently coupling oligosaccharides or glycans to the amino acid residue asparagine (Asn, N) of proteins, mostly with the requirement of a Asn–X–Ser/Thr (N-X-S/T) consensus sequence. N-linked protein glycosylation is of significance and plays critical roles in biological and pathological processes, and also applied for modern drug development. Particularly, the strategy to engineer N-linked glycosylation site(s) can stabilize the recombinant fusion proteins. This technology has been widely applied for drug discovery, especially for the peptide drugs such as rabies viral glycoprotein (RVG), cardiac-targeting peptide (CTP), bovine adrenal medulla (BAM).
Keywords
Glycosylation; N-linked glycosylation; Asparagine (Asn, N); Conservative GNSTM motif; Proteins; Oligosaccharides
Introduction
Glycosylation is the enzymatic process to covalently attach oligosaccharides or carbohydrates to macromolecules, generally proteins and lipids. It is different from glycation (so-called non-enzymatic glycosylation). Glycosylation broadly exists in eukaryotes, also observed in certain prokaryotes. Protein glycosylation is a site-specific process of co-translational and posttranslational modification via covalently coupling oligosaccharides or glycans to the specific amino acid residues on protein molecules to form the glycoproteins. Glycosylation reactions occur in the endoplasmic reticulum (ER) and the Golgi apparatus in eukaryotic cells [1]. This process can change protein folding, regulate protein functions and provide protein diversity [1,2]. Glycosylation leads to various different types of glycoproteins. These proteins can regulate various biological activities and molecular signaling pathways, and is involved in various significant biological processes and plays critical biological functions in physiology and pathology. Abnormal glycosylation has been demonstrated being associated with certain congenital disorders and other human diseases such as cancers, immune diseases, nerve diseases, diabetes and Alzheimer's diseases [3].
The classification of protein glycosylation
There are several types of glycosylation including N-linked glycosylation (N-glycosylation), O-linked glycosylation (O-glycosylation), C-mannosylation, phospho-glycosylation and glypiation (Figure 1) [3]. N-linked glycosylation is to covalently attach a carbohydrate to the nitrogen atom (the amide nitrogen) of the amino acid residue asparagine (Asn, N) of a protein in the post-translational process, and is also called asparagine-linked (Asn-linked) glycosylation [4]. O-linked glycosylation is to attach the oligosaccharides to certain unique amino acid residues, mostly serine (Ser, S) and threonine (Thr, T), rarely occurred on other amino acids such as tyrosine (Tyr, Y), hydroxylysine (Hyl, a post-translational hydroxy modification of lysine(Lys)), or hydroxyproline (Hyp, a post-translational modification of proline (Pro)) [3,5]. As for C-mannosylation, glycosylated proteins are the post-translational modification of the amino acid residue tryptophan (Trp, W) on the proteins with the conservative sequence Trp-X-X-Trp ( W-X-X-W) (X means any amino acid residue) [6-8]. Glypiation is to add glycosyl phosphatidyl inositol (GPI) anchor to a protein and located the protein to the cellular membranes [9,10]. Phosphoglycosylation (or phospho-serine glycosylation) is to add a carbohydrate to the the amino acid residue serine of a protein with a phosphodiester bond [11].
Figure 1: Classification and process of protein glycosylation. Protein glycosylation is classified as five different types, including N-linked glycosylation, O-linked glycosylation, C-mannosylation, phospho-glycosylation, and glycosylphosphophatidyl inositol (GPI).
The significance of protein glycoproteins
To many proteins, the said glycoprotein is a general and critical process of co-translational and post-translational protein modification. This process can promote protein folding, protein maturing, protein stability, protein solubility, protein secretion, protein localization, regulate protein signaling and protein interactions, provide the diversity of protein macromolecules, and mediate various physiological functions and biological activities [1,12]. Glycosylation can function on cell protection and cell stability. The glycosylated proteins can serve as specific ligands for exogenous receptors. Certain sugar chains can be served as specific receptors for various viruses (such as coronavirus SARS-CoV-2), bacteria and parasites [2,13]. The sugar chains of the glycosylated proteins can also serve as the specific ligands for endogenous receptors to participate in mediating cellular clearance and intracellular transportation.
Glycosylation plays critical roles in regulating various biological functions and processes such as cell recognition, cell differentiation, signal transduction, and immune responses. The loss of N-glycosylation and the gain of N-glycosylation are the common process in natural evolution, and certain specific glycosylations are associated with physiology and pathology [1,14]. Abnormal glycosylation can lead to pathogenesis such as tumorigenesis, immune diseases and metabolic diseases [3,15]. Particularly, glycosylation is of significance in tumor formation and progression, structural and functional changes of tumor cells, and anti-tumor drug development. Glycosylation can regulate tumor cell proliferation, cell migration and cell invasion, can induce drug resistance in tumor cells and serve as tumor markers. The glycosylation strategy can also be applied for new drug development or drug delivery [16,17].
N-linked glycosylation
Among different protein glycosylations, N-linked glycosylation is the major type of glycosylation for the post-translational and cotranslational modification of proteins in eukaryotic cells. N-linked protein glycosylation plays an important role in the stability and transport of protein spatial structure. N-linked glycans may strongly affect the structure of their covalently linked protein polymers. They can affect protein structure in different ways. First, because of the co-translation of N-linked glycosylation, the addition of carbohydrates to the partially folded nascent polypeptide can affect or promote the protein folding process. Secondly, the carbohydrates can stabilize the mature of the glycosylated proteins [18]. N-linked glycosylation occurs at the unique amino acid residue asparagine (Asn, N) of the target proteins and is of significance in biological and physiological activities and has been broadly used for drug development [2,19,20].
The biosynthetic process of N-linked glycosylation
N-glycosylation is to link saccharide molecules to proteins with covalent bonds and form oligosaccharides or glycans via attaching N-acetylglucosamine (GlcNAc) to the nitrogen atom of the amino acid residue asparagine (Asn, N) with a β−1N linkage (Figure 2) [3,21]. Many proteins, particularly the secretory proteins and membrane proteins, are N-glycosylated in the endoplasmic reticulum (ER) and the Golgi apparatus via covalently adding oligosaccharides to the side chains of the residue Asn, mostly with the requirement of a Asn–X–Ser/Thr (N-X-S/T) consensus sequence where X represents any amino acid residue except proline (Pro, P) that will block the glycosylation process (Figure 2) [1,6,22]. N-linked glycosylation occurs in ER and Golgi, including biosynthesis of dolichol linked oligosaccharide precursor, linkage of the precursor to target protein, and processing and modification of the oligosaccharides after matured. The glycosylated proteins enter into plasma membrane to either embed on membranes or secret outside cells [23,24].
Figure 2: The structures of N-linked glycosylation proteins. Under the action of glycosidases, N-acetylglucosamine (GlcNAc) and mannose molecules are linked to Asn residues to form the basic N-sugar structures. Galactose (Gal), N-acetylneuraminic acid (Neu5AC), N-acetylgalactosamine (GalNAc) and other molecules participate in the process of glycosylation.
In ER, the process is associated with the synthesis of the precursor oligosaccharides, the linkeage of oligosaccharides to proteins and the initial trimming of the precursor oligosaccharides [23]. The lipid molecule dolichol phosphate attaches on ER membrane at the cytoplasm side and the sugar molecule bounds to dolichol via a pyrophosphate linkage [3,21]. The precursor lipid turns inside ER lumen at the lumen side where the oligosaccharide chains extends with more sugar molecules being added on and formed the precursor oligosaccharides, which mainly consist of three glucoses (Glc), nine mannoses (Man) and two N-acetylglucosamines (GlcNAc) [1,21]. And these precursor oligosaccharides are attached to the target proteins that are either secretory proteins inside lumen or membrane proteins on the inside membrane of ER lumen. The formation of oligosaccharideprotein complex occurs via the oligosaccharyltransferase (OST) to recognize the unique consensus sequence Asn–X–Ser/Thr (N-XS/ T). The oligosaccharides are covalently linked to the amino acid residue Asn (N) of the target proteins at the nitrogen atom on the side chain of the residue (Figure 3) [21,24].
Figure 3: The structures of antibody and the associated glycosylation. Basically, an antibody is composed of two identical heavy chains and two identical light chains, forming a Y-like structure. There is a conserved N-linked glycosylation site on the CH 2 region of the dimeric Fc domain.
In Golgi apparatus, the oligosaccharide-protein complex are continuously trimmed at their oligosaccharide chains, and further matured with the modification of glycoproteins at their oligosaccharide chains. The glycoproteins are then transferred into the Golgi apparatus for additional trimming and modifications [14,25]. In Golgi apparatus, there are different compartments and two faces including cis face and trans face. The cis face nearing ER fuses with the secretory vesicles from ER and receives the unmodified and immature proteins and other contents. There are three types of N-linked glycans including high mannose, hybrid and complex. In cis Golgi compartment, the unmodified proteins and lipids are trimmed with the mannose N-glycans being produced via adding mannose-6-phosphates to oligosaccaride chains. These mannose N-glycans are further transported to the medial compartments and modified. They are eventually matured in the trans Golgi compartments after the process of modifications including glycosylation, posttranslational modification and phosphorylation. The hybrid and complex types of N-glycans are produced within the medial-Golgi and trans-Golgi compartments. The vesicles carrying the matured proteins and lipids leave the Golgi apparatus from its trans faces [22,26]. There are three types of vesicles including lysosomal vesicles, exocytotic vesicles and secretory vesicles. They released glycoproteins or other contents to extracellular space or fuse with membrane [22].
The consensus sequence of N-linked glycosylated sites
N-linked glycosylation occurs at its beta-amide of the amino acid residue N of proteins, with the consensus sequence Asn–X–Ser/ Thr (N-X-S/T) being essential. For the bacterial N-linked protein glycosylation, the difference is that the N-X-S/T consensus sequence is required but not sufficient. However, the glycosylated processes are homologous in bacteria and eukaryotes [26]. Moreover, the glycosylation rate at the N-X-T sequon is much higher than that at the N-X-S sequon [6,7]. X at the second site of the N-X-S/T sequon could be any residue of the 20 amino acids except proline (Pro, P) that will block the occurrence of N-glycosylation in the proteins [6,7,27]. And the N-X-S/T sequon was found to be present within a decapeptide sequence at the C-terminus of proteins could induce post-translational and cotranslational modification of proteins [7]. Also, certain proteins carrying with the N-X-S/T sequon are reportedly not glycosylated or poorly glycosylated [28]. Plus, the analysis showed that the different effects of the unique residue proline on glycosylation of proteins at a different site around the N-X-S/T sequon or the distance from N of the N-X-S/T sequon. Particularly, proline at X site has much higher frequency in non-glycosylated protein than in glycosylated proteins [27]. The amino acid residue at the Y site of the N-X-S/T-Y sequence is critical for glycosylation efficiency [28]. Proline at the Y site reduces glycosylation [27]. Reportedly, N-X-C sequon with the amino acid cysteine (Cys) at the third site instead of Ser and Thr was observed for N-glycosylation [7] and also with N-G-G-T, N-S-G-Psr (Psr means phosphoserine) observed [27].
The three amino acids N, S and T in the consensus sequence GNSTM are critical and conservative for N-linked glycosylation. And as shown in the GNSTM motif, the two amino acid residues G and M at the N- side and C- side of NST sequon could efficiently enhance N-glycosylation of proteins [29]. The difference of the amino acid residue in front of the glycosylated site N in the N-X-S/T sequon displayed the different glycosylation efficiency. N-glycosylation could be maximized with a G, following with the amino acid residues such as L, S, F, and minimized with the residues P and M [6]. Besides the requirement of the consensus sequence N-X-S/T, the N-glycosylation occurring at the C-terminus of proteins is strongly related to the distance of the unique residue Asn (N) to the C-terminus [6,27]. Glycosylation would be reduced when the N-X-S/T sequon is too close to the C-terminus. C-terminal glycosylation of the proteins could not occur with the presence of only three amino acids N-X-S/T, but the glycosylation being observed with a four amino acid N-X-T-M sequence at C-terminus. Moreover, the N-glycosylation efficiency at C-terminus is gradually increased with the extension of N-XS/ T sequence length and maximized with a sequence length of six residues at C-terminus (NXTMMS) [6].
The application of N-linked glycosylation in drug development
The strategy of N-linked glycosylation in the druggable proteins has been applied in modern drug development. For instance, the engineered N-linked glycosylation are used to stabilize the recombinant proteins or peptides. Particularly, most of the short peptides in nature are not stable and could be easily degraded with their half-lives being very short. This disadvantage limited their clinical applications. The glycosylation NXS/T sequon is artificially engineered and fused in recombinant genes of interest, leading to the expression of the glycosylated recombinant proteins, antibodies, peptides or fusion proteins [16,19,29,30]. This can extremely enhance the stability of proteins and peptides of interest, and increase their drugability and potential clinical uses. For instance, Lamp-1 (lysosome-associated membrane protein) and Lamp-2 are exosomal membrane proteins. The N-linked glycosylation of these proteins can protect them from cellular proteolysis. Removal of their glycosylation sites obviously increased the degradation of these proteins [31]. Meanwhile, N-linked glycosylation sites engineered for Lamp fusion proteins can extend for their exosomal stability [16,29]. Irisin is a peptide hormone with 112 amino acid residues. The peptide Irisin is produced by FNDC5 cleavage and acts as a thermogenic adipomyokine. N-linked glycosylation can modulate its secretion [12]. Also, the engineered N-linked glycosylation can modify or change the functions of the glycosylated proteins. In Dr. Kreer’s laboratory, a non-glycosylated protein was artificially engineered with a glycosylation site (GNSTM motif). The glycosylated protein was expressed in yeast Komagataella phaffii and could be recognized by endocytic receptors like the mannose receptor (MR), also enhanced the uptake by dendritic cells (DCs) and further activated DCs [32]. These characteristics of glycosylation can highlight the significant applications of the engineered N-linked glycosylation for new drug development.
One example is exosome that has been widely applied for drug delivery such as small molecules and short interfering RNAs (siRNAs), but limited with its non-specificity. To fix the limit, the lysosomal associated membrane protein 2b (Lamp2b) [31], a specific transmembrane protein on exosome membrane surface, has been considered for the potential uses in the drug-targeting therapeutics via fusing with the short targeting peptides such as rabies viral glycoprotein (RVG), cardiactargeting peptide (CTP), bovine adrenal medulla (BAM) [16,17]. RVG, a neuron–specific peptide with 29 amino acid residues (YTIWMPENPRPGTPCDIFTNSRGKRASNG), was fused to the N-terminus of Lamp2b and established a new exosomal system embedded in its membrane with the RVG-Lamp2b fusion proteins and the RVG peptides appearing outer surfaces of exosomes, specifically recognizing and binding to the transmembrane acetylcholine receptor (AChR) [16]. The non-specific exosomes become RVG-specific ones and can serve as drug delivery system via RVG-AChR interactions. This new system can pass cross the blood-brain barrier and specifically target neurons to treat neuron-associated diseases. However, the short peptide RVG in the fusion protein is easily degraded. To minimize the degradation of RVG and extend its long-acting stability, the strategy of engineered N-linked Glycosylation was considered and applied. Dr. Hung and his co-workers engineered a N-linked Glycosylation site with the unique GNSTM motif at the N-terminus of the RVGLamp2b fusion protein. They identified that the engineered N-linked glycosylation extremely enhanced the stability of RVG and other peptides nearby the glycosylation site while retaining their biological activities [16,29]. Another example is the cardiactargeting peptide (CTP) consisting of 12 amino acid residues (APWHLSSQYSRT). It targets the cardiomyocytes and heart tissue, and can be used to deliver therapeutic drugs. In his study, Kim fused CTP to Lamp2b, and added a glycosylation site to stabilize the CTP-Lamp2b fusion protein. This strategy enhanced the CTP-Lamp2b-specific exosomal delivery to heart cells and heart tissues via both In Vitro and In Vivo assays, providing a potential treatment for heart diseases [17]. Bovine adrenal medulla (BAM) is another peptide with 22 amino acid residues (YGGFMRRVGRPEWWMDYQKRYG), has high binding affinity to opioid receptors and sensory neuron-specific receptors (SNSRs), and may potentially be used to deliver exosomal drugs to SNSRspecific sensory neuron for the treatments of the associated diseases. We fused BAM to Lamp2b with an extra GNSTM motif to stabilize BAM in the BAM-Lamp2b fusion protein. The preliminary data support that BAM is much more stable in the fusion protein with glycosylation site (GNSTM-BAM-Lamp2b) than that without glycosylation site (BAM-Lamp2b) (Data not shown). Besides, certain other proteins or peptides such as human serum albumin (HSA), leptin, irisin, are used to fuse with peptides for drugtargeting or peptide-stabilizing. For instance, GLP was fused at the C-terminus of HSA and form the HSA-GLP fusion protein that extremely increased GLP’s stability and has been approved for the treatment of diabetes. The next generation drug may be the HSA-GLP-GNSTM with a glycosylation site being engineered at C-terminus of HSA-GLP and extend the half-life of GLP. Leptin is a peptide hormone with 167 amino acid residues and can regulate bodyweight, obesity and other biological activities. Leptin was fused at its C-terminus with the M domain of CD45 molecule that is a small extracellular domain with four putative N-linked glycosylation sites. The leptin-M fusion protein was observed with an increase of N-linked glycosylation and accumulated in endoplasmic reticulum (ER) in plants [33].
Engineered N-linked glycosylation has been widely used in various fields. Antibodies are also frequently glycosylated. Particularly, monoclonal antibody (mAb) has a conserved N-linked glycosylation site on the CH 2 region of the dimeric Fc domain (located on position 297 (Asn297)) [2]. N-linked glycosylation plays a critical role in structure, function and pharmacokinetics of antibodies. N-linked glycosylation may affect the steric hindrance between the two heavy chains and keep the Fc region in an open conformation. The removal of the glycosylation may result in the heavy chains to collapse [34]. Besides the primary site, about 20% of mAbs have a second N-linked glycosylation site located inside the variable region. The strategy to engineer the glycosylation site has been applied to modify antibodies and change their pharmacokinetics. Many drugs based on monoclonal antibodies (mAbs) have been successfully developed [2]. Adalimumab is a fully human IgG1 mAb targeting to tumor necrosis factor (TNF)-alpha and a FDA-approved drug for the treatments of the arthritis and certain skin disorders. Engineered extra glycosylation sites in Adalimumab could enhance its conformational stability and its resistance to aggregation by steric hindrance [30]. In his study, Dr. Kriz determined the amino acid residues for the N-linked glycosylation sites (Asn-X -Ser/Thr) at the CH3-CH3 interface by employing four criteria and eventually engineered two N-linked glycosylation sites on the CH3 domain. The monomeric Fc (monoFc) retained its specific binding affinity with the neonatal Fc receptor (FcRn) and extended the in vivo half-life of an antibody Fab domain [35]. And interestingly, Dr. Mrksich and his colleagues used a systematic approach for sitespecific control of N-linked protein glycosylation. They screened and characterized 41 putative N–glycosyltransferases (NGTs), found that glycosylation could be site-specifically controlled at four unique sites once the specific sequences of short peptide substrates termed as GlycTags were inserted inside a single target protein. The sequential glycosylation strategy can solve the problem of site-specific glycosylation of a target protein and provide potential therapeutic applications [19].
Conclusion
It is always a hot topic to apply the new technologies and find the next-generation or the “me-better” drugs in current drug research and development. Glycosylation plays key roles in various biological activities and pathological processes. N-linked glycosylation is the major type of glycosylation for the posttranslational and co-translational modification of proteins in eukaryotic cells. To engineer N-linked glycosylation site or a GNSTM motif in target proteins has been widely used for protein/ peptide drug discovery or drug-targeting delivery system. This strategy displays its great advantages and particularly provides the potential opportunities in the fields of fusion protein drug development.
Acknowledgement
We would greatly appreciate the supports from Shenzhen Science and Technology Program (Grant No.: KQTD20170810154011370), Xiangtan Institute of Industrial Technology Collaborative Innovation, and Xiangtan Science and Technology Bureau.
30911