WORLD JOURNAL OF BIOLOGY AND BIOTECHNOLOGY Computational analysis of heat shock protein (HSP) in Citrus X Sinensis

Review Proccess: Peer review Heat shock proteins (HSPs) are molecular chaperones and one of the cell’s most important regulatory proteins present in all species. HSPs are a multigene family classified into six families according to their molecular weight range between 8KDa to 110KDa: HSP100, HSP90, HSP70, HSP60, small heat shock proteins (sHSPs)

INTRODUCTION: Heat shock proteins (HSPs) are one of the cell's most important regulatory proteins present in the cells of all species.These are the special type of conserved proteins produced when a cell is exposed to heat shock or other stresses.Indeed, the far more important function of HSP is to protect cells from heat shock by preventing the denaturation of cellular proteins (Kregel, 2002).Some of the members of the heat shock protein family are also called molecular chaperones.Molecular chaperones assist in the proper assembly and packing of polypeptide chains.In addition to their key function in protein folding, chaperones are also involved in protein transportation and deterioration, dissociation of protein clusters and refolding of stress-denatured proteins, and macromolecular matrix formation (Haslbeck et al., 2005).HSPs elevated by several environmental conditions, metabolic stresses, physiological and pathophysiological factors such as inflammatory mediators, toxic substances, alkalinity, heavy metal ions, anoxia, ethanol, ischemia, drugs, drought, fungi, aging many more (Tutar et al., 2006;Wu and Tanguay, 2006;van Noort et al., 2012).Abbas et.al studied the SIhsp70 gene family in Solanum lycopersicum under cd 2+ stress condition.He found higher expressions of SIhsp70-11 when Cd 2+ stress is applied to it (Abbas et al., 2022).Similarly, Davoudi et.al also analyzed the HSP70 gene family in Cucurbita moschata under drought conditions (Davoudi et al., 2022).The functions of HSPs in strawberry plants at the elevated temperature and found that when heat stress was applied a range of HSPs produced like 19-29 kDa in leaves, and 16-26 kDa in flowers (Ledesma et al., 2004).Heat shock proteins are a multigene family based on their molecular weight with molecular sizes from 8-110 kDa.Depending upon molecular weight, the HSP families are divided into six classes which are HSP100, HSP90, HSP70 (chaperone), HSP60 (chaperone), small sHSPs, and Ubiquitin (Bakthisaran et al., 2015).HSP90 are the most plentiful proteins in the cell (1-2% of overall cellular proteins) (Tutar and Tutar, 2010).The main function of HSP90 family is to control protein folding.Cell cycle control, transcriptional regulation, protein trafficking, and signal transduction are the other main contribution of HSP90 in the cell except protein folding (Wang et al., 2004).This family is the most conserved protein family among all the HSP families since the process of evolution.HSP70s play crucial roles in many important biochemical functions.HSP40, HSP100, and many other nucleotide factors act as co-chaperones with HSP70 for proper protein folding (Kabani et al., 2002).Other cellular processes involving HSP70 include apoptosis, targeting proteins for degradation translocation across membranes, and some other functions managed by HSP70 (Tutar, 2006;Tutar et al., 2006).The folding and assembly of proteins and their subunits are aided by HSP60.It also has multiple forms in the cell.This family protein is mostly located in mitochondria and chloroplast (Efeoğlu, 2009).Bacterial and eukaryotic cells are bountiful with this family (Arsène et al., 2000).A group of genera called Citrus is present including many cultivated plants such as Citrus grandis (pummelo), Citrus limon (lemon), Citrus reticulata (tangerine and mandarin), Citrus paradisi (grapefruit) and Citrus X Sinensis (sweet orange) (Barkley et al., 2006).Citrus belongs to the Rutaceae family (Moore, 2001).Citrus X Sinensis is the most extensively grown citrus group in the world, comprising over 70% of per year Citrus species production (Flamini et al., 2003).From a nutritional point of view, sweet oranges are very economical.They contain excess amounts of vitamin C, fiber, thiamine, folate, and antioxidants.They have a variety of health benefits (Xu et al., 2013).The importance of HSP in sequence study includes genomic characterization and cellular physiology to fully comprehend the Citrus X Sinensis evolutionary relationship, physiochemical properties, etc. OBJECTIVES: This study aimed to provide deep insights into HSP families in Citrus X Sinensis.Our performed work provides physiochemical properties, phylogeny, and all the functional genomic architecture at the genomic level of HSP60, HSP70, and HSP90 families.All these provide the basis for future studies.

Cellular localization of HSP:
To identify the cellular positions of obtained sequences, the Protein Subcellular Localization Prediction Tool "Wolf Psort" (https://wolfpsort.hgc.jp/) was used (Horton et al., 2007).Sequences of each family (collectively) were tested at Wolf Psort and values of each sequence for different organelles were collected from there.Then for further analysis, the "TB tool" was used.Data obtained from Wolf Psort were further used to construct the heatmaps of respective families.Phylogeny: Multiple sequence alignments of HSP60, HSP70 and HSP70 were performed.MEGA7 was used to conduct phylogenetic analysis using the results of multiple sequence alignment (Kumar et al., 2016).Using the Maximum Likelihood technique based on the Jones-Taylor Thornton (JTT) matrix-based model with a 1000 bootstrap value, the evolutionary history between HSP60, HSP70 and HSP90 was computed.Structure of Protein: 3D structures of proteins provide a better understanding of protein structures, binding sites and activities.A template search was conducted using Swiss-Modeling by entering amino acid sequences as input data (https://swissmodel.expas y.org).The templates with the highest percentage similarity, GMQE and QMEAN scoring functions were used to evaluate the global and per-residue model quality.The automated modeling procedure started with the selected template having the highest percentage of similarity (Pembroke, 2000).RESULTS: Identification of HSP family: By using the Arabidopsis thaliana collected sequences as queries, 65 hits for HSP60, 44 hits for HSP70, and 15 hits for HSP90 of Citrus X Sinensis were collected from Phytozome BLAST analysis.Then the duplicated and unwanted sequences were removed from these hits.The remaining sequences were 44, 22, and 23 for HSP60, HSP70, and HSP90 respectively.Physiochemical characterization of HSP families: Table 1, S1 and S2 represented the physiochemical properties of the collected HSPs sequences.The HSP isoforms have a wide diversity in molecular weight (35000KDa to 87000KDa), number of amino acids (250 to 800), genomic sequence (1200 to 7400), transcript sequence (800 to 3500), and coding sequence (800 to 2400).An II of greater than 40 describes that the protein is unstable.The present study (table S1) revealed that all the members of HSP60 are stable except CSHSP60-10, CSHSP60-18, CSHSP60-21, CSHSP60-22, CSHSP60-25, CSHSP60-36, CSHSP60-37, CSHSP60-50, CSHSP60-97, and CSHSP60-108.An AI greater than 65 describes the thermostability of HSP.A lower GRAVY value indicates the hydrophilic nature of members.Thus, all the members of the HSP60, HSP70 HSP90 families are thermostable and hydrophilic except CSHSP60-92, CSHSP60-80, CSHSP60-69, CSHSP60-52, CSHSP60-44, CSHSP70-24, CSHSP90-37, (table 1, S1  and S2).The pI value indicated the acidic and basic nature of HSP.The pI value shows that most of the members of the HSP60 family are acidic.But some exceptions are present such as CSHSP60-28, CSHSP60-52, CSHSP60-56, CSHSP60-80, CSHSP60-86, CSHSP60-97, CSHSP60-101, CSHSP60-102, CSHSP60-106, and CSHSP60-108, these are basic in nature. of 44 sequences, 25 are reverse strands and 19 are forward strands.Table 1 revealed that all the members of HSP70 are acidic except CSHSP70-18, CSHSP70-35, CSHSP70-38, CSHSP70-41, and CSHSP70-43 which are basic.Except for CSHSP70-7, CSHSP70-12, CSHSP70-14, CSHSP70-16, CSHSP70-35, and CSHSP70-43, all the members are stable.Out of 22 members, 11 are reverse strand while the remaining all are forward.The AI value describes that all members are thermally stable (table 1).The physicochemical properties of the HSP90 family provided in Table S2 indicate that all members are acidic except CSHSP90-28, CSHSP90-39, and CSHSP90-49.Almost 15 members are forward strands while 8 are reverse strands.Study reveals that HSP90 family is stable in nature except CSHSP90-10, CSHSP90-18, CSHSP90-21, CSHSP90-22, CSHSP90-23, CSHSP90-25, CSHSP90-33, CSHSP90-34, and CSHSP90-49.Localization of HSP: To find the occurrence and expression of collected HSP sequences in the cell, detailed research was performed by Wolf Psort.A heat map of each HSP family was constructed using the TB tool.Heat maps are shown in figure 1(A) for HSP60, HSP70, and HSP90 respectively.Figure 1(A) confirmed that CSHSP60-10, CSHSP60-61, and CSHSP60-82 have maximum expressions in E.R. CSHSP60-21, CSHSP60-25, CSHSP60-52, CSHSP60-101, and CSHSP60-108 have the high number of expressions in the nucleus.CSHSP60-19, CSHSP60-69, CSHSP60-85, and CSHSP60-97 are present in excess amounts in the chloroplast.From all the sequences, CSHSP60-101 has the maximum value in the cytoskeleton.CSHSP60-80 has maximum expression in plastids.CSHSP60-83 shows the highest expression in the peroxisome.CSHSP60-17, CSHSP60-27, CSHSP60-42, and CSHSP60-107 show mitochondrial abundance.The least number of expressions is shown in vacuole and extracellular.Except CSHSP60-10, CSHSP60-27, CSHSP60-42, CSHSP60-61, CSHSP60-69, CSHSP60-82, CSHSP60-85, CSHSP60-86, and CSHSP60-107 all other sequences show cytoplasmic expressions.The CSHSP70-07 and CSHSP70-30 have maximum expressions in E.R. CSHSP70-14 and CSHSP70-16 show the highest occurrence in the nucleus.CSHSP70-12, CSHSP70-21, CSHSP70-24, CSHSP70-32, CSHSP70-33, CSHSP70-34, CSHSP70-36, CSHSP70-38, and CSHSP70-39 have maximum cytoplasmic occurrence.Chloroplast consists of a high amount of CSHSP70-13 and CSHSP70-41.CSHSP70-17 and CSHSP70-42 are present in high amounts in mitochondria.Peroxisome consists of a high amount of CSHSP70-13 and CSHSP70-61.Similarly, CSHSP90-21, CSHSP90-25 and CSHSP90-63 have maximum occurrence in the nucleus.CSHSP90-47, CSHSP90-48, CSHSP90-50, and CSHSP90-52 have the highest expression in the cytoplasm.CSHSP90-19, CSHSP90-34, and CSHSP90-49 have high occurrences in the chloroplast.The maximum expressions in ER are shown by CSHSP90-10, CSHSP90-42, and CSHSP90-44.In mitochondria, CSHSP90-27 is present in abundance.Peroxisome consists of a high amount of CSHSP90-12 and CSHSP90-45.Motif Analysis: For the sake of a better understanding of heat shock protein sequences, conserved motif analysis was predicted by the MEME suit.Through MEME suit maximum of eight motifs of HSPs were observed in Citrus X Sinensis.MEME motifs 1, and 5 consists of 29 amino acids, motifs 2,3 and 4 consist of 50 amino acids, while motif 7 and 8 consist of 41 amino acids.Only MEME motif 6 consists of 42 amino acids (table 2).All the motifs were annotated as HSP70 in the Pfam search.Motif analysis of HSP sequences is shown in figure.CSHSP60-19, CSHSP60-27, CSHSP60-30, CSHSP60-31, CSHSP60-56, CSHSP60-97, CSHSP60-98, CSHSP60-103, CSHSP60-106, CSHSP70-13, CSHSP70-17, CSHSP70-20, CSHSP70-21, CSHSP70-36, CSHSP70-39, CSHSP70-41, CSHSP90-19, CSHSP90-27, CSHSP90-30, CSHSP90-31, and CSHSP90-50 show maximum number of motifs while CSHSP60-93 and CSHSP70-32 show least number of motifs.Phylogeny: To study the evolutionary history maximum likelihood method was used based on JTT matrix-based model.Bootstrap value was 1000 for taxa analysis (figure 1B).The evolutionary aspect shows the gene changes in the organism and mutation against certain environmental stress.Overall figure showed that the HSP 60 gene is more closely related to HSP-70 and is distantly related to HSP-90.Protein structure prediction: The SWISS-MODEL template library was searched for structures that were evolutionarily related to the target sequence.Amino acids sequences of all the respective HSPs inserted as input source sequences and a template search was conducted.Those sequences having a percentage similarity of less than 50%were eliminated.Sequences having the highest percentage of similarity were selected from respective HSP families.Global Model Quality Estimation (GMQE) and QMEAN scoring functions were also evaluated for each respective model.The SWISS-MODEL template library was searched for structures that were evolutionarily related to the target sequence.Amino acids sequences of all the respective HSPs inserted as input source sequences and a template search was conducted.Those sequences having a percentage similarity of less than 50% were eliminated.Sequences having the highest percentage of similarity were selected from respective HSP families.CSHSP60-93, CSHSP70-21 and CSHSP90-30 sequence three-dimensional structure as shown in figures 1(C), S2 and S3 respectively.Global Model Quality Estimation (GMQE) and QMEAN scoring functions were also evaluated for each respective model.These putative protein structures showed an 80.55% to 82% similarity index with the templates.All three have a monomer-oligo state.DISCUSSION: HSPs are the most conserved proteins present in all species and perform a wide variety of functions.HSPs improved tolerance to cell death caused by a variety of factors.Generally, HSP90, HSP70 and HSP60 are used to control and suppress the process of apoptosis (Kennedy et al., 2014).HSPs are involved in nascent chain folding, protein disaggregation, degradation of damaged proteins, protein translocation through membranes, and cytoskeleton preservation and restructuring (Weibezahn et al., 2005).The HSP90 family is a unique and highly essential protein family present in all species.The main function of this family is to control protein folding (Wang et al., 2004)  stimulated by the overexpression of this gene.HSP60 is involved in the folding and assembly of proteins.Both HSP70 and HSP60 have similar characteristics (Arsène et al., 2000).The two proteins work together when folding and assembling substrate proteins.Sometimes, HSPs collaborate with co-chaperones for proper and efficient functioning (Wu and Tanguay, 2006) such as HSP70 and HSP90 (molecular chaperones) assembled a multi-chaperone complex in the mammalian cell and are linked by the Hop cochaperone, engaged in the folding and development of the main regulatory proteins (Wegele et al., 2004).In this study, after removing the redundant sequences we inspected a total of 44 HSP60, 22 HSP70, and 23 HSP90 sequences of Citrus X Sinensis.A crucial role in examining the stability and operation of proteins in living systems was revealed by the behavior, characterization, and physiochemical characteristics of proteins.According to numerous researches, HSP70 is cytoprotective under conditions of thermal stress (Belhadj Slimen et al., 2016).Physical and chemical properties of HSP70 protein in sorghum were analyzed by gene size, protein length and molecular weight (kDa) of the genes (Amare and Kebede, 2023).AI is a very important factor to check the thermo stability of globular proteins.An AI value greater than 65 manifested the thermo stability of protein.Higher AI means, higher protein thermo stability.Our findings revealed that all the members of respective HSP families are highly thermo stable.The isoelectric point (pI) described the acidic and basic nature of the HSP.Mostly HSPs were acidic and same results were seen with sorghum (Amare and Kebede, 2023).Mostly HSPs exhibited hydrophilic proteins; solubility of protein depends upon the GRAVY value.Similar results have been reported for solubility of protein and localization (Tripathy et al., 2021;Amare and Kebede, 2023).Molecular chaperone has recognized as poly peptide protein stabilizer during their transport into subcellular organelle mitochondria to cytoplasm of the cell.Therefore, the subcellular localization of various forms of HSPs was showed the major role in housekeeping of protein functions (Young et al., 2004;Craig, 2018).Computational tool was used to find the localization of HSPs in cell organelles due to insufficient experimental data.Cytosolic, mitochondria and chloroplast HSPs each have distinct functions (Waters and Rioflorido, 2007).The current studies showed that the localization of the HSPs more in endoplasmic reticulum.It has been reported that the HSP 70 show different gene expression in different tissues (Tripathy et al., 2021).Protein sequence analysis is useful for describe the evolutionary trends of critical biological functions.The evolution of sHSPs is quite different from other HSPs.All eukaryotic organisms have numerous HSP 70s.Various studies have provided comprehensive knowledge on structure; properties and dynamic and well defined protein mechanism on plant HSPs Structural and functional understandings of HSPs were illustrated by homology modeling.Conserved motifs of heat shock proteins were predicted.The phylogenetic relationship was studied by constructing a phylogenetic tree of the collected sequences of HSP60, HSP70 and HSP90 using the Maximum Likelihood method.HSP60 is more closely resembles HSP70, which also shown in our phylogenetic analysis.Gene families have continuous gene duplication (birth) and deletion (death) process and the evolutionary history of HSPs in plants shows that stiff selection is the main cause of the divers' number and form of HSPs (Waters and Rioflorido, 2007).3D protein structure configurations were also predicted for each HSP family having >80% sequence similarity.The template was selected with the sequence similarity ≥20% (Dokholyan, 2012).Ramachandran plot and QMEAN has been used to analyzed the quality of the predicted model (Ashwinder et al., 2016).Phylogenetic, subcellular localization, physicochemical properties and structural organization exhibited by different analysis (Amare and Kebede, 2023).Similar parameters were used to analyze the structure and function of HSPs in orange plant.CONCLUSION: HSPs are the most important cell regulatory proteins present in all cells and every form of life.These specific types of protein families are prompted in the cell under different stress conditions involved in protein synthesis and other metabolic activities.To understand the importance of HSPs in citrus, a wide genome analysis was performed.This study revealed basic information regarding the nature and expression of proteins.The identified protein sequences are hydrophilic, acidic and thermo stable.Moreover, these proteins are majorly localized at the endoplasmic reticulum in the Citrus sinuses.Phylogenetic analysis studies indicated the evolutionary relationship which describes that these protein families are closely related to each other.Swiss modeling structural findings indicated >80% similarity and the putative protein's structure develops a monomer state.

CONFLICT OF INTEREST:
Authors have no conflict of interest.

Figure 1 :
Figure 1: (A) Heat map of localization of HSP60, Hsp70 and HSP90.(B) Evolutionary relationships between HSP60, HSP70 and HSP90 families (C) (a) Template structure (b) 3D-structure of CSHSP60-93 (c) Quality Comparison (d) QMFAN Z-Scores (e) Local Quality Estimate.stimulatedby the overexpression of this gene.HSP60 is involved in the folding and assembly of proteins.Both HSP70 and HSP60 have similar characteristics(Arsène et al., 2000).The two proteins work together when folding and assembling substrate proteins.Sometimes, HSPs collaborate with co-chaperones for proper and efficient functioning (Wu and Tanguay, 2006) such as HSP70 and HSP90 (molecular chaperones) assembled a multi-chaperone complex in the mammalian cell and are linked by the Hop cochaperone, engaged in the folding and development of the main regulatory proteins(Wegele et al., 2004).In this study, after removing the redundant sequences we inspected a total of 44 HSP60, 22 HSP70, and 23 HSP90 sequences of Citrus X Sinensis.A crucial role in examining the stability and operation of proteins in living systems was revealed by the behavior, characterization,

Table 1 :
. HSP70 has an indispensable role under non-stress conditions.Thermal sufferance and resistance enhanced to environmental stresses are Physiochemical properties of HSP70 family members.[R(reverse), F (forward), GS (genomic sequence), AA (number of amino acids), MW (molecular weight in kilo Dalton), pI (Isoelectric point), AI (Aliphatic Index), II (Instability Index), GRAVY (Grand Average of hydrophobicity Index)] Eight differentially conserved motifs of heat shock protein sequences observed in Citrus X Sinensis..