BINF 731 Homework assignment 2:
Iqbal Addou

Please expand each section to visualize corresponding sections of the report:

Section 1: Finding a human enzynmess with no experimentaly verified 3D structure in PDB

Uniprot search
Using Uniprot (https://www.uniprot.org/) the following query was was used to find a protein that matches the requirements in the homework. Most notably, to find proteins that are not in PDB, the NOT boolean operator is used in the advanced search field as can be shown in the following screenshots:





For reproducibility the query for the search is:
https://www.uniprot.org/uniprot/?query=enzyme+organism%3A%22Human+%5B9606%5D%22+length%3A%5B100+TO+250%5D+NOT+database%3A%28type%3Apdb%29&sort=score

The query yields the following 4,497 results as seen in the following screenshot

Section 2: Selecting results that have a homologue in PDB

PDB Sequence Search

Using the previous obtained results from UniProt. Random proteins were chosen, and their FASTA sequence was used to query PDB search by sequence. The bellow mentioned procedure was performed many timesm, to find a protein that have a homologue in PDB with 30% to 70% identity. After several trials our candidate protein is the UniProtKB - Q96GF1 (RN185_HUMAN) The E3 ubiquitin-protein ligase that regulates selective mitochondrial autophagy, acts in the endoplasmic reticulum (ER)-associated degradation (ERAD) pathway, which targets misfolded proteins that accumulate in the endoplasmic reticulum (ER) for ubiquitination and subsequent proteasome-mediated degradation and have many other roles within the human cells.



Here are the steps that were used to find the candidate, this was repeated for many randomely selected candidates, however we will only apply it for our Q96GF1 ATPase inhibitor for reproducibility purposes. First we obain the FASTA strings, by going to the the Format/FASTA menue item in the Q96GF1 results as seen in the following screenshot





Q96GF1 FASTA:

Second we use the obtained FASTA, and use it to perform a PDB search by sequence in http://www.rcsb.org/pdb/search/searchSequence.do

The search yields 96 result, most of them have identities between 37% to 46%. The next screenshot shows the ones with the highest identities on top of the list



Section 3: Constructing a model using Swiss-Model

We use the previously obtained Q96GF1 FASTA to query Swiss Model server, located at: https://swissmodel.expasy.org/

We start modeling by clicking on the Start Modeling button

We paste the Q96GF1 FASTA sequence in the text area

I have chosen the default option Build Model instead of Search for Templates. Thus allowing Swiss Model to chose the best candidates automatically.

Upon submitting the form, Swiss Model server performs homology search using blast and an HMM based algorithm.

At the end Swiss Model produced a single model as shown bellow. On the templates tab you can find what templates where used for the modeling job. We can also download the final PDB model by downloading the whole report or right clicking on the model 3d view options menue.

Section 4: Constructing model using Phyre 2

Phyre 2 is located at http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index

The procedure is similar to Swiss Model. However Phyre 2 is a more intensive program so you have to enter an email address to receive the modeling job results. We entered the Q96GF1 FASTA and chosen the intensive mode.

Phyre 2 starts the job and sends a tracking ID.

Phyre 2 also uses homology, HMM and ab initio methods.

When Phyre 2 modeling is concluded. An email is sent and upon clicking on the provided link, the results page is shown. It looks similar to Swiss Model. A best candidate protein PDB is highlighted and the templates are shown. Similary to Swiss Model, it is possible to download the full report or the single PDB model.

Section 5: 3D View of Swiss Model and Phyre 2 PDBs

Click on each tab to visualize the protein 3D view. The utilised viewer is NGL and it relies on the browser support for Javascript and the WebGL API.

NGL Viewer 3D structure for 5TF8

NGL Viewer 3D structure for 5IDV

Section 6: Model quality using ProsaWeb and Metal Check My Metal, and Verify 3D quality validation servers

For overall PDB file quality we will use three different validation server:

  1. Prosa Web https://prosa.services.came.sbg.ac.at/prosa.php
  2. CheckMyMetal (CMM): Metal Binding Site Validation Server results https://csgid.org/metal_sites
  3. Verify 3D http://servicesn.mbi.ucla.edu/Verify3D/
to check the quality of Both the SwissModel and Phyre 2 PDB models

Prosa Web produces 3 plots the first one shows if the Z-Score of the PDB model by sequence length falls within the range of experimentaly verified proteins, the second plot is to determine quality by plotting energies as a function of amino acid sequence position, where a positive results usually indicates a problem with the quality of the structure, finally the third plot is a 3D representation of the protein by residue with a heat map representing low and high energies (blue and red respectively)

CheckMyMetal determines if the PDB model has a functional metal binding site by going trough the 3D coordinates of the residues.

Verify 3D is another general quality verification tool. It determines the compatibility of an atomic model by comparing it with good structures



ProsaWeb results

ProsaWeb quality assessmment for Swiss Model PDB

       

ProsaWeb quality assessmment for Phyre 2 Model PDB<

       



CheckMyMetal (CMM): Metal Binding Site Validation Server results

CheckMyMetal (CMM): Metal Binding Site Validation Server quality assessmment for Swiss Model PDB

ID Res. Metal Occupancy B factor (env.)1 Ligands Valence2 nVECSUM3 Geometry1,4 gRMSD(°)1 Vacancy1 Bidentate Alt. metal
_:1ZNZn1N/AN1S32.50.14Tetrahedral6.4°00Fe, Cu, Mn
_:2ZNZn1N/AS42.60.34Tetrahedral8.2°00Fe, Cu, Mn
Legend:
Not applicableOutlierBorderlineAcceptable

Column Description
Occupancy Occupancy of ion under consideration
B factor (env.)1 Metal ion B factor, with valence-weighted environmental average B factor in parenthesis
Ligands Elemental composition of the coordination sphere
Valence2 Summation of bond valence values for an ion binding site. Valence accounts for metal-ligand distances
nVECSUM3 Summation of ligand vectors, weighted by bond valence values and normalized by overall valence. Increase when the coordination sphere is not symmetrical due to incompleteness.
Geometry1,4 Arrangement of ligands around the ion, as defined by the NEIGHBORHOOD algorithm
gRMSD(°)1 R.M.S. Deviation of observed geometry angles (L-M-L angles) compared to ideal geometry, in degrees
Vacancy1 Percentage of unoccupied sites in the coordination sphere for the given geometry
Bidentate Number of residues that form a bidentate interaction instead of being considered as multiple ligands
Alt. metal A list of alternative metal(s) is proposed in descending order of confidency, assuming metal environment is accurately determined. This feature is still experimental. It requires user discrimination and cannot be blindly accepted

Metal-ligand distance distributions for model_from_sm.pdb in comparison with CSD


(1) Zheng H, Chordia MD, Cooper DR, Chruszcz M, Müller P, Sheldrick GM, Minor W (2014) Nature Protocols, 9(1), 156-70.
(2) Brown ID (2009) Chem. Rev., 109, 6858-6919.
(3) Müller P, Köpke S, Sheldrick GM (2003) Acta Crystallogr. D Biol. Crystallogr., 59, 32-37.
(4) Kuppuraj G, Dudev M, Lim C (2009) J. Phys. Chem. B, 113, 2952-2960.
(5) CSD: Cambridge Structural Database
Maintained by: Heping Zheng <dust@iwonka.med.virginia.edu>
Citing CheckMyMetal (CMM):
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Zheng,H., Chordia,M.D., Cooper,D.R., Chruszcz,M., Müller,P., Sheldrick,G.M., Minor,W. (2014) Nature Protocols, 9(1), 156-70.

CheckMyMetal (CMM): Metal Binding Site Validation Server quality for Phyre 2 Model PDB<

ID Res. Metal Occupancy B factor (env.)1 Ligands Valence2 nVECSUM3 Geometry1,4 gRMSD(°)1 Vacancy1 Bidentate Alt. metal
Legend:
Not applicableOutlierBorderlineAcceptable

No metal present in the model requested, or metal is far away from the modelled macromolecule chain.


Column Description
Occupancy Occupancy of ion under consideration
B factor (env.)1 Metal ion B factor, with valence-weighted environmental average B factor in parenthesis
Ligands Elemental composition of the coordination sphere
Valence2 Summation of bond valence values for an ion binding site. Valence accounts for metal-ligand distances
nVECSUM3 Summation of ligand vectors, weighted by bond valence values and normalized by overall valence. Increase when the coordination sphere is not symmetrical due to incompleteness.
Geometry1,4 Arrangement of ligands around the ion, as defined by the NEIGHBORHOOD algorithm
gRMSD(°)1 R.M.S. Deviation of observed geometry angles (L-M-L angles) compared to ideal geometry, in degrees
Vacancy1 Percentage of unoccupied sites in the coordination sphere for the given geometry
Bidentate Number of residues that form a bidentate interaction instead of being considered as multiple ligands
Alt. metal A list of alternative metal(s) is proposed in descending order of confidency, assuming metal environment is accurately determined. This feature is still experimental. It requires user discrimination and cannot be blindly accepted


(1) Zheng H, Chordia MD, Cooper DR, Chruszcz M, Müller P, Sheldrick GM, Minor W (2014) Nature Protocols, 9(1), 156-70.
(2) Brown ID (2009) Chem. Rev., 109, 6858-6919.
(3) Müller P, Köpke S, Sheldrick GM (2003) Acta Crystallogr. D Biol. Crystallogr., 59, 32-37.
(4) Kuppuraj G, Dudev M, Lim C (2009) J. Phys. Chem. B, 113, 2952-2960.
(5) CSD: Cambridge Structural Database
Maintained by: Heping Zheng <dust@iwonka.med.virginia.edu>
Citing CheckMyMetal (CMM):
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Zheng,H., Chordia,M.D., Cooper,D.R., Chruszcz,M., Müller,P., Sheldrick,G.M., Minor,W. (2014) Nature Protocols, 9(1), 156-70.



Verify 3D

Verify 3D quality assessment for Swiss Model PDB

 

 

Verify 3D quality assessment for Phyre 2 Model PDB<

 

 

Section 7: Quality validation results

Both Swiss Model and Phyre 2 produced results are withing range for their sequence length according to Prosa Web results. When Verify 3D was used, the Swiss Model PDB file passed, but Phyre 2 didn't. Similarily CheckMyMetal (CMM) detected a Zinc Finger in the Swiss Model but failed to find any Metal Binding sites in the Phyre 2 pdb

Based on these results we determine that Swiss Model model is the superior one compared to Phyre 2, at least in this case. Therefore for the rest of the assignement, we will solely be using the Swiss Model.

Section 8: Identifying possible protein function using Motif search

Protein motif search was used to find the location of the active sites in the four chosen proteins. The results agree with the CheckMyMetal results and indicate a Zinc Finger motif. The used website is: https://www.genome.jp/tools/motif/


Results:

Motif in the sequence

Prosite ID:
ZF_RING_1
Description:
PS00518, Zinc finger RING-type signature.
Pattern:
C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYA].
Appearance:
Position54..63
Found MotifCGHLFCWPCL
Sequence:
MASKGPSASASPENSSAGGPSGSSNGAGESGGQDSTFECNICLDTAKDAVISLCGHLFCW
PCL
HQWLETRPNRQVCPVCKAGISRDKVIPLYGRGSTGQQDPREKTPPRPQGQRPEPENR
GGFQGFGFGDGGFQMSFGIGAFPFGIFATAFNINDGRPPPAVPGTPQYVDEQFLSRLFLF
VALVIMFWLLIA

Motif in the sequence

Prosite ID:
ZF_RING_2
Description:
PS50089, Zinc finger RING-type profile.
Consensus:
CPICLEEFKDPXVVLLPCGHTFCRECIRKWXXXNSNXTCPICR
Appearance:
Position39..80
Alignment
  Query
  Database

CNICLDTAKDA--VISLCGHLFCWPCLHQWlETRPNRQVCPVCK
CPICLEEFKDPXVVLLPCGHTFCRECIRKW-XXXNSNXTCPICR
Score974
Sequence:
MASKGPSASASPENSSAGGPSGSSNGAGESGGQDSTFECNICLDTAKDAVISLCGHLFCW
PCLHQWLETRPNRQVCPVCK
AGISRDKVIPLYGRGSTGQQDPREKTPPRPQGQRPEPENR
GGFQGFGFGDGGFQMSFGIGAFPFGIFATAFNINDGRPPPAVPGTPQYVDEQFLSRLFLF
VALVIMFWLLIA

Section 9: Mutant Selection

Since we determined that the Swiss Model has a Zinc Finger signature, we will attempt to change some residues outside the active site with cysteine. These residues will be symetrical to the active site, then we will determine the 3D structure of the mutated type and assess mutant model quality. The expectation is that these cysteines will create disulfide bonds and will increase the stability of the model.

The sequence for the mutant protein is:
CCCKGPSACCCPENSSASGPSGSSNGAGESGGQDCCCCCNICLDTAKDAVISLCGHLFCWPCLHQWLETRPNRQVCPVCKAGICCCKVIPSYGRGSTCCCDP
REKTCCCPQGQRPEPENRGGFQGFGFGDGGSSMSFGIGAFPFGIFATAFNINDGRPPPAVPSSPQYVDEQFLSRLFLFVALVIMFWLLIA

Section 10: Swiss Model mutated report

We will use Swiss Model again to model the mutant type, after the conclusion of the moedling job, However the default values predict a quartenary structure of an oligomeric protein. So time, I will use a specific template to compare against a single protein.

Section 10: Comparison of 3D structure between wild type and mutant type

Click on each tab to visualize the protein 3D view. The utilised viewer is NGL and it relies on the browser support for Javascript and the WebGL API. Next we will use SuperPose http://wishart.biology.ualberta.ca/SuperPose/ to show a side by side comparison

Comparison between wild type and mutant, side by side image: We notice that the mutant type (yellow) is more coiled inward and spherical probably because the formation of disulfide bonds

NGL Viewer 3D structure for wild type

NGL Viewer 3D structure for mutant

Section 11: Model quality (wild type vs mutant) using ProsaWeb and Metal Check My Metal, and Verify 3D quality validation servers

The same quality validation websites are used as in Swiss Model vs Phyre 2:

  1. Prosa Web https://prosa.services.came.sbg.ac.at/prosa.php
  2. CheckMyMetal (CMM): Metal Binding Site Validation Server results https://csgid.org/metal_sites
  3. Verify 3D http://servicesn.mbi.ucla.edu/Verify3D/
to check the quality of Both the wild type and mutant Swiss Model PDB



ProsaWeb results

ProsaWeb quality assessmment for Swiss Model PDB (wild type)

       

ProsaWeb quality assessmment for Swiss Model PDB (mutant)<

       



CheckMyMetal (CMM): Metal Binding Site Validation Server results

CheckMyMetal (CMM): Metal Binding Site Validation Server quality assessmment for Swiss Model PDB

ID Res. Metal Occupancy B factor (env.)1 Ligands Valence2 nVECSUM3 Geometry1,4 gRMSD(°)1 Vacancy1 Bidentate Alt. metal
_:1ZNZn1N/AN1S32.50.14Tetrahedral6.4°00Fe, Cu, Mn
_:2ZNZn1N/AS42.60.34Tetrahedral8.2°00Fe, Cu, Mn
Legend:
Not applicableOutlierBorderlineAcceptable

Column Description
Occupancy Occupancy of ion under consideration
B factor (env.)1 Metal ion B factor, with valence-weighted environmental average B factor in parenthesis
Ligands Elemental composition of the coordination sphere
Valence2 Summation of bond valence values for an ion binding site. Valence accounts for metal-ligand distances
nVECSUM3 Summation of ligand vectors, weighted by bond valence values and normalized by overall valence. Increase when the coordination sphere is not symmetrical due to incompleteness.
Geometry1,4 Arrangement of ligands around the ion, as defined by the NEIGHBORHOOD algorithm
gRMSD(°)1 R.M.S. Deviation of observed geometry angles (L-M-L angles) compared to ideal geometry, in degrees
Vacancy1 Percentage of unoccupied sites in the coordination sphere for the given geometry
Bidentate Number of residues that form a bidentate interaction instead of being considered as multiple ligands
Alt. metal A list of alternative metal(s) is proposed in descending order of confidency, assuming metal environment is accurately determined. This feature is still experimental. It requires user discrimination and cannot be blindly accepted

Metal-ligand distance distributions for model_from_sm.pdb in comparison with CSD


(1) Zheng H, Chordia MD, Cooper DR, Chruszcz M, Müller P, Sheldrick GM, Minor W (2014) Nature Protocols, 9(1), 156-70.
(2) Brown ID (2009) Chem. Rev., 109, 6858-6919.
(3) Müller P, Köpke S, Sheldrick GM (2003) Acta Crystallogr. D Biol. Crystallogr., 59, 32-37.
(4) Kuppuraj G, Dudev M, Lim C (2009) J. Phys. Chem. B, 113, 2952-2960.
(5) CSD: Cambridge Structural Database
Maintained by: Heping Zheng <dust@iwonka.med.virginia.edu>
Citing CheckMyMetal (CMM):
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Zheng,H., Chordia,M.D., Cooper,D.R., Chruszcz,M., Müller,P., Sheldrick,G.M., Minor,W. (2014) Nature Protocols, 9(1), 156-70.

CheckMyMetal (CMM): Metal Binding Site Validation Server quality for Phyre 2 Model PDB<

ID Res. Metal Occupancy B factor (env.)1 Ligands Valence2 nVECSUM3 Geometry1,4 gRMSD(°)1 Vacancy1 Bidentate Alt. metal
_:3ZNZn1N/AS42.80.29Tetrahedral00Cu, Fe
_:4ZNZn1N/AN1S320.15Tetrahedral13.7°00
Legend:
Not applicableOutlierBorderlineAcceptable

Column Description
Occupancy Occupancy of ion under consideration
B factor (env.)1 Metal ion B factor, with valence-weighted environmental average B factor in parenthesis
Ligands Elemental composition of the coordination sphere
Valence2 Summation of bond valence values for an ion binding site. Valence accounts for metal-ligand distances
nVECSUM3 Summation of ligand vectors, weighted by bond valence values and normalized by overall valence. Increase when the coordination sphere is not symmetrical due to incompleteness.
Geometry1,4 Arrangement of ligands around the ion, as defined by the NEIGHBORHOOD algorithm
gRMSD(°)1 R.M.S. Deviation of observed geometry angles (L-M-L angles) compared to ideal geometry, in degrees
Vacancy1 Percentage of unoccupied sites in the coordination sphere for the given geometry
Bidentate Number of residues that form a bidentate interaction instead of being considered as multiple ligands
Alt. metal A list of alternative metal(s) is proposed in descending order of confidency, assuming metal environment is accurately determined. This feature is still experimental. It requires user discrimination and cannot be blindly accepted

Metal-ligand distance distributions for model_from_sm_mutant.pdb in comparison with CSD


(1) Zheng H, Chordia MD, Cooper DR, Chruszcz M, Müller P, Sheldrick GM, Minor W (2014) Nature Protocols, 9(1), 156-70.
(2) Brown ID (2009) Chem. Rev., 109, 6858-6919.
(3) Müller P, Köpke S, Sheldrick GM (2003) Acta Crystallogr. D Biol. Crystallogr., 59, 32-37.
(4) Kuppuraj G, Dudev M, Lim C (2009) J. Phys. Chem. B, 113, 2952-2960.
(5) CSD: Cambridge Structural Database
Maintained by: Heping Zheng <dust@iwonka.med.virginia.edu>
Citing CheckMyMetal (CMM):
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Zheng,H., Chordia,M.D., Cooper,D.R., Chruszcz,M., Müller,P., Sheldrick,G.M., Minor,W. (2014) Nature Protocols, 9(1), 156-70.



Verify 3D

Verify 3D quality assessment for Swiss Model PDB

 

 

Verify 3D quality assessment for Phyre 2 Model PDB<

 

 

Section 12:Conclusion

The wild type and mutant type have similar quality measures by Prosa Web and CMM. However the wild type is still the superior model compared with the mutant when tested with verify 3D.

Based on these results we determine that the original model is still better and that the introduced cysteine additions didn't necesseraly improve the model quality.

Section 13: References

  1. Wiederstein, M., & Sippl, M. J. (2007). ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic acids research, 35(Web Server issue), W407-10.
  2. VERIFY3D: assessment of protein models with three-dimensional profiles. Methods in enzymology, Vol. 277 (1997), pp. 396-404 by D. Eisenberg, R. Lüthy, J. U. Bowie
  3. Rajarshi Maiti, Gary H. Van Domselaar, Haiyan Zhang, and David S. Wishart "SuperPose: a simple server for sophisticated structural superposition" Nucleic Acids Res. 2004 July 1; 32 (Web Server issue): W590W594.
  4. CheckMyMetal: a macromolecular metal-binding validation tool. Zheng,H., Cooper,D.R., Porebski,P.J., Shabalin,I.G., Handing,K.B., Minor,W. (2017) Acta crystallographica. Section D, Structural biology, 73, 223-233.
  5. Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Zheng,H., Chordia,M.D., Cooper,D.R., Chruszcz,M., Müller,P., Sheldrick,G.M., Minor,W. (2014) Nature Protocols, 9(1), 156-70.
  6. The Phyre2 web portal for protein modeling, prediction and analysis Kelley LA et al. Nature Protocols 10, 845-858 (2015)
  7. Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T., de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., Schwede, T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46(W1), W296-W303 (2018).