Example for PredictProtein output
Example for PredictProtein output
Sequence: prio_human (Swissprot)
MAJOR PRION PROTEIN PRECURSOR (PRP)
OUTPUT
The following information has been received by the server:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
rost
# prion_human
MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP
HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA
VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV
NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV
ILLISFLIFLIVG
________________________________________________________________________________
The sequence had been interpreted as being:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
>P1; t1
(#) prion_human
MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP
HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA
VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV
NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV
ILLISFLIFLIVG
________________________________________________________________________________
The alignment that has been used as input to the network is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
--- ------------------------------------------------------------
--- MAXHOM multiple sequence alignment
--- ------------------------------------------------------------
---
--- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY
--- ID : identifier of aligned (homologous) protein
--- STRID : PDB identifier (only for known structures)
--- PIDE : percentage of pairwise sequence identity
--- WSIM : percentage of weighted similarity
--- LALI : number of residues aligned
--- NGAP : number of insertions and deletions (indels)
--- LGAP : number of residues in all indels
--- LSEQ2 : length of aligned sequence
--- ACCNUM : SwissProt accession number
--- NAME : one-line description of aligned protein
---
--- MAXHOM ALIGNMENT HEADER: SUMMARY
ID STRID IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME
prio_human 100 100 253 0 0 253 P04156 MAJOR PRION PROTEIN PRECU
prio_gorgo 100 100 253 0 0 253 P40252 MAJOR PRION PROTEIN PRECU
prio_pantr 99 99 253 0 0 253 P40253 MAJOR PRION PROTEIN PRECU
prio_ponpy 98 99 253 0 0 253 P40256 MAJOR PRION PROTEIN PRECU
prio_colgu 97 98 253 0 0 253 P40251 MAJOR PRION PROTEIN PRECU
prio_prefr 97 98 253 0 0 253 P40257 MAJOR PRION PROTEIN PRECU
prio_atege 97 96 232 1 9 232 P40246 MAJOR PRION PROTEIN PRECU
prio_macfa 96 98 253 0 0 253 P40254 MAJOR PRION PROTEIN PRECU
prio_saisc 96 96 253 1 7 260 P40258 MAJOR PRION PROTEIN PRECU
prio_calja 96 97 252 1 1 252 P40247 MAJOR PRION PROTEIN PRECU
prio_calmo 96 98 241 0 0 241 P40248 MAJOR PRION PROTEIN PRECU
prio_cebap 96 96 252 1 1 252 P40249 MAJOR PRION PROTEIN PRECU
prio_cerae 96 96 245 1 8 245 P40250 MAJOR PRION PROTEIN PRECU
prio_mansp 96 98 241 0 0 241 P40255 MAJOR PRION PROTEIN PRECU
prio_aottr 96 96 239 1 1 239 P40245 MAJOR PRION PROTEIN PRECU
prio_bovin 92 92 252 2 9 264 P10279 PROTEIN 1).
prio_rat 92 94 225 1 1 226 P13852 MAJOR PRION PROTEIN (PRP)
prio_trast 91 91 252 2 9 264 P40242 PROTEIN 1).
prip_bovin 90 92 252 1 1 256 Q01880 PROTEIN 2).
prio_mouse 90 91 252 2 3 254 P04925 MAJOR PRION PROTEIN PRECU
prio_sheep 90 92 252 1 1 256 P23907 MAJOR PRION PROTEIN PRECU
prio_mesau 90 92 253 1 1 254 P04273 MAJOR PRION PROTEIN PRECU
prio_odohe 90 91 252 1 1 256 P47852 MAJOR PRION PROTEIN PRECU
prp2_trast 89 90 252 1 1 256 P40243 PROTEIN 2).
prio_musvi 88 90 252 2 2 257 P40244 MAJOR PRION PROTEIN PRECU
grp_horvu 48 42 88 3 21 200 P17816 GLYCINE-RICH CELL WALL ST
grw1_lyces 55 53 38 1 7 43 Q01157 GLYCINE-RICH CELL WALL ST
grp2_sinal 40 38 88 2 18 169 P49311 GLYCINE-RICH RNA-BINDING
grp8_arath 39 39 87 2 19 169 Q03251 GLYCINE-RICH RNA-BINDING
nucl_xenla 41 42 69 3 9 650 P20397 NUCLEOLIN (PROTEIN C23).
prio_chick 38 34 249 5 23 273 P27177 RECEPTOR-INDUCING ACTIVIT
gar1_schpo 39 35 71 2 22 194 Q06975 GAR1 PROTEIN.
grpa_medfa 38 34 93 2 11 159 Q09134 ABSCISIC ACID AND ENVIRON
egg1_schja 36 30 135 6 34 212 P19470 EGGSHELL PROTEIN 1 PRECUR
grp1_phavu 36 36 127 2 10 252 P10495 GLYCINE-RICH CELL WALL ST
grp2_nicsy 35 34 113 2 4 214 P27484 GLYCINE-RICH CELL WALL ST
roab_xenla 35 28 122 5 21 351 P17131 PROTEIN) (SINGLE-STRAND B
grp1_orysa 35 34 125 2 24 165 P25074 GLYCINE-RICH CELL WALL ST
k1ci_human 35 35 120 3 9 622 P35527 KERATIN, TYPE I CYTOSKELE
grp2_sorvu 35 30 115 4 21 168 Q99070 GLYCINE-RICH RNA-BINDING
grp1_sinal 35 34 98 2 21 166 P49310 GLYCINE-RICH RNA-BINDING
gr10_brana 34 32 117 2 24 169 Q05966 GLYCINE-RICH RNA-BINDING
ebn1_ebv 34 37 112 1 1 641 P03211 EBNA-1 NUCLEAR PROTEIN.
roa2_human 34 26 124 4 15 353 P22626 B1).
ch15_drogr 36 33 70 2 14 102 P13425 CHORION PROTEIN S15.
asf1_helan 34 28 83 2 24 161 P22357 ANTHER-SPECIFIC PROTEIN S
pcp_yeren 34 22 89 2 61 155 P31484 OUTER MEMBRANE LIPOPROTEI
grp_dauca 34 28 122 3 48 157 Q03878 GLYCINE-RICH RNA-BINDING
ykr3_caeel 35 36 71 0 0 113 P34309 HYPOTHETICAL 11.3 KD PROT
vnua_prvka 33 35 105 1 19 1733 P33485 PROBABLE NUCLEAR ANTIGEN.
chb3_bommo 34 30 77 2 6 91 P08915 CHORION CLASS B PROTEIN M
grpa_maize 33 28 104 2 21 157 P10979 GLYCINE-RICH RNA-BINDING,
grp_arath 33 37 132 0 0 338 P27483 GLYCINE-RICH CELL WALL ST
els_human 33 22 129 4 19 730 P15502 ELASTIN PRECURSOR.
roaa_xenla 33 28 126 3 18 365 P17130 PROTEIN) (SINGLE-STRAND B
grp2_orysa 32 34 154 2 23 183 P29834 GLYCINE-RICH CELL WALL ST
rnha_human 32 28 99 2 32 1279 Q08211 ATP-DEPENDENT RNA HELICAS
grp7_arath 32 37 99 1 4 176 Q03250 GLYCINE-RICH RNA-BINDING
grp2_phavu 32 36 127 1 6 465 P10496 GLYCINE-RICH CELL WALL ST
chb8_bommo 32 28 77 2 6 119 P08914 CHORION CLASS B PROTEIN M
grp1_cheru 32 27 113 3 21 144 P11898 GLYCINE-RICH PROTEIN HC1.
els_bovin 32 23 135 4 19 747 P04985 ELASTINS A/B/C PRECURSOR.
nucl_rat 32 32 85 1 3 712 P13383 NUCLEOLIN (PROTEIN C23).
nucl_human 32 32 85 1 3 706 P19338 NUCLEOLIN (PROTEIN C23).
nucl_mouse 32 32 85 1 3 706 P09405 NUCLEOLIN (PROTEIN C23).
vg38_bpm1 32 26 114 2 17 262 P08234 RECEPTOR RECOGNIZING PROT
grp1_pethy 32 36 130 0 0 384 P09789 GLYCINE-RICH CELL WALL ST
mcba_ecoli 42 40 43 1 2 69 P05834 BACTERIOCIN MICROCIN B17
roa1_drome 31 25 131 3 11 365 P07909 (PEN REPEAT CLONE P9).
grp3_artsa 31 22 90 3 18 308 P13230 GLYCINE-RICH PROTEIN GRP3
chb7_bommo 32 28 75 2 6 126 P08916 CHORION CLASS B PROTEIN M
roa1_scham 31 30 126 2 5 342 P21522 HETEROGENEOUS NUCLEAR RIB
sala_droor 31 25 120 3 18 142 P21748 PROTEIN SPALT-ACCESSORY.
els_chick 31 28 120 2 8 750 P07916 ELASTIN PRECURSOR (FRAGME
spd1_nepcl 31 28 140 1 6 747 P19837 SPIDROIN 1 (DRAGLINE SILK
sala_drosi 31 21 134 5 43 139 P21749 PROTEIN SPALT-ACCESSORY.
chb4_bommo 31 22 108 5 21 147 P05685 CHORION CLASS B PROTEIN B
ews_human 30 24 128 2 27 656 Q01844 RNA-BINDING PROTEIN EWS.
sqd_drome 30 28 115 2 12 345 Q08473 (HNRNP 40).
ydh3_hsvsc 34 35 61 0 0 103 P22577 HYPOTHETICAL 9.5 KD PROTE
sala_drome 30 22 123 4 25 142 P21750 PROTEIN SPALT-ACCESSORY.
egg2_schja 30 26 150 4 50 207 P19469 EGGSHELL PROTEIN 2A PRECU
ssb_ecoli 32 28 68 1 2 177 P02339 SINGLE-STRAND BINDING PRO
prpc_human 30 26 97 1 1 166 P02810 PIF-F, PIF-S, PROTEINS A
---
--- MAXHOM ALIGNMENT: IN MSF FORMAT
MSF of: /home/phd/tmp/t1_11691.hssp from: 1 to: 253
/home/phd/tmp/t1_11691.ret_msf MSF: 253 Type: P 9-Aug-96 02:00:0 Check: 7304 ..
Name: t1_11691 Len: 253 Check: 8781 Weight: 1.00
Name: prio_human Len: 253 Check: 8781 Weight: 1.00
Name: prio_gorgo Len: 253 Check: 9429 Weight: 1.00
Name: prio_pantr Len: 253 Check: 9714 Weight: 1.00
Name: prio_ponpy Len: 253 Check: 629 Weight: 1.00
Name: prio_colgu Len: 253 Check: 433 Weight: 1.00
Name: prio_prefr Len: 253 Check: 92 Weight: 1.00
Name: prio_atege Len: 253 Check: 4448 Weight: 1.00
Name: prio_macfa Len: 253 Check: 9975 Weight: 1.00
Name: prio_saisc Len: 253 Check: 346 Weight: 1.00
Name: prio_calja Len: 253 Check: 8833 Weight: 1.00
Name: prio_calmo Len: 253 Check: 6952 Weight: 1.00
Name: prio_cebap Len: 253 Check: 9659 Weight: 1.00
Name: prio_cerae Len: 253 Check: 2862 Weight: 1.00
Name: prio_mansp Len: 253 Check: 5552 Weight: 1.00
Name: prio_aottr Len: 253 Check: 4849 Weight: 1.00
Name: prio_bovin Len: 253 Check: 542 Weight: 1.00
Name: prio_rat Len: 253 Check: 161 Weight: 1.00
Name: prio_trast Len: 253 Check: 828 Weight: 1.00
Name: prip_bovin Len: 253 Check: 515 Weight: 1.00
Name: prio_mouse Len: 253 Check: 1680 Weight: 1.00
Name: prio_sheep Len: 253 Check: 1310 Weight: 1.00
Name: prio_mesau Len: 253 Check: 9948 Weight: 1.00
Name: prio_odohe Len: 253 Check: 1063 Weight: 1.00
Name: prp2_trast Len: 253 Check: 272 Weight: 1.00
Name: prio_musvi Len: 253 Check: 54 Weight: 1.00
Name: grp_horvu Len: 253 Check: 759 Weight: 1.00
Name: grw1_lyces Len: 253 Check: 2041 Weight: 1.00
Name: grp2_sinal Len: 253 Check: 4903 Weight: 1.00
Name: grp8_arath Len: 253 Check: 6106 Weight: 1.00
Name: nucl_xenla Len: 253 Check: 6307 Weight: 1.00
Name: prio_chick Len: 253 Check: 9139 Weight: 1.00
Name: gar1_schpo Len: 253 Check: 3186 Weight: 1.00
Name: grpa_medfa Len: 253 Check: 2708 Weight: 1.00
Name: egg1_schja Len: 253 Check: 6293 Weight: 1.00
Name: grp1_phavu Len: 253 Check: 3033 Weight: 1.00
Name: grp2_nicsy Len: 253 Check: 830 Weight: 1.00
Name: roab_xenla Len: 253 Check: 3477 Weight: 1.00
Name: grp1_orysa Len: 253 Check: 6544 Weight: 1.00
Name: k1ci_human Len: 253 Check: 3405 Weight: 1.00
Name: grp2_sorvu Len: 253 Check: 9465 Weight: 1.00
Name: grp1_sinal Len: 253 Check: 3668 Weight: 1.00
Name: gr10_brana Len: 253 Check: 4828 Weight: 1.00
Name: ebn1_ebv Len: 253 Check: 4393 Weight: 1.00
Name: roa2_human Len: 253 Check: 6823 Weight: 1.00
Name: ch15_drogr Len: 253 Check: 2332 Weight: 1.00
Name: asf1_helan Len: 253 Check: 4095 Weight: 1.00
Name: pcp_yeren Len: 253 Check: 3128 Weight: 1.00
Name: grp_dauca Len: 253 Check: 2776 Weight: 1.00
Name: ykr3_caeel Len: 253 Check: 4721 Weight: 1.00
Name: vnua_prvka Len: 253 Check: 5805 Weight: 1.00
Name: chb3_bommo Len: 253 Check: 2045 Weight: 1.00
Name: grpa_maize Len: 253 Check: 2510 Weight: 1.00
Name: grp_arath Len: 253 Check: 1075 Weight: 1.00
Name: els_human Len: 253 Check: 5247 Weight: 1.00
Name: roaa_xenla Len: 253 Check: 479 Weight: 1.00
Name: grp2_orysa Len: 253 Check: 7212 Weight: 1.00
Name: rnha_human Len: 253 Check: 9503 Weight: 1.00
Name: grp7_arath Len: 253 Check: 6751 Weight: 1.00
Name: grp2_phavu Len: 253 Check: 1994 Weight: 1.00
Name: chb8_bommo Len: 253 Check: 1451 Weight: 1.00
Name: grp1_cheru Len: 253 Check: 3797 Weight: 1.00
Name: els_bovin Len: 253 Check: 727 Weight: 1.00
Name: nucl_rat Len: 253 Check: 3798 Weight: 1.00
Name: nucl_human Len: 253 Check: 3956 Weight: 1.00
Name: nucl_mouse Len: 253 Check: 3798 Weight: 1.00
Name: vg38_bpm1 Len: 253 Check: 3833 Weight: 1.00
Name: grp1_pethy Len: 253 Check: 5946 Weight: 1.00
Name: mcba_ecoli Len: 253 Check: 8534 Weight: 1.00
Name: roa1_drome Len: 253 Check: 150 Weight: 1.00
Name: grp3_artsa Len: 253 Check: 8770 Weight: 1.00
Name: chb7_bommo Len: 253 Check: 9468 Weight: 1.00
Name: roa1_scham Len: 253 Check: 1032 Weight: 1.00
Name: sala_droor Len: 253 Check: 6894 Weight: 1.00
Name: els_chick Len: 253 Check: 553 Weight: 1.00
Name: spd1_nepcl Len: 253 Check: 1595 Weight: 1.00
Name: sala_drosi Len: 253 Check: 2965 Weight: 1.00
Name: chb4_bommo Len: 253 Check: 7618 Weight: 1.00
Name: ews_human Len: 253 Check: 1326 Weight: 1.00
Name: sqd_drome Len: 253 Check: 5891 Weight: 1.00
Name: ydh3_hsvsc Len: 253 Check: 1140 Weight: 1.00
Name: sala_drome Len: 253 Check: 6682 Weight: 1.00
Name: egg2_schja Len: 253 Check: 1991 Weight: 1.00
Name: ssb_ecoli Len: 253 Check: 2795 Weight: 1.00
Name: prpc_human Len: 253 Check: 7305 Weight: 1.00
//
1 50
t1_11691 MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_human MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_gorgo MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_pantr MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_ponpy MANLGCWMLV LFVATWSNLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_colgu MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_prefr MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_atege .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_macfa MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_saisc MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_calja MANLGCWMLF LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_calmo .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_cebap MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNLYP
prio_cerae MANLGCWMLV VFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_mansp .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_aottr .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QSSPGGNRYP
prio_bovin .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prio_rat .......... .......... ........GG WNTGGSRYPG QGSPGGNRYP
prio_trast .SHIGSWILV LFVAMWSDVA LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prip_bovin .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prio_mouse MANLGYWLLA LFVTMWTDVG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_sheep .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prio_mesau MANLSYWLLA LFVAMWTDVG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP
prio_odohe .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prp2_trast .SHIGSWILV LFVAMWSDVA LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
prio_musvi .SHIGSWLLV LFVATWSDIG FCKKRPKpgG WNTGGSRYPG QGSPGGNRYP
grp_horvu .......... .........G GGGYPGGGGG YGGGGGGYPG HGGEGGGGY.
grw1_lyces .......... .......... .......... .......... ...GGGGRYP
grp2_sinal .......... .......... ..NEAQSRGS GAGGGGRGGG GGYRGGGGY.
grp8_arath .......... .......... ..QSRGSGGG GGGRGGSGGG YRSGGGGGYS
nucl_xenla .......... .......... ..SQRGGRGG FGRGGGFRGG RGGRGGG...
prio_chick MARLlcCLLA LLLAACTDVA LSKkkPSGGG WGAGSHRQPS YPRQPGYPHN
gar1_schpo .......... .......... .......... .........G PKKPKGARNG
grpa_medfa .......... .......... .......GGG YNHGGGGYNG GGYNHGG...
egg1_schja LAAIG.YTIA YPPPSDYDSG YGGGGGGGGG GGYGGWCGGS DCYGGGNGGG
grp1_phavu .......... .........G YGGGAGKGGG EGYGGGGANG GGYGGGGGSG
grp2_nicsy .......... .........G GGGGGGRGGG GYGGGSGGYG GGGRGGSRGY
roab_xenla .......... ......SRGG FGNDNFGGRG GNFGGNR.GG GGGFGNRGYG
grp1_orysa .......FLL LLTISLSKSN AARVIKYNGG GSGGGGGGGG GGGGGGNGSg
k1ci_human .......... .......... ........GG GGSGGGYGGG SGSRGGSGGs
grp2_sorvu ......FGFV TFSSEQSMLD AIEngKELDG RNITVNQAQS RGGGGGGGGY
grp1_sinal .......... ..IEGMNGQD LDGRSITVNE AQSRGSGGGG GGRGGGGGYR
gr10_brana .......... ....TFSQFG EVIDSKIIND RETGRSRGFG FVTFKDEKsq
ebn1_ebv .......... .........G GTGAGAGAGG AGAGGAGAGG GAGAGGGAGG
roa2_human .......... ......SRGG GGNFGPGPGS NFRGGSDGYG SGRGFGDGYN
ch15_drogr .......... .......... .......... .......... ..........
asf1_helan .......... .......... ..NPGPPPGA PGTPGTPPAP PGKGEGDAPH
pcp_yeren .......... .......... .......... .......... ..........
grp_dauca MAEVEYRCFV GGLAwfSQFG DITDSKIIND RETGRSrlDG RNITVNEAQS
ykr3_caeel .......... .......... .......... .......... .GSIAGNLIR
vnua_prvka .......... .........G AALPARGPGG LRGRGRGGRG GGGGGGGRGP
chb3_bommo .......... ..VGVSGNLP FLGTADVAGE FPTAGIGEIL YGCGNGAviT
grpa_maize .......... .FVTFSSENS MLDAIENMNG KELDGR.... ..NITVNQAQ
grp_arath .......... ...GSGGGLG GGIGGGAGGG AGGGGGLGGG HGGGIGGGAG
els_human .........V LPGARFPGVG VLPgkPKAPG VGGAFAGIPG VGPFGG....
roaa_xenla .......... .........G NRGGGGGFGN RGYGGDGYNG DGQLWWQPSL
grp2_orysa LAILVLLSIG MTTSARTLLG YGPGGGGGGG GEGGGGGYGG SGYGSGSGYG
rnha_human .......... .......... .MARYDNGSG YRRGGSSYSG GGYGGGYSSG
grp7_arath .......... .......... ........DG RSITVNEAQS RGSGGGGGHR
grp2_phavu .......... ......YGTG GGAGGGGGGG GDHGGGYGGG QGAGGGAGGG
chb8_bommo .......... ..VGVCGNLP FLGTADVAGE FPTAGIGEID YGCGNGAviT
grp1_cheru LLGLSIAFAI LISSEVAARE LAETAAKTEG YNNGGGYHNG GGGYNNGGGY
els_bovin .......... ........LL LCILQPSQPG GVPGAVP... GGVPGGVFFP
nucl_rat .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF
nucl_human .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF
nucl_mouse .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF
vg38_bpm1 .......... .......... .......... LNIHGVTMYG RGGNGGSNSP
grp1_pethy .......... .........G AGGGFGGGAG GGAGGGLGGG GGLGGGGGGG
mcba_ecoli .......... .......... .......... .......... ..........
roa1_drome .......... ........VD VKKALPKQND QQGGGGGRGG PGGRAGGNRG
grp3_artsa .......... .......... .......... .......... ..........
chb7_bommo .......... ..VGVSGNLP FLGTADVAGE FPTAGIGEID YGCGNGAviT
roa1_scham .......... ..VGGGAGGG WGGGRGDWGG SAGGGG...G GGWGGADPWE
sala_droor .......... .......... .......... .......... NGYGQGGQGP
els_chick AAPLLPGVLL LFSILPASQQ GGVPGAIPGG GVPGGGFFPG AGVGGL....
spd1_nepcl ........LG SQGAGRGGQG AGAAAAAAGG AGQGGYGGLG SQGAGRGGLG
sala_drosi .MKLLIALLA LVTAAIAQNG F......... .........G QGGYGGQ...
chb4_bommo .......... .......... .........G RGCGGRGYGG LGY.......
ews_human .AAVEWFDGK DFQGSKLKVS LARKKPPMNS MRGGLPPREG RGMPPPLRGG
sqd_drome .......... ......KEVD VKRATPKPEN QMMGGMRGGP RGGMRGGRGG
ydh3_hsvsc .......... .......... .......... .SPGGPGGPG GPGGPGGPGG
sala_drome .......... .......... .......... IAQNGFGQVG QGGYGGQ...
egg2_schja LAAIG.YTIA YPPSSDYDSG YGGGGGGGGG GGYGGWCGGS DCYGGGNGGG
ssb_ecoli .......... LRTRKWTDQS GQDRYTTEVV VNVGGTMQML GGRQGGG..A
prpc_human .......... .........G NQDDGPQQGP PQQGGQQQQG PPPPQGKPqp
51 100
t1_11691 PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_human PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_gorgo PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_pantr PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_ponpy PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_colgu PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_prefr PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_atege .........P QGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN
prio_macfa PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWH
prio_saisc PQGGggWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN
prio_calja PQGGG.WGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN
prio_calmo PQGGGSWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN
prio_cebap PQGGG.WGQP HGGGWGQPHG GGWGQPHGGS WGQPHGGGWG QGGGTHNQWN
prio_cerae PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQ....... .GGGTHNQWH
prio_mansp PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWH
prio_aottr PQSGG.WGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN
prio_bovin PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG qqGGTHGQWN
prio_rat PQSGGTWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWS QGGGTHNQWN
prio_trast SQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG qqGGTHGQWN
prip_bovin PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGSHSQWN
prio_mouse PQGGT.WGQP HGGGWGQPHG GSWGQPHGGS WGQPHGGGWG QGGGTHNQWN
prio_sheep PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGSHSQWN
prio_mesau PQGGGTWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN
prio_odohe PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGTHSQWN
prp2_trast PQEGGDWGQP HGGGWGQPHV GGWGQPHGGG WGQPHGGGGW GQGGTHGQWN
prio_musvi PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPhgGGWG QGGGSHGQWG
grp_horvu ..GGGGGYPG HGGEGGGGYG GGGGYHGHGG EG...GGGYG GGGGYHGH..
grw1_lyces GGGGGGRG.. .....GGRYS GGGGRGGGGG RGGRGGGG.. ..........
grp2_sinal GGGGGGYGGG RREGGGYSGG GGGYSSRGGG GGGYGGGGRR DGGGY.....
grp8_arath GGGGGGYSGG GGGGYER.RS GGYGSGGGGG GRGYGGGGRR EGGGY.....
nucl_xenla ..GGRGFGG. RGGGRGR... GGFGGRGGGG FRGGQGGGFR GGQGKKMRFD
prio_chick PGYPHNPGYP HNPGygYPHN PGYPqpHNPG YPGWGQGYNP SSGGSYHNQK
gar1_schpo PAGRGGRGGF RGGRGGS..R GGFGGNSRGG FGGGSRGGFG GGSRGGSR..
grpa_medfa ..GGYNNGGG YNHGGGGYNN GGGGYNHGGG GYNNGGGGYN HGGGGYNNGG
egg1_schja GGGGGGNGGE YGGGYGDVYG GSYGGggGGG YGDVYGGGCG ggGGN.....
grp1_phavu GGGGGGAGGa yGGGEGSGAG GGYGGANGGG GGGNGGGGGG GSGGAHGGGA
grp2_nicsy GGGDGGYGGG GGYGGGSRYG GGGGG.YGGG GGY...GGGG SGGGSGCFKC
roab_xenla GDGYNGDGQl wNRGYGAGQG GGYGAGQGGG YGGgqGGGYG GNGGYDGYNG
grp1_orysa gGGGGGGGGG NGSGSGSGYG YGYGQGNGGA QGQGSGGGGG GGGGGGGGGs
k1ci_human sGSGGGSGGG YGGGSGGGHS GGSGGGHSGG SGGNYGGGSG SGGGSGGGYg
grp2_sorvu GGGGGGYGGR EGGGYGG.GG GGYGGRREGG GGY.GGGGYG GGGGGY....
grp1_sinal SGGGGGYGGG GGGYGGGGRE GGYSG.GGGG YSSRGGGGGG YGGGGRRD..
gr10_brana SRGGGGGGGR GGGGYGGRGG GGYGGGGGGY GDRRGGGGYG SGGGGRGGGG
ebn1_ebv AGGAGGAGAG GGAGAGGGAG GAGGAGAGGG AGAggAGGAG AGGGAGGAGG
roa2_human GYGGGPGGGN FGGSPG..YG GGRGGYGGGG PGYgqGGGYG ggSGNYNDFG
ch15_drogr .......... SAGGYGNIGL GGYGL.GNVG YLQNHGGGYG RRPILISKSS
asf1_helan pdGGSGPAPP AGGGSPPPAG GDGGGGAPPP AGGDGGGGAP PPAGGDG...
pcp_yeren .......... QGGDDNNVMG AIGGAVLGGF LGNTVGGGTG RSLAT.....
grp_dauca RGSGGGGGRR EGGGGGYGGG GGYGGRREGG GGGGYGGRRE GGGGGYGG..
ykr3_caeel DKVGGAGGDI LGGLASNFFG GGGGGGGGGG GGGFGGGNGG FGGGIFIFKI
vnua_prvka RGRGGRRRRr lGGGRGRGGR GGRGGRGRGG GRAPRGGGGG PGGGGRAGRG
chb3_bommo REGGLGYGAG YGGGYGLGYG G.....YGGG YGLGYGGYGG CGCG......
grpa_maize SRGGGGGGGG YGGGRGGGGY GGGRRDGGYG GGGGYGGRRE GGGGGYG...
grp_arath GGAGGGLGGG HGGGIGGGAG GGSGGGLGGG IGGGAGGGAG GGGGAGGGGG
els_human PQPGVPLGYp lPGGYGLPYT TgyGYGPGGV AGAAGKAGYP TGTGVGPQAA
roaa_xenla LGWNRGYGAG QGGGYGAGQG GGYGGgqGGG YGGnsGGNFG SSGGYNDFGN
grp2_orysa ..EGGGSGGA AGGGYGRGGG GGGGGGEGGG SGSggGGGGG GQGGGAGGYG
rnha_human GYGSGGYGGs vGGGYRGVSR GGFRGNSGGD YRGPSGGYRG SGGFQR....
grp7_arath GGGGGGYRSG GGGGYS.... GGGGSYGGGG GRREGGGGYS GGGGGYSSRG
grp2_phavu YGGGGEHGGG GGGGQGGGAG GGYGAgaGGG QGGGAGGGYG AGGEHGGGAG
chb8_bommo REGGLGYGAG YGDGYGLGYG G.....YGGG YGLGYGGYGG CGCG......
grp1_cheru HNGGGGYNNg hNGGGGYNNG GGygGHHNGG GGYNNGGGYH GGGGSCYHYC
els_bovin GAGLGGLGVG GLGPGVKPAK PGVGGLVGPG LGAepGGFFG AGGGA.....
nucl_rat GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGDF...
nucl_human GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGD...H
nucl_mouse GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGDF...
vg38_bpm1 GSAGGHCIQN NIGGRLRINN GGAIAGGGGG GGGggGGGRP FGAAGGYSGG
grp1_pethy AGGGGGVGGG AGSGGGFGAG GGVGGGAGAG GGVGGGGGFG GGGGGGVGGG
mcba_ecoli ...GVGIGGG GGGGGGGSCG GQGGG..CGG CSNGCSGGNG GSGGSGSH..
roa1_drome NMGGGNYGNQ NGGGNWNNGG NNWGNNRGGn fGGGGGGGGG YGGGNNSWGN
grp3_artsa .MGGPGPMGP QGRGRGRGRG GFSGPdmDPG YGF.DESYCG MGGGYEMPYN
chb7_bommo REGGFGYGAG YGDGYGLGFG G.....YGGG YGLGYGGYGG CG........
roa1_scham NGRGGGGDRW GGGGGGMGGG DRWGGGGGMG GGDRYGGGGG RSGGWSNDGY
sala_droor YGGQGGFGGY GGLGGQAGFG GQIGFNGQGG VGGQLGVG.. QGGVSPGQ..
els_chick ...GAGLGAG LGAGGKPLKP GVSGLGGLGP LGLQPGAGVG GLGAGLGAFP
spd1_nepcl GQGAGAAAAA AAGGAGQGGY GGLGNQGAGR GGQggGAGQG GYGGLGSQGA
sala_drosi ....GGFGGF GGLGGQAGFG GQIGFNGQGG VGG..QVGIG QGGVHPGQ..
chb4_bommo ..GGLGYGGL GYGGLGGGCG RGFS...GGG LPVATASAAP TGLGIASeyE
ews_human PGGPGGPGGP MggGRGGDRG GFPPRGPRGS RGNPSGGGNV QHRAGDWQCP
sqd_drome YGGRGGYNNq dGQGSYGGYG GGYGGYGAGG YGDYYAGGyg YGGGFEGNGY
ydh3_hsvsc PGGPGGPGGP CGPGGPCGPG GPCGPGGPGG PGGPRSPVSS IG........
sala_drome ....GGFGGF GGIGGQAGFG GQIG..FTGQ GGVSGQVGIG QGGVHPGQ..
egg2_schja GGGGGGNGGE YGGGYGDVYG GSYGGgyGGG NGGGNGGGGG CNGGGcnDYY
ssb_ecoli PAGGNIGGGQ PQGGWGQPQQ PQGGNQFSGG .......... ..........
prpc_human PQQGGHPPPP QGRPQGPPQQ GGHPRPPRGR PQGPPQQGGH QQGPPPPPPG
101 150
t1_11691 KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY
prio_human KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY
prio_gorgo KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY
prio_pantr KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY
prio_ponpy KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGNDYEDRYY
prio_colgu KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_prefr KPSKPKSNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_atege KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_macfa KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_saisc KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_calja KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_calmo KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_cebap KPSKPKTSMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_cerae KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_mansp KPNKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_aottr KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_bovin KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY
prio_rat KPSKPKTNLK HVAGAAAAGA VVGGLGGYML GSAMSRPMLH FGNDWEDRYY
prio_trast KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY
prip_bovin KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_mouse KPSKPKTNLK HVAGAAAAGA VVGGLGGYML GSAMSRPMIH FGNDWEDRYY
prio_sheep KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
prio_mesau KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPMMH FGNDWEDRYY
prio_odohe KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMNRPLIH FGNDYEDRYY
prp2_trast KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY
prio_musvi KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY
grp_horvu .......... ...GGEGGGG YGGGGGGY.. .......... ..........
grw1_lyces .......... .......... .......... .......... ..........
grp2_sinal .......... ..GGGEGGGY GGGGGGGW.. .......... ..........
grp8_arath .......... ...GGGDGGS YGGGGGGW.. .......... ..........
nucl_xenla .......... .......... .......... .......... ..........
prio_chick PWKPPKTNFK HVAGAAAAGA VVGGLGGYAM GRVMSGMNYH FDSPDEYRWW
gar1_schpo .......... ........GG FRGGSRGGFR GR........ ..........
grpa_medfa GGYN...... HGGGGYNGGG YNHGGGGYNH G......... ..........
egg1_schja .......... ........GG GNGGGGGCNG GGCGGGPD.F YGKGYEDSYG
grp1_phavu AGGGEGAGQG aaAGGGGRGS GGGGGGGYGG GGARGSGYGG GGGSGE....
grp2_nicsy GESGHFARDC SQSGGGGGGG RFGGGGGGGG GGGCYK.... ..........
roab_xenla GGS....... ...GFSGSGG NFGSSGGYnf GNYNSQSSSN FGPMKGGNY.
grp1_orysa qGSGSGYGYG YGKGGGGGGG GGGGGGGGGG GS........ ..........
k1ci_human sGSRGGSGGS HGGGSGFGGE SGGSYGG... GEEASGSGGG YGGGSGKSSH
grp2_sorvu .......... ...GGREGGG GYGGGGGYGG NRGDSGGNWR ..........
grp1_sinal .......... ........GG EGGGYGGSGG G......... ..........
gr10_brana YGSG.GGGYG GGGGRRDGGG YGGGDGGYGG GS........ ..........
ebn1_ebv AGAGGGAGAG GGAGGAGAGG GAGGAGGAGA G......... ..........
roa2_human NYNQQPSNYG PMKSGNFGGS rgGPYGGGNY GPGGSGGSGG YG........
ch15_drogr NPSAAAanQR GVIGYELDGG ILGGHGGYGG G......... ..........
asf1_helan .......... ..GGAPPPGA .......... .......... ..........
pcp_yeren .......... ......AAGA VAGGMAGQGV QGAMNR.... ..........
grp_dauca .......... ..GGGGYGGR REGGDGGYGG GGGGSR.... ..........
ykr3_caeel VRQKFPKNSS SF........ .......... .......... ..........
vnua_prvka EVRVAAAAAG AAEAAAAAEG ALSG...... .......... ..........
chb3_bommo .......... .......... .......... .......... ..........
grpa_maize .......... ..GGGGYGGR REGGGGGYGG GGGGWR.... ..........
grp_arath LGGGHGGGFG GGAGGGLGGG AGGGTGGGFG GGAGGGAGGG AGGGF.....
els_human AAAAAKAAAK FGAGAAGVLP GVGGAGVPGV PGAIPGIGGI AG........
roaa_xenla YNSQSSSNFG PMKGGNYGGG RNSGpgGYGG GSASSSSGYG GGRRF.....
grp2_orysa QGSGYGSGYG SGAGGAHGGG YGSGGGGGGG GGQGGGSGYG SGSGYGSGYG
rnha_human .......... ........GG GRGAYGTGYL DIEEEVAAIK LG........
grp7_arath GGGGSYGGGR REGGGGYGGG EGGGYGGSGG G......... ..........
grp2_phavu GGQGGGAGGG YGAGGEHGGG AGGGQGGGAG GGYGAGGEHG GGA.......
chb8_bommo .......... .......... .......... .......... ..........
grp1_cheru HGR....... .CCSAAEAKA L......... .......... ..........
els_bovin ..AGAAAAYK AAAKAGAAGL GVGGIGGVgl GVSTGAVVPQ LGAGVGAGVK
nucl_rat KPQGKKTKFE .......... .......... .......... ..........
nucl_human KPQGKKTKFE .......... .......... .......... ..........
nucl_mouse KPQGKKTKFE .......... .......... .......... ..........
vg38_bpm1 SASTAGTLTG AGIGSKPGNA IYGGNGGN.V GSAGGAFGGI SGSRY.....
grp1_pethy SGHGGGFGAG GGVGGGAGGG LGGGVGGGGG GGSGGGGGIG GGSGHGGGF.
mcba_ecoli .......... .......... .......... .......... ..........
roa1_drome NNPWDNGNGG GNFGGGGNNW NNGgfGGYqy GGGPQRGGGN FNNNRMQPY.
grp3_artsa GNAGWTASPG RGAGAGARGA .RGGLDQSRG GGKFPSARGG RGR.......
chb7_bommo .......... .......... .......... .......... ..........
roa1_scham NSGPQSDGFG GGYKQSYGGG AVRGSSGY.. GGSRSAPYSD RGS.......
sala_droor .......... ..GGFAAQGP PNQYQPGY.. GSPVGSGHFH GGNPVDAGYI
els_chick GAAFpaASAA ALKAAAKAGA GLGGVGG... .......... ..........
spd1_nepcl GRGGLGGQGA GAAAAAAGGA GQGGYGGLGG QGAGQGGYGG LGSQGAGR..
sala_drosi .......... ..GGFAGQGS PNQYQPGY.. GNPVGSGHFH GGNPVESGHF
chb4_bommo GTVGVCGNLP FLGTAAVAGE ftVGIGEILY GCGNGAVGIt yGAGYGGGY.
ews_human NPGCGNQNFA wdRGRGGPGG MRGGRGGLM. .......... ..........
sqd_drome GGGGGGGNMG GGRGGPRGGG GPKGGGGFNG G......... ..........
ydh3_hsvsc .......... .......... .......... .......... ..........
sala_drome .......... ..GGFAGQGS PNQYQPGY.. GSPVGSGHFH GANPVESGHF
egg2_schja GGSNGRRNGH GKGGKGGNGG GGGKGGGKGG GNGEGNGKGg kGGSYAPSYY
ssb_ecoli .......... .......... .......... .......... ..........
prpc_human KPQGPPPQGG RPQGPP.... .......... .......... ..........
151 200
t1_11691 RENMHRYPNQ VYYRPMDEYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_human RENMHRYPNQ VYYRPMDEYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_gorgo RENMHRYPNQ VYYRPMDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_pantr RENMHRYPNQ VYYRPMDQYS SQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_ponpy RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_colgu RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_prefr RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_atege RENMYRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_macfa RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_saisc RENMYRYPSQ VYYRPVDQYS NQNNFVHDCV NVTIKQHTVT TTTKGENFTE
prio_calja RENMYRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_calmo RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_cebap RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_cerae RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_mansp RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_aottr RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_bovin RENMHRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKEHTVT TTTKGENFTE
prio_rat RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_trast RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE
prip_bovin RENMHRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKEHTVT TTTKGENFTE
prio_mouse RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_sheep RENMYRYPNQ VYYRPVDRYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE
prio_mesau RENMNRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE
prio_odohe RENMYRYPNQ VYYRPVDQYN NQNTFVHDCV NITVKQHTVT TTTKGENFTE
prp2_trast RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE
prio_musvi RENMYRYPNQ VYYKPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE
grp_horvu .......... .......... .......... .......... ..........
grw1_lyces .......... .......... .......... .......... ..........
grp2_sinal .......... .......... .......... .......... ..........
grp8_arath .......... .......... .......... .......... ..........
nucl_xenla .......... .......... .......... .......... ..........
prio_chick SENSARYPNR VYYRDYSSPV PQDVFVADCF NITVTEYSIG PAAKKNTSee
gar1_schpo .......... .......... .......... .......... ..........
grpa_medfa .......... .......... .......... .......... ..........
egg1_schja GDS...YGND YY........ .......... .......... ..........
grp1_phavu .......... .......... .......... .......... ..........
grp2_nicsy .......... .......... .......... .......... ..........
roab_xenla .......... .......... .......... .......... ..........
grp1_orysa .......... .......... .......... .......... ..........
k1ci_human S......... .......... .......... .......... ..........
grp2_sorvu .......... .......... .......... .......... ..........
grp1_sinal .......... .......... .......... .......... ..........
gr10_brana .......... .......... .......... .......... ..........
ebn1_ebv .......... .......... .......... .......... ..........
roa2_human .......... .......... .......... .......... ..........
ch15_drogr .......... .......... .......... .......... ..........
asf1_helan .......... .......... .......... .......... ..........
pcp_yeren .......... .......... ......TDGV QLEVRKDDGT TILVVQKQGP
grp_dauca .......... .......... .......... .......... ..........
ykr3_caeel .......... .......... .......... .......... ..........
vnua_prvka .......... .......... .......... .......... ..........
chb3_bommo .......... .......... .......... .......... ..........
grpa_maize .......... .......... .......... .......... ..........
grp_arath .......... .......... .......... .......... ..........
els_human .......... .......... .......... .......... ..........
roaa_xenla .......... .......... .......... .......... ..........
grp2_orysa GGNGHH.... .......... .......... .......... ..........
rnha_human .......... .......... .......... .......... ..........
grp7_arath .......... .......... .......... .......... ..........
grp2_phavu .......... .......... .......... .......... ..........
chb8_bommo .......... .......... .......... .......... ..........
grp1_cheru .......... .......... .......... .......... ..........
els_bovin PGKVPGVGLP GVY....... .......... .......... ..........
nucl_rat .......... .......... .......... .......... ..........
nucl_human .......... .......... .......... .......... ..........
nucl_mouse .......... .......... .......... .......... ..........
vg38_bpm1 .......... .......... .......... .......... ..........
grp1_pethy .......... .......... .......... .......... ..........
mcba_ecoli .......... .......... .......... .......... ..........
roa1_drome .......... .......... .......... .......... ..........
grp3_artsa .......... .......... .......... .......... ..........
chb7_bommo .......... .......... .......... .......... ..........
roa1_scham .......... .......... .......... .......... ..........
sala_droor HGNHHEYPEH HGDHHREHHE HHGHHEHH.. .......... ..........
els_chick .......... .......... .......... .......... ..........
spd1_nepcl .......... .......... .......... .......... ..........
sala_drosi HGNPHEYPEH HGEHHREHHE HHGHHEHH.. .......... ..........
chb4_bommo .......... .......... .......... .......... ..........
ews_human .......... .......... .......... .......... ..........
sqd_drome .......... .......... .......... .......... ..........
ydh3_hsvsc .......... .......... .......... .......... ..........
sala_drome HENPHEYPEH HGDHHREHHE HHGHHEHH.. .......... ..........
egg2_schja .......... .......... .......... .......... ..........
ssb_ecoli .......... .......... .......... .......... ..........
prpc_human .......... .......... .......... .......... ..........
201 250
t1_11691 TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_human TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_gorgo TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_pantr TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_ponpy TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_colgu TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_prefr TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVFFSSPPV ILLISFLIFL
prio_atege TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLI..
prio_macfa TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_saisc TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_calja TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_calmo TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLI..
prio_cebap TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_cerae TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL
prio_mansp TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLI..
prio_aottr TDVKIMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFL...
prio_bovin TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL
prio_rat TDVKMMERVV EQMCVTQYQK ESQAYYdrRS SAVLFSSPPV ILLISFLIFL
prio_trast TDIKMMERVV EQMCITQYQR ESEAYYQRGA SVILFSSPPV ILLISFLIFL
prip_bovin TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL
prio_mouse TDVKMMERVV EQMCVTQYQK ESQAYyrRSS STVLFSSPPV ILLISFLIFL
prio_sheep TDIKIMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL
prio_mesau TDIKIMERVV EQMCTTQYQK ESQAYYdrRS SAVLFSSPPV ILLISFLIFL
prio_odohe TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL
prp2_trast TDIKMMERVV EQMCITQYQR ESEAYYQRGA SVILFSSPPV ILLISFLIFL
prio_musvi TDMKIMERVV EQMCVTQYQR ESEAYYQRGA SAILFSPPPV ILLISLLILL
grp_horvu .......... .......... .......... .......... ..........
grw1_lyces .......... .......... .......... .......... ..........
grp2_sinal .......... .......... .......... .......... ..........
grp8_arath .......... .......... .......... .......... ..........
nucl_xenla .......... .......... .......... .......... ..........
prio_chick MENKVVTKVI REMCVQQYRE YRLASGIQLH PADTWLAVLL LLLTTLFAM.
gar1_schpo .......... .......... .......... .......... ..........
grpa_medfa .......... .......... .......... .......... ..........
egg1_schja .......... .......... .......... .......... ..........
grp1_phavu .......... .......... .......... .......... ..........
grp2_nicsy .......... .......... .......... .......... ..........
roab_xenla .......... .......... .......... .......... ..........
grp1_orysa .......... .......... .......... .......... ..........
k1ci_human .......... .......... .......... .......... ..........
grp2_sorvu .......... .......... .......... .......... ..........
grp1_sinal .......... .......... .......... .......... ..........
gr10_brana .......... .......... .......... .......... ..........
ebn1_ebv .......... .......... .......... .......... ..........
roa2_human .......... .......... .......... .......... ..........
ch15_drogr .......... .......... .......... .......... ..........
asf1_helan .......... .......... .......... .......... ..........
pcp_yeren TRFSVGQRVM .......... .......... .......... ..........
grp_dauca .......... .......... .......... .......... ..........
ykr3_caeel .......... .......... .......... .......... ..........
vnua_prvka .......... .......... .......... .......... ..........
chb3_bommo .......... .......... .......... .......... ..........
grpa_maize .......... .......... .......... .......... ..........
grp_arath .......... .......... .......... .......... ..........
els_human .......... .......... .......... .......... ..........
roaa_xenla .......... .......... .......... .......... ..........
grp2_orysa .......... .......... .......... .......... ..........
rnha_human .......... .......... .......... .......... ..........
grp7_arath .......... .......... .......... .......... ..........
grp2_phavu .......... .......... .......... .......... ..........
chb8_bommo .......... .......... .......... .......... ..........
grp1_cheru .......... .......... .......... .......... ..........
els_bovin .......... .......... .......... .......... ..........
nucl_rat .......... .......... .......... .......... ..........
nucl_human .......... .......... .......... .......... ..........
nucl_mouse .......... .......... .......... .......... ..........
vg38_bpm1 .......... .......... .......... .......... ..........
grp1_pethy .......... .......... .......... .......... ..........
mcba_ecoli .......... .......... .......... .......... ..........
roa1_drome .......... .......... .......... .......... ..........
grp3_artsa .......... .......... .......... .......... ..........
chb7_bommo .......... .......... .......... .......... ..........
roa1_scham .......... .......... .......... .......... ..........
sala_droor .......... .......... .......... .......... ..........
els_chick .......... .......... .......... .......... ..........
spd1_nepcl .......... .......... .......... .......... ..........
sala_drosi .......... .......... .......... .......... ..........
chb4_bommo .......... .......... .......... .......... ..........
ews_human .......... .......... .......... .......... ..........
sqd_drome .......... .......... .......... .......... ..........
ydh3_hsvsc .......... .......... .......... .......... ..........
sala_drome .......... .......... .......... .......... ..........
egg2_schja .......... .......... .......... .......... ..........
ssb_ecoli .......... .......... .......... .......... ..........
prpc_human .......... .......... .......... .......... ..........
253
t1_11691 IVG
prio_human IVG
prio_gorgo IVG
prio_pantr IVG
prio_ponpy IVG
prio_colgu IVG
prio_prefr IVG
prio_atege ...
prio_macfa IVG
prio_saisc IVG
prio_calja IVG
prio_calmo ...
prio_cebap IVG
prio_cerae IVG
prio_mansp ...
prio_aottr ...
prio_bovin IVG
prio_rat IVG
prio_trast IVG
prip_bovin IVG
prio_mouse IVG
prio_sheep IVG
prio_mesau MVG
prio_odohe IVG
prp2_trast IVG
prio_musvi IVG
grp_horvu ...
grw1_lyces ...
grp2_sinal ...
grp8_arath ...
nucl_xenla ...
prio_chick ...
gar1_schpo ...
grpa_medfa ...
egg1_schja ...
grp1_phavu ...
grp2_nicsy ...
roab_xenla ...
grp1_orysa ...
k1ci_human ...
grp2_sorvu ...
grp1_sinal ...
gr10_brana ...
ebn1_ebv ...
roa2_human ...
ch15_drogr ...
asf1_helan ...
pcp_yeren ...
grp_dauca ...
ykr3_caeel ...
vnua_prvka ...
chb3_bommo ...
grpa_maize ...
grp_arath ...
els_human ...
roaa_xenla ...
grp2_orysa ...
rnha_human ...
grp7_arath ...
grp2_phavu ...
chb8_bommo ...
grp1_cheru ...
els_bovin ...
nucl_rat ...
nucl_human ...
nucl_mouse ...
vg38_bpm1 ...
grp1_pethy ...
mcba_ecoli ...
roa1_drome ...
grp3_artsa ...
chb7_bommo ...
roa1_scham ...
sala_droor ...
els_chick ...
spd1_nepcl ...
sala_drosi ...
chb4_bommo ...
ews_human ...
sqd_drome ...
ydh3_hsvsc ...
sala_drome ...
egg2_schja ...
ssb_ecoli ...
prpc_human ...
________________________________________________________________________________
PredictProtein@EMBL-Heidelberg.DE
PHD: Profile fed neural network systems from HeiDelberg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of:
- secondary structure, by PHDsec
- solvent accessibility, by PHDacc
- and helical transmembrane regions, by PHDhtm
Author: Burkhard Rost
EMBL, Heidelberg, FRG
Meyerhofstrasse 1, 69 117 Heidelberg
Internet: Predict-Help@EMBL-Heidelberg.DE
All rights reserved.
Please quote
~~~~~~~~~~~~
The PredictProtein mail server is described in:
B Rost: PHD: predicting one-dimensional protein structure by pro-
file based neural networks. Meth. in Enzym., 1996, 266, 525-539.
Additionally to be quoted for publications of PHDsec output:
B Rost & C Sander: Prediction of protein structure at better than
70% accuracy. J. Mol. Biol., 1993, 232, 584-599.
The latest improvement steps (up to 72%) are explained in:
B Rost & C Sander: Combining evolutionary information and neural
networks to predict protein secondary structure. Proteins, 1994,
19, 55-72.
Additionally to be quoted for publications of PHDacc output:
B Rost & C Sander: Conservation and prediction of solvent accessi-
bility in protein families. Proteins, 1994, 20, 216-226.
Additionally to be quoted for publications of PHDhtm output:
B Rost, R Casadio, P Fariselli & C Sander: Prediction of helical
transmembrane segments at 95% accuracy. Prot. Sci.,1995,4,521-533.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of secondary structure by PHDsec
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
About the input to the network
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The prediction is performed by a system of neural networks.
The input is a multiple sequence alignment. It is taken from an HSSP
file (produced by the program MaxHom:
Sander, Chris & Schneider, Reinhard: Database of Homology-Derived
Structures and the Structural Meaning of Sequence Alignment.
Proteins, 1991, 9, 56-68.
For optimal results the alignment should contain sequences with varying
degrees of sequence similarity relative to the input protein.
The following is an ideal situation:
+-----------------+----------------------+
| sequence: | sequence identity |
+-----------------+----------------------+
| target sequence | 100 % |
| aligned seq. 1 | 90 % |
| aligned seq. 2 | 80 % |
| ... | ... |
| aligned seq. 7 | 30 % |
+-----------------+----------------------+
Estimated Accuracy of Prediction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A careful cross validation test on some 250 protein chains (in total
about 55,000 residues) with less than 25% pairwise sequence identity
gave the following results:
++================++-----------------------------------------+
|| Qtotal = 72.1% || ("overall three state accuracy") |
++================++-----------------------------------------+
+----------------------------+-----------------------------+
| Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% |
| Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% |
| Qloop (% of observed)=79% | Qloop (% of predicted)=72% |
+----------------------------+-----------------------------+
..........................................................................
These percentages are defined by:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| number of correctly predicted residues
|Qtotal = --------------------------------------- (*100)
| number of all residues
|
| no of res correctly predicted to be in helix
|Qhelix (% of obs) = -------------------------------------------- (*100)
| no of all res observed to be in helix
|
|
| no of res correctly predicted to be in helix
|Qhelix (% of pred)= -------------------------------------------- (*100)
| no of all residues predicted to be in helix
..........................................................................
Averaging over single chains
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The most reasonable way to compute the overall accuracies is the above
quoted percentage of correctly predicted residues. However, since the
user is mainly interested in the expected performance of the prediction
for a particular protein, the mean value when averaging over protein
chains might be of help as well. Computing first the three state
accuracy for each protein chain, and then averaging over 250 chains
yields the following average:
+-------------------------------====--+
| Qtotal/averaged over chains = 72.2% |
+-------------------------------====--+
| standard deviation = 9.3% |
+-------------------------------------+
..........................................................................
Further measures of performance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matthews correlation coefficient:
+---------------------------------------------+
| Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 |
+---------------------------------------------+
..........................................................................
Average length of predicted secondary structure segments:
. +------------+----------+
. | predicted | observed |
+-----------+------------+----------+
| Lhelix = | 10.3 | 9.3 |
| Lstrand = | 5.0 | 5.3 |
| Lloop = | 7.2 | 5.9 |
+-----------+------------+----------+
..........................................................................
The accuracy matrix in detail:
+---------------------------------------+
| number of residues with H, E, L |
+---------+------+------+------+--------+
| |net H |net E |net L |sum obs |
+---------+------+------+------+--------+
| obs H |12447 | 1255 | 3990 | 17692 |
| obs E | 949 | 7493 | 3750 | 12192 |
| obs L | 2604 | 2875 |19962 | 25441 |
+---------+------+------+------+--------+
| sum Net |16000 |11623 |27702 | 55325 |
+---------+------+------+------+--------+
Note: This table is to be read in the following manner:
12447 of all residues predicted to be in helix, were observed to
be in helix, 949 however belong to observed strands, 2604 to
observed loop regions. The term "observed" refers to the DSSP
assignment of secondary structure calculated from 3D coordinates
of experimentally determined structures (Dictionary of Secondary
Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22,
2577-2637).
Position-specific reliability index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The network predicts the three secondary structure types using real
numbers from the output units. The prediction is assigned by choosing
the maximal unit ("winner takes all"). However, the real numbers
contain additional information.
E.g. the difference between the maximal and the second largest output
unit can be used to derive a "reliability index". This index is given
for each residue along with the prediction. The index is scaled to
have values between 0 (lowest reliability), and 9 (highest).
The accuracies (Qtot) to be expected for residues with values above a
particular value of the index are given below as well as the fraction
of such residues (%res).:
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| | | | | | | | | | | |
| Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2|
| | | | | | | | | | | |
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4|
| E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1|
| | | | | | | | | | | |
| H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4|
| E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
The above table gives the cumulative results, e.g. 62.5% of all
residues have a reliability of at least 5. The overall three-state
accuracy for this subset of almost two thirds of all residues is 82.9%.
For this subset, e.g., 83.1% of the observed helices are correctly
predicted, and 86.9% of all residues predicted to be in helix are
correct.
..........................................................................
The following table gives the non-cumulative quantities, i.e. the
values per reliability index range. These numbers answer the question:
how reliable is the prediction for all residues labeled with the
particular index i.
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| index| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| %res | 8.8| 9.5| 9.3| 9.1| 9.7| 10.5| 12.5| 15.7| 14.1|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| | | | | | | | | | |
| Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2|
| | | | | | | | | | |
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4|
| E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1|
| | | | | | | | | | |
| H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4|
| E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
For example, for residues with Relindex = 5 64% of all predicted betha-
strand residues are correctly identified.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of solvent accessibility by PHDacc
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Definition of accessibility
~~~~~~~~~~~~~~~~~~~~~~~~~~~
For training the residue solvent accessibility the DSSP (Dictionary of
Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22,
2577-2637) values of accessible surface area have been used. The
prediction provides values for the relative solvent accessibility. The
normalisation is the following:
| ACCESSIBILITY (from DSSP in Angstrom)
|RELATIVE_ACCESSIBILITY = ------------------------------------- * 100
| MAXIMAL_ACC (amino acid type i)
where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i.
The maximal values are:
+----+----+----+----+----+----+----+----+----+----+----+----+
| A | B | C | D | E | F | G | H | I | K | L | M |
| 106| 160| 135| 163| 194| 197| 84| 184| 169| 205| 164| 188|
+----+----+----+----+----+----+----+----+----+----+----+----+
| N | P | Q | R | S | T | V | W | X | Y | Z |
| 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196|
+----+----+----+----+----+----+----+----+----+----+----+
Notation: one letter code for amino acid, B stands for D or N; Z stands
for E or Q; and X stands for undetermined.
The relative solvent accessibility can be used to estimate the number
of water molecules (W) in contact with the residue:
W = ACCESSIBILITY /10
The prediction is given in 10 states for relative accessibility, with
RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC)
where PREDICTED_ACC = 0 - 9.
Estimated Accuracy of Prediction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A careful cross validation test on some 238 protein chains (in total
about 62,000 residues) with less than 25% pairwise sequence identity
gave the following results:
Correlation
...........
The correlation between observed and predicted solvent accessibility
is:
-----------
corr = 0.53
-----------
This value ought to be compared to the worst and best case prediction
scenario: random prediction (corr = 0.0) and homology modelling
(corr = 0.66). (Note: homology modelling yields a relative accurate
prediction in 3D if, and only if, a significantly identical sequence
has a known 3D structure.)
3-state accuracy
................
Often the relative accessibility is projected onto, e.g., 3 states:
b = buried (here defined as < 9% relative accessibility),
i = intermediate ( 9% <= rel. acc. < 36% ),
e = exposed ( rel. acc. >= 36% ).
A projection onto 3 states or 2 states (buried/exposed) enables the
compilation of a 3- and 2-state prediction accuracy. PHD reaches an
overall 3-state accuracy of:
Q3 = 57.5%
(compared to 35% for random prediction and 70% for homology modelling).
In detail:
+-----------------------------------+-------------------------+
| Qburied (% of observed)=77% | Qb (% of predicted)=60% |
| Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% |
| Qexposed (% of observed)=78% | Qe (% of predicted)=56% |
+-----------------------------------+-------------------------+
10-state accuracy
.................
The network predicts relative solvent accessibility in 10 states, with
state i (i = 0-9) corresponding to a relative solvent accessibility of
i*i %. The 10-state accuracy of the network is:
Q10 = 24.5%
..........................................................................
These percentages are defined by:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| number of correctly predicted residues
|Q3 = --------------------------------------- (*100)
| number of all residues
|
| no of res. correctly predicted to be buried
|Qburied (% of obs) = ------------------------------------------- (*100)
| no of all res. observed to be buried
|
|
| no of res. correctly predicted to be buried
|Qburied (% of pred)= ------------------------------------------- (*100)
| no of all residues predicted to be buried
..........................................................................
Averaging over single chains
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The most reasonable way to compute the overall accuracies is the above
quoted percentage of correctly predicted residues. However, since the
user is mainly interested in the expected performance of the prediction
for a particular protein, the mean value when averaging over protein
chains might be of help as well. Computing first the correlation
between observed and predicted accessibility for each protein chan, and
then averaging over all 238 chains yields the following average:
+-------------------------------====--+
| corr/averaged over chains = 0.53 |
+-------------------------------====--+
| standard deviation = 0.11 |
+-------------------------------------+
..........................................................................
Further details of performance accuracy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The accuracy matrix in detail:
..............................
-------+----------------------------------------------------+-----------
\ PHD | 0 1 2 3 4 5 6 7 8 9 | SUM %obs
-------+----------------------------------------------------+-----------
OBS 0 | 8611 140 8 44 82 169 772 334 27 0 | 10187 16.6
OBS 1 | 4367 164 0 50 106 231 738 346 44 3 | 6049 9.8
OBS 2 | 3194 168 1 68 125 303 951 513 42 7 | 5372 8.7
OBS 3 | 2760 159 8 80 136 327 1246 746 58 19 | 5539 9.0
OBS 4 | 2312 144 2 72 166 396 1615 1245 124 19 | 6095 9.9
OBS 5 | 1873 96 3 84 138 425 1979 1834 187 27 | 6646 10.8
OBS 6 | 1387 67 1 60 80 278 2237 2627 231 51 | 7019 11.4
OBS 7 | 1082 35 0 32 56 225 1871 3107 302 60 | 6770 11.0
OBS 8 | 660 25 0 27 43 136 1206 2374 325 87 | 4883 7.9
OBS 9 | 325 20 2 27 29 74 648 1159 366 214 | 2864 4.7
-------+----------------------------------------------------+-----------
SUM |26571 1018 25 544 961 2564 13263 14285 1706 487 |
%pred | 43.3 1.7 0.0 0.9 1.6 4.2 21.6 23.3 2.8 0.8 |
-------+----------------------------------------------------+-----------
Note: This table is to be read in the following manner:
8611 of all residues predicted to be in exposed by 0%, were
observed with 0% relative accessibility. However, 325 of all
residues predicted to have 0% are observed as completely exposed
(obs = 9 -> rel. acc. >= 81%). The term "observed" refers to the
DSSP compilation of area of solvent accessibility calculated from
3D coordinates of experimentally determined structures (Diction-
ary of Secondary Structure of Proteins: Kabsch & Sander (1983)
Biopolymers, 22, 2577-2637).
Accuracy for each amino acid:
.............................
+---+------------------------------+-----+-------+------+
|AA | Q3 b%o b%p i%o i%p e%o e%p | Q10 | corr | N |
+---+------------------------------+-----+-------+------+
| A | 59.0 87 60 2 38 66 57 | 31 | 0.530 | 5054 |
| C | 62.0 91 67 5 39 25 21 | 34 | 0.244 | 893 |
| D | 56.5 21 45 6 49 94 57 | 20 | 0.321 | 3536 |
| E | 60.8 9 40 3 41 98 61 | 21 | 0.347 | 3743 |
| F | 63.3 94 67 9 46 29 37 | 27 | 0.366 | 2436 |
| G | 52.1 75 51 1 31 67 53 | 22 | 0.405 | 4787 |
| H | 50.9 63 53 23 45 71 50 | 18 | 0.442 | 1366 |
| I | 64.9 95 68 6 41 30 38 | 34 | 0.360 | 3437 |
| K | 66.6 2 11 2 37 98 67 | 23 | 0.267 | 3652 |
| L | 61.6 93 65 8 44 31 40 | 31 | 0.368 | 5016 |
| M | 60.1 92 64 5 39 45 44 | 29 | 0.452 | 1371 |
| N | 55.5 45 45 8 38 87 59 | 17 | 0.410 | 2923 |
| P | 53.0 48 48 9 39 83 56 | 18 | 0.364 | 2920 |
| Q | 54.3 27 44 7 44 92 56 | 20 | 0.344 | 2225 |
| R | 49.9 15 47 36 47 76 51 | 18 | 0.372 | 2765 |
| S | 55.6 69 53 3 51 81 56 | 22 | 0.464 | 3981 |
| T | 51.8 61 51 8 38 78 53 | 21 | 0.432 | 3740 |
| V | 61.1 93 65 5 40 39 42 | 34 | 0.418 | 4156 |
| W | 56.2 85 62 20 49 29 27 | 21 | 0.318 | 891 |
| Y | 49.7 73 52 33 49 36 38 | 19 | 0.359 | 2301 |
+---+------------------------------+-----+-------+------+
Abbreviations:
AA: amino acid in one-letter code
b%o, i%o, e%o: = Qburied, Qintermediate, Qexposed (% of observed),
i.e. percentage of correct prediction in each state, see above
b%p, i%p, e%p: = Qburied, Qintermediate, Qexposed (% of predicted),
i.e. probability of correct prediction in each state, see above
b%o: = Qburied (% of observed), see above
Q10: percentage of correctly predicted residues in each of the 10
states of predicted relative accessibility.
corr: correlation between predicted and observed rel. acc.
N: number of residues in data set
Accuracy for different secondary structure:
...........................................
+--------+------------------------------+----+-------+-------+
| type | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | N |
+--------+------------------------------+----+-------+-------+
| helix | 59.5 79 64 8 44 80 56 | 27 | 0.574 | 20100 |
| strand | 61.3 84 73 9 46 69 37 | 35 | 0.524 | 13356 |
| loop | 54.4 64 43 11 44 78 61 | 18 | 0.442 | 27968 |
+--------+------------------------------+----+-------+-------+
Abbreviations as before.
Position-specific reliability index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The network predicts the 10 states for relative accessibility using real
numbers from the output units. The prediction is assigned by choosing
the maximal unit ("winner takes all"). However, the real numbers
contain additional information.
E.g. the difference between the maximal and the second largest output
unit (with the constraint that the second largest output is compiled
among all units at least 2 positions off the maximal unit) can be used
to derive a "reliability index". This index is given for each residue
along with the prediction. The index is scaled to have values between
0 (lowest reliability), and 9 (highest).
The accuracies (Q3, corr, asf.) to be expected for residues with values
above a particular value of the index are given below as well as the
fraction of such residues (%res).:
+---+------------------------------+----+-------+-------+
|RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res |
+---+------------------------------+----+-------+-------+
| 0 | 57.5 77 60 9 44 78 56 | 24 | 0.535 | 100.0 |
| 1 | 59.1 76 63 9 45 82 57 | 25 | 0.560 | 91.2 |
| 2 | 61.7 79 66 4 47 87 58 | 27 | 0.594 | 77.1 |
| 3 | 66.6 87 70 1 51 89 63 | 30 | 0.650 | 57.1 |
| 4 | 70.0 89 72 0 83 91 67 | 32 | 0.686 | 45.8 |
| 5 | 72.9 92 75 0 0 93 70 | 34 | 0.722 | 35.6 |
| 6 | 76.3 95 77 0 0 93 75 | 36 | 0.769 | 24.7 |
| 7 | 79.0 97 79 0 0 93 78 | 39 | 0.803 | 16.0 |
| 8 | 80.9 98 80 0 0 91 81 | 43 | 0.824 | 9.6 |
| 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 |
+---+------------------------------+----+-------+-------+
Abbreviations as before.
The above table gives the cumulative results, e.g. 45.8% of all
residues have a reliability of at least 4. The correlation for this
most reliably predicted half of the residues is 0.686, i.e. a value
comparable to what could be expected if homology modelling were
possible. For this subset of 45.8% of all residues, 89% of the buried
residues are correctly predicted, and 72% of all residues predicted to
be buried are correct.
..........................................................................
The following table gives the non-cumulative quantities, i.e. the
values per reliability index range. These numbers answer the question:
how reliable is the prediction for all residues labeled with the
particular index i.
+---+------------------------------+----+-------+-------+
|RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res |
+---+------------------------------+----+-------+-------+
| 0 | 40.9 79 40 16 41 21 40 | 14 | 0.175 | 8.8 |
| 1 | 45.4 61 46 28 44 48 44 | 17 | 0.278 | 14.1 |
| 2 | 47.4 53 52 10 46 80 44 | 19 | 0.343 | 19.9 |
| 3 | 52.9 75 59 4 50 77 47 | 23 | 0.439 | 11.4 |
| 4 | 60.0 81 63 0 83 84 56 | 25 | 0.547 | 10.1 |
| 5 | 65.2 82 70 0 0 93 62 | 28 | 0.607 | 10.9 |
| 6 | 71.3 90 72 0 0 94 70 | 31 | 0.692 | 8.8 |
| 7 | 76.0 94 76 0 0 95 75 | 34 | 0.762 | 6.3 |
| 8 | 80.5 97 81 0 0 94 79 | 39 | 0.808 | 3.8 |
| 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 |
+---+------------------------------+----+-------+-------+
For example, for residues with RI = 4 83% of all predicted intermediate
residues are correctly predicted as such.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of helical transmembrane segments by PHDhtm:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Estimated Accuracy of Prediction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A cross validation test on 69 helical trans-membrane proteins (in total
about 30,000 residues) with less than 25% pairwise sequence identity
gave the following results:
++================++-----------------------------------------+
|| Qtotal = 94.7% || ("overall two state accuracy") |
++================++-----------------------------------------+
+----------------------------+-----------------------------+
| Qhelix (% of observed)=92% | Qhelix (% of predicted)=83% |
| Qloop (% of observed)=96% | Qloop (% of predicted)=97% |
+----------------------------+-----------------------------+
..........................................................................
These percentages are defined by:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| number of correctly predicted residues
|Qtotal = --------------------------------------- (*100)
| number of all residues
|
| no of res correctly predicted to be in helix
|Qhelix (% of obs) = -------------------------------------------- (*100)
| no of all res observed to be in helix
|
|
| no of res correctly predicted to be in helix
|Qhelix (% of pred)= -------------------------------------------- (*100)
| no of all residues predicted to be in helix
..........................................................................
Further measures of performance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matthews correlation coefficient:
+---------------------------------------------+
| Chelix = 0.84, Cloop = 0.84 |
+---------------------------------------------+
..........................................................................
Average length of transmembrane helices:
| +------------+----------+
| | predicted | observed |
+-----------+------------+----------+
| Lhelix = | 24.6 | 22.2 |
+-----------+------------+----------+
..........................................................................
The accuracy matrix in detail:
+---------------------------------+
| number of residues with H, L |
+---------+------+-------+--------+
| |net H | net L |sum obs |
+---------+------+-------+--------+
| obs H | 5214 | 492 | 5706 |
| obs L | 1050 | 22423 | 23473 |
+---------+------+-------+--------+
| sum Net | 6264 | 22915 | 29179 |
+---------+------+-------+--------+
Note: This table is to be read in the following manner:
5214 of all residues predicted to be in a helical trans-membrane
region, were observed to be in the lipid bilayer, 1050 however
were observed either inside or outside of the protein, i.e. in
loop (or non-membrane) regions. The term "observed" refers to DSSP
assignment of secondary structure calculated from 3D coordinates
of experimentally determined structures (Dictionary of Secondary
Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22,
2577-2637) where these were available. For all other proteins,
the assignment of trans-membrane segments has been taken from the
Swissprot data bank (Bairoch, A.; Boeckmann, B.: The SWISS-PROT
protein sequence data bank. Nucl. Acids Res. 20: 2019-2022, 1992).
..........................................................................
Overlap between predicted and observed segments:
+-----------------+---------------+----------------+
| segment overlap | % of observed | % of predicted |
| Sov helix | 95.6% | 95.5% |
| Sov loop | 83.6% | 97.2% |
+-----------------+---------------+----------------+
| Sov total | 86.0% | 96.8% |
+-----------------+---------------+----------------+
Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26.
As helical trans-membrane segments are longer than globular heli-
ces, correctly predicted segments can easily be made out. PHDhtm
misses 5 out of 258 observed segments, predicts 6 where non is
observed and 3 times the predicted helical segment overlaps two
observed regions. Thus, in total more than 95% of all segments
are correctly predicted.
..........................................................................
Entropy of prediction (information measure):
+-----------------+
| I = 0.64 |
+-----------------+
(For comparison: homology modelling of globular proteins in three
states: I=0.62.)
Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26.
Position-specific reliability index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The network predicts two states: helical trans-membrane region and rest
using two output units. The prediction is assigned by choosing the ma-
ximal unit ("winner takes all"). However, the real numbers of the out-
put units contain additional information.
E.g. the difference between the two output units can be used to derive
a "reliability index". This index is given for each residue along with
the prediction. The index is scaled to have values between 0 (lowest
reliability), and 9 (highest).
The accuracies (Qtot) to be expected for residues with values above a
particular value of the index are given below as well as the fraction
of such residues (%res).:
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| %res |100.0| 98.8| 97.3| 95.9| 94.1| 92.3| 89.9| 86.2| 75.0| 66.8|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| | | | | | | | | | | |
| Qtot | 94.7| 95.2| 95.6| 96.2| 96.7| 97.2| 97.7| 98.4| 99.4| 99.8|
| | | | | | | | | | | |
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| H%obs| 91.8| 92.9| 93.8| 94.4| 95.0| 95.7| 96.2| 96.8| 95.5| 78.7|
| L%obs| 95.3| 95.7| 96.1| 96.6| 97.0| 97.5| 98.1| 98.8| 99.7|100.0|
| | | | | | | | | | | |
| H%prd| 82.7| 83.8| 85.0| 86.7| 88.1| 89.7| 91.4| 93.8| 96.3| 97.1|
| L%prd| 97.9| 98.3| 98.5| 98.7| 98.8| 99.0| 99.2| 99.4| 99.7| 99.9|
+------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
The above table gives the cumulative results, e.g. 92.3% of all
residues have a reliability of at least 5. The overall two-state
accuracy for this subset is 97.2%. For this subset, e.g., 95.7% of
the observed helical trans-membrane residues are correctly predicted,
and 89.7% of all residues predicted to be in helical trans-membrane
segment are correct.
The resulting network (PHD) prediction is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
PredictProtein@EMBL-Heidelberg.DE
PHD: Profile fed neural network systems from HeiDelberg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of:
- secondary structure, by PHDsec
- solvent accessibility, by PHDacc
- and helical transmembrane regions, by PHDhtm
Author: Burkhard Rost
EMBL, Heidelberg, FRG
Meyerhofstrasse 1, 69 117 Heidelberg
Internet: Predict-Help@EMBL-Heidelberg.DE
All rights reserved.
The network systems are described in:
PHDsec: B Rost & C Sander, JMB, 1993, 232, 584-599.
B Rost & C Sander, Proteins, 1994, 19, 55-72.
PHDacc: B Rost & C Sander, Proteins, 1994, 20, 216-226.
PHDhtm: B Rost et al., Prot. Science, 4, 521-533.
Some statistics
~~~~~~~~~~~~~~~
Percentage of amino acids:
+--------------+--------+--------+--------+--------+--------+
| AA: | G | P | S | Q | V |
| % of AA: | 17.8 | 6.7 | 5.9 | 5.9 | 5.5 |
+--------------+--------+--------+--------+--------+--------+
| AA: | Y | T | N | M | L |
| % of AA: | 5.1 | 5.1 | 4.7 | 4.7 | 4.7 |
+--------------+--------+--------+--------+--------+--------+
| AA: | R | K | H | A | W |
| % of AA: | 4.3 | 4.0 | 4.0 | 4.0 | 3.6 |
+--------------+--------+--------+--------+--------+--------+
| AA: | I | E | F | D | C |
| % of AA: | 3.6 | 3.6 | 2.8 | 2.4 | 1.6 |
+--------------+--------+--------+--------+--------+--------+
Percentage of secondary structure predicted:
+--------------+--------+--------+--------+
| SecStr: | H | E | L |
| % Predicted: | 13.8 | 19.0 | 67.2 |
+--------------+--------+--------+--------+
According to the following classes:
all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45
alpha-beta : %H>30 and %E>20; mixed: rest,
this means that the predicted class is: mixed class
PHD output for your protein
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fri Aug 9 02:00:26 1996
Jury on: 10 different architectures (version 5.94_317 ).
Note: differently trained architectures, i.e., different versions can
result in different predictions.
About the protein
~~~~~~~~~~~~~~~~~
HEADER /home/phd/tmp/t1_11691.seq
COMPND
SOURCE
AUTHOR
SEQLENGTH 253
NCHAIN 1 chain(s) in t1_11691 data set
NALIGN 84
(=number of aligned sequences in HSSP file)
Abbreviations: PHDsec
~~~~~~~~~~~~~~~~~~~~~
sequence:
AA : amino acid sequence
secondary structure:
HEL: H=helix, E=extended (sheet), blank=other (loop)
PHD: Profile network prediction HeiDelberg
Rel: Reliability index of prediction (0-9)
detail:
prH: 'probability' for assigning helix
prE: 'probability' for assigning strand
prL: 'probability' for assigning loop
note: the 'probabilites' are scaled to the interval 0-9, e.g.,
prH=5 means, that the first output node is 0.5-0.6
subset:
SUB: a subset of the prediction, for all residues with an expected
average accuracy > 82% (tables in header)
note: for this subset the following symbols are used:
L: is loop (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 5
Abbreviations: PHDacc
~~~~~~~~~~~~~~~~~~~~~
solvent accessibility:
3st: relative solvent accessibility (acc) in 3 states:
b = 0-9%, i = 9-36%, e = 36-100%.
PHD: Profile network prediction HeiDelberg
Rel: Reliability index of prediction (0-9)
P_3: predicted relative accessibility in 3 states
note: for convenience a blank is used intermediate (i).
10st:relative accessibility in 10 states:
= n corresponds to a relative acc. of n*n %
subset:
SUB: a subset of the prediction, for all residues with an expected
average correlation > 0.69 (tables in header)
note: for this subset the following symbols are used:
"I": is intermediate (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 4
Abbreviations: PHDhtm
~~~~~~~~~~~~~~~~~~~~~
secondary structure:
HL: T=helical transmembrane region, blank=other (loop)
PHD: Profile network prediction HeiDelberg
PHDF:filtered prediction, i.e., too long transmembrane segments
are split, too short ones are deleted
Rel: Reliability index of prediction (0-9)
detail:
prH: 'probability' for assigning helical transmembrane region
prL: 'probability' for assigning loop
note: the 'probabilites' are scaled to the interval 0-9, e.g.,
prH=5 means, that the first output node is 0.5-0.6
subset:
SUB: a subset of the prediction, for all residues with an expected
average accuracy > 82% (tables in header)
note: for this subset the following symbols are used:
L: is loop (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 5
protein: t1_1169 length 253
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP|
PHD sec | EEEEEEEEEEE |
Rel sec |986232698999988536777898889998767988789888889788888888775445|
detail:
prH sec |001322000000000011110000000000121000100000110110000111112332|
prE sec |001124788999988631001000000000000000000000000000000000000000|
prL sec |987443100000001257778898889998878888888888888888888888877666|
subset: SUB sec |LLL...EEEEEEEEEE.LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL..L|
ACCESSIBILITY
3st: P_3 acc |bbebbbbbbbbbbbbbbebbbbeeeeeeebeeebbee eeeb eeeee eeeebeb b e|
10st: PHD acc |007000000000000007000078777790666007656970577997567990605057|
Rel acc |113535278764440103000045435430110103101340133234013420000213|
subset: SUB acc |...b.b.bbbbbbb........eee.ee............e......e...e........|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA|
PHD sec | |
Rel sec |678554545776655457866522687542246654469899998678646553333775|
detail:
prH sec |211222222112222321112233211224432222310000000111122212333111|
prE sec |000000000000000000000000000000000000000000000000000011000001|
prL sec |788766767887777677877655788765567776678899998788767676655886|
subset: SUB sec |LLLLL.L.LLLLLLL.LLLLLL..LLLL....LLL..LLLLLLLLLLLL.LLL....LLL|
ACCESSIBILITY
3st: P_3 acc |ebebeb eebbe b eeebbeb eebbe beebbe eeeeeeeeeeeebeebbbbbbbbb|
10st: PHD acc |606060576007505769006057700750670065677778977776076000000000|
Rel acc |110011130003010402001113310211020011033354454530141223311232|
subset: SUB acc |...............e........................eeeeee...e..........|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV|
PHD sec | HHH EE HHHHH EEEE EEE EE|
Rel sec |446532135221147631112446303432223323699524433643458996451549|
detail:
prH sec |111234456434321100000001345655433542000000000122220000000000|
prE sec |220000010111200134443221000010100112100256653111110002664268|
prL sec |667665432344467754445667643223355335789642335765568997324730|
subset: SUB sec |..LL....H.....LL.......L............LLLL.....L...LLLLL.E.L.E|
ACCESSIBILITY
3st: P_3 acc |bbbbbbb bbbbbbbebbbebbeebeeeb eebbeebbb bbb bbeebeeee bbbbbb|
10st: PHD acc |000000040000000600060066067604670076000500040076066674000000|
Rel acc |024203611240002111111311013101231132220031110031011021443298|
subset: SUB acc |..b...b...b...........................................bb..bb|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV|
PHD sec |EEEEEEEEEEE HHHHHHHHHHHHHHHHHHHHHHHHHHH EEEEE E|
Rel sec |999988697543887686547599999999997556875567997419955999729965|
detail:
prH sec |000000000000000000268788999999998667876778887530000000000011|
prE sec |999988788763111111100000000000001222100000001110027999840016|
prL sec |000011211226887787621200000000000100012221001248972000159971|
subset: SUB sec |EEEEEEEEEE..LLLLLLL.HHHHHHHHHHHHHHHHHHHHHHHHH..LLLEEEEE.LLLE|
ACCESSIBILITY
3st: P_3 acc |bbbbeebbbbbbbebeebbebbbebbe bbebbbbbebeeebeb beeebbbbbbee bb|
10st: PHD acc |000066000000060660060006006400700000606770605077900000096500|
Rel acc |385921235131020213022342671199448721201741251044502244530106|
subset: SUB acc |.bbb....b.............b.bb..bbebbb.....ee..b..eee...bbb....b|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |ILLISFLIFLIVG|
PHD sec |EEEE EEEEE |
Rel sec |7876211246538|
detail:
prH sec |1111243321110|
prE sec |8887201467660|
prL sec |0001444110128|
subset: SUB sec |EEEE.....EE.L|
ACCESSIBILITY
3st: P_3 acc |bbbbbbbbbbbbe|
10st: PHD acc |0000000000009|
Rel acc |8899679997425|
subset: SUB acc |bbbbbbbbbbb.e|
PHDhtm Helical transmembrane prediction
note: PHDacc and PHDsec are reliable for water-
soluble globular proteins, only. Thus,
please take the predictions above with
particular caution wherever transmembrane
helices are predicted by PHDhtm!
PHDhtm
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP|
PHD htm | |
Rel htm |998888888876677899999999999999999999999999999999999999999999|
detail: | |
prH htm |000000000011111000000000000000000000000000000000000000000000|
prL htm |999999999988888999999999999999999999999999999999999999999999|
other: | |
PHDFhtm | |
subset: | |
SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA|
PHD htm | |
Rel htm |999999999999999999999999999999999999999999999999999999999999|
detail: | |
prH htm |000000000000000000000000000000000000000000000000000000000000|
prL htm |999999999999999999999999999999999999999999999999999999999999|
other: | |
PHDFhtm | |
subset: | |
SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV|
PHD htm | |
Rel htm |999999999999999999999999999999999999999999999999999999999999|
detail: | |
prH htm |000000000000000000000000000000000000000000000000000000000000|
prL htm |999999999999999999999999999999999999999999999999999999999999|
other: | |
PHDFhtm | |
subset: | |
SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV|
PHD htm | TTTTTTT|
Rel htm |999999999999999999999999999999999999999999999999998612346788|
detail: | |
prH htm |000000000000000000000000000000000000000000000000000146678899|
prL htm |999999999999999999999999999999999999999999999999999853321100|
other: | |
PHDFhtm | TTTTTTT|
subset: | |
SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL....HHHH|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |ILLISFLIFLIVG|
PHD htm |TTTTTTTTTTTTT|
Rel htm |8888888887764|
detail: | |
prH htm |9999999998887|
prL htm |0000000001112|
other: | |
PHDFhtm |TTTTTTTTTTTTT|
subset: | |
SUB htm |HHHHHHHHHHHH.|
________________________________________________________________________________
|