PP: your alignment in MSF format
Example for submitting your alignment by email in MSF format
Bold face: keyword "# MSF"
Example for MSF format when using the WWW interface
Specification of compulsary features of MSF format.
- The string "# MSF" is crucial, as the parser interprets anything after this line as an alignment in MSF format.
- The '#' is a control for PredictProtein. The actual MSF-format begins after that line!
- Names should contain up to 14 characters and no blanks.
- Please use the same names for the same protein in all rows.
- All sequences must have the same length.
- To mark insertions, please use a point '.'.
Example
Joe Sequencer, Department of Advanced Protein Research,
National Univeristy, Timbuktu
joe@amino.churn.edu
# MSF incredulase from paracoccus dementiae, translated from cDNA
***** don't type this line (is a comment): The hash ('#') is a control for PredictProtein.
***** don't type this line (is a comment): The actual MSF format starts in the next line.
MSF of: 9rnt.hssp from: 1 to: 104
9rnt.msf MSF: 104 Type: P 10-May-99 21:30:5 Check: 2138 ..
- please use the length identical for ALL sequences
- and let the length be the number of residues you have in the alignment
Name: 9rnt0 Len: 104 Check: 5292 Weight: 1.00
Name: rnt1_aspor Len: 104 Check: 5442 Weight: 1.00
Name: rnpc_pench Len: 104 Check: 9918 Weight: 1.00
Name: rnc2_aspcl Len: 104 Check: 2103 Weight: 1.00
Name: rnpb_penbr Len: 104 Check: 1496 Weight: 1.00
Name: rnms_aspsa Len: 104 Check: 9117 Weight: 1.00
Name: rnn1_neucr Len: 104 Check: 1299 Weight: 1.00
Name: rnf1_fusla Len: 104 Check: 2764 Weight: 1.00
Name: rnf1_fusmo Len: 104 Check: 2593 Weight: 1.00
Name: rnt1_triha Len: 104 Check: 338 Weight: 1.00
Name: rnf2_fusla Len: 104 Check: 2426 Weight: 1.00
Name: rnu1_ustsp Len: 104 Check: 1288 Weight: 1.00
Name: rnu2_ustsp Len: 104 Check: 3893 Weight: 1.00
Name: aga2_pedpe Len: 104 Check: 4169 Weight: 1.00
//
The numbers are optional (i.e. NOT necessary)
1 50
9rnt0 ACDYTCGSNC YSSSDVSTAQ AAGYKLHEDG ETVGSNSYPH KYNNYEGFDF
rnt1_aspor ACDYTCGSNC YSSSDVSTAQ AAGYQLHEDG ETVGSNSYPH KYNNYEGFDF
rnpc_pench ACAATCGSVC YTSSAISAAQ EAGYDLYSAN DDVSN..YPH EYRNYEGFDF
rnc2_aspcl .CDYTCGSHC YSASAVSDAQ SAGYQLESAG QSVGRSRYPH QYRNYEGFNF
rnpb_penbr ACAATCGTVC YTSSAISSAQ AAGYNLYSTN DDVSN..YPH EYHNYEGFDF
rnms_aspsa SCEYTCGSTC YWSSDVSAAK AKGYSLYESG DTIDD..YPH GYHDYEGFDF
rnn1_neucr ACMYICGSVC YSSSAISAAL NKGYSYYEDG ATAGSSSYPH RYNNYEGFDF
rnf1_fusla ....TCGSTP YSASQVRAAA NAACQYYQSD DTAGSTTYPH TYNNYEGFDF
rnf1_fusmo ....TCGSTN YSASQVRAAA NAACQYYQND DSAGSTTYPH TYNNYEGFDF
rnt1_triha ....TCGKVF YSASAVSAAS NAACNYVRAG STAGGSTYPH VYNNYEGFRF
rnf2_fusla ....TCSSKP YSAQQVRAAA NAACQYYQSN DTAGSTTYPH TYHNYEGFDF
rnu1_ustsp .....CGGTY YSSTQVNRA. ....INNAKS GQYSSTGYPH TYNNYEGFDF
rnu2_ustsp .....CGGNV YSNDDINTAI QGALDDVANG DRPDN..YPH QYYAEASEDI
aga2_pedpe .......... ...SYVIQIL AHRYPVhpYF EPMPSGSHAF ANDPTERFPY
51 100
9rnt0 SVSSPYYEWP ILSSGDVYSG GSPGADRVVF NENNQLAGVI THTGASGNNF
rnt1_aspor SVSSPYYEWP ILSSGDVYSG GSPGADRVVF NENNQLAGVI THTGASGNNF
rnpc_pench PVSGTYYEFP ILRSGAVYSG NSPGADRVVF NGNDQLAGVI THTGASGNNF
rnc2_aspcl PVSGNYYEWP ILSSGSTYNG GGPGADRVVF NDNDELAGLI THTGASGDGF
rnpb_penbr PVSGTYYEFP ILKSGKVYTG SSPGADRVIF NDDDELAGVI THTGASGNNF
rnms_aspsa PVSGTYYEYP IMSDYDVYTG GSPGADRVIF NGDDELAGVI THTGASGDDF
rnn1_neucr PTAKPWYEFP ILSSGRVYTG GSPGADRVIF DSHGNLDMLI THNGASGNNF
rnf1_fusla AVNGPYQEFP IRTGG.VYSG GSPGADRVII NTSCQYAGAI THTGASGNNF
rnf1_fusmo PVDGPYQEFP IKSGG.VYTG GSPGADRVVI NTNCEYAGAI THTGASGNNF
rnt1_triha klSKPFYEFP ILSSGKTYTG GSPGADRVVI NGQCSIAGII THTGASGNAF
rnf2_fusla AVNGPYQEYP IRTSG.VYSG GSPGADRVII NTQCQFAGAI THTGASGNQF
rnu1_ustsp SDygPYKEYP LKTSSSGYTG GSPGADRVVY DSNdtFCGAI THTGASGNNF
rnu2_ustsp TlsGPWSEFP LVYNGPYYSS rsPGPDRVIY QTNteFCATV THTGAASYD.
aga2_pedpe SVTSLPLEYS TIGS.....G DYRQPAYVIK DANNQLLPIL EYTGFSVND.
104
9rnt0 VECT
rnt1_aspor VECT
rnpc_pench VAC.
rnc2_aspcl VAC.
rnpb_penbr VACT
rnms_aspsa VACS
rnn1_neucr VAC.
rnf1_fusla VGCS
rnf1_fusmo VGCS
rnt1_triha VAC.
rnf2_fusla VGCS
rnu1_ustsp VQCS
rnu2_ustsp ....
aga2_pedpe ....
|