Creator:
Date:
Abstract:
Glycosylation is an important form of protein post-translational modification where a glycan is attached to a protein via an enzymatic process. Experimental verification of glycosylation using wet lab techniques is expensive and time-consuming. While a number of computational prediction tools are available, none are trained using plant proteins. Since the mechanisms of glycosylation in plant and animal cells are known to differ, there is a need to develop a plant-specific predictor. In this thesis, we create such predictors of N-linked glycosylation using support vector machines and binary profile patterns derived from protein sequence windows as input feature data. The final classifier achieves a recall of 80.0% and 79.0% precision, as measured using a 10-fold cross-validation test. Our plant-specific classifier is more accurate on plant proteins than are other classifiers developed here and elsewhere. Finally, we have developed a web server to make the tool available to the research community.