The purpose of this work was to develop an accurate prediction model which can process information contained in antenatal databases to determine whether a baby will be born prematurely. The focus was on improved data preprocessing to add to methods developed by previous students in the Carleton MIRG (Medical Information technology Research Group) lab. The machine learning classifiers used included Decision Tree (DT) classifiers (for feature reduction) and the Artificial Neural Network (ANN) classifier (for model evaluation). Missing values and class imbalance was dealt with by applying software packages in the R statistical programming language. The final sensitivity and specificity results for the BORN (Better Outcomes Registry and Network) database were: Parous 89.2%, and 67.8%, Nulliparous 89.0% and 71.5%, and for PRAMS (Pregnancy Risk Assessment Monitoring System) database: Parous 84.1% and 71.4%, Nulliparous 83.8% and 76.0%. An accurate predictive tool will allow caregivers to implement preventative treatment strategies.