This thesis addresses the issue of overfitting while calibrating over-parameterized physical models with noisy and incomplete observations. A Bayesian inversion framework is augmented with model comparison and sparse learning algorithms to identify the optimal model nested under an over-parameterized model. The work is performed in three stages. First, the evidence-based Bayesian model comparison is implemented to rank competing models whereby the evidence is estimated using stationary samples from the posterior parameter probability density function (pdf) generated using a parallel and adaptive Markov Chain Monte Carlo (MCMC) sampler. Second, the concept of automatic relevance determination (ARD) is exploited to reformulate the model comparison problem into a sparse learning problem to alleviate the practical issues of 1) sensitivity of model evidence to prior pdf, and 2) overlooking nested models excluded in the candidate model set. ARD operates by assigning a zero-mean and unknown precision (hyperparameter) prior pdf to questionable model parameters. This ARD-based sparse learning approach is implemented using an MCMC-based evidence estimator and a gradient-free evidence optimizer to compute the optimal hyperparameters, which thereby picks out the optimal nested model. Third, a semi-analytical approach of nonlinear sparse Bayesian learning (NSBL) is proposed to alleviate the computational burden of the MCMC sampling within an optimization task. The analytical tractability of Bayesian entities is enabled by the Gaussian mixture-model approximation of the posterior parameter pdf without the ARD priors. A multistart Newton's method is designed to expedite the non-convex, unconstrained maximization of evidence using semi-analytically computed gradient and Hessian information of evidence. Each of the three proposed approaches is applied to model the complex physics behind a nonlinear aeroelastic oscillator observed in a wind-tunnel. Furthermore, several numerical studies are reported in this thesis to demonstrate the efficiency and robustness of NSBL in pruning redundant parameters from nonlinear physics-based models, leading to enhanced generalization capabilities of the predictive model.