In the fifirst study of this thesis we aim at improving the error rate performance of polar codes by applying the well known normalized minimum-sum decoding algorithm where the messages are scaled so as to enhance the reliability of the propagated messages and to increase the decoding accuracy. The optimal parameter for the scaling factor can only be selected by simulation prior to implementation via density evolution; where the threshold values are calculated for various scaling factors and the scaling factor showing the highest threshold in noise level is determined as the optimal scaling factor. We then study the problem of log likelihood ratio correction in uncorrelated fading channels when the fading gain is not known at the receiver. In order to perform LLR scaling that is adaptive to various transmission conditions, an efficient mutual information maximization based scaling factor searching algorithm is developed. Following that, a non-uniform quantizer based on the condition of maximum information rate achieved over uncorrelated Rayleigh fading channels and when successive cancellation decoding algorithm of polar codes is applied. In the next study, we present a fifirst-order linear LLR approximation with low complexity for calculating soft metrics of polar coded quadrature amplitude modulation system in additive white Gaussian noise and Rayleigh fading channels under single/multiple transmit and receive antennas. To minimize the complexity of the max-log-MAP algorithm, LLRs can be simplifified by changing the mathematical minimum functions with simple linear functions. This is achieved by averaging the LLR values in different regions in order to get a single simplifified expression. Lastly, we examine the use of the SC polar coded QAM constellation to approach the capacity of wireless Rayleigh fading channels over multiple transmit and receive antennas when ideal channel state information is available at both receiver and transmitter. By using Hermite polynomials and under an even-moment constraint, the results show that the information rate is achieved with unique and optimal input distribution. The computational complexity can be reduced by factorizing the optimal distribution into the product of symmetrical distributions.