Many emerging applications such as augmented reality, facial recognition require heavy computation, and the processed results have to be available to the user in the order of milliseconds. Edge computing combined with cloud computing can address this challenge by distributing the load (offloading) on different connected computing resources. This thesis introduces a novel adaptive offloading framework using Online Deep Q reinforcement learning. The proposed framework considers strict latency constraints, high state space, rapidly changing user mobility, heterogeneous resources, and stochastic task arrival rate. It also highlights the importance of caching and introduces a novel concept called "container caching" that caches the dependencies of popular applications. Therefore, offloading decisions are taken to minimize energy consumption, latency, and caching costs. Simulation results and comparisons with existing benchmarking algorithms showed remarkable performance in terms of energy consumption, network traffic, task failures, remaining power on a large scale demonstrated the feasibility of proposed approach.