The demand for video streaming is growing every day which implies a higher demand for new video transmitting and compression techniques to avoid massive data traffics over telecommunication networks. In this dissertation, we studied saliency detection topic in order to apply it to video streaming problem to be able to transmit different regions of video frames in a ranked manner based on their importance (i.e., saliency). Salient areas are the regions of interest that stand out relative to their surroundings and consequently absorb more attention. To determine the salient areas within a scene, visual importance and distinctiveness of the regions must be measured. The lack of a comprehensive and precise biologically-inspired study on the saliency of bottom-up stimuli prevents us to justify the level of importance for different stimuli such as color, luminance, texture, and motion on the human visual system (HVS). To overcome this barrier, we investigated the bottom-up features using an eye-tracking procedure and human subjects in video sequences to provide a ranking saliency system stating the most dominant circumstances for each feature individually as well as in combination with other features. The experiment was performed under conditions in which we had no cognitive bias in order to speed up the video streaming procedure. Next, we introduced a gradual saliency detection framework both for still images and video sequences using color, texture, and motion features (based on our experimental estimations) in this dissertation. In our algorithm, we proposed new feature maps for color and texture features and we also improved the optical ow field estimation in our motion map. Finally, different feature maps were combined and classified as different saliency levels using a Naive Bayesian Network. This work provides a benchmark to specify the gradual saliency for both static and dynamic (i.e., moving backgrounds) scenes. The main contribution of this work is to create the ability to assign a gradual saliency for the entirety of an image/video frame rather than simply extracting a salient object/area which is widely performed in the state-of-the-art.