log
Project Navigation
»
»
»
»
»
»
»
»
»
»
»
»
»
 
 
Publications

 

A/A Publication
1.
N. Tsapatsoulis, C. Pattichis, A. Kounoudes, C. Loizou, A. Constantinides, J. G. Taylor, "Visual Attention based Region of Interest Coding for Video-telephony Applications", 5th International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP'06), Patras, Greece, July 2006.[pdf]
 

Abstract:

Bottom up approaches to Visual Attention (VA) have been applied successfully in a variety of applications, where no domain information exists, e.g. general purpose image and video segmentation. On the other hand, when humans are looking for faces in a scene they perform an implicit conscious search. Therefore, using simple bottom up approaches for identifying visually salient areas in scenes containing humans are not so efficient. In this paper we introduce the inclusion of a top-down channel in the VA architecture proposed in the past (i.e., Itti et al) to account for conscious search in video telephony applications. In such kind of applications the existence of human faces is almost always guaranteed. The regions, in the video-telephony stream, identified by the proposed algorithm as being visually salient are encoded with higher precision compared to the remaining ones. This procedure leads to a significant bit-rate reduction while the visual quality of the VA based encoded video stream is only slightly deteriorated, as the visual trial tests show. Furthermore, extended experiments concerning both static images as well as low-quality video show the efficiency of the proposed method, as far as the compression ratios achieved is concerned. The comparisons are made against standard JPEG and MPEG-1 encoding respectively.

2.
N. Tsapatsoulis, K. Rapantzikos and Y. Avrithis, “Priority Coding for Video-telephony Applications based on Visual Attention,” in Proc. of the 2nd International Mobile Multimedia Communications Conference  (MobiMedia 2006), Alghero, Sardinia, Italy, September 2006.

Abstract:

In this paper we investigate the utilization of visual saliency maps for ROI-based video coding of video-telephony applications. Visually salient areas indicated in the saliency map are considered as ROIs. These areas are automatically detected using an algorithm for visual attention (VA) which builds on the bottom-up approach proposed by Itti et al. A top-down channel emulating the visual search for human faces performed by humans has been added, while orientation, intensity and color conspicuity maps are computed within a unified multi-resolution framework based on wavelet subband analysis. Priority encoding, for experimentation purposes, is utilized in a simple manner: Frame areas outside the priority regions are blurred using a smoothing filter and then passed to the video encoder. This leads to better compression of both Intra-coded (I) frames (more DCT coefficients are zeroed in the DCT-quantization step) and Inter coded (P,B) frames (lower prediction error). In more sophisticated approaches, priority encoding could be incorporated by varying the quality factor of the DCT quantization table. Extended experiments concerning both static images as well as low-quality video show the compression efficiency of the proposed method. The comparisons are made against standard JPEG and MPEG-1 encoding respectively.

 

Powered by Pyxel Extranet Solutions in collaboration with SignalGeneriX Ltd.