Many techniques exist for adapting videos to satisfy heterogeneous resource conditions or user preferences, whereas selection of the best adaptation operation among various choices usually is either ad hoc or inefficient. To provide a systematic solution, we present a conceptual framework based on utility function (UF), which models video entity, adaptation, resource, utility, and the relations am...
View full abstract»

This work derives a generalized video object profit function from the extended weighted transcoding graph to calculate the individual cache profit of certain versions of a video object, and the aggregate profit from caching multiple versions of the same video object. This proposed function takes into account the popularity of certain versions of an object, the transcoding delay among versions, and...
View full abstract»

Human visual sensitivity varies with not only spatial frequencies, but moving velocities of image patterns. Moreover, the loss of visual sensitivity due to object motions might be compensated by eye movement. Removing the psychovisual redundancies in both the spatial and temporal frequency domains facilitates an efficient coder without perceptual degradation. Motivated by this, a visual measure is...
View full abstract»

In this paper, we have proposed two methods to represent nonnegative integers based on the principle used in Golomb code (GC). In both methods, the given integer is successively divided with a divisor, the quotient and the remainders are then used to represent the integer. One of our methods is best suited for representing short integers and gives bit length comparable to that of Elias radic code ...
View full abstract»

This paper addresses the problem of assessing distortions produced by watermarking 3D meshes. In particular, a new methodology for subjective evaluation of the quality of 3D objects is proposed and implemented. Two objective metrics derived from measures of surface roughness are then proposed and their efficiency to predict the perceptual impact of 3D watermarking is assessed and compared with the...
View full abstract»

In the context of the automated surveillance field, automatic scene analysis and understanding systems typically consider only visual information, whereas other modalities, such as audio, are typically disregarded. This paper presents a new method able to integrate audio and visual information for scene analysis in a typical surveillance scenario, using only one camera and one monaural microphone....
View full abstract»

Due to the prevalence of digital video camcorders, home videos have become an important part of life-logs of personal experiences. To enable efficient video parsing, a critical step is to automatically extract objects, events and scene characteristics present in videos. This paper addresses the problem of extracting objects from home videos. Automatic detection of objects is a classical yet diffic...
View full abstract»

Effective video retrieval is the result of interplay between interactive query selection, advanced visualization of results, and a goal-oriented human user. Traditional interactive video retrieval approaches emphasize paradigms, such as query-by-keyword and query-by-example, to aid the user in the search for relevant footage. However, recent results in automatic indexing indicate that query-by-con...
View full abstract»

Content-based copy retrieval (CBCR) aims at retrieving in a database all the modified versions or the previous versions of a given candidate object. In this paper, we present a copy-retrieval scheme based on local features that can deal with very large databases both in terms of quality and speed. We first propose a new approximate similarity search technique in which the probabilistic selection o...
View full abstract»

Many recently proposed cross-layer protocols for wireless video, have advocated the relay of corrupted packet to higher layers. Such protocols lead to both errors and erasures at the compressed video application layer. We generically refer to such schemes as hybrid erasure-error protocols (HEEPs). In this paper, we analyze the utility of HEEPs for efficient transmission of video over wireless chan...
View full abstract»

This paper proposes an optimized content-aware authentication scheme for JPEG-2000 streams over lossy networks, where a received packet is consumed only when it is both decodable and authenticated. In a JPEG-2000 codestream, some packets are more important than others in terms of coding dependency and image quality. This naturally motivates allocating more redundant authentication information for ...
View full abstract»

Time-constrained error recovery is an integral component of reliable low-delay video applications. Regardless of the error-control method adopted by the application, unacknowledged or missing packets must be quickly identified as lost or delayed, so that necessary actions can be taken by the server/client on time. Historically, this problem has been referred to as retransmission timeout (RTO) esti...
View full abstract»

We consider streaming pre-encoded and packetized media over best-effort networks in the presence of acknowledgment feedbacks. We first review a rate-distortion (RD) optimization framework that can be employed in such scenarios. As part of the framework, a scheduling algorithm selects the data to send over the network at any given time, so as to minimize the end-to-end distortion, given an estimate...
View full abstract»

Multimedia streaming over the Internet has been a very challenging issue due to the dynamic uncertain nature of the channels. This paper proposes an algorithm for the joint design of source rate control and congestion control for video streaming over the Internet. With the incorporation of a virtual network buffer management mechanism (VB), the quality of service (QoS) requirements of the applicat...
View full abstract»

Wireless multimedia studies have revealed that forward error correction (FEC) on corrupted packets yields better bandwidth utilization and lower delay than retransmissions. To facilitate FEC-based recovery, corrupted packets should not be dropped so that maximum number of packets is relayed to a wireless receiver's FEC decoder. Previous studies proposed to mitigate wireless packet drops by a parti...
View full abstract»

Many protocols optimized to transmissions over wireless networks have been proposed. However, one issue that has not been looked into is considering human perception in deciding a transmission strategy for three-dimensional (3D) objects. Several factors, such as the number of vertices and the resolution of texture, can affect the display quality of 3D objects. When the resources of a graphics syst...
View full abstract»

This paper describes a novel framework for automatic lecture video editing by gesture, posture, and video text recognition. In content analysis, the trajectory of hand movement is tracked and the intentional gestures are automatically extracted for recognition. In addition, head pose is estimated through overcoming the difficulties due to the complex lighting conditions in classrooms. The aim of r...
View full abstract»

The harmonic broadcasting scheme has the best performance for the user latency. It, however, does not always provide the video data in time to the users. To provide the video data reliably, its two main variants - cautious and quasi-harmonic schemes have been proposed. They require more bandwidth than the harmonic scheme. The cautious and quasi-harmonic schemes need 0.50 b and 0.1771 b more bandwi...
View full abstract»

For original paper see ibid., vol. 7, no. 2, p. 330-8 (2005). To protect multimedia data in audio and video streaming applications from unauthorized access by intruders, Yeung et al. recently proposed a multikey multimedia proxy with enhanced security features based on asymmetric reversible parametric sequences (ARPS) and the RSA technique. The authors claimed that their scheme provided a number o...
View full abstract»

For original paper see ibid., vol. 4, no. 1, p. 121-8 (2002). It is shown that the watermarking algorithm presented in the above paper (R. Liu et al., 2002) can be easily discredited and ipso facto cannot be used for protecting rightful ownership
View full abstract»

The ability of a computer to detect and appropriately respond to changes in a user's affective state has significant implications to human-computer interaction (HCI). In this paper, we present our efforts toward audio-visual affect recognition on 11 affective states customized for HCI application (four cognitive/motivational and seven basic affective states) of 20 nonactor subjects. A smoothing me...
View full abstract»

State-of-the-art digital multimedia platforms contain a powerful very-long instruction word (VLIW) processing core. To obtain optimum performance on such a platform, a media-processing algorithm should exploit all the advantages provided by the VLIW processor's architecture. This paper focuses on a method for adapting a video compression algorithm for implementation on a VLIW processor using algor...
View full abstract»

This correspondence presents a speech sentence compression scheme. A compressed word sequence is first extracted. Speech segments, in the spoken document, corresponding to the extracted words are selected for concatenation. Evaluation of the proposed approach shows the compressed speech sentence retains important and meaningful information and naturalness
View full abstract»