As network availability becomes ubiquitous, users are leveraging this access to establish their presence in remote locations through the use of commercially available telepresence technologies. With the increasing adoption of systems, new questions are emerging about how these technologies affect user interactions and relationships. Our goal for this workshop is to bring an interdisciplinary group of telepresence researchers together to trade perspectives, fostering new opportunities for collaboration...

Publication details

Parametric speakers produce sound by emitting ultrasound, and using the small nonlinearity in air to demodulate it back to audible sound. The use of ultrasound allows for producing very narrow audio beams, which finds application in a number of military and consumer scenarios. However, designing better parametric speakers has been hard: closed-form solution of the nonlinear wave equation for generic geometries is nearly impossible, and the only existing solution was derived for the simple case of a...

Publication details

Estimating room impulse responses (RIRs) has a number of applications, including personalized audio, analyzing and improving acoustic behavior of concert halls, listening room compensation, sound source localization, and many others. RIRs have been estimated in essentially the same fashion for the last 50 years: Compute the cross correlation between a signal played at point A, and the signal received at point B. Best results are obtained when the signal played is white noise, or a maximum length...

As a field, telepresence has grown to include a wide range of systems, from multi-view videoconferencing units to humanlike androids. However, the diversity of systems and research makes it difficult to form a holistic understanding of where the field stands. We propose a framework consisting of seven design dimensions for understanding telepresence, iteratively developed from previous literature, a series of three surveys, the construction of two design probes, and a field study. These design...

Publication details

Telephone calls and videoconferencing are ubiquitous parts of everyday life. As the content of the call may extend beyond just words, people share applications and media using techniques such as screen sharing and email attachments. Little is known about the prevalence of this behavior and the benefits it can provide. We conducted a survey and a lab study to examine media sharing during a video call and found that it can be useful as well as emotionally engaging. Participants indicated that they would...

Publication details

Mobile videoconferencing is increasingly being used to bring remote friends or family along to an activity happening outside the home, such as shopping or visiting a tourist attraction. We explored how including contextual information of the event, in addition to audio and video of the person at the event, impacts the shared experience. We studied three kinds of information: a map showing the position of the person at the activity, a second live video showing what was in front of that person, and a...

While modern displays offer high dynamic range (HDR) with large bit-depth for each rendered pixel, the bulk of legacy image and video contents were captured using cameras with shallower bit-depth. In this paper, we study the bit-depth enhancement problem for images, so that a high bit-depth (HBD) image can be reconstructed from an input low bit-depth (LBD) image. The key idea is to apply appropriate smoothing given the constraints that reconstructed signal must lie within the per-pixel quantization...

Compressing attributes on 3D point clouds such as colors or normal directions has been a challenging problem, since these attribute signals are unstructured. In this paper, we propose to compress such attributes with graph transform. We construct graphs on small neighborhoods of the point cloud by connecting nearby points, and treat the attributes as signals over the graph. The graph transform, which is equivalent to Kahunen-Loeve Transform on such graphs, is then adopted to decorrelate the signal....

Depth image compression is important for compact representation of 3D visual data in texture-plus-depth format, where texture and depth maps from one or more viewpoints are encoded and transmitted. A decoder can then synthesize a freely chosen virtual view via depth-image-based rendering using nearby coded texture and depth maps as reference. Further, depth information can be used in other image processing applications beyond view synthesis, such as object identification, segmentation, and so on. In...

In this paper, we further the characterization of a fundamental limit of human perception: the accuracy of human estimation of others’ eye gaze directions. In particular, we introduce a non-linear model that describes how both the head direction and the gaze direction of a looker relative to an observer jointly affect the observer’s perception of the looker’s gaze direction. Ours is the first to explain in a single model the biases introduced by the looker’s head direction, the relative accuracy of eye...

Publication details

Transmitting compactly represented geometry of a dynamic 3D scene from a sender can enable a multitude of imaging functionalities at a receiver, such as synthesis of virtual images at freely chosen viewpoints via depth-image-based rendering. While depth maps-projections of 3D geometry onto 2D image planes at chosen camera viewpoints-can nowadays be readily captured by inexpensive depth sensors, they are often corrupted by non-negligible acquisition noise. Given depth maps need to be denoised and...

The next step in immersive communication beyond video from a single camera is object-based free viewpoint video, which is the capture and compression of a dynamic object such that it can be reconstructed and viewed from an arbitrary viewpoint. The moving human body is a particularly useful subclass of dynamic object for object-based free viewpoint video relevant to both telepresence and entertainment. In this paper, we compress moving human body sequences by applying recently developed Graph Wavelet...

We present the design, and evaluation of WaaZam, a video mediated communication system designed to support creative play in customized environments. Users can interact together in virtual environments composed of digital assets layered in 3D space. The goal of the project is to support creative play and increase social engagement during video sessions of geographically separated families. We try to understand the value of customization for individual families with children ages 6-12. We present...

Cross-ratio (CR) based methods offer many attractive properties for remote gaze estimation using a single camera in an uncalibrated setup by exploiting invariance of a plane projectivity. Unfortunately, due to several simplification assumptions, the performance of CR-based eye gaze trackers decays significantly as the subject moves away from the calibration position. In this paper, we introduce an adaptive homography mapping for achieving gaze prediction with higher accuracy at the calibration position...

Publication details

The cross-ratio approach has recently attracted increasing attention in eye-gaze tracking due to its simplicity in setting up a tracking system. Its accuracy, however, is lower than that of the model-based approach, and substantial efforts have been devoted to improving its accuracy. Binocular fixation is essential for humans to have good depth perception, and this paper presents a technique leveraging this constraint. It is used in two ways: First, in estimating jointly the homography matrices for both...

Publication details

This panel brings together HCI researchers who are primarily remote workers, in order to discuss their technological solutions and social practices. We aim for an engaging, fun, and informative discussion appropriate for researchers interested in remote collaboration and computer-mediated communication.

Publication details

We present Eventful, a system for producing news reports of local events using remote and locative crowd workers. The system recruits and guides novice crowd workers as they perform the roles of field reporter, curator, or writer. Field reporters attend the events in person, and use Eventful's mobile web app to get a personalized mission, submit content, and receive feedback. Missions include tasks such as taking a photo, and asking a question to an attendee. In parallel, remote curators approve,...