Links

Images

Classifications

G—PHYSICS

G06—COMPUTING; CALCULATING; COUNTING

G06F—ELECTRICAL DIGITAL DATA PROCESSING

G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer

H04N21/647—Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Accordingly, there is need for network-extensible and easily reconfigurable media appliance capable of communicating over networks and allowing for extension of on-appliance audio or video processing software and tagging of recorded audio or video signals.

Optionally, memory 110 stores software instructions and data implementing billing 202 and/or business methods, such as a time-based pay-per-view and/or micro-billing feature. For example, memory 110 stores a data structure comprising a field describing a viewing (such as a home-viewing of a video clip of video stream) and/or a field indicating an amount to be charged for the viewing and/or a field identifying a party to be charged.

Optionally, memory 110 stores instructions and/or data 205 for performing identity recognition (such as facial recognition, emotion recognition, voice recognition, and/or other pattern or identity recognition) on video data 201 and/or on incoming video signal. For example, memory 110 stores a data structure comprising an identifier for a database against which image recognition is to be performed, for example a database of faces for recognizing faces in a crowd. The database may be stored (partially or completely) internally on media appliance 100 or reside externally on a server. As another example, memory 110 stores a data structure comprising a feature extracted from a video stream and/or video clip (using image extraction instructions stored in memory 110), and the extracted feature is used for a data base query or is sent to a server for further handling.

Media appliance 100 optionally communicates with DRM service 308 for downloading and/or uploading DRM meta-data. Optionally, media appliance 100 generates a message indicating an infringement and/or other violation of digital rights, according to a set of DRM rules, such as copying without permission, broadcasting without permission, etc. For example, memory stores a data structure comprising a field identifying a video clip and/or video stream, and an indicator of a violation of a DRM rule, such as an act of broadcasting the video clip and/or video stream without permission.

Media appliance 100 optionally communicates with security service 309 to upload security information such as video and/or audio record of scene, identity recognition data as computed by identity recognition instructions 203, GPS data as provided by GPS module 112, directional data as provided by acceleration detector 113, and/or to download security information such as location to watch, identity data to store for matching against images, and/or voice audio signature to store for matching against audio clips. For example, media appliance 100 sends a data structure to security service 309, wherein the data structure comprises a field identifying a person, and a field identifying the location of the media appliance 100 at the time the person is sensed by media appliance 100. Optionally, media appliance 100 couples to police authority for providing live and/or recorded footage and/or triggering alarm and calling police according to built-in media appliance intelligence for identifying potential dangerous and/or suspicious conditions.

Media appliance 100 optionally communicates with GPS service 302, such as GPS satellites, to receive GPS information. For example, if media appliance 100 moves into a restricted area, as indicated by GPS service 302 and/or by information residing on media appliance 100 and/or obtained remotely, GPS unit 112 activates an alert. For example, memory 110 stores a data structure comprising a field identifying a restricted geographical area, and media appliance 100 generates an alarm when location of media appliance 100, as indicated by GPS service 302, falls within the restricted geographic area.

Media appliance 100 optionally communicates with news service 310 and/or other objective information service. In one embodiment, media appliance 100 receives a data structure from news service 310, the data structure representing a digital template and comprising a field identifying a location, and one or more fields identifying elements to be covered by reporter (such as a person to interview, a particular place to point out to viewers, other news reporters covering the same news story, etc.).

Media appliance 100 optionally communicates with sports broadcasting network, game-show broadcasting network, and/or other gaming or competition-related network 311. In one embodiment, media appliance 100 receives a data structure from sports broadcasting network 310, the data structure comprising a field identifying one or more competing parties, a field identifying a location of the competition, and a field indicating the competition schedule.

Media appliance 100 optionally communicates with private service 312. In one embodiment, media appliance 100 receives a data structure from movie production source or network 310, the data structure comprising a field identifying one or more movie or media production, a field identifying a location of the production, a field indicating the production schedule, a field indicating one or more scenes, and a field indicating one or more cast or staff members.

Media appliance 100 optionally communicates with other networked media appliance 306 for exchanging video and/or audio clips and/or for collaborating in the production of a media project, wherein a media appliance is assigned a token (number, string, etc.), statically or dynamically, for identifying the media appliance. Media appliance 100 optionally communicates with other networked media appliance 306 to enable video-conferencing and/or multi-way collaboration, for example, in business meetings, real estate transactions, distance learning, sports, fashion shows, surveillance, training, games, tourism, etc. For example, memory 110 stores a data structure comprising a field for describing a group of collaborating media appliances 100, and a field identifying media appliance 100 itself among the group of collaborating media appliances.

FIG. 3b is a diagram illustrating network-extensible reconfigurable media appliances communicating over a network with a server, according to an embodiment of the present invention. One or more client media appliances 330 communicate over a network 331 with server 332. Network 331 is a combination of one or more wired and/or wireless networks such as the Internet, a LAN, a WAN, a satellite network, or other network for communication. In one embodiment, server 332 is a news server, having a script or digital template for producing a news program. Server 332 delegates the recording or streaming of various predetermined pieces of audio and/or video footage to the various media appliance clients 330, wherein the recorded or streamed pieces will serve to fill-in the server 332 script or digital template for producing the news program. In another embodiment, server 332 is a server for sports or other competition, having a script or digital template for producing a sports program or a program for other competitive activity. Server 332 delegates the recording or streaming of various predetermined pieces of audio and/or video footage to the various media appliance clients 330, wherein the recorded or streamed pieces serve to fill-in the server 332 script or digital template for producing the sports (or other competition) program.

In one embodiment, I/O module 111 presents a user interface (UI), comprising a combination of hard (physical) buttons and/or soft (graphical) buttons for accessing and using billing functions, DRM functions, authentication, identity recognition, digital editing of media, and/or other services as shown in FIG. 3a and described above. For example, a view (for example comprising a button) is presented via display 114 to allow approval of a billing associated with the viewing of video data. As another example, a view is presented via display 114, allowing selection of one or more audio and/or video data for submission or transmission to a server 332, such as a news server or a sports server, as described above. Selection of a presented audio and/or video data designates the selected data for submission or transmission to the server. Optionally, interfaces and media appliances are physically separate, wherein through an interface a user can tap into a pool or one or more media appliances to view available audio and/or video data, and/or select one or more available audio and/or video for submission or transmission to a server 332, as described above. As another example, a view is presented at server 332 for approving the inclusion of a submitted or transmitted audio and/or video data into a script or a digital template for a news or sports program, wherein the audio and/or video data is submitted by a media appliance client 330 to server 332, as described above.

FIG. 4 is a flow diagram illustrating a method for sensing according to one embodiment of the present invention. The method begins with pre-production 401. Pre-production comprises employing 402 a script and/or storyboard flowchart, or employing 403 a digital template 403. A portion of this front-end may be implemented automatically or manually in software, comprising analysis, design, development, production, implementation or evaluation of script, storyboard, and/or digital template. Optionally, frames and/or scenes are labeled (via meta-data) according to script, storyboard, or digital template in use.

A script or storyboard is downloaded over a wired and/or wireless network, made available via removable storage (e.g. memory card and/or disk), or is alternatively created on media appliance. A digital template describes how to construct a video and/or multimedia document by sensing (i.e. “shooting” or recording) and assembling individual scenes and/or segments in particular order, and is downloaded over a wired and/or wireless network or created on media appliance. Alternatively, user of media appliance 100 may decide not to consult a script, storyboard, or digital template, and proceed directly to sensing 404.

One example of a template is a template for insurance inspection of vehicle accidents, wherein the template indicates “slots” for video clips, taken from various angles, of the vehicles involved in the accident, as prescribed by an insurance company.

Optionally, media appliance 100 adaptively guides media appliance operator in making discretionary decisions to take alternate script paths and/or alter flow of script (or storyboard or digital template) or generally deviate from the script, for example when dealing with emergency conditions and/or events which do not occur according to script. Such guidance may employ non-deterministic scripts, according to logic specified using Bayesian modeling, neural networks, fuzzy logic, and/or other technique for making decisions under complex conditions and/or under incomplete information. For example, in one embodiment a cast member in a script is described by fuzzy attributes, such as “a female actor with at least five years drama experience” in leading role (instead of or in addition of identifying the lead role actor by name). Then, in case the lead actor canceling her engagement, instructions employing fuzzy logic perform a search for actors matching the fuzzy attributes to dynamically recommend one or more candidates to fill the role.

Optionally, digital template or script is non-linear, allowing for one or more branching points. A branching point allows the script and/or template to flow in more than one path. For example, scene (or clip or stream) A can be followed by scene B or scene C, depending on which branch of the branching point following A is taken. For a viewer, a media presentation prepared according to such non-linear template or script allows for a multiplicity of presentations comprising different scene (or clip or stream) orderings. For a viewer, the decision of which of the alternate paths to follow in a branching point can be viewer selected, randomly chosen, based on external variable (such as a combination of one or more of: weather, temperature, stock quotes, time of day or year, viewing location, amount of money left in viewer's account, or any other external variables), based on biometric sensing of viewer, based on the result of an identity or emotion recognition procedure on viewer (such as distinguishing between happiness, sadness, excitement, apathy, interest in a particular aspect of the presentation and/or other emotions or indications of interest exhibited by viewer), based on real-time input from viewer or from larger audience (such as deliberate viewer decision of which script or template path to take next, provided via an input device or detected by the presentation module), or based on other variables. Such non-linear template or script allows for example for the production and presentation of a PG-rated, R-rated, or X-rated version of a given movie depending on the audience (for example a parent may elect to view the R-rated version of the movie while electing a PG-rated presentation for the children). As another example, a wedding template or script may allow for different presentations based on whether the bride's family or the groom's family is viewing. As another example, a mystery presentation may offer alternate endings, based on viewer input or external variables as described above.

Media appliance 100 senses 404 video and/or audio and stores a digital representation in memory 110. Optionally, multiple audio and/or video streams are sensed, either by the same media appliance or by collaborating media appliances, wherein synchronization is provided for the multiple streams, in the form of meta-data tags describing related scenes and/or streams and/or frames, and/or in the form of meta-data describing time stamps relating different scenes and/or streams. For example, memory 110 stores a data structure comprising one or more fields identifying one or more related video scenes and/or streams and/or frames, and a field indicating the nature of the relation (for example indicating that the video scenes and/or streams and/or frames represented different viewing angles of the same sensed object).

FIG. 5 is a flow diagram illustrating a method for optionally filling-in a template according to a preferred embodiment of the present invention. Starting 501 with a template, sense 502 a first scene according to the template, and fill-in 503 sensed scene in template. If no additional scene is desired 505, finish 506, else 504 proceed to step 502 and repeat until done. Template is stored in memory 110 comprising suitable format such as the Advanced Authoring Format (AAF).

FIG. 6 is a flow diagram illustrating a method for optionally tagging audio and/or video representation with information contained in a meta-data structure. Upon sensing 601 a scene, the digital representation of the sensed scene is tagged 602 with meta-data. Meta-data comprises time, media appliance location (such as provided by GPS module 112), media appliance orientation and/or media appliance acceleration (such as provided by acceleration detector 113), multi-lingual features (allowing for translation, subtitles, voice-over, etc.), cues to a theater automation system (such as instructions for house lights to go up, half-way up, or down, or instructions to open or close curtains, etc.), instructions for allowing or disallowing content (such as trailers or promotional clips) to play next to other similar content, information indicating suitability of content for different audiences such as children, information indicating any promotional offers, products and/or services (such as advertisements, product catalogs and/or coupons for products and/or services), information allowing for organizing and/or managing meta-data available to advertisers and/or service providers, and/or other information describing, identifying and/or relating to content. Tagging may be done per scene, per frame, per audio and/or video stream (e.g. when multiple streams are present), or per other defined segment of audio and/or video. For example, a video scene is tagged with meta-data comprising a field identifying the language used in the video scene. As another example, a video stream is tagged with meta-data comprising a field indicating a warning against viewing by children.

In one embodiment, media appliance 100 is a member of a distributed group of media appliances 100, for example in a distributed network of media appliances 100 and/or in a peer-to-peer configuration of media appliances 100. A media appliance 100 dynamically joins and/or leaves a distributed group of media appliances 100, in parallel and/or serially with other media appliances 100. Alternatively, media appliance 100 initiates a distributed group of media appliances 100, allowing for other media appliance's 100 to dynamically join and/or leave the group. In one embodiment, the group of media appliances 100 collaborates to cover an event, such as a sporting event, a public political event (e.g. a rally), a family event (e.g. a wedding), or other event. Media appliances 100 tag sensed audio and/or video data as described above (e.g. with GPS information, time stamps, DRM meta-data, or other information previously described), allowing reconstruction of covered event from the audio and/or video data collected by distributed media appliances 100. Memory 110 stores instructions and/or data for initiating, joining, leaving and/or querying the status of or information about such a distributed group of media appliances 100.

Foregoing described embodiments of the invention are provided as illustrations and descriptions. They are not intended to limit the invention to precise form described. In particular, it is contemplated that functional implementation of invention described herein may be implemented equivalently in hardware, software, firmware, and/or other available functional components or building blocks, and that networks may be wired, wireless, or a combination of wired and wireless. Other variations and embodiments are possible in light of above teachings, and it is thus intended that the scope of invention not be limited by this Detailed Description, but rather by claims following.

Claims (2)

1. A method of displaying a cell phone user interface for a mobile effects studio including a distributed mobile group of collaboratively synchronized cellular telephones, comprising:

providing a view in a cell phone appliance for selecting one or more video data; and

receiving a selection designating one of the one or more video data for inclusion into a news program or a sports program;

wherein the one or more video data originate at one or more media appliances;

wherein the video data is provided to a distributed group of network-extensible media appliances, such video data being provided using programmable digital effects processing via real-time signal processing to convert a communication format of the provided video data, wherein the converted data is associated with one or more tag for synchronizing video production or rendering thereof with one or more other video data provided by one or more other appliance in the group, such synchronization automatically enabling seamless collaboration or dynamic integration by the group appliances using one or more tag to produce a media project or reconstruct a covered event, whereby such distributed group serves effectively as a mobile effects studio for digitally processing synchronized video data in the group in a network extensible manner, wherein the mobile effects studio comprises a distributed mobile group of cellular telephones that are synchronized in real-time using said one or more tag for collaboration or integration, such that the mobile effects studio automatically enables extensible reconfiguration of on-appliance digital effects among such cellular telephones belonging to the mobile effects studio.

2. Apparatus for displaying a cell phone user interface for a mobile effects studio including a distributed mobile group of collaboratively synchronized cellular telephones, comprising:

display means for providing a view in a cell phone appliance for selecting one or more video data; and

user input means for receiving a selection designating one of the one or more video data for inclusion into a news program or a sports program;

wherein the one or more video data originate at one or more media appliances; wherein the video data is provided to a distributed group of network-extensible media appliances, such video data being provided using programmable digital effects processing via real-time signal processing to convert a communication format of the provided video data, wherein the converted data is associated with one or more tag for synchronizing video production or rendering thereof with one or more other video data provided by one or more other appliance in the group, such synchronization automatically enabling seamless collaboration or dynamic integration by the group appliances using one or more tag to produce a media project or reconstruct a covered event, whereby such distributed group serves effectively as a mobile effects studio for digitally processing synchronized video data in the group in a network extensible manner, wherein the mobile effects studio comprises a distributed mobile group of cellular telephones that are synchronized in real-time using said one or more tag for collaboration or integration, such that the mobile effects studio automatically enables extensible reconfiguration of on-appliance digital effects among such cellular telephones belonging to the mobile effects studio.

Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene