Object Recognition

Recognizes a known set of corporate logos within target media.

The Object Recognition API matches logos in a media file that you provide against a database of corporate logos.

The public dataset contains a library of corporate logos, which you can match against. When you submit a media file to the API, Haven OnDemand searches your media for sections that match the logos in the database.

The API returns the name of the logo in the database that was detected (based on the stock ticker for the company that owns the logo), and the location of the object in your media. The location is given as the coordinates of the corners of a box that surrounds the matching object.

Get the Results

The asynchronous mode returns a job-id, which you can then use to extract your results. There are two methods for this:

Use /1/job/status/to get the status of the job, including results if the job is finished.

Use /1/job/result/, which waits until the job has finished and then returns the result.

Note: Because /result has to wait for the job to finish before it can return a response, using it for longer operations such as processing a large video file can result in an HTTP request timeout response. The /result method returns a response either when the result is available, or after 120 seconds, whichever is sooner. If the job is not complete after 120 seconds, the /result method returns a code 7010 (job result request timeout) response. This means that your asynchronous job is still in progress. To avoid the timeout, use /status instead.

Results

The results contain position coordinates for recognized objects, these are relative to the upper left corner of the frame and are expressed in pixels (unless the source is a PDF where the units are expressed as points).

When the input is a video, the results contain time information. The time information is in seconds (milliseconds appear as a decimal, for example 3.5).

Asynchronous Use

Additional requests are required to get the result if this API is invoked asynchronously.

You can use /1/job/status/<job-id> to get the status of the job, including results if the job is finished.

You can also use /1/job/result/<job-id>, which waits until the job has finished and then returns the result.

Model

This is an abstract definition of the response that describes each of the properties that might be returned.

Object Recognition Response {

source_information (
Source_information
, optional)

Metadata information about a media file

items (
array[Items]
)

An array of recognized objects.

}

Object Recognition Response:Source_information {

mime_type (
string
)

MIME type of the document.

video_information (
Video_information
, optional)

Information about the video track if one is present.

audio_information (
Audio_information
, optional)

Information about the audio track if one is present.

image_information (
Image_information
, optional)

Information about the image track if one is present.

document_information (
Document_information
, optional)

Information about the document track. This is only available if the source media is a multi-page image, presentation file, or PDF.

}

Object Recognition Response:Source_information:Video_information {

width (
integer
)

The width of the video in pixels.

height (
integer
)

The height of the video in pixels.

codec (
string
)

The algorithm used to encode the video.

pixel_aspect_ratio (
string
)

The aspect ratio of pixels in the video. For example, if the video is made up of square pixels this value is 1:1.

}

Object Recognition Response:Source_information:Audio_information {

codec (
string
)

The algorithm used to encode the audio.

sample_rate (
integer
)

The frequency at which the audio was sampled.

channels (
integer
)

The number of channels present in the audio. For example, for stereo this value is 2.

}

Object Recognition Response:Source_information:Image_information {

width (
integer
)

The width of the image in pixels.

height (
integer
)

The height of the image in pixels.

}

Object Recognition Response:Source_information:Document_information {

page_count (
integer
)

The estimated number of pages in the document.

}

Object Recognition Response:Items {

start_time_offset (
number
, optional)

Time from the start of the video to where the object first appears. This value is expressed as a non integer number. This is only available if the source media is a video.

end_time_offset (
number
, optional)

Time from the start of the video to where the object is no longer visible. This value is expressed as a non integer number. This is only available if the source media is a video.

time_offset (
number
, optional)

Time offset from the start of the video to where the object was detected. This value is expressed as a non integer number. This is only available if the source media is a video.

page (
integer
, optional)

The page (1-based index) in the document where the object is located. This is only available if the source media is a multi-page image, presentation file, or PDF.

source_region_coordinates (
array[Source_region_coordinates]
)

An array of coordinates (relative to the top left of the source) that represents the bounding region where the object can be found. If the source media is a video then this region is where the object was found at time_offset in the input source.

Haven OnDemand uses cookies to enhance and improve the experience it provides. By continuing to use this site or pressing Continue,
we will assume that you accept receiving all cookies. If you would like to change which cookies are set, you can change your settings.