Abstract:

An apparatus for decoding an audio signal and method thereof are
disclosed. The present invention includes receiving the audio signal and
spatial information, identifying a type of modified spatial information,
generating the modified spatial information using the spatial
information, and decoding the audio signal using the modified spatial
information, wherein the type of the modified spatial information
includes at least one of partial spatial information, combined spatial
information and expanded spatial information. Accordingly, an audio
signal can be decoded into a configuration different from a configuration
decided by an encoding apparatus. Even if the number of speakers is
smaller or greater than that of multi-channels before execution of
downmixing, it is able to generate output channels having the number
equal to that of the speakers from a downmix audio signal.

Claims:

1. A method of decoding an audio signal, comprising:receiving spatial
information including at least one spatial information and spatial filter
information including at least one filter parameter;generating combined
spatial information having a surround effect by combining the spatial
parameter and the filter parameter; andconverting the audio signal to a
virtual surround signal using the combined spatial information.

2. The method of claim 1, wherein the combined spatial parameter is
generated by inputting the spatial parameter and the filter parameter to
a conversion formula.

3. The method of claim 2, wherein the combined spatial parameter includes
a filter coefficient.

4. The method of claim 2, further comprising deciding the conversion
formula according to tree configuration information for the audio signal.

5. The method of claim 2, further comprising deciding the conversion
formula according to output channel information.

6. The method of claim 1, wherein the spatial filter information is a
sound path.

7. An apparatus for decoding an audio signal, comprising:a modified
spatial information generating unit generating combined spatial
information having a surround effect by combining a spatial parameter and
a filter parameter; andan output channel generating unit converting the
audio signal to a virtual surround signal using the combined spatial
information,wherein the spatial parameter is included in spatial
information, the filter parameter is included in spatial filter
information, and the spatial information and the spatial filter
information are received.

Description:

TECHNICAL FIELD

[0001]The present invention relates to audio signal processing, and more
particularly, to an apparatus for decoding an audio signal and method
thereof. Although the present invention is suitable for a wide scope of
applications, it is particularly suitable for decoding audio signals.

BACKGROUND ART

[0002]Generally, when an encoder encodes an audio signal, in case that the
audio signal to be encoded is a multi-channel audio signal, the
multi-channel audio signal is downmixed into two channels or one channel
to generate a downmix audio signal and spatial information is extracted
from the multi-channel audio signal. The spatial information is the
information usable in upmixing the multi-channel audio signal from the
downmix audio signal.

[0003]Meanwhile, the encoder downmixes a multi-channel audio signal
according to a predetermined tree configuration. In this case, the
predetermined tree configuration can be the structure(s) agreed between
an audio signal decoder and an audio signal encoder. In particular, if
identification information indicating a type of one of the predetermined
tree configurations is present, the decoder is able to know a structure
of the audio signal having been upmixed, e.g., a number of channels, a
position of each of the channels, etc.

[0004]Thus, if an encoder downmixes a multi-channel audio signal according
to a predetermined tree configuration, spatial information extracted in
this process is dependent on the structure as well. So, in case that a
decoder upmixes the downmix audio signal using the spatial information
dependent on the structure, a multi-channel audio signal according to the
structure is generated. Namely, in case that the decoder uses the spatial
information generated by the encoder as it is, upmixing is performed
according to the structure agreed between the encoder and the decoder
only. So, it is unable to generate an output-channel audio signal failing
to follow the agreed structure. For instance, it is unable to upmix a
signal into an audio signal having a channel number different (smaller or
greater) from a number of channels decided according to the agreed
structure.

DISCLOSURE OF THE INVENTION

[0005]Accordingly, the present invention is directed to an apparatus for
decoding an audio signal and method thereof that substantially obviate
one or more of the problems due to limitations and disadvantages of the
related art.

[0006]An object of the present invention is to provide an apparatus for
decoding an audio signal and method thereof, by which the audio signal
can be decoded to have a structure different from that decided by an
encoder.

[0007]Another object of the present invention is to provide an apparatus
for decoding an audio signal and method thereof, by which the audio
signal can be decoded using spatial information generated from modifying
former spatial information generated from encoding.

[0008]Additional features and advantages of the invention will be set
forth in the description which follows, and in part will be apparent from
the description, or may be learned by practice of the invention. The
objectives and other advantages of the invention will be realized and
attained by the structure particularly pointed out in the written
description and claims thereof as well as the appended drawings.

[0009]To achieve these and other advantages and in accordance with the
purpose of the present invention, as embodied and broadly described, a
method of decoding an audio signal according to the present invention
includes receiving the audio signal and spatial information, identifying
a type of modified spatial information, generating the modified spatial
information using the spatial information, and decoding the audio signal
using the modified spatial information, wherein the type of the modified
spatial information includes at least one of partial spatial information,
combined spatial information and expanded spatial information.

[0010]To further achieve these and other advantages and in accordance with
the purpose of the present invention, a method of decoding an audio
signal includes receiving spatial information, generating combined
spatial information using the spatial information, and decoding the audio
signal using the combined spatial information, wherein the combined
spatial information is generated by combining spatial parameters included
in the spatial information.

[0011]To further achieve these and other advantages and in accordance with
the purpose of the present invention, a method of decoding an audio
signal includes receiving spatial information including at least one
spatial information and spatial filter information including at least one
filter parameter, generating combined spatial information having a
surround effect by combining the spatial parameter and the filter
parameter, and converting the audio signal to a virtual surround signal
using the combined spatial information.

[0012]To further achieve these and other advantages and in accordance with
the purpose of the present invention, a method of decoding an audio
signal includes receiving the audio signal, receiving spatial information
including tree configuration information and spatial parameters,
generating modified spatial information by adding extended spatial
information to the spatial information, and upmixing the audio signal
using the modified spatial information, which comprises including
converting the audio signal to a primary upmixed audio signal based on
the spatial information and converting the primary upmixed audio signal
to a secondary upmixed audio signal based on the extended spatial
information.

[0013]It is to be understood that both the foregoing general description
and the following detailed description are exemplary and explanatory and
are intended to provide further explanation of the invention as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014]The accompanying drawings, which are included to provide a further
understanding of the invention and are incorporated in and constitute a
part of this specification, illustrate embodiments of the invention and
together with the description serve to explain the principles of the
invention.

[0015]In the drawings:

[0016]FIG. 1 is a block diagram of an audio signal encoding apparatus and
an audio signal decoding apparatus according to the present invention;

[0017]FIG. 2 is a schematic diagram of an example of applying partial
spatial information;

[0018]FIG. 3 is a schematic diagram of another example of applying partial
spatial information;

[0019]FIG. 4 is a schematic diagram of a further example of applying
partial spatial information;

[0020]FIG. 5 is a schematic diagram of an example of applying combined
spatial information;

[0021]FIG. 6 is a schematic diagram of another example of applying
combined spatial information;

[0022]FIG. 7 is a diagram of sound paths from speakers to a listener, in
which positions of the speakers are shown;

[0023]FIG. 8 is a diagram to explain a signal outputted from each speaker
position for a surround effect;

[0024]FIG. 9 is a conceptional diagram to explain a method of generating a
3-channel signal using a 5-channel signal;

[0025]FIG. 10 is a diagram of an example of configuring extended channels
based on extended channel configuration information;

[0026]FIG. 11 is a diagram to explain a configuration of the extended
channels shown in FIG. 10 and the relation with extended spatial
parameter;

[0027]FIG. 12 is a diagram of positions of a multi-channel audio signal of
5.1-channels and an output channel audio signal of 6.1-channels;

[0028]FIG. 13 is a diagram to explain the relation between a virtual sound
source position and a level difference between two channels;

[0029]FIG. 14 is a diagram to explain levels of two rear channels and a
level of a rear center channel;

[0030]FIG. 15 is a diagram to explain a position of a multi-channel audio
signal of 5.1-channels and a position of an output channel audio signal
of 7.1-channels;

[0031]FIG. 16 is a diagram to explain levels of two left channels and a
level of a left front side channel (Lfs); and

[0032]FIG. 17 is a diagram to explain levels of three front channels and a
level of a left front side channel (Lfs).

BEST MODE FOR CARRYING OUT THE INVENTION

[0033]Reference will now be made in detail to the preferred embodiments of
the present invention, examples of which are illustrated in the
accompanying drawings.

[0034]General terminologies used currently and globally are selected as
terminologies used in the present invention. And, there are terminologies
arbitrarily selected by the applicant for special cases, for which
detailed meanings are explained in detail in the description of the
preferred embodiments of the present invention. Hence, the present
invention should be understood not with the names of the terminologies
but with the meanings of the terminologies.

[0035]First of all, the present invention generates modified spatial
information using spatial information and then decodes an audio signal
using the generated modified spatial information. In this case, the
spatial information is spatial information extracted in the course of
downmixing according to a predetermined tree configuration and the
modified spatial information is spatial information newly generated using
spatial information.

[0036]The present invention will be explained in detail with reference to
FIG. 1 as follows.

[0037]FIG. 1 is a block diagram of an audio signal encoding apparatus and
an audio signal decoding apparatus according to an embodiment of the
present invention.

[0041]Meanwhile, the spatial information can be the information extracted
in the course of downmixing the multi-channel audio signal IN_M according
to a predetermined tree configuration. In this case, the tree
configuration may correspond to tree configuration(s) agreed between the
audio signal decoding and encoding apparatuses, which is not limited by
the present invention.

[0042]And, the spatial information is able to include tree configuration
information, an indicator, spatial parameters and the like. The tree
configuration information is the information for a tree configuration
type. So, a number of multi-channels, a per-channel downmixing sequence
and the like vary according to the tree configuration type. The indicator
is the information indicating whether extended spatial information is
present or not, etc. And, the spatial parameters can include channel
level difference (hereinafter abbreviated CLD) in the course of
downmixing at least two channels into at most two channels, inter-channel
correlation or coherence (hereinafter abbreviated ICC), channel
prediction coefficients (hereinafter abbreviated CPC) and the like.

[0043]Meanwhile, the spatial information extracting unit 120 is able to
further extract extended spatial information as well as the spatial
information. In this case, the extended spatial information is the
information needed to additionally extend the downmix audio signal d
having been upmixed with the spatial parameter. And, the extended spatial
information can include extended channel configuration information and
extended spatial parameters. The extended spatial information, which
shall be explained later, is not limited to the one extracted by the
spatial information extracting unit 120.

[0044]Besides, the encoding apparatus 100 is able to further include a
core codec encoding unit (not shown in the drawing) generating a
downmixed audio bitstream by decoding the downmix audio signal d, a
spatial information encoding unit (not shown in the drawing) generating a
spatial information bitstream by encoding the spatial information s, and
a multiplexing unit (not shown in the drawing) generating a bitstream of
an audio signal by multiplexing the downmixed audio bitstream and the
spatial information bitstream, on which the present invention does not
put limitation.

[0045]And, the decoding apparatus 200 is able to further include a
demultiplexing unit (not shown in the drawing) separating the bitstream
of the audio signal into a downmixed audio bitstream and a spatial
information bitstream, a core codec decoding unit (not shown in the
drawing) decoding the downmixed audio bitstream, and a spatial
information decoding unit (not shown in the drawing) decoding the spatial
information bitstream, on which the present invention does not put
limitation.

[0046]The modified spatial information generating unit 220 of the decoding
apparatus 200 identifies a type of the modified spatial information using
the spatial information and then generates modified spatial information
s' of a type that is identified based on the spatial information. In this
case, the spatial information can be the spatial information s conveyed
from the encoding apparatus 100. And, the modified spatial information is
the information that is newly generated using the spatial information.

[0047]Meanwhile, there can exist various types of the modified spatial
information. And, the various types of the modified spatial information
can include at least one of a) partial spatial information, b) combined
spatial information, and c) extended spatial information, on which no
limitation is put by the present invention.

[0048]The partial spatial information includes spatial parameters in part,
the combined spatial information is generated from combining spatial
parameters, and the extended spatial information is generated using the
spatial information and the extended spatial information.

[0049]The modified spatial information generating unit 220 generates the
modified spatial information in a manner that can be varied according to
the type of the modified spatial information. And, a method of generating
modified spatial information per a type of the modified spatial
information will be explained in detail later.

[0050]Meanwhile, a reference for deciding the type of the modified spatial
information may correspond to tree configuration information in spatial
information, indicator in spatial information, output channel information
or the like. The tree configuration information and the indicator can be
included in the spatial information s from the encoding apparatus. The
output channel information is the information for speakers
interconnecting to the decoding apparatus 200 and can include a number of
output channels, position information for each output channel and the
like.

[0051]The output channel information can be inputted in advance by a
manufacturer or inputted by a user.

[0052]A method of deciding a type of modified spatial information using
theses informations will be explained in detail later.

[0054]The spatial filter information 230 is the information for sound
paths and is provided to the modified spatial information generating unit
220. In case that the modified spatial information generating unit 220
generates combined spatial information having a surround effect, the
spatial filter information can be used.

[0055]Hereinafter, a method of decoding an audio signal by generating
modified spatial information per a type of the modified spatial
information is explained in order of (1) Partial spatial information, (2)
Combined spatial information, and (3) Expanded spatial information as
follows.

[0056](1) Partial Spatial Information

[0057]Since spatial parameters are calculated in the course of downmixing
a multi-channel audio signal according to a predetermined tree
configuration, an original multi-channel audio signal before downmixing
can be reconstructed if a downmix audio signal is decoded using the
spatial parameters intact. In case of attempting to make a channel number
N of an output channel audio signal be smaller than a channel number M of
a multi-channel audio signal, it is able to decode a downmix audio signal
by applying the spatial parameters in part.

[0058]This method can be varied according to a sequence and method of
downmixing a multi-channel audio signal in an encoding apparatus, i.e., a
type of a tree configuration. And, the tree configuration type can be
inquired using tree configuration information of spatial information.
And, this method can be varied according to a number of output channels.
Moreover, it is able to inquire the number of output channels using
output channel information.

[0059]Hereinafter, in case that a channel number of an output channel
audio signal is smaller than a channel number of a multi-channel audio
signal, a method of decoding an audio signal by applying partial spatial
information including spatial parameters in part is explained by taking
various tree configurations as examples in the following description.

[0061]FIG. 2 is a schematic diagram of an example of applying partial
spatial information.

[0062]Referring to a left part of FIG. 2, a sequence of downmixing a
multi-channel audio signal having a channel number 6 (left front channel
L, left surround channel Ls, center channel C, low frequency channel
LFE, right front channel R, right surround channel Rs) into stereo
downmixed channels Lo and Ro and the relation between the
multi-channel audio signal and spatial parameters are shown.

[0063]First of all, downmixing between the left channel L and the left
surround channel Ls, downmixing between the center channel C and the
low frequency channel LFE and downmixing between the right channel R and
the right surround channel Rs are carried out. In this primary
downmixing process, a left total channel Lt, a center total channel
Ct and a right total channel Rt are generated. And, spatial
parameters calculated in this primary downmixing process include
CLD2 (ICC2 inclusive), CLD1 (ICC1 inclusive),
CLD0 (ICC0 inclusive), etc.

[0064]In a secondary process following the primary downmixing process, the
left total channel Lt, the center total channel Ct and the
right total channel Rt are downmixed together to generate a left
channel Lo and a right channel Ro. And, spatial parameters
calculated in this secondary downmixing process are able to include
CLDTTT, CPCTTT, ICCTTT, etc.

[0065]In other words, a multi-channel audio signal of total six channels
is downmixed in the above sequential manner to generate the stereo
downmixed channels Lo and Ro.

[0066]If the spatial parameters (CLD2, CLD1, CLD0,
CLDTTT, etc.) calculated in the above sequential manner are used as
they are, they are upmixed in sequence reverse to the order for the
downmixing to generate the multi-channel audio signal having the channel
number of 6 (left front channel L, left surround channel Ls, center
channel C, low frequency channel LFE, right front channel R, right
surround channel Rs).

[0067]Referring to a right part of FIG. 2, in case that partial spatial
information corresponds to CLDTTT among spatial parameters
(CLD2, CLD1, CLD0, CLDTTT, etc.), it is upmixed into
the left total channel Lt, the center total channel Ct and the
right total channel Rt. If the left total channel Lt and the
right total channel Rt are selected as an output channel audio
signal, it is able to generate an output channel audio signal of two
channels Lt and Rt. If the left total channel Lt, the
center total channel Ct and the right total channel Rt are
selected as an output channel audio signal, it is able to generate an
output channel audio signal of three channels Lt, Ct and
Rt.

[0068]After upmixing has been performed using CLD1 in addition, if
the left total channel Lt, the right total channel Rt, the
center channel C and the low frequency channel LFE are selected, it is
able to generate an output channel audio signal of four channels
(Lt, Rt, C and LFE).

[0070]FIG. 3 is a schematic diagram of another example of applying partial
spatial information.

[0071]Referring to a left part of FIG. 3, a sequence of downmixing a
multi-channel audio signal having a channel number 6 (left front channel
L, left surround channel Ls, center channel C, low frequency channel
LFE, right front channel R, right surround channel Rs) into a mono
downmix audio signal M and the relation between the multi-channel audio
signal and spatial parameters are shown.

[0072]First of all, like the first example, downmixing between the left
channel L and the left surround channel Ls, downmixing between the
center channel C and the low frequency channel LFE and downmixing between
the right channel R and the right surround channel Rs are carried
out.

[0073]In this primary downmixing process, a left total channel Lt, a
center total channel Ct and a right total channel Rt are
generated. And, spatial parameters calculated in this primary downmixing
process include CLD3 (ICC3 inclusive), CLD4 (ICC4
inclusive), CLD5 (ICC5 inclusive), etc. (in this case,
CLDx, and ICCx are discriminated from the former CLDx in
the first example).

[0074]In a secondary process following the primary downmixing process, the
left total channel Lt and the right total channel Rt are
downmixed together to generate a left center channel LC, and the center
total channel Ct and the right total channel Rt are downmixed
together to generate a right center channel RC. And, spatial parameters
calculated in this secondary downmixing process are able to include
CLD2 (ICC2 inclusive), CLD1 (ICC1 inclusive), etc.

[0075]Subsequently, in a tertiary downmixing process, the left center
channel LC and the right center channel Rt are downmixed to generate
a mono downmixed signal M. And, spatial parameters calculated in the
tertiary downmxing process include CLD0 (ICC0 inclusive), etc.

[0076]Referring to a right part of FIG. 3, in case that partial spatial
information corresponds to CLD0 among spatial parameters (CLD3,
CLD4, CLD5, CLD1, CLD2, CLD0, etc.), a left
center channel LC and a right center channel RC are generated. If the
left center channel LC and the right center channel RC are selected as an
output channel audio signal, it is able to generate an output channel
audio signal of two channels LC and RC.

[0077]Meanwhile, if partial spatial information corresponds to CLD0,
CLD1 and CLD2, among spatial parameters (CLD3, CLD4,
CLD5, CLD1, CLD2, CLD0, etc.), a left total channel
Lt, a center total channel Ct and a right total channel Rt
are generated.

[0078]If the left total channel Lt and the right total channel
Rt are selected as an output channel audio signal, it is able to
generate an output channel audio signal of two channels Lt and
Rt. If the left total channel Lt, the center total channel
Ct and the right total channel Rt are selected as an output
channel audio signal, it is able to generate an output channel audio
signal of three channels Lt, Ct and Rt.

[0079]In case that partial spatial information includes CLD4 in
addition, after upmixing has been performed up to a center channel and a
low frequency channel LFE, if the left total channel Lt, the right
total channel Rt, the center channel C and the low frequency channel
LFE are selected as an output channel audio signal, it is able to
generate an output channel audio signal of four channels (Lt,
Rt, C and LFE).

[0081]FIG. 4 is a schematic diagram of a further example of applying
partial spatial information.

[0082]Referring to a left part of FIG. 4, a sequence of downmixing a
multi-channel audio signal having a channel number 6 (left front channel
L, left surround channel Ls, center channel C, low frequency channel
LFE, right front channel R, right surround channel Rs) into a mono
downmix audio signal M and the relation between the multi-channel audio
signal and spatial parameters are shown.

[0083]First of all, like the first or second example, downmixing between
the left channel L and the left surround channel Ls, downmixing
between the center channel C and the low frequency channel LFE and
downmixing between the right channel R and the right surround channel
Rs are carried out.

[0084]In this primary downmixing process, a left total channel Lt, a
center total channel Ct and a right total channel Rt are
generated. And, spatial parameters calculated in this primary downmixing
process include CLD1 (ICC3 inclusive), CLD2 (ICC2
inclusive), CLD3 (ICC3 inclusive), etc. (in this case,
CLDx and ICCx are discriminated from the former CLDx and
ICCx in the first or second example).

[0085]In a secondary process following the primary downmixing process, the
left total channel Lt, the center total channel Ct and the
right total channel Rt are downmixed together to generate a left
center channel LC and a right channel R. And, a spatial parameter
CLDTTT (ICCTTT inclusive) is calculated.

[0086]Subsequently, in a tertiary downmixing process, the left center
channel LC and the right channel R are downmixed to generate a mono
downmixed signal M. And, a spatial parameter CLD0 (ICC0
inclusive) is calculated.

[0087]Referring to a right part of FIG. 4, in case that partial spatial
information corresponds to CLD0 and CLDTTT among spatial
parameters (CLD1, CLD2, CLD3, CLDTTT, CLD0,
etc.), a left total channel Lt, a center total channel Ct and a
right total channel Rt are generated.

[0088]If the left total channel Lt and the right total channel
Rt are selected as an output channel audio signal, it is able to
generate an output channel audio signal of two channels Lt and
Rt.

[0089]If the left total channel Lt, the center total channel Ct
and the right total channel Rt are selected as an output channel
audio signal, it is able to generate an output channel audio signal of
three channels Lt, Ct and Rt.

[0090]In case that partial spatial information includes CLD2 in
addition, after upmixing has been performed up to a center channel C and
a low frequency channel LFE, if the left total channel Lt, the right
total channel Rt, the center channel C and the low frequency channel
LFE are selected as an output channel audio signal, it is able to
generate an output channel audio signal of four channels (Lt,
Rt, C and LFE).

[0091]In the above description, the process for generating the output
channel audio signal by applying the spatial parameters in part only has
been explained by taking the three kinds of tree configurations as
examples. Besides, it is also able to additionally apply combined spatial
information or extended spatial information as well as the partial
spatial information. Thus, it is able to handle the process for applying
the modified spatial information to the audio signal hierarchically or
collectively and synthetically.

[0092](2) Combined Spatial Information

[0093]Since spatial information is calculated in the course of downmixing
a multi-channel audio signal according to a predetermined tree
configuration, an original multi-channel audio signal before downmixing
can be reconstructed if a downmix audio signal is decoded using spatial
parameters of the spatial information as they are. In case that a channel
number M of a multi-channel audio signal is different from a channel
number N of an output channel audio signal, new combined spatial
information is generated by combining spatial information and it is then
able to upmix the downmix audio signal using the generated information.
In particular, by applying spatial parameters to a conversion formula, it
is able to generate combined spatial parameters.

[0094]This method can be varied according to a sequence and method of
downmixing a multi-channel audio signal in an encoding apparatus. And, it
is able to inquire the downmixing sequence and method using tree
configuration information of spatial information. And, this method can be
varied according to a number of output channels. Moreover, it is able to
inquire the number of output channels and the like using output channel
information.

[0095]Hereinafter, detailed embodiments for a method of modifying spatial
information and embodiments for giving a virtual 3-D effect are explained
in the following description.

[0096](2)-1. General Combined Spatial Information

[0097]A method of generating combined spatial parameters by combining
spatial parameters of spatial information is provided for the upmixing
according to a tree configuration different from that in a downmixing
process. So, this method is applicable to all kinds of downmix audio
signals no matter what a tree configuration according to tree
configuration information is.

[0098]In case that a multi-channel audio signal is 5.1-channel and a
downmix audio signal is 1-channel (mono channel), a method of generating
an output channel audio signal of two channels is explained with
reference to two kinds of examples as follows.

[0100]FIG. 5 is a schematic diagram of an example of applying combined
spatial information.

[0101]Referring to a left part of FIG. 5, CLD0 to CLD4 and
ICC0 to ICC4 (not shown in the drawing) can be called spatial
parameters that can be calculated in a process for downmixing a
multi-channel audio signal of 5.1-channels. For instance, in spatial
parameters, an inter-channel level difference between a left channel
signal L and a right channel signal R is CLD3 and inter-channel
correlation between L and R is ICC3. And, an inter-channel level
difference between a left surround channel Ls and a right surround
channel Rs is CLD2 and inter-channel correlation between
Ls and Rs is ICC2.

[0102]On the other hand, referring to a right part of FIG. 5, if a left
channel signal Lt and a right channel signal Rt are generated
by applying combined spatial parameters CLD.sub.α and
ICC.sub.α to a mono downmix audio signal m, it is able to directly
generate a stereo output channel audio signal Lt and Rt from
the mono channel audio signal m. In this case, the combined spatial
parameters CLD.sub.α and ICC.sub.α can be calculated by
combining the spatial parameters CLD0 to CLD4 and ICC0 to
ICC4.

[0103]Hereinafter, a process for calculating CLD.sub.α among
combined spatial parameters by combining CLD0 to CLD4 together
is firstly explained, and a process for calculating ICC.sub.α among
combined spatial parameters by combining CLD0 to CLD4 and
ICC0 to ICC4 is then explained as follows.

[0104](2)-1-1-a. Derivation of CLD.sub.α

[0105]First of all, since CLD.sub.α is a level difference between a
left output signal Lt and a right output signal Rt, a result
from inputting the left output signal Lt and the right output signal
Rt to a definition formula of CLD is shown as follows.

CLD.sub.α=10*log10(PLt/PRt), [Formula 1]

[0106]where PLt is a power of Lt and PRt is a power of
Rt.

CLD.sub.α=10*log10(PLt+a/PRt+a), [Formula 2]

[0107]where PLt is a power of Lt, PRt is a power of
Rt, and `a` is a very small constant.

[0108]Hence, CLD.sub.α is defined as Formula 1 or Formula 2.

[0109]Meanwhile, in order to represent PLt and PRt using spatial
parameters CLD0 to CLD4, a relation formula between a left
output signal Lt of an output channel audio signal, a right output
signal Rt of the output channel audio signal and a multi-channel
signal L, Ls, R, Rs, C and LFE are needed. And, the
corresponding relation formula can be defined as follows.

Lt=L+Ls+C/ 2+LFE/ 2

Rt=R+Rs+C/ 2+LFE/2 [Formula 3]

[0110]Since the relation formula like Formula 3 can be varied according to
how to define an output channel audio signal, it can be defined in a
manner of formula different from Formula 3. For instance, `1/ 2` in C/ 2
or LFE/ 2 can be `0` or `1`.

[0111]Formula 3 can bring out Formula 4 as follows.

PLt=PL+PLs+PC/2+PLFE/2

PRt=PR+PRs+PC/2+PLFE/2 [Formula 4]

[0112]It is able to represent CLD.sub.α according to Formula 1 or
Formula 2 using PLt and PRt. And, `PLt and PRt` can
be represented according to Formula 4 using PL, PLs, PC,
PLFE, PR and PRs. So, it is needed to find a relation
formula enabling the PL, PLs, PC, PLFE, PR and
PRs to be represented using spatial parameters CLD0 to
CLD4.

[0113]Meanwhile, in case of the tree configuration shown in FIG. 5, a
relation between a multi-channel audio signal (L, R, C, LFE, Ls,
Rs) and a mono downmixed channel signal m is shown as follows.

[0115]In particular, by inputting Formula 6 to Formula 4 and by inputting
Formula 4 to Formula 1 or Formula 2, it is able to represent the combined
spatial parameter CLD.sub.α in a manner of combining spatial
parameters CLD0 to CLD4.

[0116]Meanwhile, an expansion resulting from inputting Formula 6 to
PC/2+PLFE/2 in Formula 4 is shown in Formula 7.

[0117]In this case, according to definitions of c1 and c2 (cf.
Formula 5), since (c1,x)2+(c2,x)2=1, it results in
(c1,OTT4)2+(c2,OTT4)2=1.

[0118]So, Formula 7 can be briefly summarized as follows.

PC/2+PLFE/2=(c2,OTT1*c1,OTT0)2*m2/2
[Formula 8]

[0119]Therefore, by inputting Formula 8 and Formula 6 to Formula 4 and by
inputting Formula 4 to Formula 1, it is able to represent the combined
spatial parameter CLD.sub.α in a manner of combining spatial
parameters CLD0 to CLD4.

[0120](2)-1-1-b. Derivation of ICC.sub.α

[0121]First of all, since ICC.sub.α is a correlation between a left
output signal Lt and a right output signal Rt, a result from
inputting the left output signal Lt and the right output signal
Rt to a corresponding definition formula is shown as follows.

[0122]In Formula 9, PLt and PRt can be represented using
CLD0 to CLD4 in Formula 4, Formula 6 and Formula 8. And,
PLtPRt can be expanded in a manner of Formula 10.

PLtRt=PLR+PLsRs+PC/2+PLFE/2 [Formula 10]

[0123]In Formula 10, `PC/2+PLFE/2` can be represented as
CLD0 to CLD4 according to Formula 6. And, PLR and
PLsRs can be expanded according to ICC definition as follows.

ICC3=PLR/ (PLPR)

ICC2=PLsRs/ (PLsPRs) [Formula 11]

[0124]In Formula 11, if (PLPR) or (PLsPRs) is
transposed, Formula 12 is obtained.

PLR=ICC3 (PLPR)

PLsRs=ICC2* (PLsPRs) [Formula 12]

[0125]In Formula 12, PL, PR, PL, and PRs can be
represented as CLD0 to CLD4 according to Formula 6. A formula
resulting from inputting Formula 6 to Formula 12 corresponds to Formula
13.

PLR=ICC3*c1,OTT3*c2,OTT3*(c1,OTT1*c1,OTT0).s-
up.2*m2

PLsRs=ICC2c1,OTT2*c2,OTT2*(c2,OTT0)2*m2
[Formula 13]

[0126]In summary, by inputting Formula 6 and Formula 13 to Formula 10 and
by inputting Formula 10 and Formula 4 to Formula 9, it is able to
represent a combined spatial parameter ICC.sub.α as spatial
parameters CLD0 to CLD3, ICC2 and ICC3.

[0128]FIG. 6 is a schematic diagram of another example of applying
combined spatial information.

[0129]Referring to a left part of FIG. 6, CLD0 to CLD4 and
ICC0 to ICC4 (not shown in the drawing) can be called spatial
parameters that can be calculated in a process for downmixing a
multi-channel audio signal of 5.1-channels.

[0130]In the spatial parameters, an inter-channel level difference between
a left channel signal L and a left surround channel signal Ls is
CLD3 and inter-channel correlation between L and Ls is
ICC3. And, an inter-channel level difference between a right channel
R and a right surround channel Rs is CLD4 and inter-channel
correlation between R and Rs is ICC4.

[0131]On the other hand, referring to a right part of FIG. 6, if a left
channel signal Lt and a right channel signal Rt are generated
by applying combined spatial parameters CLD.sub.β and ICC.sub.β
to a mono downmix audio signal m, it is able to directly generate a
stereo output channel audio signal Lt and Rt from the mono
channel audio signal m. In this case, the combined spatial parameters
CLD.sub.β and ICC.sub.β can be calculated by combining the
spatial parameters CLD0 to CLD4 and ICC0 to ICC4.

[0132]Hereinafter, a process for calculating CLD.sub.β among combined
spatial parameters by combining CLD0 to CLD4 is firstly
explained, and a process for calculating ICC.sub.β among combined
spatial parameters by combining CLD0 to CLD4 and ICC0 to
ICC4 is then explained as follows.

[0133](2)-1-2-a. Derivation of CLD.sub.β

[0134]First of all, since CLD.sub.β is a level difference between a
left output signal Lt and a right output signal Rt, a result
from inputting the left output signal Lt and the right output signal
Rt to a definition formula of CLD is shown as follows.

CLD.sub.β=10*log10(PLt/PRt), [Formula 14]

[0135]where PLt is a power of Lt and PRt is a power of
Rt.

CLD.sub.β=10*log10(PLt+a/PRt+a), [Formula 15]

[0136]where PLt is a power of Lt, PRt is a power of
Rt, and `a` is a very small number.

[0137]Hence, CLD.sub.β is defined as Formula 14 or Formula 15.

[0138]Meanwhile, in order to represent PLt and PRt using spatial
parameters CLD0 to CLD4, a relation formula between a left
output signal Lt of an output channel audio signal, a right output
signal Rt of the output channel audio signal and a multi-channel
signal L, Ls, R, Rs, C and LFE are needed. And, the
corresponding relation formula can be defined as follows.

Lt=L+Ls+C/ 2+LFE/ 2

Rt=R+Rs+C/ 2+LFE/ 2 [Formula 16]

[0139]Since the relation formula like Formula 16 can be varied according
to how to define an output channel audio signal, it can be defined in a
manner of formula different from Formula 16. For instance, `1/ 2` in C/ 2
or LFE/ 2 can be `0` or `1`.

[0140]Formula 16 can bring out Formula 17 as follows.

PLt=PL+PLs+PC/2+PLFE/2

PRt=PR+PRs+PC/2+PLFE/2 [Formula 17]

[0141]It is able to represent CLD.sub.β according to Formula 14 or
Formula 15 using PLt and PRt. And, `PLt and PRt` can
be represented according to Formula 15 using PL, PLs, PC,
PLFE, PR and PRs. So, it is needed to find a relation
formula enabling the PL, PLs, PC, PLFE, PR and
PRs to be represented using spatial parameters CLD0 to
CLD4.

[0142]Meanwhile, in case of the tree configuration shown in FIG. 6, the
relation between a multi-channel audio signal (L, R, C, LFE, Ls,
Rs) and a mono downmixed channel signal m is shown as follows.

[0144]In particular, by inputting Formula 19 to Formula 17 and by
inputting Formula 17 to Formula 14 or Formula 15, it is able to represent
the combined spatial parameter CLD.sub.β in a manner of combining
spatial parameters CLD0 to CLD4.

[0145]Meanwhile, an expansion formula resulting from inputting Formula 19
to PL+PLs in Formula 17 is shown in Formula 20.

PLs+PLs=[(c1,OTT3)2+(c2,OTT3)2](c1,OTT1-
*c1,OTT0)2*m2 [Formula 20]

[0146]In this case, according to definitions of c1 and c2 (cf.
Formula 5), since (c1,x)2+(c2,x)2=1, it results in
(c1,OTT3)2+(c2,OTT3)=1.

[0147]So, Formula 20 can be briefly summarized as follows.

PL--=PL+PLs=(c1,OTT1*c1,OTT0)2*m2
[Formula 21]

[0148]On the other hand, an expansion formula resulting from inputting
Formula 19 to PR+PRs in Formula 17 is shown in Formula 22.

PR+PRs=[(c1,OTT4)+(c2,TT4)2](c1,OTT1*c1-
,OTT0)2*m2 [Formula 22]

[0149]In this case, according to definitions of c1 and c2 (cf.
Formula 5), since (c1,x)2+(c2,x)2=1, it results in
(c1,OTT4)+(c2,OTT4)2=1.

[0150]So, Formula 22 can be briefly summarized as follows.

PR--=PR+PRs=(c2,OTT1*c1,OTT0)2*m2
[Formula 23]

[0151]On the other hand, an expansion formula resulting from inputting
Formula 19 to PC/2+PLFE/2 in Formula 17 is shown in Formula 24.

PC/2+PLFE/2=[(c1,OTT2)2+(c2,OTT2)2](c2,-
OTT0)2*m2/2 [Formula 24]

[0152]In this case, according to definitions of c1 and c2 (cf.
Formula 5), since (c1,x)2+(c2,x)2=1, it results in
(c1,OTT2)2+(c2,OTT2)2=1

[0153]So, Formula 24 can be briefly summarized as follows.

PC/2+PLFE/2=(c2,OTT0)2*m2/2 [Formula 25]

[0154]Therefore, by inputting Formula 21, formula 23 and Formula 25 to
Formula 17 and by inputting Formula 17 to Formula 14 or Formula 15, it is
able to represent the combined spatial parameter CLD.sub.β in a
manner of combining spatial parameters CLD0 to CLD4.

[0155](2)-1-2-b. Derivation of ICC.sub.β

[0156]First of all, since ICC.sub.β is a correlation between a left
output signal Lt and a right output signal Rt, a result from
inputting the left output signal Lt and the right output signal
Rt to a corresponding definition formula is shown as follows.

[0157]In Formula 26, PLt and PRt can be represented according to
Formula 19 using CLD0 to CLD4. And. PLtPRt can be
expanded in a manner of Formula 27.

PLtRt=PL--R--+PC/2+PLFE/2 [Formula 27]

[0158]In Formula 27, `PC/2+PLFE/2` can be represented as
CLD0 to CLD4 according to Formula 19. And,
PL--R-- can be expanded according to ICC definition
as follows.

ICC1=PL--R--/ (PL--PR--)
[Formula 28]

[0159]If (PL--PR--) is transposed, Formula 29 is
obtained.

PL--R--=ICC1* (PL--PR--)
[Formula 29]

[0160]In Formula 29, PL-- and PR-- can be represented
as CLD0 to CLD4 according to Formula 21 and Formula 23. A
formula resulting from inputting Formula 21 and Formula 23 to Formula 29
corresponds to Formula 30.

PL--R--=ICC1*c1,OTT1*c1,OTT0*c2,OT-
T1*c1,OTT0*m2 [Formula 30]

[0161]In summary, by inputting Formula 30 to Formula 27 and by inputting
Formula 27 and Formula 17 to Formula 26, it is able to represent a
combined spatial parameter ICC.sub.β as spatial parameters CLD0
to CLD4 and ICC1.

[0162]The above-explained spatial parameter modifying methods are just one
embodiment. And, in finding Px or Pxy, it is apparent that the
above-explained formulas can be varied in various forms by considering
correlations (e.g., ICC0, etc.) between the respective channels as
well as signal energy in addition.

[0163](2)-2. Combined Spatial Information Having Surround Effect

[0164]First of all, in case of considering sound paths to generate
combined spatial information by combining spatial information, it is able
to bring about a virtual surround effect.

[0165]The virtual surround effect or virtual 3D effect is able to bring
about an effect that there substantially exists a speaker of a surround
channel without the speaker of the surround channel. For instance,
5.1-channel audio signal is outputted via two stereo speakers.

[0166]A sound path may correspond to spatial filter information. The
spatial filter information is able to use a function named HRTF
(head-related transfer function), which is not limited by the present
invention. The spatial filter information is able to include a filter
parameter. By inputting the filter parameter and spatial parameters to a
conversion formula, it is able to generate a combined spatial parameter.
And, the generated combined spatial parameter may include filter
coefficients.

[0167]Hereinafter, assuming that a multi-channel audio signal is
5-channels and that an output channel audio signal of three channels is
generated, a method of considering sound paths to generate combined
spatial information having a surround effect is explained as follows.

[0168]FIG. 7 is a diagram of sound paths from speakers to a listener, in
which positions of the speakers are shown.

[0169]Referring to FIG. 7, positions of three speakers SPK1, SPK2 and SPK3
are left front L, center C and right R, respectively. And, positions of
virtual surround channels are left surround Ls and right surround Rs,
respectively.

[0170]Sound paths to positions r and l of right and left ears of a
listener from the positions L, C and R of the three speakers and
positions Ls and Rs of virtual surround channels, respectively are shown.
An indication of `Gx--y` indicates the sound path from the
position x to the position y. For instance, an indication of
`GL--r` indicates the sound path from the position of the
left front L to the position of the right ear r of the listener.

[0171]If there exist speakers at five positions (i.e., speakers exist at
left surround Ls and right surround Rs as well) and if the listener
exists at the position shown in FIG. 7, a signal L0 introduced into
the left ear of the listener and a signal R0 introduced into the
right ear of the listener are represented as Formula 31.

LO=L*GL--1+C*GC--1+R*GR--.sub-
.1+Ls*GLs--1+Rs*GRs--1

RO=L*GL--r+C*GC--r+R*GR--.sub-
.r+Ls*GLs--r+Rs*GRs--r, [Formula 31]

[0172]where L, C, R, Ls and Rs are channels at positions, respectively,
Gx--y indicates a sound path from a position x to a
position y, and `*` indicates a convolution.

[0173]Yet, as mentioned in the foregoing description, in case that the
speakers exist at the three positions L, C and R only, a signal
L0--real introduced into the left ear of the listener and
a signal R0--real introduced into the right ear of the
listener are represented as follows.

LO--real=L*GL--1+C*GC--1+R*G.-
sub.R--1

RO--real=L*GL--r+C*GC--r+R*G.-
sub.R--r [Formula 32]

[0174]Since surround channel signals Ls and Rs are not taken into
consideration by the signals shown in Formula 32, it is unable to bring
about a virtual surround effect. In order to bring about the virtual
surround effect, a Ls signal arriving at the position (l, r) of the
listener from the speaker position Ls is made equal to a Ls signal
arriving at the position (l, r) of the listener from the speaker at each
of the three positions L, C and R different from the original position
Ls. And, this is identically applied to the case of the right surround
channel signal Rs as well.

[0175]Looking into the left surround channel signal Ls, in case that the
left surround channel signal Ls is outputted from the speaker at the left
surround position Ls as an original position, signals arriving at the
left and right ears 1 and r of the listener are represented as follows.

Ls*GLs--1`, `Ls*GLs--r` [Formula 33]

[0176]And, in case that the right surround channel signal Rs is outputted
from the speaker at the right surround position Rs as an original
position, signals arriving at the left and right ears l and r of the
listener are represented as follows.

Rs*GRs--1`, `Rs*GRs--r` [Formula 34]

[0177]In case that the signals arriving at the left and right ears L and r
of the listener are equal to components of Formula 33 and Formula 34,
even if they are outputted via the seakers of any position (e.g., via the
speaker SPK1 at the left front position), the listener is able to sense
as if speakers exist at the left and right surround positions Ls and Rs,
respectively.

[0178]Meanwhile, in case that components shown in Formula 33 are outputted
from the speaker at the left surround position Ls, they are the signals
arriving at the left and right ears l and r of the listener,
respectively. So, if the components shown in Formula 33 are outputted
intact from the speaker SPK1 at the left front position, signals arriving
at the left and right ears l and r of the listener can be represented as
follows.

Ls*GLs--1*GL--1`,
`Ls*GLs--r*GL--r` [Formula 35]

[0179]Looking into Formula 35, a component `GL--1` (or
`GL--r`) correpsonding to the sound path from the left
front position L to the left ear l (or the right ear r) of the listener
is added.

[0180]Yet, the signals arriving at the left and right ears 1 and r of the
listener should be the components shown in Formula 33 instead of Formula
35. In case that a sound outputted from the speaker at the left front
position L arrives at the listener, the component `GL--1`
(or `GL--r`) is added. So, if the components shown in
Formula 33 are outputted from the speaker SPK1 at the left front
position, an inverse function `GL--1-1` (or
`GL--r-1`) of the `GL--1` (or
`GL--r`) should be taken into consideration for the sound
path. In other words, in case that the components correpsonding to
Formula 33 are outputted from the speaker SPK1 at the left front position
L, they have to be modified as the following formula.

Ls*GLs--1*GL--1-1`,
`Ls*GL--r*GL--r-1` [Formula 36]

[0181]And, in case that the components correposnding to Formula 34 are
outputted from the speaker SPK1 at the left front position L, they have
to be modified as the following formula.

Rs*GRs--1*GL--1-1`,
`Rs*GRs--r*GL--1-1` [Formula 37]

[0182]So, the signal L' outputted from the speaker SPK1 at the left front
position L is summarized as follows.

L'=L+Ls*GLs--1*GL--1+Rs*GRs--1*GL--1 [Formula 38]

[0183](Components Ls*GLs--r*GL--r and
Rs*GRs--r*GL--1 are omitted.)

[0184]If the signal, which is shown in Formula 38 to be outputted from the
speaker SPK1 at the left front position L, arrives at the position of the
left ear L of the listener, a sound path factor `GL--1` is
added. So, `GL--1` terms in formula 38 are cancelled out,
whereby factors shown in Formula 33 and Formula 34 eventually remain.

[0185]FIG. 8 is a diagram to explain a signal outputted from each speaker
position for a virtual surround effect.

[0186]Referring to FIG. 8, if signals Ls and Rs outputted from surround
positions Ls and Rs are made to be included in a signal L' outputted from
each speaker position SPK1 by considering sound paths, they correspond to
Formula 38.

[0188]For instance, a signal C' outputted from a speaker SPK2 at a center
position C is summarized as follows.

C'=C+Ls*HLs--C+Rs*HRs--C [Formula 40]

[0189]For another instance, a signal R' outputted from a speaker SPK3 at a
right front position R is summarized as follows.

R'=R+Ls*HLs--R+Rs*HRs--R [Formula 41]

[0190]FIG. 9 is a conceptional diagram to explain a method of generating a
3-channel signal using a 5-channel signal like Formula 38, Formula 39 or
Formula 40.

[0191]In case of generating a 2-channel signal R' and L' using a 5-channel
signal or in case of not including a surround channel signal Ls or Rs in
a center channel signal C', HLs--C or
HRs--C becomes 0.

[0192]For convenience of implementation, Hx--y can be
variously modified in such a manner that Hx--y is replaced
by Gx--y or that Hx--y is used by
considering cross-talk.

[0193]The above detailed explanation relates to one example of the
combined spatial information having the surround effect. And, it is
apparent that it can be varied in various forms according to a method of
applying spatial filter information. As mentioned in the foregoing
description, the signals outputted via the speakers (in the above
example, left front channel L', right front channel R' and center channel
C') according to the above process can be generated from the downmix
audio signal using the combined spatial information, an more
particularly, using the combined spatial parameters.

[0194](3) Expanded Spatial Information

[0195]First of all, by adding extended spatial information to spatial
information, it is able to generate expanded spatial information. And, it
is able to upmix an audio signal using the extended spatial information.
In the corresponding upmixing process, an audio signal is converted to a
primary upmixing audio signal based on spatial information and the
primary upmixing audio signal is then converted to a secondary upmixing
audio signal based on extended spatial information.

[0196]In this case, the extended spatial information is able to include
extended channel configuration information, extended channel mapping
information and extended spatial parameters.

[0197]The extended channel configuration information is information for a
configurable channel as well as a channel that can be configured by tree
configuration information of spatial information. The extended channel
configuration information may include at least one of a division
identifier and a non-division identifier, which will be explained in
detail later. The extended channel mapping information is position
information for each channel that configures an extended channel. And,
the extended spatial parameters can be used for upmixing one channel into
at least two channels. The extended spatial parameters may include
inter-channel level differences.

[0198]The above-explained extended spatial information may be included in
spatial information after having been generated by an encoding apparatus
(i) or generated by a decoding apparatus by itself (ii). In case that
extended spatial information is generated by an encoding apparatus, a
presence or non-presence of the extended spatial information can be
decided based on an indicator of spatial information. In case that
extended spatial information is generated by a decoding apparatus by
itself, extended spatial parameters of the extended spatial information
may result from being calculated using spatial parameters of spatial
information.

[0199]Meanwhile, a process for upmixing an audio signal using the expanded
spatial information generated on the basis of the spatial information and
the extended spatial information can be executed sequentially and
hierarchically or collectively and synthetically. If the expanded spatial
information can be calculated as one matrix based on spatial information
and extended spatial information, it is able to upmix a downmix audio
signal into a multi-channel audio signal collectively and directly using
the matrix. In this case, factors configuring the matrix can be defined
according to spatial parameters and extended spatial parameters.

[0200]Hereinafter, after completion of explaining a case that extended
spatial information generated by an encoding apparatus is used, a case of
generating extended spatial information in a decoding apparatus by itself
will be explained.

[0202]First of all, expanded spatial information is generated by an
encoding apparatus in being generated by adding extended spatial
information to spatial information. And, a case that a decoding apparatus
receives the extended spatial information will be explained. Besides, the
extended spatial information may be the one extracted in a process that
the encoding apparatus downmixes a multi-channel audio signal.

[0203]As mentioned in the foregoing description, extended spatial
information includes extended channel configuration information, extended
channel mapping information and extended spatial parameters. In this
case, the extended channel configuration information may include at least
one of a division identifier and a non-division identifier. Hereinafter,
a process for configuring an extended channel based on array of the
division and non-division identifiers is explained in detail as follows.

[0204]FIG. 10 is a diagram of an example of configuring extended channels
based on extended channel configuration information.

[0205]Referring to a lower end of FIG. 10, 0's and 1's are repeatedly
arranged in a sequence. In this case, `0` means a non-division identifier
and `1` means a division identifier. A non-division identifier 0 exists
in a first order (1), a channel matching the non-division identifier 0 of
the first order is a left channel L existing on a most upper end. So, the
left channel L matching the non-division identifier 0 is selected as an
output channel instead of being divided. In a second order (2), there
exists a division identifier 1. A channel matching the division
identifier is a left surround channel Ls next to the left channel L. So,
the left surround channel Ls matching the division identifier 1 is
divided into two channels.

[0206]Since there exist non-division identifiers 0 in a third order (3)
and a fourth order (4), the two channels divided from the left surround
channel Ls are selected intact as output channels without being divided.
Once the above process is repeated to a last order (10), it is able to
configure entire extended channels.

[0207]The channel dividing process is repeated as many as the number of
division identifiers 1, and the process for selecting a channel as an
output channel is repeated as many as the number of non-division
identifiers 0. So, the number of channel dividing units AT0 and AT1 are
equal to the number (2) of the division identifiers 1, and the number of
extended channels (L, Lfs, Ls, R, Rfs, Rs, C and LFE) are equal to the
number (8) of non-division identifiers 0.

[0208]Meanwhile, after the extend channel has been configured, it is able
to map a position of each output channel using extended channel mapping
information. In case of FIG. 10, mapping is carried out in a sequence of
a left front channel L, a left front side channel Lfs, a left surround
channel Ls, a right front channel R, a right front side channel Rfs, a
right surround channel Rs, a center channel C and a low frequency channel
LFS.

[0209]As mentioned in the foregoing description, an extended channel can
be configured based on extended channel configuration information. For
this, a channel dividing unit dividing one channel into at least two
channels is necessary. In dividing one channel into at least two
channels, the channel dividing unit is able to use extended spatial
parameters. Since the number of the extended spatial parameters is equal
to that of the channel dividing units, it is equal to the number of
division identifiers as well. So, the extended spatial parameters can be
extracted as many as the number of the division identifiers.

[0210]FIG. 11 is a diagram to explain a configuration of the extended
channels shown in FIG. 10 and the relation with extended spatial
parameters.

[0211]Referring to FIG. 11, there are two channel division units AT0
and AT1 and extended spatial parameters ATD0 and ATD1
applied to them, respectively are shown.

[0212]In case that an extended spatial parameter is an inter-channel level
difference, a channel dividing unit is able to decide levels of two
divided channels using the extended spatial parameter.

[0213]Thus, in performing upmixing by adding extended spatial information,
the extended spatial parameters can be applied not entirely but
partially.

[0215]First of all, it is able to generate expanded spatial information by
adding extended spatial information to spatial information. A case of
generating extended spatial information using spatial information will be
explained in the following description. In particular, it is able to
generate extended spatial information using spatial parameters of spatial
information. In this case, interpolation, extrapolation or the like can
be used.

[0216](3)-2-1. Extension to 6.1-Channels

[0217]In case that a multi-channel audio signal is 5.1-channels, a case of
generating an output channel audio signal of 6.1-channels is explained
with reference to examples as follows.

[0218]FIG. 12 is a diagram of a position of a multi-channel audio signal
of 5.1-channels and a position of an output channel audio signal of
6.1-channels.

[0219]Referring to (a) of FIG. 12, it can be seen that channel positions
of a multi-channel audio signal of 5.1-channels are a left front channel
L, a right front channel R, a center channel C, a low frequency channel
(not shown in the drawing) LFE, a left surround channel Ls and a right
surround channel Rs, respectively.

[0220]In case that the multi-channel audio signal of 5.1-channels is a
downmix audio signal, if spatial parameters are applied to the downmix
audio signal, the downmix audio signal is upmixed into the multi-channel
audio signal of 5.1-channels again.

[0221]Yet, a channel signal of a rear center RC, as shown in (b) of FIG.
12, should be further generated to upmix a downmix audio signal into a
multi-channel audio signal of 6.1-channels.

[0222]The channel signal of the rear center RC can be generated using
spatial parameters associated with two rear channels (left surround
channel Ls and right surround channel Rs). In particular, an
inter-channel level difference (CLD) among spatial parameters indicates a
level difference between two channels. So, by adjusting a level
difference between two channels, it is able to change a position of a
virtual sound source existing between the two channels.

[0223]A principle that a position of a virtual sound source varies
according to a level difference between two channels is explained as
follows.

[0224]FIG. 13 is a diagram to explain the relation between a virtual sound
source position and a level difference between two channels, in which
levels of left and surround channels Ls and RS are `a` and `b`,
respectively.

[0225]Referring to (a) of FIG. 13, in case that a level a of a left
surround channel Ls is greater than that b of a right surround channel
Rs, it can be seen that a position of a virtual sound source VS is closer
to a position of the left surround channel Ls than a position of the
right surround channel Rs.

[0226]If an audio signal is outputted from two channels, a listener feels
that a virtual sound source substantially exists between the two
channels. In this case, a position of the virtual sound source is closer
to a position of the channel having a level higher than that of the other
channel.

[0227]In case of (b) of FIG. 13, since a level a of a left surround
channel Ls is almost equal to a level b of a right surround channel Rs, a
listener feels that a position of a virtual sound source exists at a
center between the left surround channel Ls and the right surround
channel Rs.

[0228]Hence, it is able to decide a level of a rear center using the above
principle.

[0229]FIG. 14 is a diagram to explain levels of two rear channels and a
level of a rear center channel.

[0230]Referring to FIG. 14, it is able to calculate a level c of a rear
center channel RC by interpolating a difference between a level a of a
left surround channel Ls and a level b of a right surround channel Rs. In
this case, non-linear interpolation can be used as well as linear
interpolation for the calculation.

[0231]A level c of a new channel (e.g., rear center channel RC) existing
between two channels (e.g., Ls and Rs) can be calculated according to
linear interpolation by the following formula.

c=a*k+b*(1-k), [Formula 40]

[0232]where `a` and `b` are levels of two channels, respectively and `k`
is a relative position beta channel of level-a, a channel of level-b and
a channel of level-c.

[0233]If a channel (e.g., rear center channel RC) at a level-c is located
at a center between a channel (e.g., Ls) at a level-a and a channel RS at
a level-b, `k` is 0.5. If `k` is 0.5, Formula 40 follows Formula 41.

c=(a+b)/2 [Formula 41]

[0234]According to Formula 41, if a channel (e.g., rear center channel RC)
at a level-c is located at a center between a channel (e.g., Ls) at a
level-a and a channel RS at a level-b, a level-c of a new channel
corresponds to a mean value of levels a and b of previous channels.
Besides, Formula 40 and Formula 41 are just exemplary. So, it is also
possible to readjust a decision of a level-c and values of the level-a
and level-b.

[0235](3)-2-2. Extension to 7.1-Channels

[0236]When a multi-channel audio signal is 5.1-channels, a case of
attempting to generate an output channel audio signal of 7.1-channels is
explained as follows.

[0237]FIG. 15 is a diagram to explain a position of a multi-channel audio
signal of 5.1-channels and a position of an output channel audio signal
of 7.1-channels.

[0238]Referring to (a) of FIG. 15, like (a) of FIG. 12, it can be seen
that channel positions of a multi-channel audio signal of 5.1-channels
are a left front channel L, a right front channel R, a center channel C,
a low frequency channel (not shown in the drawing) LFE, a left surround
channel Ls and a right surround channel Rs, respectively.

[0239]In case that the multi-channel audio signal of 5.1-channels is a
downmix audio signal, if spatial parameters are applied to the downmix
audio signal, the downmix audio signal is upmixed into the multi-channel
audio signal of 5.1-channels again.

[0240]Yet, a left front side channel Lfs and a right front side channel
Rfs, as shown in (b) of FIG. 15, should be further generated to upmix a
downmix audio signal into a multi-channel audio signal of 7.1-channels.

[0241]Since the left front side channel Lfs is located between the left
front channel L and the left surround channel Ls, it is able to decide a
level of the left front side channel Lfs by interpolation using a level
of the left front channel L and a level of the left surround channel Ls.

[0242]FIG. 16 is a diagram to explain levels of two left channels and a
level of a left front side channel (Lfs).

[0243]Referring to FIG. 16, it can be seen that a level c of a left front
side channel Lfs is a linearly interpolated value based on a level a of a
left front channel L and a level b of a left surround channel Ls.

[0244]Meanwhile, although a left front side channel Lfs is located between
a left front channel L and a left surround channel Ls, it can be located
outside a left front channel L, a center channel C and a right front
channel R. So, it is able to decide a level of the left front side
channel Lfs by extrapolation using levels of the left front channel L,
center channel C and right front channel R.

[0245]FIG. 17 is a diagram to explain levels of three front channels and a
level of a left front side channel.

[0246]Referring to FIG. 17, it can be seen that a level d of a left front
side channel Lfs is a linearly extrapolated value based on a level a of a
left front channel l, a level c of a center channel C and a level b of a
right front channel.

[0247]In the above description, the process for generating the output
channel audio signal by adding extended spatial information to spatial
information has been explained with reference to two examples. As
mentioned in the foregoing description, in the upmixing process with
addition of extended spatial information, extended spatial parameters can
be applied not entirely but partially. Thus, a process for applying
spatial parameters to an audio signal can be executed sequentially and
hierarchically or collectively and synthetically.

INDUSTRIAL APPLICABILITY

[0248]Accordingly, the present invention provides the following effects.

[0249]First of all, the present invention is able to generate an audio
signal having a configuration different from a predetermined tree
configuration, thereby generating variously configured audio signals.

[0250]Secondly, since it is able to generate an audio signal having a
configuration different from a predetermined tree configuration, even if
the number of multi-channels before the execution of downmixing is
smaller or greater than that of speakers, it is able to generate output
channels having the number equal to that of speakers from a downmix audio
signal.

[0251]Thirdly, in case of generating output channels having the number
smaller than that of multi-channels, since a multi-channel audio signal
is directly generated from a downmix audio signal instead of downmixing
an output channel audio signal from a multi-channel audio signal
generated from upmixing a downmix audio signal, it is able to
considerably reduce load of operations required for decoding an audio
signal.

[0252]Fourthly, since sound paths are taken into consideration in
generating combined spatial information, the present invention provides a
pseudo-surround effect in a situation that a surround channel output is
unavailable.

[0253]While the present invention has been described and illustrated
herein with reference to the preferred embodiments thereof, it will be
apparent to those skilled in the art that various modifications and
variations can be made therein without departing from the spirit and
scope of the invention. Thus, it is intended that the present invention
covers the modifications and variations of this invention that come
within the scope of the appended claims and their equivalents.