Film is not a language in the
sense that English, French, or mathematics is. It is, first of
all, impossible to be ungrammatical in film. And it is not
necessary to learn a vocabulary. Infants appear to understand
television images, for example, months before they begin to
develop a facility with spoken language. Even cats watch
television. Clearly, it is not necessary to acquire intellectual
competence in film in order to appreciate it, at least on the
most basic level.

But film is very much like
language. People who are highly experienced in film, highly
literate visually (or should we say "cinemate"?), see
more and hear more than people who seldom go to the movies. An
education in the quasi-language of film opens up greater
potential meaning for the observer, so it is useful to use the
metaphor of language to describe the phenomenon of film. In fact,
no extensive scientific investigation of our ability to
comprehend artificial sounds and images has as yet been
performed, but nevertheless we do know through research, that
while children are able to recognize objects in pictures long
before they are able to read, they are eight or ten years of age
before they can comprehend a film image the way most adults do.
Moreover, there are cultural differences in perception of images.
In one famous 1920s test, anthropologist William Hudson set out
to examine whether rural Africans who had had little contact with
Western culture perceived depth in two-dimensional images the
same way that Europeans do. He found, unequivocally, that they do
not. Results variedthere were some individuals who
responded in the Western manner to the testbut they were
uniform over a broad cultural and sociological range. /122/

Figure 3-1.
CONSTRUCTION-TASK FIGURES. Subjects asked to reconstruct
these figures in three dimensions using sticks or rods,
respond in different ways. People from Western cultures,
trained in the codes and conventions that artists use to
convey three-dimensionality in a two-dimensional drawing, see
A as three dimensional and B as two-dimensional. The
operating code for three-dimensionality here insists that the
dimension of depth be portrayed along the 45º oblique line.
This works well enough in A, but not in B, where the oblique
lines are not in the depth plane. Subjects from African
cultures tend to see both figures as two-dimensional, since
they are not familiar with this Western three-dimensional
code. Figures C and D illustrate the models of A constructed
by Western and African observers, respectively. (From,
"Pictorial Perception and Culture," Jan B.
Deregowski. (~) 1972 by Scientific American, Inc. All rights
reserved.)

The conclusions that can be
drawn from this seminal experiment and others that have followed
are two: first, that every normal human being can perceive and
identify a visual image; second, that even the simplest visual
images are interpreted differently in different cultures. So we
know that images must be "read." There is a process of
intellection occuring not necessarily consciouslywhen
we observe an image, and it follows that we must have learned, at
some point, how to do this.

The "ambiguous
Trident," a well-known "optical illusion,"
provides an easy test of this ability.

Figure 3~2. THE AMBIGUOUS
TRIDENT. The illusion is intriguing only because we are
trained in Western codes of perspective. The psychological
effects is powerful: our minds insist that we see the object
in space rather than the drawing on a plane.

It's safe to say that the level
of visual literacy of anyone reading this book is such that
observation of the trident will be confusing to all of us. It
would not be for someone not trained in /123/Western conventions
of three-dimensionality. Similarly, the well-known optical
illusions in Figures 3-3 and 3-4 demonstrate that the process of
perception and comprehension involves the brain: it is a mental
experience as well as a physical one.

Figure 3-3 THE NECKER CUBE.
Devised in 1832 by L. A. Necker, a Swiss naturalist. The
illusion depends, once again, on cultural training.

Whether we
"see" the Necker Cube from the top or the bottom or
whether we perceive the drawing in Figure 3-4 as either a young
girl or an old woman depends not on the physiological function of
our eyes but on what the brain does with the information
received.

Figure 3-4. "My Wife
and My Mother-in-Law," by cartoonist W.E. Hill, was
published in Puck in 1915. It has since become a
famous example of the phenomenon known as multistable figure.
The young woman's chin is the old woman's nose. The old
woman's chin is the young woman's chest.

The word "image,"
indeed, has two conjoined meanings: an image is an optical
pattern; it is also a mental experience, which is why, we can
assume, we use the word "imagine" to describe the
mental creation of pictures.

So there is a strong element of
our ability to observe images, whether still or moving, that
depends on learning. This is, interestingly, not true to a
significant extent with auditory phenomena. If the machines are
sophisticated enough, we can produce recorded sounds that are
technically indistinguishable from their originals. The result of
this difference in mode of the two systems of
perceptionvisual and audi- /125/
toryis that whatever education our ears undergo in order to
perceive reality is sufficient to perceive recorded sound,
whereas there is a subtle but significant difference between the
education necessary for our eyes to perceive (and our brain to
understand) recorded images and that which is necessary simply to
comprehend the reality that surrounds us. It would serve no
purpose to consider phonography as a language, but it is useful
to speak of photography (and cinematography) as a language,
because a learning process is involved.

THE PHYSIOLOGY OF PERCEPTION

Another way to describe this
difference between the two senses is in terms of the function of
the sensory organs: ears hear whatever is available for them to
hear; eyes choose what to see. This is true not only in the
conscious sense (choosing to redirect attention from point A to
point B or to ignore the sight altogether by closing our eyes),
but in the unconscious as well. Since the receptor organs that
permit visual acuity are concentrated (and properly arranged)
only in the "fovea" of the retina, it's necessary for us to
stare directly at an object in order to have a clear image of it.



You can demonstrate this to
yourself by staring at the dot in the center of this page. Only
the area immediately surrounding it will be clear. The result of
this foveated vision is that the eyes must move constantly in
order to perceive an object of any size. These semiconscious
movements are called "saccades" and take approximately
1/20 second each, just about the interval of persistence of vision, the phenomenon that makes film
possible.

The conclusion that can be
drawn from the fact of foveated vision is that we do indeed read
an image physically as well as mentally and psychologically, just
as we read a page. The difference is that we know how to read a
page in English, from the left to right and top to
bottombut we are seldom conscious of how precisely we read
an image.

A complete set of
physiological, ethnographic, and psychological experiments might
demonstrate that various individuals read images more or less
well in three different ways:

physiologically: the best
readers would have the most efficient and extensive
saccadic patterns;

ethnographically: the most
literate readers would draw on a greater experience and
knowledge of various cultural visual conventions;

psychologically: the
readers who gained the most from the material would be
the ones who were best able to assimilate the various
sets of meanings they perceived and then integrate the
experience. /126/

Figure 3~5. SACCADE
PATTERNS. At left, a drawing of a bust of Queen Nefertiti; at
right, a diagram of the eye movements of a subject viewing
the bust. Notice that the eye follows regular patterns rather
than randomly surveying the image. The subject clearly
concentrates on the face and shows little interest in the
neck. The ear also seems to be a focus of attention, probably
not because it is inherently interesting, but rather because
it is located in a prominent place in this profile. The
saccadic patterns are not continuous; the recording clearly
shows that the eye jerks quickly from point to point (the
"notches" in the continuous line), fixing on
specific nodes rather than absorbing general information. The
recording was made by Alfred L. Yarbus of the Institute for
Problems of Information Transmission, Moscow. (From "Eye
Movements and Visual Perception, " by David Noton and
Lavdrence Stark, June 1971.

The irony here is that we know
very well that we must learn to read before we can attempt to
enjoy or understand literature, but we tend to believe,
mistakenly, that anyone can read a film. Anyone can see a film,
it's true, even cats. But some people have learned to comprehend
visual imagesphysiologically, ethnographically, and
psychologicallywith far more sophistication than have
others. This evidence confirms the validity of the triangle of
perception outlined in Chapter 1, uniting author, work, and
observer. The observer is not simply a consumer, but an active or
potentially activeparticipant in the process.

Film is not a language, but is
like a language, and since it is like language, some of the
methods that we use to study language might probably be applied
to a study of film. In fact, during the last ten years, /127/

Figure 3-6. THE PONZO
ILLUSION. The horizontal lines are of equal length, yet the
line at the top appears to be longer than the line at the
bottom. The diagonals suggest perspective, so that we
interpret the picture in depth and conclude, therefore, that
since the "top" line must be "behind" the
"bottom" line, further away, it must then be
longer.

this approach to
filmessentially linguistichas grown considerably in
importance. Since film is not a language, strictly linguistic
concepts are misleading. Ever since the beginning of film
history, theorists have been fond of comparing film with verbal
language (this was partly to justify the serious study of film),
but it wasn't until a new, larger category of thought developed
in the fifties and early sixtiesone that saw written and
spoken language as just two among many systems of
communicationthat the real study of film as a language
could proceed. This inclusive category is semiology, the study of systems of signs. Semiologists
justified the study of film as language by redefining the concept
of written and spoken language. Any system of communication is a
"language"; English, French, or Chinese is a
"language system." Cinema, therefore, may be a language
of a sort, but it is not clearly a language system. As Christian
Metz, the well-known film semiologist, pointed out: we understand
a film not because we have a knowledge of its system, rather, we
achieve an understanding of its system because we understand the
film. Put another way, "It is not because the cinema is
language that it can tell such fine stories, but rather it has
become language because it has told such fine stories"
[Metz, Film Language, p. 47].

For semiologists, a sign must
consist of two parts: the signifier and the signified. The word
"word," for examplethe collection of letters or
soundsis a signifier; what it represents is something else
againthe "signified." In literature, the relationship between
signifier and signified is a main locus of art: the poet is
building constructions that, on the one hand, are composed of
sounds (signifiers) and, on the other, of meanings (signifieds),
and the relationship between the two can be fascinating. In fact,
much of the pleasure of poetry lies just here: in the dance
between sound and meaning.

But in film, the signifier and
the signified are almost identical: the sign /128/
of cinema is a short-circuit sign. A picture of a book is much
closer to a book, conceptually, than the word "book"
is. It's true that we may have to learn in infancy or early
childhood to interpret the picture of a book as meaning a book,
but this is a great deal easier than learning to interpret the
letters or sounds of the word "book" as what it
signifies. A picture bears some direct relationship with what it
signifies, a word seldom does.

[Pictographical languages
like Chinese and Japanese might be said to fall somewhere in
between film and Western languages as sign systems, but only
when they are written, not when they are spoken, and only in
limited cases. On the other hand, there are some
words"gulp," for examplethat are
onomatopoeic and therefore bear a direct relationship to what
they signify, but only when they are spoken.]

It is the fact of this
short-circuit sign that makes the language of film so difficult
to discuss. As Metz put it, in a memorable phrase: "A film
is difficult to explain because it is easy to understand."
It also makes "doing" film quite different from
"doing" English (either writing or speaking). We can't
modify the signs of cinema the way we can modify the words of
language systems. In cinema, an image of a rose is an image of a
rose is an image of a rosenothing more, nothing less. In
English, a rose can be a rose, simply, but it can also be
modified or confused with similar words: rose, rosy, rosier,
rosiest, rise risen, rows (ruse), arose, roselike, and so forth.
The power of language systems is that there is a very great
difference between the signifier and the signified; the power of
film is that there is not.

Nevertheless, film is like
a language. How, then, does it do what is does? Clearly, one
person's image of a certain object is not another's. If we both
read the words "rose" you may perhaps think of a Peace
rose you picked last summer, while I am thinking of the one Laura
Westphal gave to me in December 1968. In cinema, however, we both
see the same rose, while the filmmaker can choose from an
infinite variety of roses and then photograph the one chosen in
another infinite variety of ways. The artist's choice in cinema
is without limit; the artist's choice in literature is
circumscribed, while the reverse is true for the observer. Film
does not suggest, in this context: it states. And therein lies
its power and the danger it poses to the observer: the reason why
it is useful, even vital, to learn to read images well so that
the observer can seize some of the power of the medium. The
better one reads an image, the more one understands it, the more
power one has over it. The reader of a page invents the image,
the reader of a film does not, yet both readers must work to
interpret the signs they perceive in order to complete the
process of intellection. The more work they do, the better the
balance between observer and creator in the process, the better
the balance, the more vital and resonant the work of art. /129/

The earliest film
textseven many published recentlypursue with
shortsighted ardor the crude comparison of film and
written/spoken language. The standard theory suggested that the shot
was the word of film, the scene
its sentence, and the sequence
its paragraph. In the sense that these sets of divisions are
arranged in ascending order of complexity, the comparison is true
enough; but it breaks down under analysis. Assuming for the
moment that a word is the smallest convenient unit of meaning,
does the shot compare equivalently? Not at all. In the first
place, a shot takes time. Within that time span there is a
continually various number of images. Does the single image, the frame,
then constitute the basic unit of meaning in film? Still the
answer is no, since each frame includes a potentially infinite
amount of visual information, as does the soundtrack that
accompanies it. While we could say that a film shot is something
like a sentence, since it makes a statement and is sufficient in
itself, the point is that the film does not divide itself into
such easily manageable units. While we can define
"shot" technically well enough as a single piece of
film, what happens if the particular shot is punctuated
internally? The camera can move; the scene can change completely
in a pan or track.
Should we then be talking of one shot or two?

Likewise, scenes, which were
defined strictly in French classical /130/ theater
as beginning and ending whenever a character entered or left the
stage, are more amorphous in film (as they are in theater today).
The term scene is useful, no doubt, but not precise. Sequences
are certainly longer than scenes, but the
"sequence-shot," in which a single shot is coterminous
with a sequence, is an important concept and no smaller units
within it are sequential.

It would seem that a real
science of film would depend on our being able to define the
smallest unit of construction. We can do that technically, at
least for the image: it is the single frame. But this is
certainly not the smallest unit of meaning. The fact is that
film, unlike written or spoken language, is not composed of
units, as such, but is rather a continuum of meaning. A shot
contains as much information as we want to read in it, and
whatever units we define within the shot are arbitrary.
Therefore, film presents us with a language (of sorts) that:

a) consists of short-circuit
signs in which the signifier nearly equals the signified; and

b) depends on a continuous,
nondiscrete system in which we can't identify a basic unit and
which therefore we can't describe quantitatively. The result is,
as Christain Metz says, that: "An easy art, the cinema is in
constant danger of falling victim to this easiness." Film is
too intelligible, which is what makes it difficult to analyze.
"A film is difficult to explain because it is easy to
understand."

DENOTATIVE AND CONNOTATIVE
MEANING

Films do, however, manage to
communicate meaning. They do this essentially in two different
manners: denotatively and connotatively. Like written language, but to a greater degree,
a film image or sound has a denotative meaning: it is what is and
we don't have to strive to recognize it. This factor may seem
simplistic, but it should never be underestimated: here lies the
great strength of film. There is a substantial difference between
a description in words (or even in still photo- graphs) of a
person or event, and a cinematic record of the same. Because film
can give us such a close approximation of reality, it can
communicate a precise knowledge that written or spoken language
seldom can. Language systems may be much better equipped to deal
with the nonconcrete world of ideas and abstractions (imagine
this book, for example, on film: without a complete narration it
would be incomprehensible), but they are not nearly so capable of
conveying precise information about physical realities.

By its very nature,
written/spoken language analyzes. To write the /131/