Off Topic

terms learned

This week my daughter Camila Daya and I watched two movies in Mandarin together, and also practiced children’s music a bit on our way to her gym classes.

We watched The Lion King dubbed in Mandarin for the fourth or fifth time and enjoyed it thoroughly, as always. The momentary inspiration for this selection was the fact that we are going to do a safari in South Africa next week, so seeing the animated lions and other animals helped us get excited.

Previously, when looking up Boonie Bears episodes, I chanced upon a feature-length Boonie Bears movie that I had not even previously heard of, so of course I downloaded it. It was the second movie we watched last week. There are no English subtitles, so we understood very little of the dialogue, but had a good time watching it. The plot is relatively easy to follow, of course. In addition, I was pleased on several occasions to pick out words I would not have understood a few months ago, and which helped me understand the storyline. Even my daughter, who has done only a third of my Mandarin viewing thus far, understood several words.

It took me a while to discover the title of the movie. Finally, Google told me it is a 2015 film called Boonie Bears: Mystical Winter. I didn’t find it as entertaining–and certainly not as funny–as last year’s Boonie Bears: To the Rescue. Nevertheless, there were touching moments, and I found the mystical aspects of it quirky but interesting.

On my way to the farm this weekend, I was very pleased to be able, for the first time, to sing along with Nan Zi Han from Mulan the whole way through. I have finally completed its memorization, after many months! So, my dear (and dwindling) readers, you can look forward to a new music video soon, hopefully in May, when we get back from Africa.

This week, I finished re-watching Dragon with subtitles clip-by-clip, often repeating lines again and again, in an attempt to decipher new vocabulary. I now have over 50 terms in my Word-a-Day vocabulary list from Dragon–far more than from any other source. I register a phonetic transcription (using my own haphazard system), the source (Dragon), and the exact time that the term comes up in the movie. I do not try to translate the term, although I often have a rough translation in mind based on the subtitles and context.

When I again watch this movie or any source that I have previously worked on in this way, I am able to produce a chronological list of terms and reference them as the scenes come up. By this method, I gradually learn and reinforce vocabulary that I have been able to decipher.

These terms are all in a simple Access database that I created. In addition to using them as I repeat an entire movie or episode of a show, I also sometimes do a “Word List Review”, in which I will watch isolated scenes of various different sources to specifically reinforce vocabulary. In 30 minutes, I might watch clips from two different movies, a Boonie Bears episode, and a Qiao Hu episode, for example.

In order to render this process more efficient, I make use of a concept I became more familiar with when engaging in discussions last year on language learning forums: spaced repetition systems (SRS). The most cited example of SRS are Anki cards, a kind of digital flashcard for memorizing vocabulary or anything else. Anki cards are cool because they allow you to insert images, audio, or even video, and you can use them on your cell phone or any other device. Some people take this the next level and break down an entire movie or episode into tiny clips, with dual-language subtitles, in a process abbreviated as subs2srs. Supposedly, you can use this high-tech method of memorization, in a short period of time, to be able to watch a movie in a completely new language, whether Japanese, Bahasa or Mandarin, without subtitles and with full comprehension.

Now, mind you, I never really used Anki cards or subs2srs. Being me, I had to reinvent the wheel. I didn’t really want to distract myself with creating Anki cards or parsing videos and using dual-language subtitles. Instead, I created simple queries in my Word List database that incorporate the spaced repetition concept. The idea is that, each time you review vocabulary or whatever you’re trying to memorize, you rank its difficulty. Items that you rank as more difficult will come back or repeat sooner, while those you rank as easy will only come back to you after much longer intervals.

I made a couple little formulas in a database query to assess the priority of reviewing each term I register.

For those who are minimally familiar with Access or SQL, they will be very easy to understand. First, I defined a variable called “age”, which is the current date minus the date that I registered that term.

age: Now()-[when]

Next, I attributed a number to each level of difficulty. Each time I review a word in a clip, I assess its difficulty as hard, medium, easy, or mastered.

Finally, I use these variables to help calculate the priority. The higher the number, the higher the priority and the sooner I should review the term. The field “reviewed” refers to how many times the term was reviewed in that specific source, while “total reviews” refers to how many times the term was reviewed in any source.

priority: ([age]/([total reviews]+[reviewed]*2+1))*[difficulty]

I then use a simple query to generate lists of terms with priorities over 50 and over 100, respectively. The lists indicate which words I should focus on reviewing. The way I most often use the lists is to choose what movie or episode to watch when I want to review vocabulary. For example, if I see that a movie I haven’t watched for a while has 15 words show up on the 50+ list, I will then watch the whole movie or, alternately, review the specific scenes where those terms come up.

This system consumes very little time. As I’ve mentioned before, I’m not sure whether this type of artifice improves or distracts from my learning. On the whole, I believe it is probably beneficial. However, what I am sure of is that it provides a psychological boost, as I have some quantitative parameter of progress.

I tested my comprehension while watching 15 minutes of a random Chinese soap opera in Mandarin using the methodology described in my Week 51 post. Although this year I have picked up the pace with my Mandarin viewing (and listening), I often wonder if I am making real progress. It was encouraging that the results—which are as objective as I can make them—suggest that my gains in comprehension are on track with my hours of listening.

I estimated that during the 15 minutes of the soap opera, approximately 1,453 words were spoken, of which I definitely understood 166, including repeats (such as “wo,” which means “I”). I believe this estimate is conservative, considering my self-test methodology. I feel confident in stating that I now understand about 11% of words spoken in Mandarin in the Singaporean soap opera Tale of Two Cities, and I expect this is representative of what I would understand in day-to-day standard Mandarin conversation.

This level of understanding, while far from true comprehension, is sufficient to allow me to understand more of what is happening in most situations than I would have a 14 months ago, before beginning to learn Mandarin.

I am at 26% of my experiment time, or 313 hours. As the following graph illustrates, I have been squeezing more and more Mandarin viewing time into my days, especially since mid-December (due to how much I enjoy my experiment). This trend may be slightly reversed, as I have started my Law classes again. On the other hand, incorporating music while driving into my experiment allows me to put in more time.

The graph seems to show my daughter teetering off and giving up on the experiment. Fortunately, that is not true. In fact, I believe she may have settled into a long-term participation in the experiment, in which she puts in about 1/4 of the hours that I do. She calls it “our experiment” (I melt inside) and we have tons of fun. Basically, she listens to music with me in the car, watches Boonie Bears with me some evenings, and every once in a while we watch part of a movie in Mandarin, usually a dubbed Disney film.

Her pace of acquisition is obviously far slower than mine (which is slow enough). There is absolutely no pressure on her, so she has fun and I think she is gaining a few things:

Insight into the language-acquisition process

Understanding of what an experiment and a long-term project are

Glimpses into Chinese culture

Rudiments of Mandarin language

An example of stubborn persistence (hopefully not stupid obstinacy)!

Getting back to my test, I’m very glad I did it because it has given me renewed confidence to stay the course. My results during the first few minutes were over 15%, but then gradually dropped down to 11.27%, which I rounded down to 11%. I think this drop may be due in part to simply getting lazier about jotting down words as the 15 minutes dragged on. This difficulty in jotting down words is one of the reasons I believe 11% is actually a conservative estimate.

I am hopeful that after another 47 hours of listening, having reached 360 hours or the 30%-mark of my experiment, I will score above 12%, keeping pace with the first 240 hours, which took me to 8% comprehension.

I’ll best most of you didn’t know I had a Chinese grandfather. Here’s the story, with many thanks to my mother for writing it down:

Victor’s “Chinese” grandfather

by Greta Browne, Victor Hart’s mother

Victor’s grandfather, George Chalmers Browne, would have loved to see Victor and Camila singing in Mandarin.

Chalmers, my father, was born in China in 1915, of Presbyterian missionaries who had met there as single missionaries a few years earlier. They raised three children, Chalmers, Beatrice and Francis, who all grew up speaking Chinese.Eventually my father, his sister and his brother left China to go to college in the United States, and my grandparents also left for good, in the mid-thirties, when the Japanese invasion threatened to engulf them in violence. . . . Read more

I didn’t even think about this connection when I started my Mandarin experiment. It wasn’t part of my growing-up experience in any way. I suppose it’s just an interesting coincidence; a subtle karmic link gradually ripening into fruition; or an intergenerational, subconsciously transmitted attraction to China.

At any rate, I love the idea that my grandfather would have enjoyed following my experiment.

This past week was Carnival in Brazil. Instead of spending it in drunken debauchery as you non-Brazilians might expect, I had a great time with my family at the farm. Naturally, I watched three movies in Mandarin—Shaolin (again), Raise the Red Lantern, and To Live, all of which I would recommend unhesitatingly.

I watched the latter two without subtitles. It was the first time since early on in my experiment that I watch a Chinese feature film without any subtitles on first viewing.

I still understand little and it’s far less enjoyable than watching with subtitles. However, the experience was very different from when I saw Farewell My Concubine in the first month of my experiment. The number of words and short sentences I understand, though still small, now actually contributes significantly to my understanding of dialogue and of the plot in general, and thus to my enjoyment. This evidence of progress was encouraging, and I believe this past week will mark a gradual transition away from the use of subtitles when watching Chinese movies.

Another encouraging realization came this week when, speaking to my daughter one evening, I mentioned the Mandarin words for dog and cat. I then reflected that I have picked up quite a few animal names in Chinese! This knowledge comes partially from Qiao Hu and is not representative of my general (lack of) vocabulary in the language. Nonetheless, since I never intended to learn animal vocabulary, I was impressed and pleased that I have happened to pick up so much. Of course, I could be wrong on some of these, but I believe I know:

I have now watched 240 hours of Mandarin-language movies and TV shows, or 20% of the total time for my experiment. Nearly a year has gone by since I began this adventure on January 17, 2014.

The sounds of a language that was once utterly foreign to me have now become familiar, though not quite intelligible. As I reported at the 10% mark, I continue to make steady progress in my deciphering and comprehension. I now occasionally understand complete phrases, and in most sentences I can pick up at least one word.

My incipient comprehension is starting to become useful. When watching a regular movie or show without subtitles, the words and phrases I understand enhance my understanding of the plot, even if marginally.

At this 240-hour mark, I tested my listening comprehension using a new episode of the same Chinese soap opera I have used for this purpose in the past—A Tale of 2 Cities[1]. I think it is a good test because I never watch this particular show or even this genre—so the results are not influenced by previous familiarity with the content or specific voices and manners of speaking. At the same time, the dialogue seems to be in standard Mandarin[2] and is not technical, but rather about daily life. Thus, the results should be representative.

This time, I devised a simple system to measure more accurately and objectively the percentage of word occurrences I was understanding. As I watched, for the first time, 15 minutes of the episode, I jotted down the words I believed I understood. I then watched the entire 15 minutes again, one section at a time, verifying as best as I could which words I got right (discarding the ones I was unsure of) and estimating the total number of words in each section. Thus, within a couple of percentage points, I can confidently affirm that I now understand 8% of words in a routine standard Mandarin conversation, including repeats, inasmuch as this soap opera is a representative sample.

The following graph shows how my estimated comprehension has evolved over time (blue line), alongside the time I have put in (red line).

If the rate of learning as measured for the first 240 hours were to continue indefinitely, I would understand 40% of the words (including repeats) by the end of my experiment, and would take 3,000 hours to reach 100% listening comprehension. Of course, that extrapolation is tenuous at best. The main reason the rate of learning would decline is because of diminishing returns—more specifically, due to the diminishing word frequency of new words.[3]

On the other hand, the rate of learning might also accelerate because of the nature of the language acquisition process. I am listening to a large amount of audio content that I do not understand, but it nonetheless is entering my brain, which is evolutionarily designed to recognize patterns and create neural synapses to process the sounds efficiently. I am convinced that this cognitive development occurs far beyond what I can consciously and self-referentially perceive at any given time in terms of comprehension of actual words. As my brain silently labors, its Mandarin repository and processing ability gradually increase before finally manifesting as actual conscious comprehension of words and phrases.

Furthermore, like pieces in a 10,000-piece puzzle, the more words I learn (especially the “corner pieces” of key pronouns, verbs, conjunctions, and so forth), the more the general panorama comes into view. As this happens, deciphering new words in context becomes easier.

Although my self-assessments are rough estimates—especially the previous, less meticulous ones—my progress would seem to indicate that thus far, the latter beneficial phenomena have outweighed the diminishing word frequency factor. After the first 120 hours, I estimated I was understanding 2.75% of word occurrences, while after another 120 hours, I now estimate I understand 8% of them.

For the sake of conjecture, and despite the tenuous nature of any extrapolation, let us assume that I did continue my rate of an 8% increase in word occurrence comprehension for every 240 hours of listening. What would that spell for my hypotheses?

The first and main hypothesis is that I can learn to understand Mandarin just by watching authentic videos. Obviously, that hypothesis would be proven correct, since eventually I would get to 100% comprehension. Though any conclusive affirmations would be premature at this point, that conjecture is logical and consistent with my experience thus far. If I was able to get past the initial hurdle of deciphering and consolidating comprehension of a few dozen words in Mandarin[4], it seems self-evident that I will continue to make progress and eventually understand the language.

Skipping ahead, the third hypothesis is that after watching 1,200 hours of authentic Mandarin videos, I will have attained sufficient comprehension to tackle a new video, and on first viewing, understand the general plot or the topics that are being discussed. According to my extrapolation, after 1,200 hours I would understand 40% of word occurrences. I am unsure whether that would be enough to attain the aforementioned intermediate level of comprehension, but I do not believe it would be. I think to really understand the general plot and topics of any new video, one would need to understand closer to 60% of word occurrences.

This projection coincides with my subjective expectation based on how the experiment is going thus far. I think it is quite possible that my rate of acquisition will accelerate and, as a result, the percent of word occurrences will increase more quickly and reach 60%. On the other hand, I would not be surprised if that does not happen, and five or six years from now, at the end of my experiment, I am in fact at 40% comprehension, thus refuting the third hypothesis.

The second hypothesis is that this method is actually efficient and effective as compared to traditional, old school methods that are heavy on formal study, grammar rules, translations, and memorization. This hypothesis will be the most complex and controversial to assess.

A presumably very efficient method requires at least 4,600 hours to achieve a “professional working proficiency” in Mandarin, comprising listening, speaking, reading, and writing. I would guess that an inefficient traditional method might take twice that amount of time.

Further, I estimate that one needs to understand about 90% of word occurrences in speech between natives, as in a soap opera, to attain that level of proficiency[5]. At my current rate, extrapolated, that would take me 2,700 hours of viewing. It might then take me another 1,350 to achieve an equivalent level of speaking proficiency[6], bringing the total to 4,050 hours. That does not include learning Chinese characters and being able to read and write. If these estimates and my extrapolation prove accurate, it seems my method would be similarly inefficient as traditional (old school) academic methods, and my second hypothesis would be refuted as well.

. . .

More importantly, though, I am having a lot of fun. As I’ve discovered during my current vacation period, watching Chinese movies and Boonie Bears cartoons is a great way to avoid dealing with more urgent, practical matters. I watched 48 hours of Mandarin between December 11 and January 6, but did not even touch the piles of unfiled papers in my closet!

Many of the Chinese movies I have watched enriched my life culturally, aesthetically, and philosophically.

The Boonie Bears have been a great bonding experience with my daughter and even with my wife on a few late nights when no one was sleepy! While watching the sadistic bears and their logger nemesis in action is not any more culturally or morally edifying than Bugs Bunny or Tom and Jerry, the great thing is that you can enjoy the plot and the antics without subtitles.

That is important, because in the past 40 hours, I have deliberately reduced my use of subtitles from a previous 70% of viewing to a current 60%. I will continue to reduce their use until most, and then all, of my viewing is without this crutch.

Of course, the most useful show I have found is Qiao Hu. It has no subtitles, I understand half of the dialogue, and I can easily pick up several new words in each episode. And it is really enjoyable—for a two year old! Needless to say, I watch much less Qiao Hu than I “should” to avoid giving up on my experiment due to boredom.

I really look forward to being able to understand and enjoy movies without subtitles. While I probably will not get to that point anytime soon for first viewings, I expect that sometime this year or next it will become feasible to enjoy my favorite movies without subtitles, when watching them for the fourth or fifth time.

Since last July, my daughter has not watched enough Mandarin to make notable progress. Alas, I do not think she will learn in this way. Nevertheless, I believe the exposure she has had to this difficult and important language, and to Chinese culture through film, is enriching. If she decides to learn Mandarin when she is a little older, she will be a leg up because of this early exposure.

[3] If I understood every single occurrence of just 5 or 10 Mandarin words, my percentage would be much higher than my current result. However, that is not trivial, because the trick is being able to decipher those words in the context of sentences spoken quickly by native speakers.

Faithful longtime followers of my blog (hi Mom) will remember fondly that during the first three months of my Mandarin experiment, many delightful hours were spent watching Boonie Bears with my daughter, Camila Daya. The humor and basic storyline are quite comprehensible even if you don’t understand one word of Mandarin. Here’s a typical example:

I stopped watching Boonie Bears in April because I finally deemed it too difficult for total beginners. New decipherable words seemed to be very few and far between. I decided I should watch Qiao Hu in its place. Instead, I ended up watching mostly Chinese movies with English subtitles.

Last week I explained that I intend to gradually reduce the use of English subtitles in my Mandarin viewing. I will probably do so much faster than I specified in that post. Just in the past two weeks, I have gone from a previous average of 70% to a current 54% of viewing using subtitles. Watching movies without subtitles, however, is still not very enjoyable because I understand so little of the dialogue. So I decided to give Boonie Bears another try.

I had missed those hilarious fellas! Boonie Bear episodes are a perfect 10 minutes of mindless entertainment at the end of a long day. Even my wife watches with us, without the slightest intention of learning any Chinese!

My assessment of the Mandarin-learning benefit of Boonie Bears has changed significantly. Happily, I understand quite a bit more than before. As I’ve insisted recently with my skeptics, I’m definitely making progress. I am no longer listening to unintelligible garble. I understand a few words in each scene, enough to get the gist of the dialogue.

I also think I may have been somewhat off in my initial assessment. As I watch, I realize I did learn quite a few words from Boonie Bears in the early months—everything from how to answer a phone and ask who it is to specific vocabulary like hat and honey.

An unfortunate consequence of my initial abandonment of the ursine duo was that my daughter pretty much stopped watching Mandarin. Two nights ago, we watched Boonie Bears together just like in the old days—a total of 4 episodes or 40 minutes of viewing.

Shuda and Swar, we’re back and cheering you on as you protect the environment and endlessly torment that poor little lumberjack!

I recently decided to keep a daily word list for my Mandarin project. When watching videos, I attempt to decipher vocabulary, and pay special attention to those that are either new or not well consolidated. I add an average of one term per day to my list. My hope is that this approach will guarantee a bare minimum pace of vocabulary acquisition. My word-list goals and method are explained in detail in my Week 30 post.

This simple and unoriginal project-within-a-project got me thinking. Why do we essentially stop expanding our vocabulary in our native language, or at least slow down dramatically? Would it be feasible to use a similar method to continually acquire new words and over time become armed with an outsized lexicon? Could I employ a similar approach to the four languages I already speak as a way to ensure that my skills continue to improve?

These are not entirely new questions for me or for many of my readers. For the sake of brevity, I will not endeavor to answer all of them in this post, though I find the topic fascinatingly complex.

In my case, I know a total of four languages, two of them as a native speaker, and am now endeavoring to learn a fifth. Research suggests that the average university-educated adult has a receptive vocabulary in his or her native language of about 17,000 to 20,000 word families*. Let’s assume I’m at the higher end and have a 20,000-word vocabulary in English. In Portuguese—though I consider myself a native speaker—my vocabulary is somewhat smaller because I have studied and read much less than in English, so a reasonable estimate would be 15,000 words. In Spanish, which I use professionally in written and spoken form, I would guess 10,000. And my French, which is very rusty and quite poor in terms of productive vocabulary, nonetheless probably has something like 4,000 receptive words. In Mandarin, I’m guessing about 150 (listening only) at this point.

So what would happen over time if I were able to add one word per day to my receptive vocabulary in each language?

Estimate of Word Families in My Receptive Vocabulary

Current

40 years old

50 years old

60 years old

71 years old

English

20,000

21,825

25,475

29,125

33,000

Portuguese

15,000

16,825

20,475

24,125

28,000

Spanish

10,000

11,825

15,475

19,125

23,000

French

4,000

5,825

9,475

13,125

17,000

Mandarin

150

1,975

5,625

9,275

13,000

I am currently 35 years old. By the time I’m 71, I would have a remarkably large lexicon in my native languages; a vocabulary comparable to an average educated native speaker of French and above average in Spanish; and a vocabulary akin in size to a native speaker of Mandarin without a college education. That would be amazing. Of course, there are many other components to language mastery, but I believe vocabulary is the single most important factor.

Receptive vocabulary is very different and far less impressive than productive vocabulary, but undoubtedly many words, by some estimates up to half, make their way into our productive vocabulary.

There are a variety of problems with this theoretical undertaking. Without elaborating, I would contend that the two main ones are time constraints and long-term retention.

Nevertheless, for a linguaphile, the prospects are tantalizing. Who knows? This could mark the beginning of a brand new experiment.