Why wasn’t Google Glass popular for translation?

Aside from missing family and friends and finding that wearing an air pollution mask tended to fog up my glasses, one reason that I was happy to return home from China was that it was no fun being illiterate. WeChat can be used to translate a sign or menu into English, but it is somewhat cumbersome. Same deal with Google Translate, which works to turn English text into characters to show shop and restaurant personnel.

It occurred to me to wonder why systems such as Google Glass hadn’t caught on simply for the purpose of finding text in every scene and translating into the traveler’s home language. Was there simply not enough battery power to have the thing running continuously? It would have added a lot to the trip if I could have just walked around streets and museums and, without having to take any explicit action, seen English language versions of all of the surrounding writing.

Share:

10 thoughts on “Why wasn’t Google Glass popular for translation?”

Wouldn’t it be easier just to hire a guide for a few RMB, say a student who wants to practice his or her English? I doubt such a person would have been particularly hard to find & probably a lot more pleasant and educational than walking around with some contraption on your nose.

(Generally speaking, prices in Shanghai were not that different from prices in Boston. Chinese-made goods, actually, seem to be cheaper at Costco here in Waltham than in shops in Shanghai, Suzhou, or Hangzhou.)

It’s a great idea and I think one of the better uses for augmented reality. You’re far above average intellectually (which I think partially explains why more people don’t see this potential) and it would be even more of a help for less well-endowed people. I also agree with Jack, and the best thing would be a combination of both, where you and the guide could share the AR via Bluetooth. I think it will be more popular once the price comes down, so that a really decent version of the glasses comes in at around $100, and tour guides can offer them as part of the package. Imagine walking around, looking at a building and saying: “What’s the history of that building?” so you ask the tour guide, they are sharing the AR, the basic information comes up and the tour guide can do just that – help guide you along from place to place and supply additional information to keep everything focused. Then add in a recording and logging function so you can download the whole experience and keep a detailed travelogue.

The other applications of this kind of AR could be revolutionary, for instance, in real estate sales. Imagine you have a pair of these glasses and every time you make a home improvement, you log the experience. Then, when you go to sell your property, whether FSBO or through an agent, you have not just the transaction records but the whole repair and improvement, basically a complete history of the property, available to the prospective buyer, and you use that as an added value to the sale – the better the records, the more firm you can be with pricing. Aircraft ownership and maintenance, same thing. The engine was overhauled at X hours, here’s the AR.

PS – I have to give credit for the genesis of this idea (at least for me) with my father. He was a tinkerer in addition to his EE background, and because of that he understood the implications of Moore’s Law better than most people. When I was about 5 years old, we had one of the original Polaroid Land cameras (the nice ones with the leather). I was amazed by that camera, it was like a miracle to me. One day, he assured me, you would be able to wear around your neck a small device that would record your entire life, with audio and your own annotations, and you would be able to completely document everything you ever did. One of his favorite quips when really big USB (>1GB) storage became available: “If you walked into an IBM board meeting back in 1960, tossed this on the table and told them what it did and what it cost, they’d have had you carted off to a rubber room, locked it in a vault, and pretended they never saw it while they tried to reverse engineer it.”

In a more practical vein, think of the implications for “know how” – for example in aerospace manufacturing. One of the problems we would face, for example, if we wanted to manufacture a working Saturn V engine is that, sure, we’ve got the blueprints, but the “know how” details that were necessary to actually build the engines and make them work have been lost to time. Those notes and experiences are gone. There will be no need to lose that information in a few years. Then the problem becomes controlling the access to the “secrets.”

IMHO, the OCR translation was simply not good enough back in 2013. Alternatives, such as the Pleco phone suite, worked better but only barely. As someone who lives within a walking distance from a Chinatown I had to learn a couple hundred characters to be able to read a Chinese menu.

The OCR has got to be better now, particularly when augmented by AI with a big cloud computing system backing it all up. Even the lowly US Postal Service scans every single piece of mail in the United States – that’s why you can get informed delivery. They scan and track every piece. And they not only read the IMDB intelligent mail barcodes (which are subject to a lot of printing variation depending on which machines apply them) but they also do a very good job of reading the printed addresses, including handwritten addresses, if for some reason the barcode read fails. It’s all automated. If the barcode read fails, the USPS equipment attempts to decipher the address and prints a NEW barcode that gets stuck on the envelope. You’ve probably received pieces of mail and wondered why/how it was there. The speed of the machines doing this are too fast to follow with the human eye. I have to believe that even Chinese character sets, even hand-written, could be deciphered at the same speed, and of course it would get better the more you use the device.

You’re way underestimating the difficulty of rendering random first person perspective images into text. Deep Learning is what has everybody thinking amazing human-like things are possible, but it’s really quite limited.

Nobody has a clue how to build a car that can drive around and see what’s going on, like a competent human can do. Ten years ago people thought deep learning would do that. Welp, didn’t work.