工商時報【Nikki Lu】
閱讀暖身蘋果、亞馬遜、微軟以及Google都提供語音助理服務,孰優孰劣?根據路透社報導,蘋果的語音助理Siri在辨識語音、回答問題方面或許不再具優勢,但Siri一大優勢是能說最多種語言,現在正要學習說上海話,我們來看看它怎麼做到。進入本文前,想想以下單字英文怎麼說:A.量身訂做 B.(說話)含糊的 C.規模化
The voice-assistant wars are in *full swing, with Apple, Amazon, Microsoft and大樓隔熱紙 now Google all offering electronic assistants to take your commands.
Many researchers believe that Apple has squandered its lead when it comes to understanding speech and answering questions. But there is at least one thing Siri can do that the other assistants cannot: speak 21 languages localized for 36 countries, a very important capability in a smartphone ma大地之春rket where most sales are outside the United States.
Microsoft Cortana, by contrast, has eight languages (A)tailored for 13 countries. Google's Assistant, which began in its Pixel phone but has moved to other Android devices, speaks four languages. Amazon's Alexa features only English and German. Siri will even soon start to learn Shanghainese, a special dialect of Wu Chinese spoken only around Shanghai.
At Apple, the company has started working on a new language by bringing in humans to read passages in a range of accents and dialects, which are then transcribed by hand so the computer has an exact representation of the spoken text to learn from, said Alex Acero, head of the speech team at Apple. Apple also captures a range of sounds in a variety of voices. From there, an acoustic model is built that tries to predict words sequences.
蘋果語音團隊負責人Alex Acero說,要發展新語言功能時,會讓有各種方言和口音的真人唸出文字段落,然後再手動轉錄,這樣電腦就可以擁有準確的學習樣本。蘋果還會從不同的聲音中捕捉各種語音,接著建立一個聲學模型,以嘗試預測字元序列。
Then Apple deploys “dictation mode,” its text-to-speech translator, in the new language, Acero said. When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them anonymous. The recordings, complete with background noise and (B)mumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.
After enough data has been gathered and a voice actor has been recorded to play Siri in a new language, Siri is released with answers to what Apple estimates will be the most common questions, Acero said. Once released, Siri learns more about what real-world users ask and is updated every two weeks with more tweaks.
But script-writing does not (C)scale, said Charles Jolley, creator of an intelligent assistant named Ozlo.“You can't hire enough writers to come up with the system you'd need in every language. You have to synthesize the answers,” he said.
不過,智慧助理Ozlo的創造者Charles Jolley說,撰寫腳本無法規模化,「不可能聘僱夠多的作者,來打造每種語言所需的系統,必須人工合成回答。」
The founders of Viv, a startup founded by Siri's original creators that Samsung acquired last year, is working on just that.“Viv was built to specifically address the scaling issue for intelligent assistants,” said Dag Kittlaus, the CEO and co-founder of Viv. “The only way to leapfrog today's limited functionality versions is to open the system up and let the world teach them.”
「Siri之父」的新創公司Viv,正著手解決這個問題。這間公司去年由三星收購。Viv的聯合創始人兼CEO Dag Kittlaus說:「Viv想解決智慧助理的規模化問題,想要讓當今功能侷限的版本升級,唯一的方法就是開放系統,讓世界來教它們。」
世界公民,這是商業英語的last mile!