In an on-stage demo at its annual I/O conference, Google CEO Sundar Pichai showed the audience what he claims is a real call between Google’s duplex technology and a hair salon.
The call was impressive in the sense that it highlighted how the technology is capable of capturing the many nuances of human speech.
Not only is the technology capable of incorporating subtle acknowledgements like “mmmhhmm” into a conversation, it is seemingly able to comprehend complex speech patterns.
The demonstration was the outcome of years of development and advances in the field of voice technology.
In its official post accompanying the on-stage demonstration, Google stated, “Google Duplex’s conversations sound natural thanks to advances in understanding, interacting, timing, and speaking”.
Despite it being an impressive advancement in the capabilities of technology to compute the English language, the online commentary in the immediate aftermath of the demonstration raised a number of important questions:
- Will the call handler be aware they’re talking to an automated agent?
- How will the technology be used?
- How consistent is the technology?
On the first point, Google’s hasn’t told us very much. In the post on its blog, it states, “We want to be clear about the intent of the call so businesses understand the context. We’ll be experimenting with the right approach over the coming months”.
Its lack of explanation around this point is particularly surprising considering the amount of scrutiny big tech has been under in the last few months around its approach to user consent and privacy.
Moreover, coverage around Google I/O seems to have conveniently ignored issues that have dominated press headlines for months in favor of fawning over yet another technology that prioritizes convenience as its primary selling point.
On the second point, Google states, “The technology is directed towards completing specific tasks, such as scheduling certain types of appointments,” which perhaps signifies its use is highly dependent on context.
The post also makes it clear that the technology is “self-monitoring” i.e. if it can’t complete a task autonomously it will hand off to a human operator.
However, where’s there’s automation for the individual, there’s automation for the corporation.
Automated diallers are in widespread use in the commercial world and, according to the latest data, UK citizens receive upwards of 6 million calls daily about insurance, personal injury and other matters of high transactional value to corporations.
Email, at the time of its mainstream adoption, was widely recognized as a means to speed up communication and make our lives simpler.
Today, spam or automated messages account for 59 percent of email traffic worldwide (and the USA accounts for 12.8 percent of that total).
When technology geared around automation or convenience is developed, people have a tendency to think how they can use it rather than how it will be used against them.
On whether the technology will be consistent, it’s impossible to say without working with it in a real environment.
Apple’s digital assistant Siri was released amidst fanfare and high expectations back in 2011, however many commentators deem it has fallen short of the mark.
Whether Google Duplex represents another big promise or a meaningful development in voice technology is yet to be seen.