|
|
|
Introduction
In 1985 I worked on my first natural language interface. We
were building a proof of concept system that would allow top managers
at an insurance company to use natural language commands to query a data
base. If the system didnt know a word that was used it would come
back with a question asking for the meaning of the term it did not know.
During an early test one of the users asked, How much life insurance
has been sold last year in the Northeast? The computer questioned
us back with, What is the meaning of life? How thrilled we
were that we had created the first thinking computer!! During the last few months I have talked to many people about
this book I was writing. I
cant count how many people mentioned the Star Trek computer or HAL
from 2001 to me when I would tell them the title the book. For a while
I thought it was neat that people could relate to what I was writing about.
After all, if I mention the title of my last book, GUI Design Essentials
to people who are not in the field of computers I usually get blank, glazed
stares. As the same comments continued to come I started to inwardly groan.
But as I write this now, in the thick of the project, I have decided that
it is significant. The idea of a talking computer and a computer that
understands our speech is different, its striking, its a novel
idea that captures peoples attention and imagination. At first it
seems easy to understand why so many people would remember HAL who had
seen 2001. HAL is the somewhat sinister computer in 2001 who closes the
pod doors. People remember
him because hes evil, right? But then, how many evil
characters are there in movies? Do we always remember their names? And
what about the original computer in Star Trek? That androgynous computer
was just called computer. And he/she wasnt sinister
at all. Why do we remember that one? I think its because speech is something that is uniquely
human. Dogs bark, and chimpanzees can communicate with written
language, but only humans talk. To talk is to be human, and HAL and Computer
stick in our memory because they are computers with uniquely human traits.
We are fascinated by this idea, and maybe even repelled by it. No wonder, then, that speech capabilities in computers seem
to hold that love/hate, avoidance/attraction aspect for us. Speaking is
so natural to us as humans, that we want to be able to talk to our computers.
It seems that it would be easier. And we want our computers to talk to
us, because listening to information rather than reading it leaves our
eyes and hands free to do other tasks. Yet we hesitate. We come up with
so many reasons not to embrace the technology. True, the reasons are plentiful and many are all too valid.
The state of speech recognition, and dealing with errors it produces is
enough to still drive many away. I used speech technology, specifically
dictation systems when I was writing this book. I dictated the phrase
and manual output, and performance was worse with auditory
input and spoken output. When I glanced up at the screen the recognizer
had written and Immanual Kant put an the of paint boatman and Spokane
output. Last week a press release from the University of Southern California
proclaimed a breakthrough in speech recognition using neural network technology.
The improvements are dramatic. By the time this book is released, using
neural networks for speech recognition may have made many of the error
handling and recovery issues unimportant. None too soon for me. No, its more than error rate we worry about. I think we
worry about losing our uniqueness. We dont want humans to be machines,
and we are still uncomfortable with our machines being human-like. I dont believe that talking to a computer, or listening
to a computer makes us any less human, or the computer any more human.
We are about to overcome our hesitancy and leap into speech in a big way.
What will that transition be like? Will it continue to happen slowly?
Or will it snowball all at once? We are historically very poor at predicting the pace of acceptance
of new technologies. In a conversation I had with one of the original
creators of the cell phone, he told me that when they developed the first
cell phone they thought the total world-wide market was about 100 phones
(top CEOs and maybe heads of large powerful governments). TJ Watson from
IBM predicted world-wide computer use at 5 computers (of course he meant
large mainframes). The internet was confined to a group of scientists
for many years. We cannot predict the speed or the direction or the use
of new technologies. I suspect, however, that the uses of speech technologies will
be much more social and human than we think. When people embrace a technology
they take it into their social situations. People sign on to the internet
to chat with each other, or sell items like a big garage sale. People
use their cell phones to call home and talk to children and spouses. Certainly
business and commerce will be targeted for speech applications, as they
have already been, but if you want to find the point at which the technology
is fully embraced, look for places where speech brings us closer to other
people.
|
|
ISBN 0-471-37545-4
416 pages
April, 2000
Wiley
Computer Publishing
Timely. Practical. Reliable.
Weinschenk Consulting
Home Page
|