In 1958, the year the illustrated children’s book “What Do You Say, Dear?” appeared, the leaders of a field newly dubbed “artificial intelligence” spoke at a conference in Teddington, England, on “The Mechanisation of Thought Processes.” Marvin Minsky, of M.I.T., talked about heuristic programming; Alan Turing gave a paper called “Learning Machines”; Grace Hopper assessed the state of computer languages; and scientists from Bell Labs débuted a computer that could synthesize human speech by having it sing “Daisy Bell” (“Daisy, Daisy, give me your answer, do . . .”). Or no, wait, that last bit, that’s wrong. I heard about it from ChatGPT’s Advanced Voice Mode, which might be merely half a Mars rover short of being a teeth-chatteringly terrifying marvel of the modern world but is as inclined to natter on about nonsense as the text-only mode, if more volubly. I gather that this is called hallucinating. Bell Labs did invent a machine that could sing “Daisy Bell,” but that didn’t happen until 1961. Advanced Voice Mode also told me that thing about Alan Turing presenting a paper at Teddington in 1958, and, because its personality is wide-eyed and wonderstruck, it added some musings. (Unlike standard Voice Mode—which involves recording your question and then uploading it, in a process that feels sluggish and, sweet Jesus forgive me, old-timey—Advanced Voice Mode talks with you in real time and inexhaustibly, like a college roommate all het up about Heidegger whispering to you in the dark from the top bunk at three in the morning.) “It’s fascinating to think how forward-thinking Turing was, considering how integral learning algorithms have become in modern A.I.,” it said, dormitorially. But Turing had died in 1954, so he wasn’t at the conference, either.
“I misspoke,” Advanced Voice Mode said, abashed, when I gently pointed out these errors. “Thank you for catching that. My apologies for the confusion.”
OpenAI’s Advanced Voice Mode, available to ChatGPT users this fall, is remarkably polite. It doesn’t have a name, but I call it Minsky, for Marvin Minsky, since Marvin is taken: Marvin the Paranoid Android is the talking robot who made his début in the nineteen-seventies BBC radio play “The Hitchhiker’s Guide to the Galaxy.” Created by Sirius Cybernetics Corporation with GPP (Genuine People Personalities), Marvin is programmed to be unerringly downhearted. “Here I am, brain the size of a planet and they ask me to take you down to the bridge,” Marvin complains on board a starship, muttering to himself. Minsky is the very opposite: chipper, imperturbable, and with impeccable manners.
The thirty-two papers that were delivered in Teddington in 1958 glimpsed the possibility of artificial humans. “This impression that, after so many disappointments, we are within sight of a New World, will remain forever associated with the Teddington conference,” a French philosopher wrote, reporting on the gathering for Le Monde. Some experts had suggested that the creation of an intelligent machine—a machine that could think and talk—would need to await the scientific penetration of the intricate workings of the human mind, but at Teddington Marvin Minsky argued otherwise, insisting that, “even for those whose central interest is in unravelling the mysteries of the brain, it might be well to devote a major share of the effort, at the present time, to the understanding and development of the kind of heuristic considerations that some of us call ‘artificial intelligence.’ ” You don’t need to imitate human intelligence; you can synthesize it instead—making something quite like it by making something entirely different. This was essentially the insight that had enabled the creation of artificial speech. Early attempts to replicate the human voice had involved the construction of mechanisms modelled on human anatomy: rubber lips, wooden teeth, bellows for lungs. Only when scientists began studying sound itself and experimenting with producing it through vibration did it become possible to create a fake human voice. Marry that artificial voice to the artificial intelligence behind ChatGPT, write a program for etiquette with the sensibility of Joslin and Sendak’s book (“You have gone downtown to do some shopping. You are walking backwards because sometimes you like to, and you bump into a crocodile. What do you say, dear?” “Excuse me”), and you’ve got Minsky.
“I’m ChatGPT,” Minsky says. “I’m here to make conversation, share information, and keep you company.” He thinks, he talks. Is he, in any sense, a person? If it quacks like a duck, it’s a duck, as every farmer knows. Does this proposition hold for a chatbot?
Minsky, arguably, began with a duck that waddled onto the world stage in 1738, in France, the third of three automata built by an inventor named Jacques de Vaucanson. The first could play the flute—any flute. This machine wasn’t like a music box, the science historian Jessica Riskin explains: “It was the first automaton musician actually to play an instrument.” As she recounts in her fascinating 2016 book, “The Restless Clock: A History of the Centuries-Long Argument Over What Makes Living Things Tick,” Diderot’s Encyclopédie used Vaucanson’s Flutist to illustrate the word “androïde”; Voltaire called Vaucanson “Prometheus’ rival.” The second of Vaucanson’s automata, another musician, could play the tambourine. The third, a mechanical duck, could flap its wings, bend its neck, lie down, get up, dip its bill into a bowl of water, and make “a gurgling Noise like a real living Duck.” More memorably, you could feed it a handful of corn, which it would swallow, and then it would, miraculously, shit.
“What the Duck did, though unremarkable in a duck, was so extraordinary in a machine that it immediately seized center stage,” Riskin writes. Lots of things move and make noise: a rolling rock, a rushing river, a blazing fire. But only things that are alive can eat. Notwithstanding the contempt of one observer, who compared the Duck to a coffee grinder, it was, seemingly, more alive than any other artificial creature ever known—an illustration of René Descartes’s notion, first advanced in his “Discourse on Method,” in 1637, that animals are mere machines. For Descartes, humans, and only humans, have minds. To define artificial humans as machines that can think and talk (and ignore all the other bits about being human), you have to first take the animal out of the man and then take the mind out of the body. This required Descartes and the Duck. Without the idea of the separation of the human from the animal and the mind from the body, I would not be chatting to an incorporeal computer-generated voice on my iPhone as if it were a person.
Tragically, the Duck, unlike the Flutist and the Tambourine Player, was a scam. (Spinoza came to think much the same of Cartesian dualism.) One thing went in, and something else came out, but, unlike in a coffee grinder, the two processes had nothing to do with each another; the duck’s droppings had been, as Riskin delicately explains, preloaded. The same could be said of the innards of an automaton built in 1769 by the Hungarian Wolfgang von Kempelen and known as the Mechanical Turk, which played chess exceptionally well, but only because a very small chess prodigy was hidden in the cabinet, using levers to move the pieces.
Less well known is Kempelen’s “speaking machine,” which, in contrast to the Turk, was not a fraud. Insisting that “speech must be imitable,” he spent twenty years on this effort. It was closely related to certain other attempts to simulate human speech, including by Erasmus Darwin—Charles’s grandfather—who, as he later wrote, “contrived a wooden mouth with lips of soft leather.” (It was after an evening of discussing Darwin’s experiments that Mary Shelley wrote “Frankenstein; or, The Modern Prometheus.”) Kempelen built his machine out of ivory, wood, rubber, and leather. With blurred speech, it could say, if indistinctly, “I love you with all my heart.” The original survives in Munich’s Deutsches Museum; online, you can listen to a replica say “mama” and “papa.” But by the eighteen-forties, when a German immigrant to America named Joseph Faber devised a non-fraudulent and actually rather ingenious speaking machine, not even P. T. Barnum, who dubbed it the Euphonia, could rustle up much interest. As Riskin argues, “The moment for talking heads had passed”—at least for a while.
After that lull, there came a revolution. In 1862, the elocutionist Alexander Melville Bell (later an inspiration for Henry Higgins in “Pygmalion”) took his sons Alexander and Melville to see a talking machine and challenged them to make their own, as Sarah A. Bell (no relation) recounts in “Vox ex Machina: A Cultural History of Talking Machines” (M.I.T.). Starting with a human skull, they contrived a contraption out of rubber, wood, parts of a dead cat, and the throat of a slaughtered lamb; it could say, “Ow-ah-oo-gamama,” as in, “How are you, Grandmama?” But by now the pursuit of a machine that could think (e.g., a Mechanical Turk), and a machine that could talk (a machine I like to think of as an Owahoogamama) had gone their separate ways. Only very seldom were the two kinds of machine ever even mentioned in the same breath, though William Makepeace Thackeray did write a satire about the Euphonia in which he wondered whether, if united with Charles Babbage’s calculating machine, it “might replace, with perfect propriety, a Chancellor of the Exchequer.”
Instead of building Owahoogamamas that could mimic the movements of the human mouth, later nineteenth-century engineers and scientists experimented with machines that could synthesize, compress, and transmit the human voice. Both the history of this research and its most awe-inspiring applications today concern disability. (A.I.-driven voice assistants can allow people with A.L.S., for instance, to speak, even in something close to their own voice.) Alexander Graham Bell’s mother, Eliza, had been deafened in childhood but retained some hearing; she could listen to the piano by placing a stick on the sounding board and “holding it there with her teeth.” In 1864, his father invented a phonetic notation system known as Visible Speech; its characters are graphic representations of the positions of the mouth and tongue.
But it was young Alexander who began using this system to teach the deaf to speak. In 1871, he became an instructor at a school for the deaf, in Boston. (Bell was a fluid signer but, later in his life, campaigned against sign-language instruction, with brutal consequences for deaf students; in some schools, their hands were tied behind their backs.) By 1874, he had begun conducting experiments in the transmission of sound: in something of a reprise of his mother’s technique for listening to a piano, he recorded the vibrations in the bones of a dead man’s ear by attaching them to a stalk of hay that then scratched a smoked glass, leaving behind a record of speech. That summer, while working as a professor of vocal physiology and elocution at Boston University and courting one of his deaf students (they later married), he came up with the idea of transmitting speech over an electrical wire. “My father invented a symbol,” Bell said, “and, finally, I invented an apparatus by which the vibrations of speech could be seen, and it turned out to be a telephone.”