From charlesreid1


A biography of Claude Shannon, whose theory of information ushered in much of the current information age.


Part 1

Geniuses are the luckiest of mortals because what they must do is the same as what they most want to do and, even if their genius is unrecognized in their lifetime, the essential earthly reward is always theirs, the certainty that their work is good and will stand the test of time. One suspects that the geniuses will be least in the Kingdom of Heaven—if, indeed, they ever make it; they have had their reward. —W. H. AUDEN

After it was over, someone asked the chairman to put into perspective what had just happened. “It was,” he said, “as if Newton had showed up at a physics conference.

Of course, information existed before Shannon, just as objects had inertia before Newton. But before Shannon, there was precious little sense of information as an idea, a measurable quantity, an object fitted out for hard science. Before Shannon, information was a telegram, a photograph, a paragraph, a song. After Shannon, information was entirely abstracted into bits. The sender no longer mattered, the intent no longer mattered, the medium no longer mattered, not even the meaning mattered: a phone conversation, a snatch of Morse telegraphy, a page from a detective novel were all brought under a common code.

It is a puzzle of his life that someone so skilled at abstracting his way past the tangible world was also so gifted at manipulating it. Shannon was a born tinkerer: a telegraph line rigged from a barbed-wire fence, a makeshift barn elevator, and a private backyard trolley tell the story of his small-town Michigan childhood. And it was as an especially advanced sort of tinkerer that he caught the eye of Vannevar Bush—soon to become the most powerful scientist in America and Shannon’s most influential mentor—who brought him to MIT and charged him with the upkeep of the differential analyzer, an analog computer the size of a room, “a fearsome thing of shafts, gears, strings, and wheels rolling on disks” that happened to be the most advanced thinking machine of its day.

And it brought him to Bell Labs, an industrial R&D operation that considered itself less an arm of the phone company than a home for “the operation of genius.” “People did very well at Bell Labs,” said one of Shannon’s colleagues, “when they did what others thought was impossible.” Shannon’s choice of the impossible was, he wrote, “an analysis of some of the fundamental properties of general systems for the transmission of intelligence, including telephony, radio, television, telegraphy, etc.”—systems that, from a mathematical perspective, appeared to have nothing essential in common until Shannon proved that they had everything essential in common.

In 1990, the Voyager 1 probe turned its camera back on Earth from the edge of the solar system, snapped a picture of our planetary home reduced in size to less than a single pixel—to what Carl Sagan called “a mote of dust suspended in a sunbeam”—and transmitted that picture across four billion miles of void. Claude Shannon did not write the code that protected that image from error and distortion, but, some four decades earlier, he had proved that such a code must exist. And so it did.

Having completed his pathbreaking work by the age of thirty-two, he might have spent his remaining decades as a scientific celebrity, a public face of innovation: another Bertrand Russell, or Albert Einstein, or Richard Feynman, or Steve Jobs. Instead, he spent them tinkering.

An electronic, maze-solving mouse named Theseus. An Erector Set turtle that walked his house. The first plan for a chess-playing computer, a distant ancestor of IBM’s Deep Blue. The first-ever wearable computer. A calculator that operated in Roman numerals, code-named THROBAC (“Thrifty Roman-Numeral Backward-Looking Computer”). A fleet of customized unicycles. Years devoted to the scientific study of juggling. And, of course, the Ultimate Machine: a box and a switch, which, when flipped on, produced a whirring of gears and a mechanical hand that emerged from the box, flipped the switch off, and disappeared again. Claude Shannon was self-effacing in much the same way. Rarely has a thinker who devoted his life to the study of communication been so uncommunicative.

He worked with levity and played with gravity; he never acknowledged a distinction between the two. His genius lay above all in the quality of the puzzles he set for himself.

At a meeting of the school board it was decided not to hire any married women teachers during the coming school year due to economic conditions. It was decided that when a husband was capable of making a living it would be unfair competition to hire married women. Mrs. Mabel Shannon, Mrs. Lyons, and Mrs. Melvin Cook will be out of the school system due to this ruling. By that point, at least, there was much in her private life to occupy her.

The trees drew the lumber industry, and the first visitors and inhabitants were willing to contend with the climate for the rich cache of white pine and hardwoods. But the environment was austere, with subzero temperatures and thick lake-effect snow. A local history from 1856 concluded, perhaps self-servingly, that the harsh climate offered a brand of moral education: “The fact that [Northern Michigan’s] pioneers had more to struggle against in order to provide homes for themselves and the necessary accompaniments of homes developed in them a degree of aggressive energy which has remained as a distinct sectional possession . . . a splendid type of manhood and womanhood—self-reliant, strong, straight-forward, enterprising and moral.”

Biographies of geniuses often open as stories of overzealous parenting. We think of Beethoven’s father, beating his son into the shape of a prodigy. Or John Stuart Mill’s father, drilling his son in Greek at the tender age of three. Or Norbert Wiener’s father, declaring to the world that he could turn anything, even a broomstick, into a genius with enough time and discipline. “Norbert always felt like that broomstick,” a contemporary later remarked. Compared to those childhoods, Shannon’s was ordinary.

Reflecting on his education with the benefit of hindsight, Shannon would say that his interest in mathematics had, besides sibling rivalry, a simple source: it just came easily to him. “I think one tends to get into work that you find easy for yourself,” Shannon acknowledged.

He loved science and disliked facts. Or rather, he disliked the kind of facts that he couldn’t bring under a rule and abstract his way out of. Chemistry in particular tested his patience. It “always seems a little dull to me,” he wrote his science teacher years after; “too many isolated facts and too few general principles for my taste.”

On April 17, 1930, thirteen-year-old Claude attended a Boy Scout rally and won “first place in the second class wig-wag signalling contest.” The object was to speak Morse code with the body, and no scout in the county spoke it as quickly or accurately as Claude. Wig-wag was Morse code by flag: a bright signaling flag (red stands out best against the sky) on a long hickory pole. The mediocre signalers took pauses to think; the best, like Claude, had something of the machine in them. Right meant dot, left meant dash, dots and dashes meant breaks in the imaginary current that meant words; he was a human telegraph.

And the grandson inherited the tinkering gene. “As a young boy, I built many things, working with mechanical stuff,” he recalled. “Erector sets and electrical equipment, built radios, things of that sort. I remember I had a radio controlled boat.”

Predictably, Claude grew up worshipping Thomas Edison. And yet the affinity between Edison and Claude Shannon was more than happenstance. They shared an ancestor: John Ogden, a Puritan stonemason, who crossed the Atlantic from Lancashire, England, to build gristmills and dams, and with his brother raised the first permanent church in Manhattan,

In 1895, the then-dean of the engineering school, Charles Greene, had been asked to create plans for a new building to house the school’s growing student body. Greene’s request—$50,000 for a small, U-shaped structure—was granted. He died before he could carry out the construction, and Cooley succeeded him as dean. Asked to judge his predecessor’s plans and funding needs, Cooley replied, “Gentlemen, if you could but see the other engineering colleges with which we are forced to compete, you would not hesitate for one moment to appropriate a quarter of a million dollars.” Something about Cooley’s understated certainty swayed the board, and his request was swiftly approved.

A public exhibition in 1913 showcased the spoils of the expansion, as close as a university has probably come to something like a world’s fair. Ten thousand people came to tour the facilities and take in the latest technological marvels. Electrical engineers sent messages over a primitive wireless system. Mechanical engineers “surprised their visitors by sawing wood with a piece of paper running at 20,000 revolutions per minute, freezing flowers in liquid air, and showing a bottle supported only by two narrow wires from which a full stream of water flowed—a mystery solved by few.” Two full torpedoes, two large cannons, and “a complete electric railway with a block signal system” rounded out the demonstrations. “For the average student as well as for the casual visitor, the Engineering corner of the Campus held mysteries almost as profound as the deeper mysteries of the Medical School,” observed one writer.

Though the dual degree was common enough, Shannon’s variety of indecision, which he never entirely outgrew, would prove crucial to his later work.

Someone content to build things might have been happy with a single degree in engineering; someone drawn more to theory might have been satisfied with studying math alone. Shannon, mathematically and mechanically inclined, could not make up his mind, but the result left him trained in two fields that would prove essential to his later successes.

He joined Radio Club, Math Club, even the gymnastics team. Shannon’s records of leadership during this time are two. One is his stint as secretary of the Math Club. “A feature of all meetings,” a journal recorded, “was a list of mathematical problems placed on the board and discussed informally after the regular program. A demonstration of mathematical instruments in the department’s collection made an interesting program.” The other was news enough that the hometown paper saw fit to print it as an item of note: “Claude Shannon has been made a non-commissioned officer in the Reserve Officers Training Corps at the University of Michigan.”

In the Engineering Buildings, where Claude spent the bulk of his time, his classmates tried the strength of shatterproof windshield glass, worked to muffle milk-skimming machines, floated model battleships on a sunless indoor model sea. But the real life on campus was outside the classroom.

Buoyed, we imagine, by this first success, Shannon again submitted a solution and was again published in the Monthly’s back pages, in January 1935, in answer to this problem: E 100 [1934, 390]. Proposed by G. R. Livingston, State Teachers College, San Diego, California. In two concentric circles, locate parallel chords in the outer circle which are tangent to the inner circle, by the use of compasses only, finding the ends of the chords and their points of tangency. Modest as they are, these early efforts are a window into the education of Claude Shannon. We can infer from them that the college-aged Shannon understood the value of appearing in a professional public forum, one that would earn the scrutiny of mathematicians his age and the attention of those older than him. That he was reading such a journal at all hints at more than the usual attention paid to academic matters; that his solutions were selected points to more than the usual talent.

Above all, his first publications tell us something about his growing ambition: taking time out from the usual burdens of classes and college life to study these problems, work out the answers, and prepare them for publication suggests that he already envisioned something other for himself than the family furniture business.

His something other would begin, in earnest, with a typed postcard tacked to an engineering bulletin board. It was an invitation to come east and help build a mechanical brain. Shannon noticed it in the spring of 1936, just as he was considering what was to come after his undergraduate days were over. The job—master’s student and assistant on the differential analyzer at the Massachusetts Institute of Technology—was tailor-made for a young man who could find equal joy in equations and construction, thinking and building.

The man in the black suit is Vannevar Bush, and this photo marks his start. Pugnacious and perpetually time-strapped, grandson and great-grandson of Yankee whaling captains, saddled with a name so frustratingly hard to pronounce that he would instruct others to call him “Van” or even “John”—the twenty-two-year-old inventor would one day be, although he couldn’t possibly imagine it yet, the most powerful scientist in America.

“The thing we know about that apple,” Vannevar Bush continued, “is, to a first approximation, that its acceleration is constant.” We can plot its fall on the chalkboard in seconds. “But suppose we want to include the resistance that air offers to the fall. This just puts another term in our equation but makes it hard to solve formally. We can still very readily solve it on a machine. We simply connect together elements, electrical or mechanical gadgets, that represent the terms of the equation, and watch it perform.”

How fast can a population of animals grow before it crashes? How long before a heap of radioactive uranium decays? How far does a magnet’s force extend? How much does a massive sun curve time and space? To ask any of these questions is to ask for the solution to a differential equation. Or, of special interest to Bush and his electrical engineering colleagues: How great a power surge could the nation’s electrical grids tolerate before they failed? Given all the wealth and work it had taken to electrify America, it was a multimillion-dollar question.

It turned out that most differential equations of the useful kind—the apple-falling-in-the-real-world kind, not the apple-falling-down-a-chalkboard kind—presented just the same impassable problem. These were not equations that could be solved by formulas or shortcuts, only by trial and error, or intuition, or luck. To solve them reliably—to bring the force of calculus to bear on the industrial problems of power transmission or telephone networks, or on the advanced physics problems of cosmic rays and subatomic particles—demanded an intelligence of another order.

Part 2

The point was precision. In particular, the point was rigor in reducing the hard, solid world—the wrench—into symbols so exact—the patent application—that one could be flawlessly translated from the other. Given the pipe wrench, produce the words for that wrench and no other; given the words, produce the wrench. That, Bush taught his students, was the beginning of engineering.

A math laboratory of that era was “ well-stocked with clay, cardboard, wire, wooden, metal and other models and materials”—and with graph paper, which was only about as old as Bush was. At Bush’s MIT, math and engineering were an extension of the metal shop and the woodshop, and students who were skilled with the planimeter and the slide rule had to be skilled as well with the soldering iron and the saw. There is perhaps a source here for engineers’ persistent status anxiety, “uncertain always where they fit,” as the great critic Paul Fussell put it, “whether with boss or worker, management or labor, the world of headwork or the world of handwork.” But there was also the conviction that handwork was headwork, as long as the translations had precision.

“It was a fearsome thing of shafts, gears, strings, and wheels rolling on disks,” said an MIT physicist who turned to the differential analyzer to study the behavior of scattering electrons, “but it worked.” It was an enormous wooden frame latticed with spinning rods, resembling a giant’s 100-ton foosball set. At the input end were six draftsman’s tables, where the machine read the equations it was to evaluate, much like Thomson’s analyzer read a graph of the tides.

The mathematics were infinitely more complex—but Vannevar Bush’s lawnmower might have recognized in this calculating room a distant descendant. The differential analyzer, wrote one science historian, “still interpreted mathematics in terms of mechanical rotations, still depended on expertly machined wheel-and-disc integrators, and still drew its answers as curves.

This was the computer before the digital revolution: a machine that literally performed equations in the process of solving them. As long as the machine was acting out the equations that shape an atom, it was, in a meaningful sense, a giant atom; as long as it was acting out the equations that fuel a star, it was a miniature star.

For the physicist or engineer, two systems that obey the same equations have a kind of identity—or at least an analogy. And that, after all, is all our word analog means. A digital watch is nothing like the sun; an analog watch is the memory of a shadow’s circuit around a dial.

On greener days, a walk outside would take Shannon past columned facades chiseled with the names of the greats: Archimedes, Copernicus, Newton, Darwin. MIT was a neoclassical island in what was still an industrial Boston suburb, and the Pantheon-style dome at its center sat as an uneasy neighbor to the factories and mills along the Charles River.

In the 1930s, there were only a handful of people in the world who were skilled in both “symbolic calculus,” or rigorous mathematical logic, and the design of electric circuits. This is less remarkable than it sounds: before the two fields melded in Shannon’s brain, it was hardly thought that they had anything in common. It was one thing to compare logic to a machine—it was another entirely to show that machines could do logic.

“Years ago an engineer told me a fantasy he thought threw some light on the ends of engineering, or at least on those underlying his own labors. A flying saucer arrives on Earth and the crew starts flying over cities and dams and canals and highways and grids of power lines; they follow cars on the roads and monitor the emissions of TV towers. They beam up a computer into their saucer, tear it down, and examine it. ‘Wow,’ one of them finally exclaims. ‘Isn’t nature incredible!?’ ”

“What’s your secret in remaining so carefree?” an interviewer asked Shannon toward the end of his life. Shannon answered, “I do what comes naturally, and usefulness is not my main goal. . . . I keep asking myself, How would you do this? Is it possible to make a machine do that? Can you prove this theorem?” For an abstracted man at his most content, the world isn’t there to be used, but to be played with, manipulated by hand and mind.

More to the point, it was a matter of deep conviction for Bush that specialization was the death of genius. “In these days, when there is a tendency to specialize so closely, it is well for us to be reminded that the possibilities of being at once broad and deep did not pass with Leonardo da Vinci or even Benjamin Franklin,” Bush said in a speech at MIT. “Men of our profession—we teachers—are bound to be impressed with the tendency of youths of strikingly capable minds to become interested in one small corner of science and uninterested in the rest of the world. . . . It is unfortunate when a brilliant and creative mind insists upon living in a modern monastic cell.”

“Much of the power and elegance of any mathematical theory,” Shannon wrote, “depends on use of a suitably compact and suggestive notation, which nevertheless completely describes the concepts involved.”

And where Shannon could continue to pound away at a mathematical problem or research question until he struck sparks, pursuing problems with breathtaking intuition and instinct, Weaver had discovered that he possessed no such gift. In a remarkably self-aware reflection on his strengths and weaknesses, Weaver observed: “I had a good capacity for assimilating information, something of a knack for organizing, an ability to work with people, a zest for exposition, an enthusiasm that helped to advance my ideas. But I lacked that strange and wonderful creative spark that makes a good researcher. Thus I realized that there was a definite ceiling on my possibilities as a mathematics professor.”

As Fred Kaplan explained in his history of wartime science, “It was a war in which the talents of scientists were exploited to an unprecedented, almost extravagant degree.” There were urgent questions that needed answers, and the scientifically literate were uniquely equipped to answer them. Kaplan cataloged just a few:

As in other areas of Shannon’s life, his most important work in cryptography yielded a rigorous, theoretical underpinning for many of a field’s key concepts. Shannon’s exposure to day-to-day cryptographic work during the war, it seems, was important—but its primary purpose was as grist for a paper that would only be published in classified form on September 1, 1945—one day before the Japanese surrender was signed. This paper, “A Mathematical Theory of Cryptography—Case 20878,” contained important antecedents of Shannon’s later work—but it also provided the first-ever proof of a critical concept in cryptology: the “one-time pad.”

It took Claude Shannon, and the space of more than a half century, to prove that a code constructed under these stringent (and usually impracticable) conditions would be entirely unbreakable—that perfect secrecy within a cryptographic system was, at least in theory, possible. Even with unlimited computing power, the enemy could never crack a code built on such a foundation.

Of the people in this world, Shannon would say: “They were not a very talkative bunch, you could say that. They were the most secretive bunch of people in the world. It’s very hard to even find out for example who are the important cryptographers in this country.”

Pierce told Shannon on numerous occasions that “he should write up this or that idea.” To which Shannon is said to have replied, with characteristic insouciance, “What does ‘should’ mean?” Oliver, Pierce, and Shannon—a genius clique, each secure enough in his own intellect to find comfort in the company of the others. They shared a fascination with the emerging field of digital communication and cowrote a key paper explaining its advantages in accuracy and reliability.

It turns out that there were three certified geniuses at BTL [Bell Telephone Laboratories] at the same time, Claude Shannon of information theory fame, John Pierce, of communication satellite and traveling wave amplifier fame, and Barney. Apparently the three of those people were intellectually INSUFFERABLE. They were so bright and capable, and they cut an intellectual swath through that engineering community, that only a prestige lab like that could handle all three at once.

Other accounts suggest that Shannon might not have been so “insufferable” as he was impatient. His colleagues remembered him as friendly but removed. To Maria, he confessed a frustration with the more quotidian elements of life at the Labs. “I think it made him sick,” she said. “I really do. That he had to do all that work while he was so interested in pursuing his own thing.” Partly, it seems, the distance between Shannon and his colleagues was a matter of sheer processing speed.

What others saw as reticence, McMillan saw as a kind of ambient frustration: “He didn’t have much patience with people who weren’t as smart as he was.”

George Henry Lewes once observed that “genius is rarely able to give an account of its own processes.” This seems to have been true of Shannon, who could neither explain himself to others, nor cared to. In his work life, he preferred solitude and kept his professional associations to a minimum.

He writes in neat script on lined paper, but the raw material is everywhere. Eight years like this—scribbling, refining, crossing out, staring into a thicket of equations, knowing that, at the end of all that effort, they may reveal nothing. There are breaks for music and cigarettes, and bleary-eyed walks to work in the morning, but mostly it’s this ceaseless drilling. Back to the desk, where he senses, perhaps, that he is on to something significant, something even more fundamental than the master’s thesis that made his name—but what?

Second, there are limits to brute force. Applying more power, amplifying messages, strengthening signals—Whitehouse’s solution to the telegraph problem—remained the most intuitive answer to noise. Its failure in 1858 discredited Whitehouse but not the outlines of his methods; few others were available. Yet there were high costs to shouting. In the best case, it was still expensive and energy-hungry. In the worst case, as with the undersea cable, it could destroy the medium of communication itself.

But such a law addressed only the movement of electricity, not the nature of the messages it carried. How could science speak of such a thing? It could track the speed of electrons in a wire, but the idea that the message they represented could be measured and manipulated with comparable precision would have to wait until the next century. Information was old. A science of information was just beginning to stir.

Each year that Shannon placed a call, he was less likely to speak to a human operator and more likely to have his call placed by machine, by one of the automated switchboards that Bell Labs grandly called a “mechanical brain.” In the process of assembling and refining these sprawling machines, Shannon’s generation of scientists came to understand information in much the same way that an earlier generation of scientists came to understand heat in the process of building steam engines.

One was Harry Nyquist. When he was eighteen, his family left its Swedish farm and joined the wave of Scandinavian immigration to the upper Midwest; he worked construction in Sweden for four years to pay for his share of the passage. Ten years after his arrival, he had a doctorate in physics from Yale and a job as a scientist in the Bell System. A Bell lifer, Nyquist was responsible for one of the first prototype fax machines: he sketched out a proposal for “telephotography” as early as 1918.

it seemed that a greater range of frequencies imposed on top of one another, a greater “bandwidth,” was needed to generate the more interesting and complex waves that could carry richer information. To efficiently carry a phone conversation, the Bell network needed frequencies ranging from about 200 to 3,200 hertz, or a bandwidth of 3,000 hertz. Telegraphy required less; television would require 2,000 times more. Nyquist showed how the bandwidth of any communications channel provided a cap on the amount of “intelligence” that could pass through it at a given speed.

this limit on intelligence meant that distinction between continuous signals (like the message on a phone line) and discrete signals (like dots and dashes or, we might add, 0’s and 1’s) was much less clear-cut than it seemed. A continuous signal still varied smoothly in amplitude, but you could also represent that signal as a series of samples, or discrete time-slices—and within the limit of a given bandwidth, no one would be able to tell the difference.

More fundamentally, as a professor of electrical engineering wrote, it showed that “the world of technical communications is essentially discrete or ‘digital.’ ”


That was Nyquist’s surprising result: the larger the number of “letters” a telegraph system could use, the faster it could send a message. Or we can look at it the other way around. The larger the number of possible current values we can choose from, the greater the density of intelligence in each signal, or in each second of communication.

Hartley’s ideas far beyond what Hartley, or anyone, could have imagined. Aside from George Boole, that obscure logician, no one shaped Shannon’s thought more.

While an engineer like Shannon would not have needed the reminder, it was Hartley who made meaning’s irrelevance to information clearer than ever.

A science of information would have to make sense of the messages we call gibberish, as well as the messages we call meaningful. So in a crucial passage, Hartley explained how we might begin to think about information not psychologically, but physically: “In estimating the capacity of the physical system to transmit information we should ignore the question of interpretation, make each selection perfectly arbitrary, and base our results on the possibility of the receiver’s distinguishing the result of selecting any one symbol from that of selecting any other.”

The real measure of information is not in the symbols we send—it’s in the symbols we could have sent, but did not.

To send a message is to make a selection from a pool of possible symbols, and “at each selection there are eliminated all of the other symbols which might have been chosen.” To choose is to kill off alternatives. We see this most clearly, Hartley observed, in the cases in which messages happen to bear meaning.

message. The information value of a symbol depends on the number of alternatives that were killed off in its choosing. Symbols from large vocabularies bear more information than symbols from small ones. Information measures freedom of choice.

Shannon was arguably the first to conceive of our genes as information bearers, an imaginative leap that erased the border between mechanical, electronic, and biological messages.

Imagining the transmitter as a distinct conceptual box proved to be especially pivotal: as we will see, the work of encoding messages for transmission turned out to hold the key to Shannon’s most revolutionary result.

New sciences demand new units of measurement—as if to prove that the concepts they have been talking and talking around have at last been captured by number. The new unit of Shannon’s science was to represent this basic situation of choice. Because it was a choice of 0 or 1, it was a “binary digit.” In one of the only pieces of collaboration Shannon allowed on the entire project, he put it to a lunchroom table of his Bell Labs colleagues to come up with a snappier name. Binit and bigit were weighed and rejected, but the winning proposal was laid down by John Tukey, a Princeton professor working at Bell. Bit.

One bit is the amount of information that results from a choice between two equally likely options. So “a device with two stable positions . . . can store one bit of information.” The bit-ness of such a device—a switch with two positions, a coin with two sides, a digit with two states—lies not in the outcome of the choice, but in the number of possible choices and the odds of the choosing. Two such devices would represent four total choices and would be said to store two bits.

What does information really measure? It measures the uncertainty we overcome. It measures our chances of learning something we haven’t yet learned. Or, more specifically: when one thing carries information about another—just as a meter reading tells us about a physical quantity, or a book tells us about a life—the amount of information it carries reflects the reduction in uncertainty about the object.

Now, the goal of all this was not merely to grind out the precise number of bits in every conceivable message: in situations more complicated than a coin flip, the possibilities multiply and the precise odds of each become much harder to pin down. Shannon’s point was to force his colleagues to think about information in terms of probability and uncertainty.

For the vast bulk of messages, in fact, symbols do not behave like fair coins. The symbol that is sent now depends, in important and predictable ways, on the symbol that was just sent: one symbol has a “pull” on the next.

a random machine in charge of a telegraph key might break the rules and ignorantly send a space after a space—but nearly all the messages that interest engineers do come with implicit rules, are something less than free, and Shannon taught engineers how to take huge advantage of this fact. This was the hunch that Shannon had suggested to Hermann Weyl in Princeton in 1939, and which he had spent almost a decade building into theory: Information is stochastic. It is neither fully unpredictable nor fully determined.

To prove it, Shannon set up an ingenious, if informal, experiment in garbled text: he showed how, by playing with stochastic processes, we can construct something resembling the English language from scratch. Shannon began with complete randomness. He opened a book of random numbers, put his finger on one of the entries, and wrote down the corresponding character from a 27-symbol “alphabet” (26 letters, plus a space). He called it “zero-order approximation.” Here’s what happened: XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD. There are equal odds for each character, and no character exerts a “pull” on any other. This is the printed equivalent of static. This is what our language would look like if it were perfectly uncertain and thus perfectly informative. But we do have some certainty about English. For one, we know that some letters are likelier than others.

To construct a text with reasonable digram frequencies, “one opens a book at random and selects a letter at random on the page. This letter is recorded. The book is then opened to another page and one reads until this letter is encountered. The succeeding letter is then recorded. Turning to another page this second letter is searched for and the succeeding letter is recorded, etc.” If all goes well, the text that results reflects the odds with which one character follows another in English.

This is “second-order approximation”: ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE. Out of nothing, a stochastic process has blindly created five English words (six, if we charitably supply an apostrophe and count ACHIN’). “Third-order approximation,” using the same method to search for trigrams, brings us even closer to passable English: IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.

Even further, our choice of the next word is strongly governed by the word that has just gone before. Finally, then, Shannon turned to “second-order word approximation,” choosing a random word, flipping forward in his book until he found another instance, and then recording the word that appeared next: THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.

From gibberish to reasonable, the passages grew closer and closer to passable text. They were not written, but generated: the only human intervention came in manipulating the rules. How, Shannon asked, do we get to English? We do it by making our rules more restrictive.

codebreaking remained possible, and remains so, because every message runs up against a basic reality of human communication. It always involves redundancy; to communicate is to make oneself predictable. This was the age-old codebreaker’s intuition that Shannon formalized in his work on information theory: codebreaking works because our messages are less, much less, than fully uncertain.

his work on information and his work on codes grew from a single source: his interest in the unexamined statistical nature of messages, and his intuition that a mastery of this nature might extend our powers of communication.

He would explain later, “I wrote [the information theory paper], which in a sense sort of justified some of the time I’d been putting into [cryptography], at least in my mind. . . . But there was this close connection. I mean they are very similar things. . . . Information, at one time trying to conceal it, and at the other time trying to transmit it.”

Letters can be redundant: because Q is followed almost automatically by U, the U tells us almost nothing in its own right. We can usually discard it, and many more letters besides. As Shannon put it, “MST PPL HV LTTL DFFCLTY N RDNG THS SNTNC.”

Shannon guessed that the world’s wealth of English text could be cut in half with no loss of information: “When we write English, half of what we write is determined by the structure of the language and half is chosen freely.” Later on, his estimate of redundancy rose as high as 80 percent: only one in five characters actually bear information.

At higher redundancies, fewer sequences are possible, and the number of potential intersections shrinks: if English were much more redundant, it would be nearly impossible to make puzzles. On the other hand, if English were a bit less redundant, Shannon speculated, we’d be filling in crossword puzzles in three dimensions.

Shannon’s estimates of our language’s redundancy grew, he wrote cryptically, out of “certain known results in cryptography.” The hint he dropped there is a reminder that his great work on code writing, “Communication Theory of Secrecy Systems,” was still classified in 1948.

two words were hugely probable after the short phrase that Shannon spelled out: “desk”; “table.” Once Raymond Chandler got to “the,” he had written himself into a corner.

Understanding redundancy, we can manipulate it deliberately, just as an earlier era’s engineers learned to play tricks with steam and heat.

To begin with, how fast can we send a message? It depends, Shannon showed, on how much redundancy we can wring out of it. The most efficient message would actually resemble a string of random text: each new symbol would be as informative as possible, and thus as surprising as possible. Not a single symbol would be wasted. Of course, the messages that we want to send one another—whether telegraphs or TV broadcasts—do “waste” symbols all the time. So the speed with which we can communicate over a given channel depends on how we encode our messages: how we package them, as compactly as possible, for shipment.

Shannon’s first theorem proves that there is a point of maximum compactness for every message source. We have reached the limits of communication when every symbol tells us something new. And because we now have an exact measure of information, the bit, we also know how much a message can be compressed before it reaches that point of perfect singularity.

Shannon’s paper was the first to define the idea of channel capacity, the number of bits per second that a channel can accurately handle. He proved a precise relationship between a channel’s capacity and two of its other qualities: bandwidth (or the range of frequencies it could accommodate) and its ratio of signal to noise. Nyquist and Hartley had both explored the trade-offs among capacity, complexity, and speed; but it was Shannon who expressed those trade-offs in their most precise, controllable form.

there is a hard cap—a “speed limit” in bits per second—on accurate communication in any medium. Past this point, which was soon enough named the Shannon limit, our accuracy breaks down.

Below the channel’s speed limit, we can make our messages as accurate as we desire—for all intents, we can make them perfectly accurate, perfectly free from noise. This was Shannon’s furthest-reaching find: the one Fano called “unknown, unthinkable,” until Shannon thought it. Until Shannon, it was simply conventional wisdom that noise had to be endured.

Shannon proposed an unsettling inversion. Ignore the physical channel and accept its limits: we can overcome noise by manipulating our messages. The answer to noise is not in how loudly we speak, but in how we say what we say.

combining the advantages of codes that compress and codes that guard against error: that is, reducing a message to bits as efficiently as possible, and then adding the redundancy that protects its accuracy. Coding and decoding would still exact their cost in effort and time. But Shannon’s proof stood: there is always an answer. The answer is digital.

while Shannon had proven that the codes must be there, neither he nor anyone else had shown what they must be. Once the audacity of his work had worn off—he had, after all, founded a new field and solved most of its problems at one stroke—one consequential shortfall would dominate the conversation on Claude Shannon and Claude Shannon’s theory. How long would it take to find the codes? Once found, would they even make everyday practical sense, or would it simply be cheaper to continue muddling through? Could this strange work, full of imaginary languages, messages without meaning, random text, and a philosophy that claimed to encompass and explain every signal that could possibly be sent, ever be more than an elegant piece of theorizing? In words with which any engineer could have sympathized: would it work?

They’re best overheard in a conversation between Shannon and Von Neumann at Princeton, said to have taken place in 1940, when Shannon was first piecing his theory together in the midst of his failing marriage. Shannon approached the great man with his idea of information-as-resolved-uncertainty—which would come to stand at the heart of his work—and with an unassuming question. What should he call this thing? Von Neumann answered at once: say that information reduces “entropy.” For one, it was a good, solid physics word. “And more importantly,” he went on, “no one knows what entropy really is, so in a debate you will always have the advantage.”

In the state of maximal entropy, all pockets of predictability would have long since failed: each particle a surprise. And the whole would read, were there then eyes to read it, as the most informative of messages.

When particles jump from state to state, is their resemblance to switches, to logic circuits, to 0’s and 1’s, something more than a trick of our eyes? Or put it this way: Was the quality of information something we imposed on the world, just a by-product of our messages and machines—or was it something we found out about the world, something that had been there all along?

he ran up against a human habit much older than him: our tendency to reimagine the universe in the image of our tools. We made clocks, and found the world to be clockwork; steam engines, and found the world to be a machine processing heat; information networks—switching circuits and data transmission and half a million miles of submarine cable connecting the continents—and found the world in their image, too.

In an unpublished spoof written a year later, Shannon imagined the damage his methods would do if they fell into the wrong hands. It seems that an evil Nazi scientist, Dr. Hagen Krankheit, had escaped Germany with a prototype of his Müllabfuhrwortmaschine, a fearsome weapon of war “anticipated in the work . . . of Dr. Claude Shannon.” Krankheit’s machine used the principles of randomized text to totally automate the propaganda industry. By randomly stitching together agitprop phrases in a way that approximated human language, the Müllabfuhrwortmaschine could produce an endless flood of demoralizing statements. On one trial run, it spat out “Subversive elements were revealed to be related by marriage to a well-known columnist,”“Capitalist warmonger is a weak link in atomic security,” and “Atomic scientist is said to be associated with certain religious and racial groups.” Remarkably, these machine-generated phrases were indistinguishable from human propaganda—and now it was feared that the machine had fallen into the hands of the communists.

a connection between information and physics was first suggested, as early as 1929, by the Hungarian physicist Leo Szilard. Briefly, Szilard resolved an old puzzle in the physics of heat: the Second Law of Thermodynamics says that entropy is constantly increasing, but what if we imagined a microscopic and intelligent being, which James Clerk Maxwell had dubbed a “demon,”that tried to decrease entropy by sorting hot molecules from cold? Would that contradict the Second Law? Szilard showed that it would not: the very act of determining which molecules were which would cost enough energy to offset any savings that Maxwell’s Demon proposed to achieve. In other words—learning information about particles costs energy. Shannon, however, had not read Szilard’s work when he wrote his 1948 paper.

It piqued the interest of one reader in particular, who would become Shannon’s most important popularizer: Warren Weaver, the director of the Division of Natural Sciences at the Rockefeller Foundation, one of the principal funders of science and mathematics research in the country.

Ridenour had spent the early part of the twentieth century at the rich intersection of physics and geopolitics. During World War II, he worked at the renowned MIT Radiation Laboratory, commonly known as the Rad Lab. The Rad Lab began with outsize ambitions, as an effort to perfect mass-produced radar technology to defeat the German Luftwaffe’s bombing runs against the British. It also had mysterious origins. Funded by Alfred Lee Loomis, the intensely private millionaire financier, attorney, and self-taught physicist, the lab was initially bankrolled entirely by Loomis himself. It created most of the radar systems used to identify German U-boats—and its network of scientists and technicians became much of the nucleus of the Manhattan Project. As Lee DuBridge, the lab’s director, would later quip, “Radar won the war; the atom bomb ended it.” This was the world of fighting man’s physics.

Doob was open about the fact that he was, perhaps too frequently, looking for trouble. Asked why he became interested in mathematics in the first place, he answered: I have always wanted to understand what I was doing, and why I was doing it, and I have often been a pest because I have objected when what I heard or read was not to be taken literally. The boy who noticed that the emperor wasn’t dressed and objected loudly has always been my model. Mathematics seemed to match my psychology, a mistake reflecting the fact that somehow I did not take into account that mathematics is created by humans. His sharp words, friends recalled, often came mixed with humor.

Closer to our times, the twentieth-century mathematician G. H. Hardy would write what became the ur-text of pure math. A Mathematician’s Apology is a “manifesto for mathematics itself,” which pointedly borrowed its title from Socrates’s argument in the face of capital charges.

It was the pure mathematicians who looked down on Von Neumann’s work on game theory, calling it, among other things, “just the latest fad” and “déclassé.” The same group would level a similar judgment against John Nash—just as Doob would against Claude Shannon.

“In reality,” said Golomb, “Shannon had almost unfailing instinct for what was actually true and gave outlines of proofs that other mathematicians . . . would make fully rigorous.” In the words of one of Shannon’s later collaborators, “Distinguished and accomplished as Doob was, the gaps in Shannon’s paper which seemed large to Doob seemed like small and obvious steps to Shannon. Doob might not realize this for, how often if ever, would he have encountered a mind like Shannon’s?”

He was, according to one writer, “the American John von Neumann”—and the exaggeration was almost excusable. Born in Columbia, Missouri, Norbert Wiener was shaped by a father single-mindedly focused on molding his young son into a genius. Leo Wiener used an extraordinary personal library—and an extraordinary will—to homeschool young Norbert until the age of nine. “I had full liberty to roam in what was the very catholic and miscellaneous library of my father,” Wiener wrote. “At one period or other the scientific interests of my father had covered most of the imaginable subjects of study.”

He would begin the discussion in an easy, conversational tone. This lasted exactly until I made the first mathematical mistake. Then the gentle and loving father was replaced by the avenger of the blood. . . . Father was raging, I was weeping, and my mother did her best to defend me, although hers was a losing battle.

But the scars of such a childhood were obvious for all to see. He had been a child coming of age around people many years his senior. And as often happens, he was ridiculed cruelly and mercilessly by the older children; the result was an intense awkwardness that followed him his entire life. It didn’t help that, in appearance, Wiener was easy to ridicule. Bearded, bespectacled, nearsighted, with red-veined skin and a ducklike walk, there was hardly a stereotype of the addle-pated academic that Wiener did not fulfill. “

In appearance and behaviour, Norbert Wiener was a baroque figure, short, rotund, and myopic, combining these and many qualities in extreme degree.

His conversation was a curious mixture of pomposity and wantonness. He was a poor listener. . . . He spoke many languages but was not easy to understand in any of them. He was a famously bad lecturer.

His contributions to mathematics were as broad as they were deep: quantum mechanics, Brownian motion, cybernetics, stochastic processes, harmonic analysis—there was hardly a corner of the mathematical universe that his intellect left untouched.

Wiener was twenty-two years older than Shannon, so it reveals something about the advanced degree of Shannon’s thinking and the importance of his work that, as early as 1945, Wiener was nervous about which of them would win the race for credit for information theory. Their contest began in earnest in 1946. As the story goes, the manuscript that formed the outlines of Wiener’s contributions to information theory was nearly lost to humanity. Wiener had entrusted the manuscript to Walter Pitts, a graduate student, who had checked it as baggage for a trip from New York’s Grand Central Terminal to Boston. Pitts forgot to retrieve the baggage. Realizing his mistake, he asked two friends to pick up the bag. They either ignored or forgot the request. Only five months later was the manuscript finally tracked down; it had been labeled “unclaimed property” and cast aside in a coatroom.

Wiener’s contribution was contained within the wide-ranging book Cybernetics, which had its debut in the same year as Shannon’s two-part paper. If Shannon’s 1948 work was, at least initially, relatively unknown to the wider public, Wiener’s notion of cybernetics—a word he derived from the Greek for “steersman” to encompass “the entire field of control and communications theory, whether in the machine or in the animal”—aroused intense public interest from the moment it was published.

By the standards of the great mathematical feuds—Gottfried Leibniz and Isaac Newton battling over custody of calculus, or Henri Poincaré and Bertrand Russell debating the nature of mathematical reasoning—the rivalry between Shannon and Wiener is, sadly, less spectacular than biographers might prefer. But it still stands as an important moment in Shannon’s story.

Shannon turned thirty-two in 1948. The conventional wisdom in mathematical circles had long held that thirty is the age by which a young mathematician ought to have accomplished his foremost work; the professional mathematician’s fear of aging is not so different from the professional athlete’s.

“For most people, thirty is simply the dividing line between youth and adulthood,” writes John Nash biographer Sylvia Nasar, “but mathematicians consider their calling a young man’s game, so thirty signals something far more gloomy.” Shannon was two years late by that standard, but he had made it. Roughly ten years of work had become seventy-seven pages of information theory, and the work had been worthwhile by all accounts.

Robert Gallager, another colleague, went a step further: “He had a weird insight. He could see through things. He would say, ‘Something like this should be true’ . . . and he was usually right. . . . You can’t develop an entire field out of whole cloth if you don’t have superb intuition.” The trouble with that kind of intuition is that solutions to problems appear before the details and intermediary steps do.

The expansion of the applications of Information Theory to fields other than radio and wired communications has been so rapid that oftentimes the bounds within which the Professional Group interests lie are questioned. . . . Should an attempt be made to extend our interests to such fields as management, biology, psychology, and linguistic theory, or should the concentration be strictly in the direction of communication by radio or wire?

Shannon allowed that the popularity was, at least in part, due to information theory’s hovering place on the edges of so many of the era’s hottest fields—“computing machines, cybernetics, and automation”—as well as to its sheer novelty. And yet, he continued, “it has perhaps been ballooned to an importance beyond its actual accomplishments.

While we feel that information theory is indeed a valuable tool in providing fundamental insights into the nature of communication problems and will continue to grow in importance, it is certainly no panacea for the communication engineer or, a fortiori, for anyone else. Seldom do more than a few of nature’s secrets give way at one time.

Seldom do more than a few of nature’s secrets give way at one time. It’s a remarkable statement from someone who still had a full career ahead of him, someone who, in a practical sense, had every incentive to encourage information theory’s inflation. Yet here was Shannon pulling on the reins.

. I personally believe that many of the concepts of information theory will prove useful in these other fields—and, indeed, some results are already quite promising—but the establishing of such applications is not a trivial matter of translating words to a new domain, but rather the slow tedious process of hypothesis and experimental verification.

The war’s end had brought the military a thorny problem: the exit from public service of many of the nation’s top scientists, mathematicians, and engineers. Beginning in wartime, as Sylvia Nasar wrote, “to be plucked from academe and initiated into the secret world of the military had become something of a rite of passage for the mathematical elite.” Now, though, “how to keep the best and brightest thinking about military problems was far from obvious. Men of the caliber of John von Neumann would hardly sign up for the civil service.”

One solution, familiar to the men who occupied the upper rungs of the mathematical world, was the establishment of technical committees in close contact with various branches

The committee that would become most familiar to Shannon—and the reason for the urgent messages from Wenger and von Neumann—was known as the Special Cryptologic Advisory Group, or SCAG. In the NSA’s words, “the fundamental purpose in establishing SCAG was to assemble a specific group of outstanding technical consultants in the scientific fields of interest to the Agency, and thus provide a valuable source of advice and assistance in solving special problems in the cryptologic field.”

“I think the history of science has shown that valuable consequences often proliferate from simple curiosity,” Shannon once remarked. Curiosity in extremis runs the risk of becoming dilettantism, a tendency to sample everything and finish nothing. But Shannon’s curiosity was different.

What other people called hobbies, he thought of as experiments: exercises in the practice of simplification, models that filed a problem down to its barest interesting form. He was so convinced of a machine-enabled future, and so eager to explore its boundaries, that he was willing to tolerate a degree of ridicule to bring it to pass. He was preoccupied, as he wrote to a correspondent, “with the possible capabilities and applications of large scale electronic computers.” Considered in the light of that future, our present, his machines weren’t hobbies—they were proofs.

“Here at the Bell Telephone Laboratories, we’re concerned with improving your telephone system,” Shannon says, coming the closest he ever will come to shilling for his employer. That moment, along with the images of telephones dialing and switches activating, and the cheery music in the background, was a necessary piece of PR: concerned as they were about regulatory interest in their work, the higher-ups at Bell Labs and AT&T couldn’t allow Claude Shannon to go into theaters, schools, or universities with a robotic mouse and risk giving the appearance that the enormous leeway and profits they had been granted by the U.S. government was being devoted to frivolities.

The incongruity of such leading minds discussing a mechanical mouse was mitigated by the fact that Theseus (or, to be exact, the mouse-maze system as a whole) was a working example of the “artificial intelligence” that many of the esteemed attendees had spent their careers pondering only in theory. Theseus was artificially intelligent. When an attendee pointed out the obvious—that if the metallic cheese were removed, the mouse would simply sputter along, searching in vain for a piece of cheese that was no longer there—conference attendee and social scientist Larry Frank responded, “It is all too human.”

To the question of whether a certain rough kind of intelligence could be “created,” Shannon had offered an answer: yes, it could. Machines could learn. They could, in the circumscribed way Shannon had demonstrated, make mistakes, discover alternatives, and avoid the same missteps again. Learning and memory could be programmed and plotted, the script written into a device that looked, from a certain perspective, like an extremely simple precursor of a brain.

THROBAC (“Thrifty Roman-Numeral Backward-Looking Computer”) was a calculator whose keys, processing, and output all worked in Roman numerals, useless except to those who could decipher the difference between, say, CLXII and CXLII.

he was not bothered by the usual fears of a world run by machines or a human race taking a backseat to robots. If anything, Shannon believed the opposite: “In the long run [the machines] will be a boon to humanity, and the point is to make them so as rapidly as possible. . . . There is much greater empathy between man and machines [today] . . . we’d like to close it up so that we are actually talking back and forth.”

In response to the question of what the point of all his robot work might be, Shannon remarked that his goals were threefold: “First, how can we give computers a better sensory knowledge of the real world? Second, how can they better tell us what they know, besides printing out the information? And third, how can we get them to react upon the real world?” Or, as he told a later interviewer, in an even more optimistic mood: I believe that today, that we are going to invent something, it’s not going to be the biological process of evolution anymore, it’s going to be the inventive process whereby we invent machines which are smarter than we are and so we’re no longer useful, not only smarter but they last longer and have replaceable parts and they’re so much better. There are so many of these things about the human system, it’s just terrible. The only thing surgeons can do to help you basically is to cut something out of you. They don’t cut it out and put something better in, or a new part in.

Once machines were beating our grandmasters, writing our poetry, completing our mathematical proofs, and managing our money, we would, Shannon observed only half-jokingly, be primed for extinction. “These goals could mark the beginning of a phase-out of the stupid, entropy-increasing, and militant human race in favor of a more logical, energy conserving, and friendly species—the computer.”

In a life of pursuits adopted and discarded with the ebb and flow of Shannon’s promiscuous curiosity, chess remained one of his few lifelong pastimes. One story has it that Shannon played so much chess at Bell Labs that “at least one supervisor became somewhat worried.” He had a gift for the game, and as word of his talent spread throughout the Labs, many would try their hand at beating him. “Most of us didn’t play more than once against him,” recalled Brockway McMillan.

Nearly a half century before Deep Blue defeated the world’s human champion, Shannon anticipated the value of chess as a sort of training ground for intelligent machines and their makers: The chess machine is an ideal one to start with, since: (1) the problem is sharply defined both in allowed operations (the moves) and in the ultimate goal (checkmate); (2) it is neither so simple as to be trivial nor too difficult for satisfactory solution; (3) chess is generally considered to require “thinking” for skillful play; a solution of this problem will force us either to admit the possibility of a mechanized thinking or to further restrict our concept of “thinking”; (4) the discrete structure of chess fits well into the digital nature of modern computers.

But—and Shannon was emphatic about the “but”—“these must be balanced against the flexibility, imagination and inductive and learning capacities of the human mind.” The great downfall of a chess-playing machine, Shannon thought, was that it couldn’t learn on the fly, a capacity he believed was vital for victory at the elite levels. He cites Reuben Fine, an American chess master, on the misconceptions about top-ranked players and their approach to the game: “Very often people have the idea that masters foresee everything or nearly everything . . . that everything is mathematically calculated down to the smirk when the Queen’s Rook Pawn queens one move ahead of the opponent’s King’s Knight’s Pawn. All this is, of course, pure fantasy. The best course to follow is to note the major consequences for two moves, but try to work out forced variations as they go.”

Shannon would, over time, grow more positive that artificial brains would surpass organic brains. Decades would pass before programmers would build a grand-master-level chess computer on the foundations that Shannon helped lay, but he was certain that such an outcome was inevitable. The thought that a machine could never exceed its creator was “just foolish logic, wrong and incorrect logic.” He went on: “you can make a thing that is smarter than yourself. Smartness in this game is made partly of time and speed. I can build something which can operate much faster than my neurons.”

In one sense, the world seen through such eyes looks starkly unequal. “A very small percentage of the population produces the greatest proportion of the important ideas,” Shannon began, gesturing toward a rough graph of the distribution of intelligence. “There are some people if you shoot one idea into the brain, you will get a half an idea out. There are other people who are beyond this point at which they produce two ideas for each idea sent in. Those are the people beyond the knee of the curve.”

It was here, naturally, that Shannon was at his fuzziest. It is a quality of “motivation . . . some kind of desire to find out the answer, the desire to find out what makes things tick.” For Shannon, this was a requirement: “If you don’t have that, you may have all the training and intelligence in the world, [but] you don’t have the questions and you won’t just find the answers.” Yet he himself was unable to nail down its source. As he put it, “It is a matter of temperament probably; that is, a matter of probably early training, early childhood experiences.” Finally, at a loss for exactly what to call it, he settled on curiosity. “I just won’t go any deeper into it than that.”

Here Shannon was more concrete: he proposed six strategies, and the fluency with which he walked his audience through them—drawing P’s for “problems” and S’s for “solutions” on the chalkboard behind him for emphasis—suggests that these were all well-trodden paths in his mind.

You might, he said, start by simplifying: “Almost every problem that you come across is befuddled with all kinds of extraneous data of one sort or another; and if you can bring this problem down into the main issues, you can see more clearly what you’re trying to do.” Of course, simplification is an art form in itself: it requires a knack for excising everything from a problem except what makes it interesting, a nose for the distinction between accident and essence worthy of a scholastic philosopher.

Failing this difficult work of simplifying, or supplementing it, you might attempt step two: encircle your problem with existing answers to similar questions, and then deduce what it is that the answers have in common—in fact, if you’re a true expert, “your mental matrix will be filled with P’s and S’s,” a vocabulary of questions already answered.

If you cannot simplify or solve via similarities, try to restate the question: “Change the words. Change the viewpoint. . . . Break loose from certain mental blocks which are holding you in certain ways of looking at a problem.” Avoid “ruts of mental thinking.” In other words, don’t become trapped by the sunk cost, the work you’ve already put in.

Fourth, mathematicians have generally found that one of the most powerful ways of changing the viewpoint is through the “structural analysis of a problem”—that is, through breaking an overwhelming problem into small pieces.

Fifth, problems that can’t be analyzed might still be inverted. If you can’t use your premises to prove your conclusion, just imagine that the conclusion is already true and see what happens—try proving the premises instead.

Finally, once you’ve found your S, by one of these methods or by any other, take time to see how far it will stretch. The math that holds true on the smallest levels often, it turns out, holds true on the largest.

If there was any tension in the auditorium when he concluded—and invited the audience up to the front to examine a new gadget he’d been tinkering on—it was between Shannon the reluctant company man and Shannon the solitary wonder. The latter was as elusive as ever. There is a famous paper on the philosophy of mind called “What Is It Like to Be a Bat?” The answer, roughly, is that we have no idea. What was it like to be Claude Shannon?

Shannon was able to use each talk to dive deeply into a topic of personal interest. The “Seminar on Information Theory” in the spring term of 1956 served as a carousel for Shannon’s passions. In a lecture titled “Reliable Machines from Unreliable Components,” Shannon presented the following challenge: “In case men’s lives depend upon the successful operation of a machine, it is difficult to decide on a satisfactorily low probability of failure, and in particular, it may not be adequate to have men’s fates depend upon the successful operation of single components as good as they may be.” What followed was an analysis of the error-correcting and fail-safe mechanisms that might resolve such a dilemma.

In another lecture, “The Portfolio Problem,” Shannon pondered the implications for information theory of illicit gambling: The following analysis, due to John Kelly, was inspired by news reports of betting on whether or not the contestant on the TV program “$64,000 Question” would win. It seems that one enterprising gambler on the west coast, where the program broadcast is delayed three hours, was receiving tips by telephone before the local telecast took place. The question arose as to how well the gambler could do if the communication channel over which he received the tips was noisy.

As he wrote to his supervisor, Hendrik Bode, “It always seemed to me that the freedom I took [at the Labs] was something of a special favor.” Bell Labs, understandably, didn’t see it that way. They made a counteroffer, with a generous increase in Shannon’s salary. But, in the end, it wasn’t enough to sway him. His letter of resignation was a thoughtful weighing of industry against the academy. “There are certainly many points of superiority at Bell Labs,” Shannon writes. “Perhaps most important among these is the freedom from teaching and other duties with a consequent increase in time available for research.” Shannon acknowledged, too, that Bell Labs was offering him more money than MIT, “although the differential was not great in my case and, at any rate, I personally feel other issues are much more important.”

Bell Labs’ somewhat remote location in New Jersey was a complicating factor in its own right. “The essential seclusion and isolation of Bell Labs has both advantages and disadvantages. It eliminates a good many time-wasting visitors, but at the same time prevents many interesting contacts. Foreign visitors often spend a day at Bell Laboratories but spend six months at MIT. This gives opportunities for real interchange of ideas.” Bell Labs matched and even exceeded MIT in the caliber of its thinking, Shannon allowed. But in the end, “the general freedom in academic life is, in my view, one of its most important features. The long vacations are exceedingly attractive, as is also the general feeling of freedom in hours of work.”

Shannon became a whetstone for others’ ideas and intuitions. Rather than offer answers, he asked probing questions; instead of solutions, he gave approaches. As Larry Roberts, a graduate student of that time, remembered, “ Shannon’s favorite thing to do was to listen to what you had to say and then just say, ‘What about . . .’ and then follow with an approach you hadn’t thought of. That’s how he gave his advice.” This was how Shannon preferred to teach: as a fellow traveler and problem solver, just as eager as his students to find a new route or a fresh approach to a standing puzzle.

One anecdote, from Robert Gallager, captures both the power and subtlety of Shannon’s approach to the work of instruction: I had what I thought was a really neat research idea, for a much better communication system than what other people were building, with all sorts of bells and whistles. I went in to talk to him about it and I explained the problems I was having trying to analyze it. And he looked at it, sort of puzzled, and said, “Well, do you really need this assumption?” And I said, well, I suppose we could look at the problem without that assumption. And we went on for a while. And then he said, again, “Do you need this other assumption?” And I saw immediately that that would simplify the problem, although it started looking a little impractical and a little like a toy problem. And he kept doing this, about five or six times. I don’t think he saw immediately that that’s how the problem should be solved; I think he was just groping his way along, except that he just had this instinct of which parts of the problem were fundamental and which were just details. At a certain point, I was getting upset, because I saw this neat research problem of mine had become almost trivial. But at a certain point, with all these pieces stripped out, we both saw how to solve it. And then we gradually put all these little assumptions back in and then, suddenly, we saw the solution to the whole problem. And that was just the way he worked. He would find the simplest example of something and then he would somehow sort out why that worked and why that was the right way of looking at it.

Irwin Jacobs, an MIT student of that era and later the founder of Qualcomm, recalled: “People would go in, discuss a new idea, and how they were approaching it—and then he’d go over to one of his filing cabinets and pull out some unpublished paper that covered the material very well!”

It wasn’t only Shannon’s constant presence in the house, or the collection of electromechanical ephemera, that set him apart from other fathers. The Shannons were peculiar in the way that only a family headed by two mathematical minds might be. For instance, when it came time to decide who would handle the dishes after dinner, the Shannons turned to a game of chance: they wound up a robotic mouse, set it in the middle of their dining room table, and waited for the mouse to drop over one of the edges—and thus select that evening’s dishwasher. Then there were the spontaneous moments of math instruction. At a party hosted by the Shannons, young Peggy Shannon was in charge of the toothpicks. She was carrying a box of them on the house’s verandah—and then dropped it by accident, spilling its contents onto the porch. Her father, standing nearby, paused, took stock of the mess, and then said, “Did you know, you can calculate pi with that?” He was referring to Buffon’s Needle, a famous problem in geometric probability: it turns out that when you drop a series of needles (or toothpicks) on an evenly lined floor, the proportion of needles falling across a line can be used to estimate pi with surprising accuracy. Most important, Peggy remembered, her dad wasn’t angry with her for the mess. The Shannon household coalesced around the parents’ passions: chess and music became family pastimes, and stock picking and tinkering were a part of everyday life. Shannon took his children to circus performances. Alice in Wonderland, the favorite of many a mathematician, was in the air; Shannon especially enjoyed quoting from “Jabberwocky.” When it came to challenging math assignments, Peggy was regularly pointed in her father’s direction, even though, as she admits, this was overkill; anyone in the household, including her two older brothers, could have helped. He was, by her account, a patient teacher, though he often went on tangents that betrayed his own inclinations.

Even with his aversion to writing things down, the famous attic stuffed with half-finished work, and countless hypotheses circulating in his mind—and even when one paper on the scale of his “Mathematical Theory of Communication” would have counted as a lifetime’s accomplishment—Shannon still managed to publish hundreds of pages’ worth of papers and memoranda, many of which opened new lines of inquiry in information theory. That he had also written seminal works in other fields—switching, cryptography, chess programming—and that he might have been a pathbreaking geneticist, had he cared to be, was extraordinary. Yet Shannon had also come to accept that his own best days were behind him. “I believe that scientists get their best work done before they are fifty, or even earlier than that. I did most of my best work while I was young,” Shannon said.

This belief in an implicit age cap on mathematical genius was hardly unique to Shannon. As the mathematician G. H. Hardy famously wrote, “no mathematician should ever allow himself to forget that mathematics, more than any other art or science, is a young man’s game.”

While there have been notable exceptions to that rule, Shannon was convinced that he would not be one of them. His Bell Labs colleague Henry Pollak recalls visiting Shannon at home in Winchester to bring him up to date on a new development in communications science. “I started telling him about it, and for a brief time he got quite enthused about this. And then he said, ‘Nuh-uh, I don’t want to think. I don’t want to think that much anymore.’ It was the beginning of the end in his case, I think. He just—he turned himself off.” But if Shannon turned off the most rigorous part of his mind, he also freed himself to take a bird’s-eye view of the emerging Information Age that his work had made possible.

“For everybody who built communication systems, before [Shannon], it was a matter of trying to find a way to send voice, trying to find a way to send data, like Morse code,” recalled Gallager. “The thing that Claude said is that you don’t have to worry about all those different things.” Now their worries had a far more productive outlet: the coding, storage, and transmission of bits. “Once all the engineers were doing that, they start making this enormously rapid progress, start finding better and better ways of digitizing things and of storing and of communicating these very simple objects called binary digits, instead of these very complicated things like voice waveforms. If you look at it that way, Shannon is really responsible for the digital revolution.”

I think that this present century in a sense will see a great upsurge and development of this whole information business . . . the business of collecting information and the business of transmitting it from one point to another, and perhaps most important of all, the business of processing it—using it to replace man at semi-rote operations at a factory . . . even the replacement of man in the things that we almost think of as creative, things like mathematics or translating languages. If words like that seem self-evident and unremarkable to us today, it’s worth remembering that Shannon was speaking more than a quarter century before the birth of the World Wide Web, and at a time when virtually all computers were still room-sized.

And yet, Thorp wrote, what impressed him more than any of the gadgets was his host’s uncanny ability to “see” a solution to a problem rather than to muscle it out with unending work. “Shannon seemed to think with ‘ideas’ more than with words or formulas. A new problem was like a sculptor’s block of stone and Shannon’s ideas chiseled away the obstacles until an approximate solution emerged like an image, which he proceeded to refine as desired with more ideas.”

Juggling lacks the nobility of mathematical pastimes like chess or music. And yet the tradition of mathematician-jugglers is an ancient one. As best as we can tell, that tradition began in the tenth century CE in an open-air market in Baghdad. It was there that Abu Sahl al-Quhi, later one of the great Muslim astronomers, got his start in life juggling. A few years later, Al-Quhi became a kind of court mathematician for the local emir, who, fascinated by planetary motion, built an observatory in the garden of his palace and put Al-Quhi in charge. The appointment bore some fine mathematical fruit: Al-Quhi invented an adjustable geometrical compass, likely the world’s first, and led the revival among Muslim geometers of the study of the Greek thinkers Archimedes and Apollonius.

As Graham observed, “mathematics is often described as the science of patterns. Juggling can be thought of as the art of controlling patterns in time and space.” So it’s no surprise that generations of mathematicians could be found on university quads, tossing things in the air and catching them.

But computing was not only a central thread of his life’s work. As the lecture’s title suggested, it was also, always, his hobby—or, as he translated the word for his audience, his shumi. “Building devices like chess-playing machines and juggling robots, even as a ‘shumi,’ might seem a ridiculous waste of time and money,” Shannon admitted. “But I think the history of science has shown that valuable consequences often proliferate from simple curiosity.”

What might proliferate from such curiosities as Endgame and Theseus? I have great hopes in this direction for machines that will rival or even surpass the human brain. This area, known as artificial intelligence, has been developing for some thirty or forty years. It is now taking on commercial importance. For example, within a mile of MIT, there are seven different corporations devoted to research in this area, some working on parallel processing. It is difficult to predict the future, but it is my feeling that by 2001 AD we will have machines which can walk as well, see as well, and think as well as we do.

Incidentally, a communication system is not unlike what is happening right here. I am the source and you are the receiver. The translator is the transmitter who is applying a complicated operation to my American message to make it suitable for Japanese ears. This transformation is difficult enough with straight factual material, but becomes vastly more difficult with jokes and double entendres. I could not resist the temptation to include a number of these to put the translator on his mettle. Indeed, I am planning to take a tape of his translation to a second translator, and have it translated back into English. We information theorists get a lot of laughs this way.

As Robert Gallager put it, “Claude was never a person who depended a great deal on memory, because one of the things that made him brilliant was his ability to draw such wonderful conclusions from very, very simple models. What that meant was that, if he was failing a little bit, you wouldn’t notice it.”

From 1983 to 1993, Shannon continued to live at Entropy House and carry on as well as he could. Perhaps it says something about the depth of his character that, even in the last stages of his decline, much of his natural personality remained intact. “The sides of his personality that seemed to get stronger were the sweet, boyish, playful sides. . . . We were lucky,” Peggy noted. The games and tinkering continued, if at a more measured pace.

His deepest legacy, in some sense, wasn’t the one he owned, but the one woven into the work of others—his students, his admirers, later information theorists, engineers, and mathematicians.


Shannon never acknowledged the contradictions in his fields of interest; he simply went wherever his omnivorous curiosity led him. So it was entirely consistent for him to jump from information theory to artificial intelligence to chess to juggling to gambling—it simply didn’t occur to him that investing his talents in a single field made any sense at all.

The great Russian mathematician Andrey Kolmogorov put it like this in 1963: In our age, when human knowledge is becoming more and more specialized, Claude Shannon is an exceptional example of a scientist who combines deep abstract mathematical thought with a broad and at the same time very concrete understanding of vital problems of technology. He can be considered equally well as one of the greatest mathematicians and as one of the greatest engineers of the last few decades.

He reached the heights of the ivory tower, with all the laurels and professorial chairs to prove it, but felt no shame playing games built for children and writing tracts on juggling. He was passionately curious, but also, at times, unapologetically lazy. He was among the most productive, honored minds of his era, and yet he gave the appearance that he would chuck it all overboard for the chance to tinker in his gadget room.

Courage is one of the things that Shannon had supremely. You have only to think of his major theorem. He wants to create a method of coding, but he doesn’t know what to do so he makes a random code. Then he is stuck. And then he asks the impossible question, “What would the average random code do?” He then proves that the average code is arbitrarily good, and that therefore there must be at least one good code. Who but a man of infinite courage could have dared to think those thoughts? That is the characteristic of great scientists; they have courage. They go forward under incredible circumstances; they think and continue to think.

Importantly, his courage was joined to an ego so self-contained and self-sufficient that it looked, from certain angles, like the absence of ego. This was the keystone quality of Shannon, the one that enabled all the others. At almost every opportunity for self-promotion, Shannon demurred. Mathematicians worry about spending time on problems of insufficient difficulty, what they derisively call “toy problems”; Claude Shannon worked with actual toys in public!

And that is connected, we think, to the other great hallmark of Shannon’s life: the value of finding joy in work. We expect our greatest minds to bear the deepest scars; we prefer our geniuses tortured. But with the exception of a few years in his twenties when Shannon passed through what seems like a moody, possibly even depressive, stage, his life and work seemed to be one continuous game. He was, at once, abnormally brilliant and normally human.

he was drawn to the idea that knowledge was valuable for its own sake and that discovery was pleasurable in its own right. As he himself put it, “I’ve been more interested in whether a problem is exciting than what it will do.”

“He was not interested in forming a company to build unicycles. He was interested in finding out what made unicycles fun and finding out more about them.”

Shannon set four goals for artificial intelligence to achieve by 2001: a chess-playing program that was crowned world champion, a poetry program that had a piece accepted by the New Yorker, a mathematical program that proved the elusive Riemann hypothesis, and, “most important,” a stock-picking program that outperformed the prime rate by 50 percent. “These goals,” he said only half-jokingly, “could mark the beginning of a phase-out of the stupid, entropy-increasing, and militant human race in favor of a more logical, energy conserving, and friendly species—the computer.”

Here is how Arthur Koestler, a physics student turned novelist, once put it: Modern man lives isolated in his artificial environment, not because the artificial is evil as such, but because of his lack of comprehension of the forces which make it work—of the principles which relate his gadgets to the forces of nature, to the universal order. It is not central heating which makes his existence “unnatural,” but his refusal to take an interest in the principles behind it. By being entirely dependent on science, yet closing his mind to it, he leads the life of an urban barbarian.

We would add: it is not the Internet that is unnatural, nor our feast of information, but a refusal to consider what their origins are, how and why they are here, where they sit in the flow of our history, and what kinds of men and women brought them about. We think there is something of an obligation in beginning to learn these things.