A Turing Test for Empathy?

On Wednesday evening, I was the female participant on a panel about artificial intelligence.[1] The event was hosted at the National Club on Bay Street in Toronto. At Friday’s lunch, a colleague who attended to support mentioned that the venue smelled like New York, carried the grime of time in its walls after so many rain storms. Indeed, upon entering, I rewalked into Manhattan’s Downtown Association, returned to April, 2017, before the move to Toronto, peered down from the attic of my consciousness to see myself gently placing a dripping umbrella in the back of the bulbous cloak room, where no one would find it, feeling the mahogany enclose me in peaty darkness, inhaling a mild must that could only tolerate a cabernet, waiting with my acrylic green silk scarf from Hong Kong draped nonchalant around my neck, hanging just above the bottom seam of my silk tunic, dangling more than just above the top seam of my black leather boots, when a man walked up, the manager, and beaming with welcome he said “you must be the salsa instructor! Come, the class is on the third floor!” I laughed out loud. Alfred arrived. Alfred who was made for another epoch, who is Smith in our Hume-Smith friendship, fit for the ages, Alfred who had become a member of the association and, a gentleman of yore, would take breakfast there before work, Acshenbach in Venice, tidily wiping a moist remnant of scrambled eggs from the right corner lip, a gesture chiseled by Joseon porcelain and Ithaca’s firefly summer, where took his time to ruminate about his future, having left, again, his past.

joseon
An early 18th-century Joseon jar. Korean ceramics capture Alfred’s gentle elegance. Yet, he has a complex relationship with his Koreanness.

Upstairs we did the microphone dance, fumbling to hook the clip on my black jeans (one of the rare occasions where I was wearing pants). One of my father’s former colleagues gave the keynote. He walked through the long history of artificial intelligence, starting with efforts to encode formal logic and migrating through the sine curve undulations of research moving from top-down intelligent design (e.g., expert systems) to bottom-up algorithms (e.g., deep convolutional neural networks), abstraction moving ever closer to data until it fuses, meat on a bone, into inference.[2] He proposed that intellectual property had shifted from owning the code to building the information asset. He hinted at a thesis I am working to articulate in my forthcoming book about how contemporary, machine learning-based AI refracts humanity through the convex (or concave or distorted or whatever shape it ends up being) mirror of the space of observation we create with our mechanisms for data capture (which are becoming increasingly capacious with video and Alexa in every home, as opposed to being truncated to bilps in clickstream behavior or point of sale transactions), our measurement protocol, and the arabesque inversions of our algorithms. They key thing is that we no longer start with an Aristotelian formal cause when we design computational systems, which means, we no longer imagine the abstract, Platonic scaffold of some act of intelligence as a pre-condition of modeling it. Instead, as Andrei Karpathy does a good job articulating, we stipulate the conditions for a system to learn bottom-up from the data (this does not mean we don’t design, it’s just that the questions we ask as we make the systems require a different kind of abstraction that is affiliated with induction (as Peter Sweeney eloquently illustrates in this post)). This has pretty massive consequences for how we think about the relationship between man and machine. We need to stop pitting machine against man. And we need to stop spouting obsequious platitudes that the “real power comes from the collaboration of man and machine.” There’s something of a sham humanism in those phrases that I want to get to the bottom of. The output of a machine learning algorithm always already is, and becomes even more, as the flesh of abstraction moves closer to the bone of data (or vice versa?), the digested and ruminated and stomach acid-soaked replication of human activity and behavior. It’s about how we regurgitate. That’s why it does indeed make sense to think about bias in machine learning as the laundering of human prejudice.

A woman in the audience posed the final question to the panelists: you’ve spoken about the narrow capabilities of machine learning systems, but will it be possible for artificial intelligence to learn empathy?

A fellow panelist took the Turing Test approach: why yes, he said, there has been remarkable progress in mimicking even this sacred hallmark of the limbic system. It doesn’t matter if the machine doesn’t actually feel anything. What matters it that the machine manifests the signals of having felt something, and that may well be all that matters to foster emotional intelligence. He didn’t mention Soul Machines, a New Zealand-based startup making “incredibly life-like, emotionally responsive artificial humans with personality and character,” but that’s who I’d cite as the most sophisticated example of what UX/UI design can look like when you fuse the skill set of cinematic avatars, machine learning scientists, and neuroscientists (and even the voice of Cate Blanchett).

I disagreed. I am no affect expert (just a curious generalist fumbling my way through life), but believe empathy is remarkably complex for many reasons.

I looked at her directly, deeply. At not just at her, I looked into her. And what I mean by looking into her is that I opened myself up a little, wasn’t just a person protected by the distance of the stage (or, more precisely, the 4 brown leather bar stools with backs so low they only came up to vertebra 4 or 5, and all of us leaned in and out trying to find and keep a dignified posture, hands crossed into serenity, sometimes leaning forward). Yes, when I opened myself to engage with her I leaned forward almost to the point of resting my elbows on my thighs, no longer leaning back and, every few moments, returning my attention to the outer crevices of my eyes to ensure they were soft as my fellow panelists spoke. And I said, think about this. I’m up here on stage perceiving what I’m perceiving and thinking what I’m thinking and feeling what I’m feeling, and somehow, miraculously, I can project what I think you’re perceiving, what I think you’re thinking, what I think you’re feeling, and then, on top of that, I can perhaps, maybe, possibly start to feel what you feel as a result of the act of thinking that I think what you perceive, think, and feel. But even this model is false. It’s too isolated. For we’ve connected a little, I’m really looking at you, watching your eyes gain light as I speak, watching your head nod and your hands flit a little with excitement, and as I do this we’re coming together a little, entangling ourselves to become, at least for this moment, a new conjoint person that has opened a space for us to jointly perceive, think, and feel. We’re communicating. And perhaps it’s there, in that entangled space, where the fusion of true empathy takes place, where it’s sound enough to impact each of us, real enough to enable us to notice a change in what we feel inside, a change occasioned by connection and shared experience.

A emotional Turing test would be a person’s projection that another being is feeling with her. It wouldn’t be entangled. It would be isolated. That can’t be empathy. It’s not worthy of the word.

But, how could we know that two people actually feel the same feeling? If we’re going to be serious, let’s be serious. Let’s impose a constraint and say that empathy isn’t just about feeling some feeling when you infer that another person is feeling something, most often feeling something that would cause pain. It’s literally feeling the same thing. Again, I’m just a curious generalist, but know that psychologists have tools to observe areas of the brain that light up when some emotional experience takes place; so we could see if, during an act of empathy, the same spot lights up.[3] Phenomenologically,  however, that is, as the perceived, subjective experience of the feeling, it has to be basically impossible for us to ever feel the exact same feeling. Go back to the beginning of this blog post. When I walked into the National Club, my internal experience was that of walking into the Downtown Association more than 1.5 years earlier. I would hazard that no one else felt that, no one else’s emotional landscape for the rest of the evening was then subtly impacted by the emotions that arose during this reliving. So, no matter how close we come to feeling with someone when our emotional world us usurped, suddenly, by the experience of another, it’s still grafted upon and filtered through the lens of time, of the various prior experiences we’ve had that trigger that response and come to shape it. As I write, I am transported back to two occasions in my early twenties when I held my lovers in my arms, comforting and soothing them after each had learned about a friend’s suicide. We shared emotion. Deeply. But it was not empathy. My experience of their friends’ suicide was far removed. It was compassion, sympathy, but close enough to the bone to provide them space to cry.

So then we ask, if it’s likely impossible to feel the exact same feeling, then we should relax the constraint and permit that empathy need not be deterministic and exact, but can be recognized within a broader range. We can make it a probabilistic shared experience, an overlap within a different bound. If we relax that constraint, then can we permit a Turing test?

I still don’t think so. Unless we’re ok with sociopaths.

But how about this one. Once I was running down Junipero Serra Boulevard in Palo Alto. It was a dewy morning, dewy as so many mornings are in Silicon Valley. The rhythms of the summer are so constant: one wakes up to fog, daily, fog coming thick over the mountains from the Pacific. Eventually the fog burns and if you go on a bike ride down Page Mill road past the Sand Hill exit to 280 you can watch how the world comes to life in the sun, reveals itself like Michelangelo reveals form from marble. There was a pocket of colder, denser, sweeter smelling air on the 6.5-mile run I’d take from the apartment in Menlo Park through campus and back up Junipero Serra. I would anticipate it as I ran and was always delighted by the smell that hit me; it was the smell of hose water when I was a child. And then I saw a deer lying on the side of the road. She was huge. Her left foot shook in pain. Her eyes were pleading in fear. She begged as she looked at me, begged for mercy, begged for feeling. I was overcome by empathy. I stopped and stood there, still, feeling with her for a moment before I slowly walked closer. Her foot twitched more rapidly with the wince of fear. But as I put my hand on her huge, hot, sweating belly, she settled. Her eyes relaxed. She was calmed and could allow her pain without the additional fear of further hurt. I believe we shared the same feeling at that moment. Perhaps I choose to believe that, if only because it is beautiful.

The moment of connection only lasted a few minutes, although it was so deep it felt like hours. It was ruptured by men in a truck. They honked and told me I was an idiot and would get hurt. The deer was startled enough to jump up and limp into the woods to protect herself. I told the men their assumptions were wrong and ran home.

You might say that this is textbook Turing test empathy. If I can project that I felt the exact same feeling as an animal, if I can be that deluded, then what’s stopping us from saying that that the projection and perception of shared feeling is precisely what this is all about, and therefore it’s fair game to experience the same with a machine?

The sensation of love I felt with that deer left a lasting impression on me. We were together. I helped her. And she helped me by allowing me to help her. Would we keep the same traces of connection from machines? Should empathy, then, be defined by its durability? By the fact that, if we truly do connect, it changes us enough to stay put and be relived?

There are, of course, moments when empathy breaks down.

Consider breakdowns in communication at work or in intimate relationships. Just as my memory of the Downtown Association shaped, however slightly, my experience at Wednesday’s conference, so too do the accumulated interactions we have with our colleagues and partners reinforce models of what we think others think about us (and vice versa). These mental models then supervene upon the act of imagination to perceive, think, and feel like someone else. It breaks. Or, at the very latest, distorts the hyperparameters of what we can perceive. Should anything be genuinely shared in such a tangled web, it would be the shared awareness of the impossibility of identification. I’ve seen this happen with teams and seen it happen with partners. Ruts and little walls that, once built, are very difficult to erode.

Another that comes to mind is the effort required to empathize deeply with people far away from where we live and what we experience. When I was in high school, Martha Nussbaum, a philosopher at the University of Chicago who has written extensively about affect, came and gave a talk about the moral failings of our imagination. This was in 2002. I call her mentioning that we obsess far more deeply, we feel far more acutely, about a paper cut on our index finger or a blister on our right heel, than we do when we try to experience, right here and now, the pain of Rwandans during the genocide, of Syrian refugees packed damp on boats, of the countless people in North America razed from fentanyl. On the talk circuit for his latest book, Yuval Harari comments that we’ve done the conceptual work required to construct and experience a common identity (and perhaps some sort of communal empathy) with people we’ll never meet, who are far outside the local perception of the tribe, in constructing the nation. And that this step from observable, local community to imagined, national community was a far steeper step function than the next rung in the ladder from national to global identity (8,000,000 and 7,000,000,000 are more or less the same for the measly human imagination, whereas 8,000,000 feels a lot different than 20). Getting precise on the limits of these abstractions feels like worthwhile work for a 21st-century ethicists. After all, in its original guise, the trolley problem was not a deontological tool for us to pre-ponder and encode utilitarian values into autonomous vehicles. It was a thinking tool to illustrate the moral inevitability of presence.


I received LinkedIn invites after the talk. One man commented that he found my thoughts about empathy particularly insightful. I accepted his invitation because he took the time to listen and let me know my commentary had at least a modicum of value. I’ll never know what he felt as he sat in the audience during the panel. I barely know what I felt, as two and a half days of experience have already intervened to reshape the experience. So we grow, beings in time.


[1] Loyal blog readers will have undoubtedly noticed how many posts open with a similar sentence. I speak at a ton of conferences. I enjoy it: it’s the teacher’s instinct. As I write today, however, I feel alienated from the posts’ algorithmic repetition, betokening the rhythm of my existence. Weeks punctuated by the sharp staccato of Monday’s 15-minute (fat fully cut) checkins, the apportioned two hours to rewrite the sales narrative, the public appearances that can be given the space to dilate, and the perturbations flitting from interaction to interaction, as I gradually cultivate the restraint to clip empathy and guard my inside from noxious inputs. Tuesday morning, a mentor sent me this:

Screen Shot 2018-11-10 at 7.43.32 AM

[2] This is a loaded term. I’m using it here as a Bayesian would, but won’t take the time to unpack the nuances in this post. I interviewed Peter Wang for the In Context podcast yesterday (slated to go live next week) and we spoke about the deep transformation of the concept of “software” we’re experiencing as the abstraction layer that commands computers to perform operations moves ever closer to the data. Another In Context guest, David Duvenaud, is allergic to the irresponsible use of the word “inference” in the machine learning community (here’s his interview). Many people use inference to refer to a prediction made by a trained algorithm on new data it was not trained on: so, for example, if you make a machine learning system that classifies cats and dogs, the training stage is when you show the machine many examples of images with labels cat and dog and the “inference” stage is when you show the machine a new picture without a label and ask it, “is this a cat or a dog?” Bayesians like Duvenaud (I think it’s accurate to refer to him that way…) reserve the term inference for the act of updating the probability of a hypothesis in light of new observations and data. Both cases imply the delicate dance of generalization and induction, but imply it in different ways. Duvenaud’s concern is that by using the word imprecisely, we lose the nuance and therefore our ability to communicate meaningfully and therefore hamper research and beauty.

[3] Franco Moretti once told me that similar areas of the brain light up when people read Finnegans Wake (or was it Ulysses? or was it Portrait of the Artist? and the Bible (maybe Ecclesiastes?).

The featured image is Edouard Manet’s Olympia, unveiled in Paris in 1856. In the context of this post, it illustrates the utter impossibility of our empathizing with Olympia. The scorn and contempt in her eyes protects her and gives her power. She thwarts any attempt at possession through observation and desire, perhaps because she is so distanced from the maid offering her flowers, deflecting her gaze out towards the observer but looking askance, protecting within her the intimations of what she has just experienced, of the fact that there was a real lover but it was and will never be you. Manet cites Titian’s Venus of Urbino (1534), but blocks all avenues for empathy and connection, empowering Olympia through her distance. 

380px-Tiziano_-_Venere_di_Urbino_-_Google_Art_Project
Titian’s Venus of Urbino has a little sleeping dog, not a bristling black cat

The AlphaGo Documentary

Our narratives of Man versus Machine focus on Machine becoming Man, then surpassing Him.

Man is something that shall be overcome. Man is a rope, tied between beast and overman–a rope over an abyss. What is great in man is that he is a bridge and not an end. (Zarathustra, thus imaginarily reported by Friedrich Nietzche) 

The AlphaGo documentary is about Man qua Man[1], or, more precisely, about one man by the name of Lee Sedol, who has a soft, high-pitched voice, a wife, and a daughter. In March of 2016, Sedol went from being well known to Go fans to being well known to everyone after losing 4 out of 5 games to AlphaGo, a computer built by machine learning engineers at Deepmind.

Here is what the film beckoned me to see, feel, and infer.

1. Fear eats Man’s mind

And thus the native hue of resolution/ Is sicklied o’er with the pale cast of thought (Hamlet, Act III, Scene I) [2]

Sedol is a champion. He has cultivated excellence, put in his 10,000 hours of practice. Played game after game after game to get where he is today, working humbly and patiently with his coach. Playing Go the way he plays is an act of respect towards his elders, his nation, his family.

For Sedol, therefore, the match against AlphaGo was much more than a match. It was the appointed time to exhibit elegance, grace, and creativity above and beyond standard play. The moment when he left the hallowed halls of practice to squint into the harsh lights of the stage. When they applauded. When he bowed, and, lifting his as slowly as possible to protract time into the infinite dilation of Cantor’s continuity, pupils dilating into eye drop blurs, seconds half-lived to infinitesimals, further, until he couldn’t stop it anymore, until, as raised his head back up, he noticed his sense of self had changed, he observed himself being observed, knowing everyone was watching, rid himself of the caterpillar cloak called Lee Sedol to stretch his powdery wings as Man. He had become an allegory of human intelligence pitted against the machine.

No biggie. You got this. Just a little blip in history. Just a game. Underwear. Chickens in underwear with scraggly little legs hobbling under the weight of tubby guts bloated with donuts and Budweiser. Just like yesterday when no one was watching.

What a horrible place to be.

And yet, we honor it. We honor the resilience of the golfer who keeps his cool after a dud shot hooks way too far left. We honor the focus of the concert violinist who can make her way through the Mephistophelian haze of a Paganini caprice. We honor the ease excellent TED Talk speakers find when they share an idea they believe in. We honor it because we know how hard it is. Because we recognize that the difference between good and excellent is the fortitude of practice and the gumption to keep the mind in check, to settle its sabotage, to focus.

We are all Hamlet. Some of us more than others.

Sedol is also Hamlet. The documentary does a marvelous job eliciting our empathy as we watch him doubt, furrow, fear, apologize, strategize, wrestle with the pastiche reflection of what he could have done, who he could have been, how the narrative could have gone if only he had done this move instead of that move. We never hear the voices in his head but we can infer their clamor: “calm down, stay here, focus.” Sedol plays the game in context. He knows the stakes of the match and has no choice but to devote a portion of his brain to the everything else that is not the local task. It’s plausible that only 30% of his brain power could be devoted to the actual game.

AlphaGo has no voices in its head. It has no runaway probabilities. The only probabilities it calculates span the trees it searches to find the next move and win.

2. Man is a social animal who relies on nonverbal communication

Man is by nature a social animal; an individual who is unsocial naturally and not accidentally is either beneath our notice or more than human. Society is something that precedes the individual. Anyone who either cannot lead the common life or is so self-sufficient as not to need to, and therefore does not partake of society, is either a beast or a god. (Aristotle, Politics) [3]

AlphaGo has no hands. It has no face. Unless Deepmind decides to embody future versions in a robot somersaulting down the uncanny valley, it will never feel the silky lamination of a Go stone, never calm its nerves by methodically circling the stone between the pads of its right thumb and index finger as it contemplates its next move.

Like the infamous Godot in Samuel Beckett’s play, in the documentary, AlphaGo feels more like a prop than a character. It’s undoubtedly there, ubiquitous, but somehow also absent. Sedol engages with AlphaGo through a ventriloquist named Aja Huang, a Taiwanese computer scientist on the DeepMind team who is also an amateur 6-dan Go player. Sedol never engages with AlphaGo directly: only with its diplomat, its emissary.

Huang is no throw-away character. The ventriloquist could have been anyone: his task was to look at the digital display indicating AlphaGo’s move and translate this to the physical board by placing the stone in the right place. He could have carried out this task with zero knowledge of what it meant. Brawn without brains. Pure, robotic execution.

The positions of the stones mean something to Huang. He bridges two ways of seeing the game, like a computer scientist charting probabilities and like a Go player strategizing moves.

And this means that his face could have relayed emotional content back to Sedol, allowing the champion to plunder the emotional cues that are such an integral part of the game. In the first match, Sedol felt alienated because when he looked up at Huang to gather information from his temples, eyebrows, forehead, pupils, cheeks, lips, chin, elbows, freckles, arm hairs, face hairs, eyes, sweat beads, breath, aura, the signals were absent. Huang didn’t exhibit the weight of concentration or even the active restraint of a bluff. It was almost worse that he wasn’t just a robot man because he had enough knowledge to lead Sedol to anticipate emotional cues but fell short because his ego wasn’t engaged. He was, in the end, only an observer. The stage shifted to a theater of deliberate alienation, as in the movie The Lobster.

colinfarrelllobster
Huang is like Colin Farrell’s character in The Lobster because his emotional response does not match the context.

This inverted uncanny valley tells us something about how we communicate. It’s cliché to underscore the importance of nonverbal communication, but it was quite powerful to see how much Sedol typically relies upon emotional cues as animal, as mammal, when he plays against a normal opponent, and how the absence of those cues threw him off. I suspect some of the reticence we feel around trust and explainability stems from our brains processing the world as animals. We don’t actually require explanations from people to trust them and obey them. Power and persuasion seep through different seams.

3. Computer scientists and subject matter experts see the same thing different ways

While Huang speaks neural network and speaks Go, most of the DeepMind scientists lack the same bilingual subject matter expertise (I may be incorrect, but I’m pretty sure not everyone who worked in AlphaGo knows the game). Indeed, one fascinating aspect of contemporary machine learning is that the systems can learn what aspects of the data are relevant for a prediction or classification task rather than having a person apply their knowledge to hand pick which aspects will be most relevant. This is not universally the case, and it’s not to denigrate the value of subject matter expertise: on the contrary, there is excellent research afoot to make it easier for people with subject matter expertise in some domain–be that cancer diagnostics or fashion taste or 50-years of experience tweaking knobs to offset the quirks of an office building in lower Manhattan–to represent their knowledge as distributions and parameters without needing to be a scientist do to so. But a characteristic of the deep learning moment is that a crafty scientist can consider a problem abstractly, move away from the particular details we observe as the problem’s phenotype (e.g., a move in the game of Go) and focus on the mathematical underpinnings of the problem (e.g., the number of hidden layers or some other architectural choices to make in a neural network). Add to this that what makes a machine learning problem a machine learning problem is that there is too much variance for us to deterministically write out all the rules: instead, we provide primers that enable the system to iterate fast (that’s where we need all the computational power) to map inputs to outputs until the mapping works well most of the time. It’s like selecting the yeast that will yield the best bread.

Programs for playing games often fill the role in artificial intelligence research that the fruit fly Drosophila plays in genetics. Drosophilae are convenient for genetics because they breed fast and are cheap to keep, and games are convenient for artificial intelligence because it is easy to compare a computer’s performance on games with that of a person. (John McCarthy and Ed Feigenbaum, Tribute to Arthur Samuel) 

The AlghaGo documentary did a wonderful job juxtaposing how computer scientists tracked the game’s progress and how Sedol and the Go commentators tracked the game’s progress. The scientists viewed the game mathematically, as a series of abstract scores and probabilities. The players viewed the game phenotypically, as a series of moves on the board. It was two fundamentally different ways of viewing the same problem, illustrating the silos of communication companies that quickly emerge like tectonic plates shooting mountain sprouts in any enterprise. The endgame for the opponents was also quite different. The AlphaGo team was fundamentally interested in using Go as a testing ground for computational possibility, the particular use case required to explore the larger problem of building a system that can act intelligently. Sedol was fundamentally interested in playing perfect Go, and potentially abstracting lessons from play to other aspects of his life. These conflicting endgames are often at work in the dialectic of innovation, yin and yang dancing drunk through the discrete step changes of technological progress.

alphago team
The movie did an excellent job helping the viewer appreciate the different ways computer scientists and subject matter experts view the same problem.

I do wonder if we could rewrite the narrative of Man versus Machine as one of two different ways of creating, encapsulating, and sharing knowledge. The documentary made this about Demis Hassabis and the AlphaGo Team versus Sedol, West versus East, traditional culture versus computer science, two ways of representing knowledge and viewing the world. It’s ultimately a more grounded narrative. In our HBR Ideacast episode, I suggested to host Sarah Green Carmichael that it’s helpful to reframe a supervised learning system as “one human judgment versus the statistical average of thousands of human judgments,” and then ask which one you’d rather rely on. Granted, the new AlphaGo Zero system is one of self play, not one that mines past human judgment. But the yeast primer is still coded and crafted by human minds with a particular way of framing problems as engineered mathematical models.

4. Algorithms change how Man makes sense of the world

In a 2011 TED Talk, Kevin Slavin explained how trading algorithms have reshaped the physical landscape (we build structures to transmit the fastest signal possible so our algos can outcompete one another by fractions of a second). In a 2018 phone conversation, my partner John Frankel at ffVC helped me crystallize my understanding that task-specific machine learning algorithms are poised to reshape–if not already actively reshaping–our cognitive landscape.[4]

Much of the language used to describe AlphaGo betokens alienness and strangeness, facets of thought that are not only not human but antihuman. From a 2017 Atlantic article:

Since May, experts have been painstakingly analyzing the 55 machine-versus-machine games. And their descriptions of AlphaGo’s moves often seem to keep circling back to the same several words: Amazing. Strange. Alien.

“They’re how I imagine games from far in the future,” Shi Yue, a top Go player from China, has told the press. A Go enthusiast named Jonathan Hop who’s been reviewing the games on YouTube calls the AlphaGo-versus-AlphaGo face-offs “Go from an alternate dimension.” From all accounts, one gets the sense that an alien civilization has dropped a cryptic guidebook in our midst: a manual that’s brilliant—or at least, the parts of it we can understand.

AlphaGo makes a few moves in the match versus Sedol that flummox him. As a non Go player, I couldn’t make sense of the moves myself, but relied upon the commentary and interpretation offered by the film. What I took away was the sense that AlphaGo did not rely upon the same leading indicator heuristics that are the common tropes of seasoned Go players. If we think about it, it shouldn’t come as a surprise that the search space of 10172 positions (according to 11th-century Chinese scholar Shen Kuo) contains brilliance that has to date evaded master Go players. But what’s even more interesting is the cultural significance of knowledge transfer from generation to generation. If Go is staging ground for life, then mastery can and should be measured by analogical transferability and applicability. It’s like a pedagogical philosophy that values critical thinking: teach them Shakespeare, teach them whatever nouns you want, but focus on enabling them to transfer the verbs so they can shape shift to solve problems as they arise.

What comes off as alien is a system that is optimized for one task, regardless of analogy and transfer. And, one only win Go by one point, not many. The move with the highest likelihood of winning by the narrowest of margins will look different than the move that betokens potentially less likelihood of success, but a larger cushion. Map this to making big choices in life: most people study the safe subject to keep options open rather than following the risky path of studying what they love. The tradeoffs are different. Their optimizing for different types of outcomes and using a different calculus.

So, following John Frankel, I’d like to propose that our heuristics will change as our minds increasingly engage with tools that optimize ruthlessly against one task. Machine Go is different than Man Go because it’s not designed as a pedagogical tool to teach life lessons. It’s designed to win, designed to exploit the logic of one search space and one game and one set of rules. But that need not be all that bad. There’s something lovely in coming to terms with the fact that success only requires one point, that we need not rely upon the greedy heuristics that are familiar as we navigate the world. What’s deemed as alien is a means of coming to terms with our own predilections to generalize, when it may not always (or often) be the best bellwether of success. It’s the inverse interpretation of Bostrom’s paper clip optimization monster. An invitation for us to ponder our values and ethical stance as we increasingly interact with algorithms geared to optimize without questioning if that’s ultimately what we want and need.

pythagoreancup

The Pythagorean Cup is at once practical joke, physics lesson, and moral chastiser. If you are greedy and put too much wine into the cup, a siphon effect kicks in and all the liquid drains out. This kind of analogical triple meaning is the opposite of algorithmic thinking in its current form.

Conclusion

The AlphaGo documentary left me feeling empathy and admiration for Lee Sedol. Not as a Go champion, not as an allegory of Man’s Intelligence, but as a man. His humility was beautiful. His striving was admirable. His kindness towards his daughter was noticeable. His Korean duty was evident. He was many features cobbled into a being, with feelings and a heartbeat, and a mind. He learned something from the matches and lost gracefully, shaking Hassabis’ hand as he left the press conference, cameras flashing in his wake.


[1] Gender warriors, please do forgive me. There’s implicit critique about who controls the AI narrative in keeping the reference to Man, and I find the capitalization lends a curious aura of allegory to this post, which is riddled with references to male heroes.

[2] Rainer Werner Fassbinder (whose name I always mistake for Rainer Maria until I remember that’s the other Rainer, the Rilke Rainer) has a marvelous film entitled Angst Essen Seele Auf, translated as Ali: Fear Eats the Soul (which I should be translated as Ali: Fear Eat the Soul to better capture the grammatical error Ali, one of the film’s protagonists, makes when he speaks broken German without conjugating verbs) about an “almost accidental romance kindled between a German woman in her mid-sixties and a Moroccan migrant worker around twenty-five years younger.” While released in 1973, the lessons are all the more relevant today. The other eating metaphor on my mind is Andreessen’s software eating the world, and now Steven Cohen and Matthew Granade saying that models will run the world. What Cohen and Granade get right in their article is that AI systems are about much more than just Jupyter notebooks with models. You have to put models into production, use them as hypothesis to build closed-loop systems that get better as they engage with the world. So, so, so, so, so, so, so many companies still seem to miss this part. It’s hard, and requires work that isn’t deemed sexy by the cognoscenti and the rockstars (how awesome is the word cognoscenti?).

[3] I’ve been thinking a lot about the social value of work and the workplace and have the early, what-my-mind-does-walking-and-running-level intuitions of a blog post about why work is an opportunity to experience positive, Aristotelian freedom (where self-actualization occurs through participation in a common, social goal) versus negative freedom (how we normally conceptualize freedom as the absence of constraint for the individual) and what that means for team and meaning and also the intrinsic value of work (for the leisure promised by some UBI pundits rubs me the wrong way; not all UBI pundits believe self-actualization is an individual project, and the most sober ones think it’s a bump needed to become more socially connected (including Charles Murray, which is interesting…). Stay tuned.

[4] John has an uncanny ability to understand and represent the heart of the matter in emerging technologies. It’s a privilege to learn from him. I’ve mentioned this before on the blog, but John also has the world’s best out-of-office emails, which have inspired my own (mine are far less sardonic and far more earnest, not by choice but by the ineluctable traps of my style).

The featured image is from an article the newspaper Korea Portal posted March 15, 2016. In the article, Sedol says: “I wanted to end the tournament with good results, but feel sad that I couldn’t do it. As I said before, this is not a loss for man, but a loss for me. This tournament really showed what my shortcomings are.” As in the documentary, Sedol interprets his loss as a personal failure. He doesn’t view himself as the representative of mankind. This isn’t man versus machine. It is one match. One man versus his opponent. But because the opponent doesn’t feel like Sedol does, doesn’t care if it wins, it becomes one man versus himself. 

Who’s allowed to write about technology?

I recently published an article about explainability in machine learning systems for the Harvard Business Review. The article argues that many businesses get stuck applying machine learning because they worry about black boxes; that they should think about what matters for a given use case, as sometimes other governance and assessment metrics are more relevant than an explanation (e.g., precision and recall for information retrieval); and that a close reading of recital 71 in the EU GDPR suggests that an individual’s right to an explanation applies to the procedures used to build and govern the entire system, not to which input features, with which weights, lead to which outputs.[1]

The article’s goal is to help businesses innovate. It seeks to empower people by helping them ask the right questions. The battle cry is: There’s no silver bullet. You have to think critically. Compliance and business teams should align with data scientists early in the machine learning system-development process to align on constraints required for a given use case. Businesses should be as clear as possible on what algorithms actually optimize for, as ethical pitfalls arise between what we can and can’t measure, what our data do and do not index about the world.[3]

The day it was published, I received two comically opposite responses from data scientists working in executive positions in technology companies. The first complimented me, mentioning that they were pleasantly surprised to see someone with my educational background writing so cogently about machine learning. The second condemned me, mentioning that someone with my educational background had no right to write about machine learning and that I was peddling dangerous hype.

I didn’t learn much from the compliment. My Mr. Peanut Butter labrador ego enjoyed being stroked.

I learned a few things from the critique. It helped clarify some of my own tacit assumptions, my ideology, my ethics, the grey matter between the words, the stuff that makes it hard to write because it feels vulnerable and exposed, the implicit stuff that signals community acceptance and alignment and that we rarely sit back, unpack, analyze, and articulate.

Here are some of the lessons.

1. Precision always matters

As with (close to) all tricky situations, I wonder if I did the right thing at the time. Upon being attacked, I chose to diffuse rather than ignite. I thanked the person for voicing their critique and disengaged. The only thing I mentioned was that it appeared that they drew their conclusion from the title of my article rather than its content. They stated that scientists have long had ways to interpret the output of neural networks and that I was peddling hype to write an article entitled “When is it important for an algorithm to explain itself?” I was surprised to see the narrow focus on algorithmic interpretability, as I felt the work my article did was to expand the analytical framework of explainability to systems and procedures, not just algorithm. So, underneath the amygdala’s attack response, my mind said “Did they even read it?”

It took me a few minutes to put (what I assume are) the pieces together. I don’t write the titles for my HBR articles and hadn’t taken the time to internalize how the title could be interpreted. When my editor suggested it, I quickly approved. Had I felt it was important that the title precisely reflect the content, I would have recommended that we say “an algorithmic system,” not an “algorithm” and say “when is it important for businesses to consider explainability in machine learning systems” versus implying that algorithms have agency. (Although it is a thought-provoking and crucial task to think about how we can and should design system front-ends to translate math-speak into people-speak, be that to communicate and quantify uncertainty or to indicate other performance metrics in a way that is meaningful and useful to developers and users.)

Let me be clear on the lesson here. I try to take as much responsibility as possible for outcomes, especially negative ones. I approved this title very quickly because I was excited to see the piece go live. Next time, I’ll think more deeply. I’m not blaming the HBR editorial team. They thought about the title and considered a few different options. I love working with my editor. He’s a wonderful partner. We bounce topics by one another. He’ll push back on stuff he’s not excited about; I’ll do the same for him. What I enjoy most is giving him feedback on questions unrelated to my writing. I like giving back, as he has done much to help me build my reputation as a writer.

The reason I wonder if I did the right thing is that I wonder if it is my duty to other writers, to other professionals, to have stood up for myself as opposed to stepping back and disengaging. But attacks beget strong emotions. For all of us. I needed time to think and let the lessons sink in. This is, obviously, my response.

2. Read things before sharing them and commenting on them

I’m guilty of having shared things without reading them, or having only skimmed an abstract. Mostly because I’m busy and can be impulsive. This is a good lesson about why that’s always a bad idea. I’d feel ashamed if people suspected I hadn’t read an article before critiquing it. There’s a lot at stake here, like democracy. It’s meaningful to engage deeply and charitably with another person’s ideas. To take the time to understand what they are trying to communicate, to find an opportunity to refine an idea, challenge an idea, improve the structure or flow. To teach one another.

3. Don’t judge someone based on their resume

The person who critiqued me seemed to draw conclusions about what I could and could not know based on my LinkedIn profile. That doesn’t reveal that much. You don’t see that much of my college education was funded by Siemens because I was one of 2 female students in New England awarded for having the highest scores on our math and science AP tests. You don’t see that I was a math major at U Chicago who always got straight As in math and struggled much more in humanities, but was ultimately more interested in literature so decided to pursue that path. You don’t see how much analysis, complex analysis, number theory, linear algebra, and group theory I studied. You don’t see how I came to understand how important that training would be once I ended up working in machine learning. You don’t see that I focused on history of philosophy and math in the 17th and 18th century in graduate school, and, while the specific math of that period is no outdated, have thought deeply about the philosophical questions associated with statistics, empirical science, and the diffusion of knowledge in the wake of new scientific discovery. You don’t see how my fellow literature graduate students told me reading my writing was like being in a prison because I was always trying to prove things, as in a math proof.

You might see that I’m not interested in competing with machine learning researchers in their own field. I want to drink in as much of their thinking as possible, want to learn everything they can teach me, want to understand why and what it means for the field, want to experience the immense joy of recognizing structural similarities between two disciplines or applications that can be the seat of innovation, that place where you realize that a mathematical technique originally explored for problem A comes into its own in the world in problem B.

You might see that the role I’ve come to accept is that of the translator, the generalist curious enough to dive deeply into whatever subject matter I’m working on, but will never be the disciplined expert. There will always be questions and gaps. Always more to learn and explore. Always people who can go deeper and narrower. Like Sheldon Levy, I viciously and vibrantly admire those whose creative minds will discover things, will reframe problems to uncover solutions stuck for centuries. They are the heroes. All we do is to sing their song, and help others hear its beauty.

4. Allow people to learn

This is the most important lesson. The one I care about. The one that puts fire into my heart and makes my fingers type quickly.

We must allow people to learn from experiences after school. We must not accept a world where the priests alone are allowed to understand, where the experts alone have the authority to write about, talk about, and share ideas about a subject. Technologies like artificial intelligence are already impacting us all. Work will change. Jobs will change. New jobs and new opportunities will arise. If people are not given the space to change, to learn, if the only people we deem qualified to do this work, to write about this work, are those who come with a certain PhD, a certain educational certificate, a certain type of social rubric of authority, we are fucked. We must trust that people can learn new things and find ways to give them opportunities. We must engage with one another so as to promote openness, to probe and push without the searing pain of judgment, to provide people with the confidence required to ask the simple questions needed to get to the heart of the matter, to give people the breathing room to embrace the initial anxiety of change so they can come to do something new.

No, I did not do my PhD in machine learning. It’s not impossible that I won’t go back and do a second PhD in reinforcement learning, as I find the epistemological questions associated with that subfield incredibly rich, incredibly akin to the perennial questions I have loved in the Greeks, in the early moderns, and again today. Time will tell. I have, however, worked in the field over the past few years. I was fortunate enough to have been granted the opportunity to learn a lot at Fast Forward Labs. I will forever be grateful to Hilary Mason for giving me a chance to help her build her business, and for believing that I could learn. I learned. I am still learning. I write about what I’ve come to learn, and accept criticism, feedback, refinements, all the stuff other people can share with me to expand my understanding and help us all grow. I’m decent at recognizing what I know with precision and where my knowledge starts to falter into fuzziness, and I tell people that. I have made mistakes, thought about them deeply, and try my best not to make them a second time (which I don’t always succeed it). I love enabling others to build sound intuitions about mathematical concepts and technology. To feel empowered, feel like they get it in ways they hadn’t before. And not because it’s dumbed down. Not because we resort to the Platonic blindfolds for the masses. Because we can all do it. It’s just that we have to break down the power walls, break down the barriers, break down our egos, and do our best to make something meaningful.

I will fight for it. There are too many people who hold themselves back because they are excluded from circles protecting themselves within elitism. Everyone deserves a voice. Everyone deserves a chance to understand.

5. Everyone should be allowed to write about technology

It’s not all going to be good. There is a lot of hype. I’m not sure hype is all bad, as it has the power to mobilize large groups of people who wouldn’t otherwise be interested. Nothing like the fine print of precise qualifications to dampen the mood of disruptive innovation. There is damage when the hype breeds unnecessary fear, rather than unbounded excitement. And there’s certainly lots of work to be done to help businesses bring expectations down to earth to capitalize on what’s possible. But there’s wonderful satisfaction that emerges when businesses start to get traction with a narrowly-focused, real-world application. And it takes different people with different viewpoints from different teams to make that happen, in particularly in established enterprises with their processes and people and quirks and habits and culture.

We may not all want to write, but we all have a part to play. And I’ll always subscribe to the Ratatouille philosophy: it’s not that everyone can cook, but that the greatest chef in the world may not necessarily have a resume our priors deem likely to succeed.


[1] Peter Sweeney has written many great articles about epistemology and AI, and argued that we should conceptualize the outputs of machine learning algorithms as observations, not explanations. He was responding to David Weinberger, who has argued that we should focus governance efforts on optimization, not explanation. I’m partial to that, but again think it depends on the use case. Nick Frosst, who wrote the capsule network paper with Sara Sabour[2] and Geoff Hinton, thinks that interpretability (I must admit that I use the words interpretability and explainability interchangeably, and should take the time to parse the two, both philosophically and technically) is important because those creating systems and those impacted by systems should have the right to intervene to change their behaviour or change the system to change outcomes. So, for example, if a system denies an individual a mortgage because they missed their last 3 credit card payments, then that gives the individual meaningful recourse to act differently to meet the requisite rules in the future. It does indeed get dicey if there are so many dimensions that end up correlated to some output that impacts big deal opportunities for real people living real lives. I analyzed a couple of examples in this podcast.

[2] I’m 99% confident that Sara wasn’t able to attend NIPS in Los Angeles last year because she is Iranian. Knowledge, and credit for new knowledge, is cosmopolitan.

[3] My company recently published a framework to help consumer enterprises develop responsible machine learning systems. It’s practical and breaks down the different privacy, security, governance, and ethics questions cross-functional teams should ask and address at different points in the machine learning system-development process. We worked hard on it. I’m proud of it.

The featured image is of Remy the rat chef. He has a heightened sense of taste and smell but is naturally overlooked as an awesome chef because he’s a rat. He ends up making a ratatouille that softens the curmudgeonly critique because it brings him back to his childhood like Proust’s madeleine. So worth watching over and over again. 

Artificial Intelligence and the Fall of Eve

We seem to need foundational narratives.

Big picture stories that make sense of history’s bacchanal march into the apocalypse.

Broad-stroke predictions about how artificial intelligence (AI) will shape the future of humanity made by those with power arising from knowledge, money, and/or social capital.[1] Knowledge, as there still aren’t actually that many real-deal machine learning researchers in the world (despite the startling growth in paper submissions to conferences like NIPS), people who get excited by linear algebra in high-dimension spaces (the backbone of deep learning) or the patient cataloguing of assumptions required to justify a jump from observation to inference.[2] Money, as income inequality is a very real thing (and a thing too complex to say anything meaningful about in this post). For our purposes, money is a rhetoric amplifier, be that from a naive fetishism of meritocracy, where we mistakenly align wealth with the ability to figure things out better than the rest of us,[3] or cynical acceptance of the fact that rich people work in private organizations or public institutions with a scope that impacts a lot of people. Social capital, as our contemporary Delphic oracles spread wisdom through social networks, likes and retweets governing what we see and influencing how we see (if many people, in particular those we want to think like and be like, like something, we’ll want to like it too), our critical faculties on amphetamines as thoughtful consideration and deliberation means missing the boat, gut invective the only response fast enough to keep pace before the opportunity to get a few more followers passes us by, Delphi sprouting boredom like a 5 o’clock shadow, already on to the next big thing. Ironic that big picture narratives must be made so hastily in the rat race to win mindshare before another member of the Trump administration gets fired.

Most foundational narratives about the future of AI rest upon an implicit hierarchy of being that has been around for a long time. While proffered by futurists and atheists,  the hierarchy dates back to the Great Chain of Being that medieval Christian theologists like Thomas Aquinas built to cut the physical and spiritual world into analytical pieces, applying Aristotelian scientific rigor to the spiritual topics.

Screen Shot 2018-06-03 at 10.43.22 AM
Aquinas’ hierarchy of being on a blog by a fellow named David Haines I know nothing about but that seems to be about philosophy and religion.

The hierarchy provides a scale from inanimate matter to immaterial, pure intelligence. Rocks don’t get much love on the great chain of being, even if they carry the wisdom and resilience of millions of years of existence, contain, in their sifting shifting grain of sands, the secrets of fragility and the whispered traces of tectonic plates and sunken shores. Plants get a little more love than rocks, and apparently Venus fly traps (plants that resemble animals?) get more love than, say, yeast (if you’re a fellow member of the microbiome-issue club, you like me are in total awe of how yeast are opportunistic sons of bitches who sense the slightest shift in pH and invade vulnerable tissue with the collective force of stealth guerrilla warriors). Humans are hybrids, half animal, half rational spirit, our sordid materiality, our silly mortality, our mechanical bodies ever weighting us down and holding us back from our real potential as brains in vats or consciousnesses encoded to live forever in the flitting electrons of the digital universe. There are a shit ton of angels. Way more angel castes than people castes. It feels repugnant to demarcate people into classes, so why not project differences we live day in and day out in social interactions onto angels instead? And, in doing so, basically situate civilized aristocrats as closer to God than the lower and more animalistic members of the human race? And then God is the abstract patriarch on top of it all, the omnipotent, omniscient, benevolent patriarch who is also the seat of all our logical paradoxes, made of the same stuff as Gödel’s incompleteness theorem, the guy who can be at once father and son, be the circle with the center everywhere and the circumference nowhere, the master narrator who says, don’t worry, I got this, sure that hurricane killed tons of people, sure it seems strange that you can just walk into a store around the corner a buy a gun and there are mass shootings all the time, but trust me, if you could see the big picture like I see the big picture, you’d get how this confusing pain will actually result in the greatest good to the most people.

IMG_4395
Sandstone in southern Utah, the momentary, coincidental dance of wind and grain petrified into this shape at this moment in time. I’m sure it’s already somewhat different.

I’m going to be sloppy here and not provide hyperlinks to specific podcasts or articles that endorse variations of this hierarchy of being: hopefully you’ve read a lot of these and will have sparks of recognition with my broad stroke picture painting.[4] But what I see time and again are narratives that depict AI within a long history of evolution moving from unicellular prokaryotes to eukaryotes to slime to plants to animals to chimps to homo erectus to homo sapiens to transhuman superintelligence as our technology changes ever more quickly and we have a parallel data world where leave traces of every activity in sensors and clicks and words and recordings and images and all the things. These big picture narratives focus on the pre-frontal cortex as the crowning achievement of evolution, man distinguished from everything else by his ability to reason, to plan, to overcome the rugged tug of instinct and delay gratification until the future, to make guesses about the probability that something might come to pass in the future and to act in alignment with those guesses to optimize rewards, often rewards focused on self gain and sometimes on good across a community (with variations). And the big thing in this moment of evolution with AI is that things are folding in on themselves, we no longer need to explicitly program tools to do things, we just store all of human history and knowledge on the internet and allow optimization machines to optimize, reconfiguring data into information and insight and action and getting feedback on these actions from the world according to the parameters and structure of some defined task. And some people (e.g., Gary Marcus or Judea Pearl) say no, no, these bottom up stats are not enough, we are forgetting what is actually the real hallmark of our pre-frontal cortex, our ability to infer causal relationships between phenomena A and phenomena B, and it is through this appreciation of explanation and cause that we can intervene and shape the world to our ends or even fix injustices, free ourselves from the messy social structures of the past and open up the ability to exercise normative agency together in the future (I’m actually in favor of this kind of thinking). So we evolve, evolve, make our evolution faster with our technology, cut our genes crisply and engineer ourselves to be smarter. And we transcend the limitations of bodies trapped in time, transcend death, become angel as our consciousness is stored in the quick complexity of hardware finally able to capture plastic parallel processes like brains. And inch one step further towards godliness, ascending the hierarchy of being. Freeing ourselves. Expanding. Conquering the march of history, conquering death with blood transfusions from beautiful boys, like vampires. Optimizing every single action to control our future fate, living our lives with the elegance of machines.

It’s an old story.

Many science fiction novels feel as epic as Disney movies because they adapt the narrative scaffold of traditional epics dating back to Homer’s Iliad and Odyssey and Virgil’s Aeneid. And one epic quite relevant for this type of big picture narrative about AI is John Milton’s Paradise Lost, the epic to end all epics, the swan song that signaled the shift to the novel, the fusion of Genesis and Rome, an encyclopedia of seventeenth-century scientific thought and political critique as the British monarchy collapsed under  the rushing sword of Oliver Cromwell.

Most relevant is how Milton depicts the fall of Eve.

Milton lays the groundwork for Eve’s fall in Book Five, when the archangel Raphael visits his friend Adam to tell him about the structure of the universe. Raphael has read his Aquinas: like proponents of superintelligence, he endorses the great chain of being. Here’s his response to Adam when the “Patriarch of mankind” offers the angel mere human food:

Adam, one Almightie is, from whom
All things proceed, and up to him return,
If not deprav’d from good, created all
Such to perfection, one first matter all,
Indu’d with various forms various degrees
Of substance, and in things that live, of life;
But more refin’d, more spiritous, and pure,
As neerer to him plac’t or neerer tending
Each in thir several active Sphears assignd,
Till body up to spirit work, in bounds
Proportiond to each kind.  So from the root
Springs lighter the green stalk, from thence the leaves
More aerie, last the bright consummate floure
Spirits odorous breathes: flours and thir fruit
Mans nourishment, by gradual scale sublim’d
To vital Spirits aspire, to animal,
To intellectual, give both life and sense,
Fansie and understanding, whence the Soule
Reason receives, and reason is her being,
Discursive, or Intuitive; discourse
Is oftest yours, the latter most is ours,
Differing but in degree, of kind the same.

Raphael basically charts the great chain of being in the passage. Angels think faster than people, they reason in intuitions while we have to break things down analytically to have any hope of communicating with one another and collaborating. Daniel Kahnemann’s partition between discursive and intuitive thought in Thinking, Fast and Slow had an analogue in the seventeenth century, where philosophers distinguished the slow, composite, discursive knowledge available in geometry and math proofs from the fast, intuitive, social insights that enabled some to size up a room and be the wittiest guest at a cocktail party.

Raphael explains to Adam that, through patient, diligent reasoning and exploration, he and Eve will come to be more like angels, gradually scaling the hierarchy of being to ennoble themselves. But on the condition that they follow the one commandment never to eat the fruit from the forbidden tree, a rule that escapes reason, that is a dictum intended to remain unexplained, a test of obedience.

But Eve is more curious than that and Satan uses her curiosity to his advantage. In Book Nine, Milton fashions Satan in his trappings as snake as a master orator who preys upon Eve’s curiosity to persuade her to eat of the forbidden fruit. After failing to exploit her vanity, he changes strategies and exploits her desire for knowledge, basing his argument on an analogy up the great chain of being:

O Sacred, Wise, and Wisdom-giving Plant,
Mother of Science, Now I feel thy Power
Within me cleere, not onely to discerne
Things in thir Causes, but to trace the wayes
Of highest Agents, deemd however wise.
Queen of this Universe, doe not believe
Those rigid threats of Death; ye shall not Die:
How should ye? by the Fruit? it gives you Life
To Knowledge? By the Threatner, look on mee,
Mee who have touch’d and tasted, yet both live,
And life more perfet have attaind then Fate
Meant mee, by ventring higher then my Lot.
That ye should be as Gods, since I as Man,
Internal Man, is but proportion meet,
I of brute human, yee of human Gods.
So ye shall die perhaps, by putting off
Human, to put on Gods, death to be wisht,
Though threat’nd, which no worse then this can bring.

 

Satan exploits Eve’s mental model of the great chain of being to tempt her to eat the forbidden apple. Mere animals, snakes can’t talk. A talking snake, therefore, must have done something to cheat the great chain of being, to elevate itself to the status of man. So too, argues Satan, can Eve shortcut her growth from man to angel by eating the forbidden fruit. The fall of mankind rests upon our propensity to rely on analogy. May the defenders of causal inference rejoice.[5]

The point is that we’ve had a complex relationship with our own rationality for a long time. That Judeo-Christian thought has a particular way of personifying the artifacts and precipitates of abstract thoughts into moral systems. That, since the scientific revolution, science and religion have split from one another but continue to cross paths, if only because they both rest, as Carlo Rovelli so beautifully expounds in his lyrical prose, on our wonder, on our drive to go beyond the immediately visible, on our desire to understand the world, on our need for connection, community, and love.

But do we want to limit our imaginations to such a stale hierarchy of being? Why not be bolder and more futuristic? Why not forget gods and angels and, instead, recognize these abstract precipitates as the byproducts of cognition? Why not open our imaginations to appreciate the radically different intelligence of plants and rocks, the mysterious capabilities of photosynthesis that can make matter from sun and water (WTF?!?), the communication that occurs in the deep roots of trees, the eyesight that octopuses have all down their arms, the silent and chameleon wisdom of the slit canyons in the southwest? Why not challenge ourselves to greater empathy, to the unique beauty available to beings who die, capsized by senescence and always inclining forward in time?

IMG_4890
My mom got me herbs for my birthday. They were little tiny things, and now they look like this! Some of my favorite people working on artistic applications of AI consider tuning hyperparameters to be an act akin to pruning plants in a garden. An act of care and love.

Why not free ourselves of the need for big picture narratives and celebrate the fact that the future is far more complex than we’ll ever be able to predict?

How can we do this morally? How can we abandon ourselves to what will come and retain responsibility? What might we build if we mimic animal superintelligence instead of getting stuck in history’s linear march of progress?

I believe there would be beauty. And wild inspiration.


[1] This note should have been after the first sentence, but I wanted to preserve the rhetorical force of the bare sentences. My friend Stephanie Schmidt, a professor at SUNY Buffalo, uses the concept of foundational narratives extensively in her work about colonialism. She focuses on how cultures subjugated to colonial power assimilate and subvert the narratives imposed upon them.

[2] Yesterday I had the pleasure of hearing a talk by the always-inspiring Martin Snelgrove about how to design hardware to reduce energy when using trained algorithms to execute predictions in production machine learning. The basic operations undergirding machine learning are addition and multiplication: we’d assume multiplying takes more energy than adding, because multiplying is adding in sequence. But Martin showed how it all boils down to how far electrons need to travel. The broad-stroke narrative behind why GPUs are better for deep learning is that they shuffle electrons around criss-cross structures that look like matrices as opposed to putting them into the linear straight-jacket of the CPU. But the geometry can get more fine-grained and complex, as the 256×256 array in Google’s TPU shows. I’m keen to dig into the most elegant geometry for designing for Bayesian inference and sampling from posterior distributions.

[3] Technology culture loves to fetishize failure. Jeremy Epstein helped me realize that failure is only fun if it’s the mid point of a narrative that leads to a turn of events ending with triumphant success. This is complex. I believe in growth mindsets like Ray Dalio proposes in his Principles: there is real, transformative power in shifting how our minds interpret the discomfort that accompanies learning or stretching oneself to do something not yet mastered. I jump with joy at the opportunity to transform the paralyzing energy of anxiety into the empowering energy of growth, and believe its critical that more women adopt this mindset so they don’t hold themselves back from positions they don’t believe they are qualified for. Also, it makes total sense that we learn much, much more from failures than we do from successes, in science, where it’s important to falsify, as in any endeavor where we have motivation to change something and grow. I guess what’s important here is that we don’t reduce our empathy for the very real pain of being in the midst of failure, of not feeling like one doesn’t have what other have, of being outside the comfort of the bell curve, of the time it takes to outgrow the inheritance and pressure from the last generation and the celebrations of success. Worth exploring.

[4] One is from Tim Urban, as in this Google Talk about superintelligence. I really, really like Urban’s blog. His recent post about choosing a career is remarkably good and his Ted talk on procrastination is one of my favorite things on the internet. But his big picture narrative about AI irks me.

[5] Milton actually wrote a book about logic and was even a logic tutor. It’s at once incredibly boring and incredibly interesting stuff.

The featured image is the 1808 Butts Set version of William Blake’s “Satan Watching the Endearments of Adam and Eve.” Blake illustrated many of Milton’s works and illustrated Paradise Lost three times, commissioned by three different patrons. The color scheme is slightly different between the Thomas, Butts, and Linnell illustration sets. I prefer the Butts. I love this image. In it, I see Adam relegated to a supporting actor, a prop like a lamp sitting stage left to illuminate the real action between Satan and Eve. I feel empathy for Satan, want to ease his loneliness and forgive him for his unbridled ambition, as he hurdles himself tragically into the figure of the serpent to seduce Eve. I identify with Eve, identify with her desire for more, see through her eyes as they look beyond the immediacy of the sexual act and search for transcendence, the temptation that ultimately leads to her fall. The pain we all go through as we wise into acceptance, and learn how to love. 

Screen Shot 2018-06-03 at 8.22.32 AM
Blake’s image reminds me of this masterful kissing scene in Antonioni’s L’Avventura (1960)The scene focuses on Monica Vitti, rendering Gabriele Ferzetti an instrument for her pleasure and her interior movement between resistance and indulgence. Antonioni takes the ossified paradigm of the male gaze and pushes it, exposing how culture can suffocate instinct as we observe Vitti abandon herself momentarily to her desire.

Of Basilicas and Sandcastles

Sandstone blushed pink, washed gold drips spires like kids plodding sand clods, layer upon layer tapering into antique vases and Victorian crowns, cobweb queens crooning nocturnal arias of desert winds in desert pines, ghosts within Native American ghosts, burnt sage bushes carmelizing wellbeing and peace as they caress their canyons, their friends, leaving them be, pruning caves where they might dwell and carve and paint and eat, ghosts long silenced by Manifest Destiny blasting His metallic, electric, self-driving cries to Mars, for yes, the future is already here, just not evenly distributed, moguls gluttonously rich as our anorexic middle class, addicted to their heroine gaze in selfie sticks and Facebook Likes, vanishes from photos like Trotsky, a Mexican ice pick nailing his moustache to the cross, forsaken by his father as parched tears transubstantiate into blood droplets fixed in sacred time together with the ocean hoodoos, voodoo rocks moving, flowing, crooning their ocean hymns with the wind queens till ice cracks the foundations and the avalanche falls.

IMG_4466
The precious colour schemes of Bryce Canyon, up close.

But, somehow, these same rosy blushes and gold lashes appear in Barcelona, on the recently restored façade of Gaudí’s unfinished Sagrada Familia.

I behold the Barcelonan stone blushed pink, washed gold and time somersaults in my lonely chest. Bryce Canyon and the Sagrada Familia stand, silent, 5,586 miles apart. They have co-existed for 136 years. Neither is complete; neither ever will be. The truth is that their pink gold stones likely aren’t very similar to the scientific eye. But to me they are. Similar enough to tear apart the fabric of time as a lover tears silk to expose milk skin, Harem beauty, breasts blanched only by moon rays. So similar that tears pierce their possibility. I don’t hold them back even if others are watching. The others are busy being with loved ones anyway. No one watches. No one except every sometimes the perfect meeting eye to eye, not the groping kind, the seeking kind, the kind astonished to have encountered a self, a soul, curious to see deep inside for an instant, to cradle the shock of what must be beauty, observant enough to recognize unique meeting unique before the footsteps go too far and a we vanishes, stillborn.

I stand alone in Barcelona, walking streets watching others walk streets with others. My very loneliness grants me passage back to the silent sandcastles of Bryce Canyon. Inside this crevice of similarity, I recognize two separate constructions that have come to be one through the patient ravishing of wear and time. In Barcelona, by the grime of the city, exhaust from bankers’ lungs twitching stock exchange profit, orange precipitating scallops doused in chorizo oil; in Bryce, by the violence of the desert, ice tearing limestone hymen with glacier patience, tourist footsteps gently tweezing out the old ocean soul in camera flashes and plastic baggies. The difference is that in Barcelona this pristine blush of pink and gold only juts out when juxtaposed to the tarnished, uncleaned façade, whereas in Bryce it cannot but swallow everyone in its magnificence. It’s likely I wouldn’t have perceived the shocking similarity had I visited Barcelona a few months later, presuming the restoration work will have advanced to no longer leave the striking difference between clean and dirty façades. And I wouldn’t have been primed to see the similarity had I not visited Bryce Canyon just a few weeks before. I, then,–and by I, I mean the set of experiences collected into this unifier we call memory and consciousness, where analogies forge similarity in blacksmith strokes—am the condition of the similarity. It took me moving between continents to notice this unique and beautiful elision. And, it’s likely that it took me being alone to feel it deeply enough to make it matter. Had I wandered the world with a companion, I probably would have noticed the similarities, but they probably wouldn’t have penetrated deeply into the place where the beauty breathes so pure it hurts. Hurts because it carries with it the basic fact of my existence, inviting me to have a seat. To feast upon my life.

img_4762.jpg
It’s the pink and orange hues of the left façade that caught my eye.

Why yes, the hues of pink and gold in the muted limestone of Bryce Canyon and Barcelona are so beautiful because the perception of their similarity is the trace of my existence. The heightening of what is to what is meaningful. It’s a nostalgic and slightly mourning meaning, as walking the streets of Barcelona I think about García Llorca’s Yerma*, a play about a woman who never bears a child. I often face moments of sadness at not being married, not having children, not being cushioned by normativity’s blessings. But my jealousy and covetousness for others’ lives have eased over time. This is evident in how my relationship with my mother has changed. I’ve done the emotional prep work of still being without child at 40 or 45, empathizing with a future self in a future state and thereby also growing more compassionate to others, today. I’ve experienced many places and opened my heart to many people. It’s an existence worth a second act.

This took place yesterday morning. Saturday. Friday afternoon I recovered a different past. It’s likely Friday’s experience primed my mind and my emotions to notice Saturday’s sandstone similarities.

For Friday I walked into the Picasso museum in Barcelona’s Gothic quarter. The air was damp but the rain held off, at least then (later on I waited out the raindrops with strangers under a group of trees near the waterfront, watching a mother spoon yogurt to some little mouth covered by stroller canvas; the little one seemed to eat well, the yogurt went fast). I’d wandered to the Cathedral, saw the foreboding chiaro-oscuro of the heraldic escutcheons, black and shadowed and tall into cracked gothic arches. I wandered through narrow streets weaved with balconies, some square, others round like Gaudí hobbit holes.

The Picasso museum is housed in a medieval cloister. The entrance is asymmetric, with matte greens and greys and a staircase up the right-hand side. Standing in line for tickets, I encountered my sixteen-year-old self. She was waiting for me; she had never left; she lived in the matte green of the entrance hall. I relived the mild disgust noticing our Spanish teacher’s fanny pack hang like a limp holster under the taught piqué cotton of his mint green polo shirt, I saw the moles on his hairy legs and the forced kindness in his smile. He stood there waiting for all 17 or 18 or 20 of us to gather in the museum. Watching him, sixteen, I relived my projection of myself in the eyes of the boys on the trip, they were juniors, I was a sophomore, I had a crush on Lyle, my experience of Gaudí’s balconies and Picasso’s cubism and Velázquez’s portraits and Franco’s phallic monument and the Roman aqueducts in Segovia were filtered through this prism of insecurity and adolescent desire, my personality still so much in flux, my introversion still so marked. I brought my violin to Spain and played every day. I brought a suitcase that was much too large, as I had yet to pride myself on my practicality, how easily I could move about the world. At the time, I was absorbed by the pulse of my feelings, by the inklings of the self I wanted to project. I was so governed by how I thought others perceived me; still am, but more so then. Painfully so then, my superego cruel and chastising. I jumped forward a few years, into my mid twenties, where I regretted my stupid crushes and insecurity and self-absorption, as I didn’t have strong memories of the objects and monuments and art I was supposed to have learned about. I was so focused on Lyle, so focused on how I projected Lyle saw me, that I missed out on Gaudí, Barcelona, Spain.

picasso museum
The entrance to the Picasso museum, where I remembered things past.

But now, years later, I love the distortions. I love how walking into the Picasso museum on a damp Friday afternoon, I recover not just the memory of the place, but the feelings and vulnerability and sensations of the past observer. I love how I’m still there in my sensitivity, still there shaping observations based on who I am and what I’ve lived and where I project I might find future happiness. And that that snapshot of a self in development is still available to inhabit, to re-inhabit once more. That we can become again. That just as the pink and gold hues collapse space into the pulse of a single mind, so too do the matte greens collapse time, the identity of place revealing a self growing in time. Eighteen years of experience elapsed under a staircase.

I walked upstairs and, migrating at my own pace from room to room, understood how, like David Bowie, Picasso didn’t have one style, but iterated upon a given style in a given creative period until it was exhausted, then moved on. Impressionism cedes to the blue period cedes to the Russian ballets cedes to cubism cedes to the bombastic primitivism cedes to recreating las Meninas cedes to ad infinitum adoration of his wife cedes to the black and white still lives of old age. It was his recreating Las Meninas that caught my attention. Picasso takes this work, this exemplar of Spanish Golden Age style, where Velázquez enacts the Christ-like elision of creator and spectator, the Baroque practice of inverting the artwork—as representation of reality—to fuse the moment of creation with the moment of observation, perfected through the gaze of the painter himself as of the man escaping from the back lit door, well, Picasso takes this work and exploits the conceit of the artist and, with algorithmic insistence, repaints and repaints and recreates, distilling the essence of the form in the variations of style and look and feel. And the variations themselves eclipse his own journey as a painter, never tied to one style, always free to pivot and redefine himself anew. Picasso’s Meninas telescoping my own experience recovering my younger self, the privilege of my loneliness opening me raw and whole to meet him there, to imagine I might be with him and the pigeons while he painted. Another meeting eye to eye, the seeking kind, inside the artist, back to Velázquez. Complete in a way that can only be described as human.

Imagen 021
My favourite rendition of Picasso’s Las Meninas, a series from 1957.

*I picked up a book in the airport in Barcelona, nearly finished with Galloway’s diatribe against the Four. Niebla en Tánger, by Cristina López Barrio. Funny I’d just mentioned Yerma as I wrote this post on the plane, for I came across this sentence from a similarly childless protagonist: “Está vacío, como el de Yerma, piensa, hueco por esperar vida del hombre equivocado.”

The featured image dates from my recent trip to Bryce Canyon. It’s like a big field of deep dream art, dripping in its delicate phantasy. It was my favourite of the Utah canyons. 

 

Privacy Beyond the Individual

This week’s coverage of the Facebook and Cambridge Analytica debacle[1] (latest Guardian article) has brought privacy top of mind and raised multiple complex questions[2]:

Is informed consent nothing but a legal fiction in today’s age of data and machine learning, where “ongoing and extensive data collection can be neither fully informed nor truly consensual — especially since it is practically irrevocable?”

Is tacit consent–our silently agreeing to the fine print of privacy policies as we continue to use services–something we prefer to grant given the nuisance, time, and effort required to understand the nuances of data use? Is consent as a mechanism too reliant upon the supposition that privacy is an individual right and, therefore, available for an individual to exchange–in varying degrees–for the benefits and value from some service provider (i.e.,  Facebook likes satisfying our need to be both loved and lovely)? If consent is defunct, what legal structure should replace it?

Screen Shot 2018-03-25 at 9.08.17 AM
Scottish Enlightenment philosopher Adam Smith wrote about our need to love and be lovely in the Theory of Moral Sentiments. I’m dying to dig back into Smith because I suspect his work can help show that even personalization and online consumerism independent of any political context dulls our capacities to be active, rational participants in democracy. Indeed, listening to Russ Roberts’s EconTalk podcast, I’ve learned that Smith argued that commerce, i.e., in-person transactions with strangers to exchange value, provides an opportunity to practice regulating our emotions, as we can’t devolve into temper tantrums and emotional outbursts like we do with our families and spouses (the inimical paradoxes of intimacy…) if we want to get business done. Roberts wrote a poem extolling the wonders if libertarian emergence called It’s a Wonderful Loaf.

How should we update outdated notions of what qualifies as personally identifiable information (PII), which already vary across different countries and cultures, to account for the fact contemporary data processing techniques can infer aspects of our personal identity from our online (and, increasingly, offline) behavior that feel more invasive and private than our name and address? Can more harm be done to an individual using her social security/insurance number than psychographic traits? In which contexts?

Would regulatory efforts to force large companies like Facebook to “lock down” data they have about users actually make things worse, solidifying their incumbent position in the market (as Ben Thompson and Mike Masnick argue)?

Is the best solution, as Cory Doctorow at the Electronic Frontier Foundation argues, to shift from having users (tacitly) consent to data use, based on trust and governed by the indirect forces of market demand (people will stop using your product if they stop trusting you) and moral norms, to building privacy settings in the fabric of the product, enabling users to engage more thoughtfully with tools?[3]

Many more qualified than I are working to inform clear opinions on what matters to help entrepreneurs, technologists, policymakers, and plain-old people[4] respond. As I grapple with this, I thought I’d share a brief and incomplete history of the thinking and concepts undergirding privacy. I’ll largely focus on the United States because it would be a book’s worth of material to write even brief histories of privacy in other cultures and contexts. I pick the United States not because I find it the most important or interesting, but because it happens to be what I know best. My inspiration to wax historical stems from a keynote I gave Friday about the history of artificial intelligence (AI)[5] for AI + Public Policy: Understanding the shift, hosted by the Brookfield Institute in Toronto.

As is the wont of this blog, the following ideas are far from exhaustive and polished. I offer them for your consideration and feedback.


The Fourth Amendment: Knock-and-Announce

As my friend Lisa Sotto eloquently described in a 2015 lecture at the University of Pennsylvania, the United States (U.S.) considers privacy as a consumer right, parsed across different business sectors, and the European Union (EU) considers privacy as a human right, with a broader and more holistic concept of what kinds of information qualify as sensitive. Indeed, one look at the different definitions of sensitive personal data in the U.S. and France in the DLA Piper Data Protection Laws of the World Handbook shows that the categories and taxonomies are operating at different levels. In the U.S., sensitive data is parsed by data type; in France, sensitive data is parsed by data feature:

Screen Shot 2018-03-24 at 12.22.42 PM
Screenshot from the DLA Piper Data Protection Handbook. They conveniently organize information so a reader can compare how two countries differ on aspects of privacy law.

It seems potentially, possibly plausible (italics indicating I’m really unsure about this) that the U.S. concept of privacy as being fundamentally a consumer right dates back to the original elision of privacy and property in the Fourth Amendment to the U.S. Constitution:

The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.

We forget how tightly entwined protection of property was to early U.S. political theory. In his Leviathan, for example, seventeenth-century English philosopher Thomas Hobbes derives his theory of legitimate sovereign power (and the notion of the social contract that influenced founding fathers like Jefferson and Madison) from the need to provide individuals with some recourse against intrusions on their property; otherwise we risk devolving to the perpetually anxious and miserable state of the war of all against all, where anyone can come in and ransack our stuff at any time.

The Wikipedia page on the Fourth Amendment explains it as a countermeasure against general warrants and writs of assistance British colonial tax collectors were granted to “search the homes of colonists and seize ‘prohibited and uncustomed’ goods.” What matters for this brief history is the foundation that early privacy law protected people’s property–their physical homes–from searches, inspections, and other forms of intrusion or surveillance by the government.

Katz v. United States: Reasonable Expectations of Privacy

Over the past 50 years, new technologies have fracked the bedrock assumptions of the Fourth Amendment.[6] The case law is expansive and vastly exceeds the cursory overview I’ll provide in this post. Cindy Cohn from the Electronic Frontier Foundation has written extensively and lucidly on the subject.[7] As has Daniel Solove.

Perhaps the seminal case shaping contemporary U.S. privacy law is Katz v. United States, 389 U.S. 347 (1967). A 2008 presentation from Artemus Ward at Northern Illinois University presents the facts and a summary of the Supreme Court Justices’ opinions in these three slides (there are also dissenting opinions):

Katz+v.+United+States+(1967)+The+Facts

Katz+v.+United+States+(1967)+Justice+Potter+Stewart+Delivered+the+Opinion+of+the+Court

harlan

There are two key questions here:

  • Does the right to privacy extend to telephone booths and other public places?
  • Is a physical intrusion necessary to constitute a search?

Justice Harlan’s comments regarding the “actual (subjective) expectation of privacy” that society is prepared to recognize as “reasonable” marked a conceptual shift to pave the way for the Fourth Amendment to make sense in the digital age. Katz shifted the locus of constitutional protections against unwarranted government surveillance from one’s private home–property that one owns–to public places that social norms recognize as private in practice if not name (a few cases preceding Katz paved the way for this judgment).

This is watershed: when any public space can be interpreted as private in the eyes of the beholder, the locus of privacy shifts from an easy-to-agree-upon-objective-space like someone’s home, doors locked and shades shut, to a hard-to-agree-upon-subjective-mindset like someone’s expectation of what should be private, even if it’s out in a completely public space, just as long as those expectations aren’t crazy (i.e., that annoying lady somehow expecting that no one is listening to her uber-personal conversation about the bad sex she had with the new guy from Tinder as she stands in a crowded checkout line waiting to purchase her chia seed concoction and her gluten-free crackers[8]) but accord with the social norms and practices of a given moment and generation.

Imagine how thorny it becomes to decide what qualifies as a reasonable expectation for privacy when we shift from a public phone booth occupied by one person who can shut the door (as in Katz) to:

  • internet service providers shuffling billions of text messages, phone calls, and emails between individuals, where (perhaps?) the standard expectation is that when we go through the trouble of protecting information with a password (or two-factor authentication), we’re branding these communications as private, not to be read by the government or the private company providing us with the service (and metadata?);
  • GPS devices placed on the bottom of vehicles, as in United States v. Jones, 132 S.Ct. 945 (2012), which in themselves may not seem like something everyone has to worry about often but which, given the category of data they generate, are similar to any and all information about how we transact and move in the world, revealing not just what our name is but which coffee shops and doctors (or lovers or political co-conspirators) we visit on a regular basis, prompting Justice Sandra Sotomayor to be very careful in her judgments;
  • social media platforms like Facebook, pseudo-public in nature, that now collect and analyze not only structured data on our likes and dislikes, but, thanks to advancing AI capabilities, image, video, text, and speech data;
  • etc.
Screen Shot 2018-03-25 at 10.34.26 AM
Slide from Kosta Derpanis‘s extremely witty and engaging talk on computer vision at Friday’s AI + Public Policy Conference. This shows how Facebook is now applying computer vision techniques to analyze images users post, not just structured data about what they like and dislike.

Just as Zeynep Tufecki argues that informed consent loses its power in an age where most users of internet services and products don’t rigorously understand what use of their data they’re consenting too, so too does Cohn believe that the “‘reasonable expectation of privacy’ test currently employed in Fourth Amendment jurisprudence is a poor test for the digital age.”[9] As with any shift from criticism to pragmatic solutions, however, the devil is in the details. If we eliminate a reasonableness test because it’s too flimsy for the digital age, what do we replace it to achieve the desired outcomes of protecting individual rights to free speech and preventing governmental overreach? Do we find a way to measure actual harm suffered by an individual? Or should we, as Lex Gill suggested Friday, somehow think about privacy as a public good rather than an individual choice requiring individual consent? What are the different harms we need to guard against in different contexts, given that use of data for targeted marketing has different ramifications than government wiretapping?

These questions are tricky to parse because, in an age where so many aspects of our lives are digital, privacy bleeds into and across different contexts of social, political, commercial, and individual activity. As Helen Nissenbaum has masterfully shown, our subjective experience of what’s appropriate in different social contexts influences our reasonable expectations of privacy in digital contexts. We don’t share all the intimate details of our personal life with colleagues in the same way we do with close friends or doctors bound by duties of confidentiality. Add to that that certain social contexts demand frivolity (that ironic self you fashion on Facebook) and others, like politics, invite a more aspirational self.[10] Nissenbaum’s theory of contextual integrity, where privacy is preserved when information flows respect the implicit, socially-constructed boundaries that graft the many sub-identities we perform and inhabit as individuals, applies well to Cambridge Analytica debacle. People are less concerned by private companies using social media data for psychographic targeting than they are for political targeting; the algorithms driving stickiness on site and hyper-personalized advertising aren’t fit to promote the omnivorous, balanced information diet required to understand different sides of arguments in a functioning democracy. Being at once watering hole to chat with friends, media company to support advertising, and platform for political persuasion, Facebook collapses distinct social spheres into one digital platform (which also complicates anti-trust arguments, as evident in this  excellent Intelligence Squared debate).

A New New Deal on Data: Privacy in the Age of Machine Learning

In 2009, Alex “Sandy” Pentland of the MIT Media Lab began the final section of his article calling for a “new deal” on data as follows:

Perhaps the greatest challenge posed by this new ability to sense the pulse of humanity is creating a “new deal” around questions of privacy and data ownership. Many of the network data that are available today are freely offered because the entities that control the data have difficulty extracting value from them. As we develop new analytical methods, however, this will change.[11]

This ability to “sense the pulse of humanity,” writes Pentland earlier in the article, arises from the new data generation, collection, and processing tools that have effectively given the human race “the beginnings of a working nervous system.” Pentland contrasts what we are able to know about people’s behavior today–where we move in the world, how many times our hearts beat per minute, whom we love, whom we are attracted to, what movies we watch and when, what books we read and stop reading in between, etc–with the “single-shot, self-report data” data, e.g., yearly censuses, public polls, and focus groups, that characterized demographic statistics in the recent past. Note that back in 2009, the hey day of the big data (i.e., collecting and storing data) era, Pentland commented that while a ton of data was collected, companies had difficulty extracting value. It was just a lot of noise backed by the promise of analytic potential.

This has changed.

Machine learning has unlocked the potential and risks of the massive amounts of data collected about people.

The standard risk assessment tools (like privacy impact assessments) used by the privacy community today focus on protecting the use of particular types of data, PII like proper names and e-mail addresses. There is a whole industry and tool kit devoted to de-identification and anonymization, automatically removing PII while preserving other behavioral information for statistical insights. The problem is that this PII-centric approach to privacy misses the boat in the machine learning age. Indeed, what Cambridge Analytica brought to the fore was the ability to use machine learning to probabilistically infer not proper names but features and types from behavior: you don’t need to check a gender box for the system to make a reasonably confident guess that you are a woman based on the pictures you post and the words you use; private data from conversations with your psychiatrist need not be leaked for the system to peg you as neurotic. Deep learning is so powerful because it is able to tease out and represent hierarchical, complex aspects of data that aren’t readily and effectively simplified down variables we can keep track of and proportionately weight in our heads: these algorithms can, therefore, tease meaning out of a series of actions in time. This may not peg you as you, but it can peg you as one of a few whose behavior can be impacted using a given technique to achieve a desired outcome.

Three things have shifted:

  • using machine learning, we can probabilistically construct meaningful units that tell us something about people without standard PII identifiers;
  • because we can use machine learning, the value of data shifts from the individual to statistical insights across a distribution; and
  • breaches of privacy that occur at the statistical layer instead of the individual data layer require new kinds of privacy protections and guarantees.

The technical solution to this last bullet point is a technique called differential privacy. Still in the early stages of commercial adoption,[12] differential privacy thinks about privacy as the extent to which individual data impacts the shape of some statistical distribution. If what we care about is the insight, not the person, then let’s make it so we can’t reverse engineer how one individual contributed to that insight. In other words, the task is to modify a database such that:

if you have two otherwise identical databases, one with your information and one without it, the probability that a statistical query will produce a given result is (nearly) the same whether it’s conducted on the first or second database.

Here’s an example Matthew Green from Johns Hopkins gives to help develop an intuition for how this works:

Imagine that you choose to enable a reporting feature on your iPhone that tells Apple if you like to use the 💩  emoji routinely in your iMessage conversations. This report consists of a single bit of information: 1 indicates you like 💩 , and 0 doesn’t. Apple might receive these reports and fill them into a huge database. At the end of the day, it wants to be able to derive a count of the users who like this particular emoji.

It goes without saying that the simple process of “tallying up the results” and releasing them does not satisfy the DP definition, since computing a sum on the database that contains your information will potentially produce a different result from computing the sum on a database without it. Thus, even though these sums may not seem to leak much information, they reveal at least a little bit about you. A key observation of the differential privacy research is that in many cases, DP can be achieved if the tallying party is willing to add random noise to the result. For example, rather than simply reporting the sum, the tallying party can inject noise from a Laplace or gaussian distribution, producing a result that’s not quite exact — but that masks the contents of any given row.

This is pretty technical. It takes time to understand it, in particular if you’re not steeped in statistics day in and day out, viewing the world as a set of dynamic probability distributions. But it poses a big philosophical question in the context of this post.

In the final chapters of Homo Deus, Yuval Noah Harari proposes that we are moving from the age of Humanism (where meaning emanates from the perspective of the individual human subject) to the age of Dataism (where we question our subjective viewpoints given our proven predilections for mistakes and bias to instead relegate judgment, authority, and agency to algorithms that know us better than we know ourselves). Reasonable expectations for privacy, as Justice Harlan indicated, are subjective, even if they must be supported by some measurement of norms to qualify as reasonable. Consent is individual and subjective, and results in principles like that of minimum use for an acknowledged purpose because we have limited ability to see beyond ourselves, we create traffic jams because we’re so damned focus on the next step, the proxy, as opposed to viewing the system as a whole from a wider vantage point, and only rarely (I presume?) self-identify and view ourselves under the round curves of a distribution. So, if techniques like differential privacy are better apt to protect us in an age where distributions matter more than data points, how should we construct consent, and how should we shape expectations, to craft the right balance between the liberal values we’ve inherited and this mathematical world we’re building? Or, do we somehow need to reformulate our values to align with Dataism?

And, perhaps most importantly, what should peaceful resistance look like and what goals should it achieve?


[1] What one decides to call the event reveals a lot about how one interprets it. Is it a breach? A scandal? If so, which actor exhibits scandalous behavior: Nix for his willingness to profit from the manipulation of people’s psychology to support the election of an administration that is toppling democracy? Zuckerberg for waiting so long to acknowledge that his social media empire is more than just an advertising platform and has critical impacts on politics and society? The Facebook product managers and security team for lacking any real enforcement mechanisms to audit and verify compliance with data policies? We the people, who have lost our ability and even desire to read critically, as we prefer the sensationalism of click bait, goading technocrats to optimize for whatever headline keeps us hooked to our feed, ever curious for more? Our higher education system, which, falling to economic pressures that date back to (before but were aggravated by) the 2008-2009 financial crisis are cutting any and all curricula for which it’s hard to find a direct, casual line to steady and lucrative employment, as our education system evolves from educating a few thoughtful priests to educating many industrial workers to educating engineers who can build stuff and optimize everything and define proxies and identify efficiencies so we can go faster, faster until we step back and take the time to realize the things we are building may not actually align with our values, that, perhaps, we may need to retain and reclaim our capacities to reflect and judge and reason if we want to sustain the political order we’ve inherited? Or perhaps all of this is just the symptom of much larger, complex trend in World History that we’re unable to perceive, that the Greeks were right in thinking that forms of government pass through inevitable cycles with the regularity of the earth rotating around the sun (an historical perspective itself, as the Greeks thought the inverse) and we should throw our hands up like happy nihilists, bowing down to the unstoppable systemic forces of class warfare, the give and take between the haves and the have nots, little amino acids ever unable to perceive how we impact the function of proteins and how they impact us in return?

And yet, it feels like there may be nothing more important than to understand this and to do what little–what big–we can to make the world a better place. This is our dignity, quixotic though it may be.

Screen Shot 2018-03-24 at 9.39.39 AM
The Greek term for the cycle of political regimes is anacyclosis. Interestingly enough, there is an institute devoted to this idea, seemingly located in North Carolina. Their vision is to “halt the cycle of revolution while democracy prevails.”

[2] One aspect of the fiasco* I won’t write about but that merits at least passing mention is Elon Musk’s becoming the mascot for the #DeleteFacebook movement (too strong a word?). The New York Times coverage of Musk’s move references Musk and Zuckerberg’s contrasting opinions on the risks AI might pose to humanity. From what I understand, as executives, they both operate on extremely long time scales (i.e., 100 years in the future), projecting way out into speculative futures and working backwards to decide what small steps man should take today to enable Man to take giant future leaps (gender certainly intended, especially in Musk’s case, as I find his aesthetic and many of the very muscular men I’ve met from Tesla at conferences is not dissimilar from the nationalistic masculinity performed by Vladimir Putin). Musk rebuffed Zuckerberg’s criticism that Musk’s rhetoric about the existential threat AI poses to humanity is “irresponsible” by saying that Zuckerberg’s “understanding of the subject is limited.” I had some cognitive dissonance reading this, as I presumed the risk Musk was referring to was that of super-intelligence run amok (à la Nick Bostrom, whom I admittedly reference as a straw man) rather than that of our having created an infrastructure that exacerbates short-term, emotional responses to stimuli and thereby threatens the information exchange required for democracy to function (see Alexis de Tocqueville on the importance of newspapers in Democracy in America). My takeaway from all of this is that there are so many different sub-issues all wrapped up together, and that we in the technology community really do need to work as hard as we can to references specifics rather than allow for the semantic slippage that leaves interpretation in the mind of the beholder. It’s SO HARD to do this, especially for pubic figures like Musk, given that people’s attention spans are limited and we like punchy quotables at a very high level. The devil is always in the details.

[3] Doctorow references Laurence Lessig’s Code and Other Laws of Cyberspace, which I have yet to read but is hailed as a classic text on the relationship between law and code, where norms get baked into our technologies in the choices of how we write code.

[4] I always got a kick out of the song Human by the Killers, whose lyrics seem to imply a mutually exclusive distinction between human and dancer. Does the animal kingdom offer better paradigms for dancers than us poor humans? Must depend on whether you’re a white dude.

[5] My talk drew largely from Chris Dixon‘s extraordinary Atlantic article How Aristotle Created the Computer. Extraordinary because he deftly encapsulates 2000 years of the history of logic into a compelling, easy-to-read article that truly helps the reader develop intuitions about deterministic computer programs and the shift to a more inductive machine learning paradigm, while also not leaving the reader with the bitter taste of having read an overly general dilettante. Here’s one of my slides, which represents how important it was for the history of computation to visualize and interpret Aristotelian syllogisms as sets (sets lead to algebra lead to encoding in logical gates lead to algorithms).

Screen Shot 2018-03-24 at 11.18.28 AM
As Dixon writes, “You can replace “Socrates” with any other object, and “mortal” with any other predicate, and the argument remains valid. The validity of the argument is determined solely by the logical structure. The logical words — “all,” “is,” are,” and “therefore” — are doing all the work.”

Fortunately (well, we put effort in to coordinate), my talk was a nice primer for Graham Taylor‘s superbly clear introduction to various forms of machine learning. I most liked his section on representation learning, where he showed how the choice of representation of data has an enormous impact on the performance of algorithms:

Screen Shot 2018-03-24 at 11.25.25 AM
The image is from deeplearningbook.org. Note that in Cartesian coordinates, it’s hard to draw a straight line that could separate the blue and green items, whereas in polar coordinates, the division is readily visible. Coordinate choice has a big impact on what qualifies as simple in math. Newton and Descartes, for example, disagreed over the simplicity of the equation for a circle: it’s a pretty complex equation when represented in Cartesian coordinates, but quite simple in Polar coordinates. Our frame of reference is a thinking tool we can use to solve problems — I first learned this in high-school physics, when Clyfe Beckwith taught us that we could tilt our Cartesian coordinates to align with the slope of a hill in a physics problem. It’s a foundational memory I have of ridding myself of ossified assumptions to open the creative thinking space to solve problems, not unlike adding 0 to an algebraic equation to leverage the power of b + -b.

[6] If you’re interested in contemporary Constitutional Law, I highly recommend Roman Mars’s What Trump Can Teach us about Con Law podcast. Mars and Elizabeth Joh, a law school professor at UC Davis, use Trump’s entirely anomalous behavior as catalyst to explore various aspects of the Constitution. I particularly enjoyed the episode about the emoluments clause, which prohibits acceptance of diplomatic gifts to the President, Vice President, Secretary of State, and their spouses. The Protocol Gift Unit keeps public record of all gifts presidents did accept, including justification of why they made the exception. For example, in 2016, former President Obama accepted Courage, an olive green with black flecks soapstone sculpture, depicting the profile of an eagle with half of an indigenous man’s face in the center, valued at $650.00, from His Excellency Justin Trudeau, P.C., M.P., Prime Minister of Canada, because “non-acceptance would cause embarrassment to donor and U.S. Government.”

courage statute
Courage on the right. Leo Arcand, the Alberta Cree artist who sculpted it, on the left.

[7] Cindy will be in Toronto for RightsCon May 16-18. I cannot recommend her highly enough. Every time I hear her speak, every time I read her writing, I am floored by her eloquence, precision, and passionate commitment to justice.

[8] Another thing I cannot recommend highly enoug is David Foster Wallace’s commencement speech This is Water. It’s ruthlessly important. It’s tragic to think about the fact that this human, this wonderfully enlightened heart, felt the only appropriate act left was to commit suicide.

[9] A related issue I won’t go into in this post is the third-party doctrine, “under which an individual who voluntarily provides information to a third party loses any reasonable expectation of privacy in that information.” (Cohn)

[10] Eli Pariser does a great job showing the difference between our frivolous and aspiration selves, and the impact this has on filter bubbles, in his 2011 quite prescient monograph.

[11] See also this 2014 Harvard Business Review interview with Pentland. My friend Dazza Greenwood first introduced me to Pentland’s work by presenting the blockchain as an effective means to executive the new deal on data, empowering individuals to keep better track of where data flow and sit, and how they are being used.

[12] Cynthia Dwork’s pioneering work on differential privacy at Microsoft Research dates back to 2006. It’s currently in use at Apple, Facebook, and Google (the most exciting application being fused with federated learning across the network of Android users, to support localized, distributed personalization without requiring that everyone share their digital self with Google’s central servers). Even Uber has released an open-source differential privacy toolset. There are still many limitations to applying these techniques in practice given their impact on model performance and the lack of robust guarantees on certain machine learning models. I don’t know of many instances of startups using the technology yet outside a proud few in the Georgian Partners portfolio, including integrate.ai (where I work) and Bluecore in New York City.

The featured image is from an article in The Daily Dot (which I’ve never heard of) about the Mojave Phone Booth, which, as Roman Mars delightfully narrates in 99% Invisible became a sensation when Godfrey “Doc” Daniels (trust me that link is worth clicking on!!) used the internet to catalogue his quest to find the phone booth working merely from its number: 760-733-9969. The tattered decrepitude of the phone booth, pitched against the indigo of the sunset, is a compelling illustration of the inevitable retrograde character of common law precedent. The opinions in Katz v. United States regarded reasonably expectations for privacy were given at a time when digital communications occurred largely over the phone: is it even possible for us to draw analogies between what privacy meant then and what it could mean now in the age of centralized platform technologies whose foundations are built upon creating user bases and markets to then exchange this data for commercial and political advertising purposes? But, what can we use to anchor ethics and lawful behavior if not the precedent of the past, aligned against a set of larger, overarching principles in an urtext like the constitution, or, in the Islamic tradition, the Qur’an? 

Exploration-Exploitation and Life

There was another life that I might have had, but I am having this one. – Kazuo Ishiguro

On April 18, 2016*, I attended an NYAI Meetup** featuring a talk by Columbia Computer Science Professor Dan Hsu on interactive learning. Incredibly clear and informative, the talk slides are worth reviewing in their entirety. But one in particular caught my attention (fortunately it summarizes many of the subsequent examples):

Screen Shot 2017-12-02 at 9.44.34 AM
From Dan Hsu’s excellent talk on interactive machine learning

It’s worth stepping back to understand why this is interesting.

Much of the recent headline-grabbing progress in artificial intelligence (AI) comes from the field of supervised learning. As I explained in a recent HBR article, I find it helpful to think of supervised learning like the inverse of high school algebra:

Think back to high school math — I promise this will be brief — when you first learned the equation for a straight line: y = mx + b. Algebraic equations like this represent the relationship between two variables, x and y. In high school algebra, you’d be told what m and b are, be given an input value for x, and then be asked to plug them into the equation to solve for y. In this case, you start with the equation and then calculate particular values.

Supervised learning reverses this process, solving for m and b, given a set of x’s and y’s. In supervised learning, you start with many particulars — the data — and infer the general equation. And the learning part means you can update the equation as you see more x’s and y’s, changing the slope of the line to better fit the data. The equation almost never identifies the relationship between each x and y with 100% accuracy, but the generalization is powerful because later on you can use it to do algebra on new data. Once you’ve found a slope that captures a relationship between x and y reliably, if you are given a new x value, you can make an educated guess about the corresponding value of y.

Supervised learning works well for classification problems (spam or not spam? relevant or not for my lawsuit? cat or dog?) because of how the functions generalize. Effectively, the “training labels” humans provide in supervised learning assign categories, tokens we affiliate to abstractions from the glorious particularities of the world that enable us to perceive two things to be similar. Because our language is relatively stable (stable does not mean normative, as Canadian Inuit perceive snow differently from New Yorkers because they have more categories to work with), generalities and abstractions are useful, enabling the learned system to act correctly in situations not present in the training set (e.g., it takes a hell of a long time for golden retrievers to evolve to be indistinguishable from their great-great-great-great-great-grandfathers, so knowing what one looks like on April 18, 2016 will be a good predictor of what one looks like on December 2, 2017). But, as Rich Sutton*** and Andrew Barto eloquently point out in their textbook on reinforcement learning,

This is an important kind of learning, but alone it is not adequate for learning from interaction. In interactive problems it is often impractical to obtain examples of desired behavior that are both correct and representative of all the situations in which the agent has to act. In uncharted territory—where one would expect learning to be most beneficial—an agent must be able to learn from its own experience.

In his NYAI talk, Dan Hsu also mentioned a common practical limitation of supervised learning, namely that many companies often lack good labeled training data and it can be expensive, even in the age of Mechanical Turk, to take the time to provide labels.**** The core thing to recognize is that learning from generalization requires that future situations look like past situations; learning from interaction with the environment helps develop a policy for action that can be applied even when future situations do not look exactly like past situations. The maxim “if you don’t have anything nice to say, don’t say anything at all” holds both in a situation where you want to gossip about a colleague and in a situation where you want to criticize a crappy waiter at a restaurant.

In a supervised learning paradigm, there are certainly traps to make faulty generalizations from the available training data. One classic problem is called “overfitting”, where a model seems to do a great job on a training data set but fails to generalize well to new data. But the super critical salient difference Hsu points out in his talk is that, while with supervised learning the data available to the learner is exogenous to the system, with interactive machine learning approaches, the learner’s performance is based on the learner’s decisions and the data available to the world depends on the learner’s decisions. 

Think about that. Think about what that means for gauging the consequences of decisions. Effectively, these learners cannot evaluate counterfactuals: they cannot use data or evidence to judge what would have happened if they took a different action. An ideal optimization scenario, by contrast, would be one where we could observe the possible outcomes of any and all potential decisions, and select the action with the best outcome across all these potential scenarios (this is closer, but not identical, to the spirit of variational inference, but that is a complex topic for another post).

To share one of Hsu’s***** concrete examples, let’s say a website operator has a goal to personalize website content to entice a consumer to buy a pair of shoes. Before the user shows up at the site, our operator has some information about her profile and browsing history, so can use past actions to guess what might be interesting bait to get a click (and eventually a purchase). So, at the moment of truth, the operator says “Let’s show the beige Cole Hann high heels!”, displays the content, and observes the reaction. We’ll give the operator the benefit of the doubt and assume the user clicks, or even goes on to purchase. Score! Positive signal! Do that again in the future! But was it really the best choice? What would have happened if the operator had shown the manipulatable consumer the red Jimmy Choo high heels, which cost $750 per pair rather than a more modest $200 per pair? Would the manipulatable consumer have clicked? Was this really the best action?

The learner will never know. It can only observe the outcome of the action it took, not the action it didn’t take.

The literature refers to this dilemma as the trade-off between exploration and exploitation. To again cite Sutton and Barto:

One of the challenges that arise in reinforcement learning, and not in other kinds of learning, is the trade-off between exploration and exploitation. To obtain a lot of reward, a reinforcement learning agent must prefer actions that it has tried in the past and found to be effective in producing reward. But to discover such actions, it has to try actions that it has not selected before. The agent has to exploit what it already knows in order to obtain reward, but it also has to explore in order to make better action selections in the future. The dilemma is that neither exploration nor exploitation can be pursued exclusively without failing at the task. The agent must try a variety of actions and progressively favor those that appear to be best. On a stochastic task, each action must be tried many times to gain a reliable estimate of its expected reward.

There’s a lot to say about the exploration-exploitation tradeoff in machine learning (I recommend starting with the Sutton/Barto textbook). Now that I’ve introduced the concept, I’d like to pivot to consider where and why this is relevant in honest-to-goodness-real-life.

The nice thing about being an interactive machine learning algorithm as opposed to a human is that algorithms are executors, not designers or managers. They’re given a task (“optimize revenues for our shoe store!”) and get to try stuff and make mistakes and learn from feedback, but never have to go through the soul-searching agony of deciding what goal is worth achieving. Human designer overlords take care of that for them. And even the domain and range of possible data to learn from is constrained by technical conditions: designers make sure that it’s not all the data out there in the world that’s used to optimize performance on some task, but a tiny little baby subset (even if that tiny little baby entails 500 million examples) confined within a sphere of relevance.

Being a human is unfathomably more complicated.

Many choices we make benefit from the luxury of triviality and frequency. “Where should we go for dinner and what should we eat when we get there?” Exploitation can be a safe choice, in particular for creatures of habit. “Well, sweetgreen is around the corner, it’s fast and reliable. We could take the time to review other restaurants (which could lead to the most amazing culinary experience of our entire lives!) or we could not bother to make the effort, stick with what we know, and guarantee a good meal with our standard kale caesar salad, that parmesan crisp thing they put on the salad is really quite tasty…” It’s not a big deal if we make the wrong choice because, low and behold, tomorrow is another day with another dinner! And if we explore something new, it’s possible the food will be just terrible and sometimes we’re really not up for the risk, or worse, the discomfort or shame of having to send something we don’t like back. And sometimes it’s fine to take the risk and we come to learn we really do love sweetbreads, not sweetgreens, and perhaps our whole diet shifts to some decadent 19th-century French paleo practice in the style of des Esseintes.

Des_Esseintes_at_study_Zaidenberg_illustration
Arthur Zaidenberg’s depiction of des Esseintes, decadent hero extraordinaire, who embeds gems into a tortoise shell and has a perfume organ.

Other choices have higher stakes (or at least feel like they do) and easily lead to paralysis in the face of uncertainty. Working at a startup strengthens this muscle every day. Early on, founders are plagued by an unknown amount of unknown unknowns. We’d love to have a magic crystal ball that enables us to consider the future outcomes of a range of possible decisions, and always act in the way that guarantees future success. But the crystal balls don’t exist, and even if they did, we sometimes have so few prior assumptions to prime the pump that the crystal ball could only output an #ERROR message to indicate there’s just not enough there to forecast. As such, the only option available is to act and to learn from the data provided as a result of that action. To jumpstart empiricism, staking some claim and getting as comfortable as possible with the knowledge that the counterfactual will never be explored, and that each action taken shifts the playing field of possibility and probability and certainty slightly, calming minds and hearts. The core challenge startup leaders face is to enable the team to execute as if these conditions of uncertainty weren’t present, to provide a safe space for execution under the umbrella of risk and experiment. What’s fortunate, however, is that the goals of the enterprise are, if not entirely well-defined, at least circumscribed. Businesses exist to turn profits and that serves as a useful, if not always moral, constraint.

Big personal life decisions exhibit further variability because we but rarely know what to optimize for, and it can be incredibly counter-productive and harmful to either constrain ourselves too early or suffer from the psychological malaise of assuming there’s something wrong with us if we don’t have some master five-year plan.

This human condition is strange because we do need to set goals–it’s beneficial for us to consider second- and third-tier consequences, i.e., if our goal is to be healthy and fit, we should overcome the first-tier consequence of receiving pleasure when we drown our sorrows in a gallon of salted caramel ice cream–and yet it’s simply impossible for us to imagine the future accurately because, well, we overfit to our present and our past.

I’ll give a concrete example from my own experience. As I touched upon in a recent post about transitioning from academia to business, one reason why it’s so difficult to make a career change is that, while we never actually predict the future accurately, it’s easier to fear loss from a known predicament than to imagine gain from a foreign predicament.****** Concretely, when I was deciding whether to pursue a career in academia or the private sector in the fifth year in graduate school, I erroneously assumed that I was making a strict binary choice, that going into business meant forsaking a career teaching or publishing. As I was evaluating my decision, I never in my wildest dreams imagined that, a mere two years later, I would be invited to be an adjunct professor at the University of Calgary Faculty of Law, teaching about how new technologies were impacting traditional professional ethics. And I also never imagined that, as I gave more and more talks, I would subsequently be invited to deliver guest lectures at numerous business schools in North America. This path is not necessarily the right path for everyone, but it was and is the right path for me. In retrospect, I wish I’d constructed my decision differently, shifting my energy from fearing an unknown and unknowable future to paying attention to what energized me and made me happy and working to maximize the likelihood of such energizing moments occurring in my life. I still struggle to live this way, still fetishize what I think I should be wanting to do and living with an undercurrent of anxiety that a choice, a foreclosure of possibility, may send me down an irreconcilably wrong path. It’s a shitty way to be, and something I’m actively working to overcome.

So what should our policy be? How can we reconcile this terrific trade-off between exploration and exploitation, between exposing ourselves to something radically new and honing a given skill, between learning from a stranger and spending more time with a loved one, between opening our mind to some new field and developing niche knowledge in a given domain, between jumping to a new company with new people and problems, and exercising our resilience and loyalty to a given team?

There is no right answer. We’re all wired differently. We all respond to challenges differently. We’re all motivated by different things.

Perhaps death is the best constraint we have to provide some guidance, some policy to choose between choice A and choice B. For we can project ourselves forward to our imagined death bed, where we lie, alone, staring into the silent mirror of our hearts, and ask ourselves “Was my life was meaningful?” But this imagined scene is not actually a future state: it is a present policy. It is a principle we can use to evaluate decisions, a principle that is useful because it abstracts us from the mire of emotions overly indexed towards near-term goals and provides us with perspective.

And what’s perhaps most miraculous is that, at every present, we can sit there are stare into the silent mirror of our hearts and look back on the choices we’ve made and say, “That is me.” It’s so hard going forward, and so easy going backward. The proportion of what may come wanes ever smaller than the portion of what has been, never quite converging until it’s too late, and we are complete.


*Thank you, internet, for enabling me to recall the date with such exacting precision! Using my memory, I would have deduced the approximate date by 1) remembering that Robert Colpitts, my boyfriend at the time (Godspeed to him today, as he participates in a sit-a-thon fundraiser for the Interdependence Project in New York City, a worthy cause), attended with me, recalling how fresh our relationship was (it had to have been really fresh because the frequency with which we attended professional events together subsequently declined), and working backwards from the start to find the date; 2) remembering what I wore! (crazy!!), namely a sheer pink sleeveless shirt, a pair of wide-legged white pants that landed just slightly above the ankle and therefore looked great with the pair of beige, heeled sandals with leather so stiff it gave me horrific blisters that made running less than pleasant for the rest of the week. So I’d recently purchased those when my brother and his girlfriend visited, which was in late February (or early March?) 2016; 3) remembering that afterwards we went to some fast food Indian joint nearby in the Flatiron district, food was decent but not good enough to inspire me to return. So that would put is in the March-April, 2016 range, which is close but not the exact April 18. That’s one week after my birthday (April 11); I remember Robert and I had a wonderful celebration on my birthday. I felt more deeply cared for than I had in any past birthdays. But I don’t remember this talk relative to the birthday celebration (I do remember sending the marketing email to announce the Fast Forward Labs report on text summarization on my birthday, when I worked for half day and then met Robert at the nearby sweetgreen, where he ordered, as always, (Robert is a creature of exploitation) the kale caesar salad, after which we walked together across the Brooklyn Bridge to my house, we loved walking together, we took many, many walks together, often at night after work at the Promenade, often in the morning, before work, at the Promenade, when there were so few people around, so few people awake). I must say, I find the process of reconstructing when an event took place using temporal landmarks much more rewarding than searching for “Dan Hsu Interactive Learning NYAI” on Google to find the exact date. But the search terms themselves reveal something equally interesting about our heuristic mnemonics, as every time we reconstruct some theme or topic to retrieve a former conversation on Slack.

**Crazy that WeWork recently bought Meetup, although interesting to think about how the two business models enable what I am slowly coming to see as the most important creative force in the universe, the combinatory potential of minds meeting productively, where productively means that each mind is not coming as a blank slate but as engaged in a project, an endeavor, where these endeavors can productively overlap and, guided by a Smithian invisible hand, create something new. The most interesting model we hope to work on soon at integrate.ai is one that optimizes groups in a multiplayer game experience (which we lovingly call the polyamorous online dating algorithm), so mapping personality and playing style affinities to dynamically allocate the best next player to an alliance. Social compatibility is a fascinating thing to optimize for, in particular when it goes beyond just assembling a pleasant cocktail party to pairing minds, skills, and temperaments to optimize the likelihood of creating something beautiful and new.

***Sutton has one of the most beautiful minds in the field and he is kind. He is a person to celebrate. I am grateful our paths have crossed and thoroughly enjoyed our conversation on the In Context podcast.

***Maura Grossman and Gordon Cormack have written countless articles about the benefits of using active learning for technology assisted review (TAR), or classifying documents for their relevance for a lawsuit. The tradeoffs they weigh relate to system performance (gauged by precision and recall on a document set) versus time, cost, and effort to achieve that performance.

*****Hsu did not mention Haan or Choo. I added some more color.

******Note this same dynamic occurs in our current fears about the future economy. We worry a hell of a lot more about the losses we will incur if artificial intelligence systems automate existing jobs than we celebrate the possibilities of new jobs and work that might become possible once these systems are in place. This is also due to the fact that the future we imagine tends to be an adaptation of what we know today, as delightfully illustrated in Jean-Marc Côté’s anachronistic cartoons of the year 2000. The cartoons show what happens when our imagination only changes one variable as opposed to a set of holistically interconnected variables.

barber
19th-century cartoons show how we imagine technological innovations in isolation. That said, a hipster barber shop in Portland or Brooklyn could feature such a palimpsestic combination.

 

The featured image is a photograph I took of the sidewalk on State Street between Court and Clinton Streets in Brooklyn Heights. I presume a bird walked on wet concrete. Is that how those kinds of footprints are created? I may see those footprints again in the future, but not nearly as soon as I’d be able to were I not to have decided to move to Toronto in May. Now that I’ve thought about them, I may intentionally make the trip to Brooklyn next time I’m in New York (certainly before January 11, unless I die between now and then). I’ll have to seek out similar footprints in Toronto, or perhaps the snows of Alberta. 

 

 

 

 

 

 

 

 

Hearing Aids (Or, Metaphors are Personal)

Thursday morning, I gave the opening keynote at an event about the future of commerce at the Rotman School of Management in Toronto. I shared four insights:

  • The AI instinct is to view a reasoning problem as a data problem
    • Marketing hype leads many to imagine that artificial intelligence (AI) works like human brain intelligence. Words like “cognitive” lead us to assume that computers think like we think. In fact, succeeding with supervised learning, as I explain in this article and this previous post, involves a shift in perspective to reframe a reasoning task as a data collection task.
  • Advances in deep learning are enabling radical new recommender systems
    • My former colleague Hilary Mason always cited recommender systems as a classic example of a misunderstood capability. Data scientists often consider recommenders to be a solved problem, given the widespread use of collaborative filtering, where systems infer person B’s interests based on similarity with person A’s interests. This approach, however, is often limited by the “cold start” problem: you need person A and person B to do stuff before you can infer how they are similar. Deep learning is enabling us to shift from comparing past transactional history (structured data) to comparing affinities between people and products (person A loves leopard prints, like this ridiculous Kimpton-style robe!). This doesn’t erase the cold start problem wholesale, but it opens a wide range of possibilities because taste is so hard to quantify and describe: it’s much easier to point to something you like than to articulate why you like it.
  • AI capabilities are often features, not whole products
  • AI will dampen the moral benefits of commerce if we are not careful
    • Adam Smith is largely remembered for his theories on the value of the distribution of labor and the invisible hand that guides capitalistic markets. But he also wrote a wonderful treatise on moral sentiments where he argued that commerce is a boon to civilization because it forces us to interact with strangers; when we interact with strangers, we can’t have temper tantrums like we do at home with our loved ones; and this gives us practice in regulating our emotions, which is a necessary condition of rational discourse and the compromise at the heart of teamwork and democracy. As with many of the other narcissistic inclinations of our age, the logical extreme of personalization and eCommerce is a world where we no longer need to interact with strangers, no longer need to practice the art of tempered self-interest to negotiate a bargain. Being elegantly bored at a dinner party can be a salutatory boon to happiness. David Hume knew this, and died happy; Jean-Jacques Rousseau did not, and died miserable.
bill cunningham
This post on Robo Bill Cunningham does a good job explaining how image recognition capabilities are opening new roads in commerce and fashion.

An elderly couple approached me after the talk. I felt a curious sense of comfort and familiarity. When I give talks, I scan the audience for signs of comprehension and approval, my attention gravitating towards eyes that emit kindness and engagement. On Thursday, one of those loci of approval was an elderly gentleman seated in the center about ten rows deep. He and his Russian companion had to have been in their late seventies or early eighties. I did not fear their questions. I embraced them with the openness that only exists when there is no expectation of judgment.

She got right to the point, her accent lilted and slavic. “I am old,” she said, “but I would like to understand this technology. What recommendations would you give to elderly people like myself, who grew up in a different age with different tools and different mores (she looked beautifully put together in her tweed suit), to learn about this new world?”

I told her I didn’t have a good answer. The irony is that, by asking about something I don’t normally think about, she utterly stumped me. But it didn’t hurt to admit my ignorance and need to reflect. By contrast, I’m often able to conjure some plausible response to those whose opinion I worry about most, who elicit my insecurities because my sense of self is wrapped up in their approval. The left-field questions are ultimately much more interesting.

The first thing that comes to mind if we think about how AI might impact the elderly is how new voice recognition capabilities are lowering the barrier to entry to engage with complex systems. Gerontechnology is a thing, and there are many examples of businesses working to build robots to keep the elderly company or administer remote care. My grandmother, never an early adopter, loves talking to Amazon Alexa.

But the elegant Russian woman was not interested in how the technology could help her; She wanted to understand how it works. Democratizing knowledge is harder than democratizing utility, but ultimately much more meaningful and impactful (as a U Chicago alum, I endorse a lifelong life of the mind).

Then something remarkable happened. Her gentleman friend interceded with an anecdote.

“This,” he started, referring to the hearing aid he’d removed from his ear, “is an example of artificial intelligence. You can hear from my accent that I hail from the other side of the Atlantic (his accent was upper-class British; he’d studied at Harvard). Last year, we took a trip back with the family and stayed in quintessential British town with quintessential British pubs. I was elated by the prospect of returning to the locals of my youth, of unearthing the myriad memories lodged within childhood smells and sounds and tastes. But my first visit to a pub was intolerable! My hearing aid had become thoroughly Canadian, adapted to the acoustics of airy buildings where sound is free to move amidst tall ceilings. British pubs are confined and small! They trap the noise and completely bombarded my hearing aid. But after a few days, it adjusted, as these devices are wont to do these days. And this adaptation, you see, shows how devices can be intelligent.”

Of course! A hearing aid is a wonderful example of an adaptive piece of technology, of something whose functionality changes automatically with context. His anecdote brilliantly showed how technologies are always more than the functionalities they provide, are rather opportunities to expose culture and anthropology: Toronto’s adolescence as a city indexed by its architecture, in contrast to the wizened wood of an old-world pub; the frustrating compromises of age and fragility, the nostalgic ideal clipped by the time the device required to recalibrate; the incredible detail of the personal as a theatrical device to illustrate the universal.

What’s more, the history of hearing aids does a nice job illustrating the more general history of technology in this our digital age.

Partial deafness is not a modern phenomenon. As everywhere, the tools to overcome it have changed shape over time.

Screen Shot 2017-11-19 at 11.39.29 AM
This 1967 British Pathé primer on the history of hearing aids is a total trip, featuring radical facial hair and accompanying elevator music. They pay special attention to using the environment to camouflage cumbersome hearing aid machinery.

One thing that stands out when you go down the rabbit hole of hearing aid history is the importance of design. Indeed, historical hearing aids are analogue, not digital. People used to use naturally occurring objects, like shells or horns, to make ear trumpets like the one pictured in the featured image above. Some, including 18th-century portrait painter Joshua Reynolds, did not mind exposing their physical limitations publicly. Reynolds was renowned for carrying an ear trumpet and even represented his partial deafness in self-portraits painted later in life.

reynolds_self_portrait_1775_0
Reynolds’ self-portrait as deaf (1775)

Others preferred to deflect attention from their disabilities, camouflaging their tools in the environment or even transforming them into signals of power. At the height of the Napoleonic Age, King John VI of Portugal commissioned an acoustic throne with two open lion mouths at the end of the arms. These lion mouthes became his makeshift ears, design transforming weakness into a token of strength; Visitors were required to kneel before the chair and speak directly into the animal heads.

acoustic throne
King John VI’s acoustic throne, its lion head ears requiring submission

The advent of the telephone changed hearing aid technology significantly. Since the early 20th century, they’ve gone from being electronic to transistor to digital. Following the exponential dynamics of Moore’s Law, their size has shrunk drastically: contemporary tyrants need not camouflage their weakness behind visual symbols of power. Only recently have they been able to dynamically adapt to their surroundings, as in the anecdote told by the British gentleman at my talk. Time will tell how they evolve in the near future. Awesome machine listening research in labs like those run by Juan Pablo Bello at NYU may unlock new capabilities where aids can register urban mood, communicating the semantics of a surrounding as opposed to merely modulating acoustics. Making sense of sound requires slightly different machine learning techniques than making sense of images, as Bello explores in this recent paper. In 50 years time, modern digital hearing aids may seem as eccentric as a throne with lion-mouth ears.

The world abounds in strangeness. The saddest state of affairs is one of utter familiarity, is one where the world we knew yesterday remains the world we will know tomorrow. Is the trap of the filter bubble, the closing of the mind, the resilient force of inertia and sameness. I would have never included a hearing aid in my toolbox of metaphors to help others gain an intuition of how AI works or will be impactful. For I have never lived in the world the exact same way the British gentleman has lived in the world. Let us drink from the cup of the experiences we ourselves never have. Let us embrace the questions from left field. Let each week, let each day, open our perspectives one sliver larger than the day before. Let us keep alive the temperance of commerce and the sacred conditions of curiosity.


The featured image is of Madame de Meuron, a 20th-century Swiss aristocrat and eccentric. Meuron is like the fusion of Jean des Esseintes–the protagonist of Huysman’s paradigmatic decadent novel, À Rebours, the poisonous book featured in Oscar Wilde’s Picture of Dorian Gray–and Gertrude Stein or Peggy Guggenheim. She gives life to characters in Thomas Mann novels. She is a modern day Quijote, her mores and habits out of sync with the tailwinds of modernity. Eccentricity, perhaps, the symptom of history. She viewed her deafness as an asset, not a liability, for she could control the input from her surroundings: “So ghör i nume was i wott! – So I only hear what I want to hear!”

Clinamen

The Sagrada Familia is a castle built by Australian termites.


The Sagrada Familia is not a castle built by Australian termites, and never will be. Tis utter blasphemy.


The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell, in near silence, near but for the methodical gnawing, not unlike that of a mouse nibbling rapaciously on parched pasta left uneaten all these years but preserved under the thick dust on the thin cardboard with the thin plastic window enabling her to view what remained after she’d cooked just one serving, with butter, for her son, there emerged and fell, with the sublime transience of Andy Goldsworthy, a neo-Gothic church of organic complexity on par with that imagined by Antoni Gaudí i Cornet, whose Sagrada Familia is scheduled for completion in 2026, a full century after the architect died in a tragic tram crash, distracted by the recent rapture of his prayer.


The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell a structure so eerily resemblant of the one Antoni Gaudí imagined before he died, neglected like a beggar in his shabby clothes, the doctors unaware they had the chance to save the mind that preempted the fluidity of contemporary parametric architectural design by some 80 odd years, a mind supple like that of Poincaré, singular yet part of a Zeitgeist bent on infusing time into space like sandalwood in oil, inseminating Euclid’s cold geometry with femininity and life, Einstein explaining why Mercury moves retrograde, Gaudí rendering the holy spirit palpable as movement in stone, fractals of repetition and difference giving life to inorganic matter, tension between time and space the nadir of spirituality, as Andrei Tarkovsky went on to explore in his films.

tarkovsky mirror
From Andrei Tarkovsky’s Mirror. As Tarkovsky wrote of his films in Sculpting in Time: “Just as a sculptor takes a lump of marble, and, inwardly conscious of the features of his finished piece, removes everything that is not a part of it — so the film-maker, from a ‘lump of time’ made up of an enormous, solid cluster of living facts, cuts off and discards whatever he does not need, leaving only what is to be an element of the finished film.”

The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell a structure so eerily resemblant of the one Antoni Gaudí imagined before he died, with the (seemingly crucial) difference that the termites built their temple without blueprints or plan, gnawing away the silence as a collectivity of single stochastic acts which, taken together over time, result in a creation that appears, to our meaning-making minds, to have been created by an intelligent designer, this termite Sagrada Familia a marvelous instance of what Dennett calls Darwin’s strange inversion of reasoning, an inversion that admits to the possibility that absolute ignorance can serve as master artificer, that IN ORDER TO MAKE A PERFECT AND BEAUTIFUL MACHINE, IT IS NOT REQUISITE TO KNOW HOW TO MAKE IT*, that structures might emerge from the local activity of multiple parts, amino acids folding into proteins, bees flying into swarms, bumper-to-bumper traffic suddenly flowing freely, these complex release valves seeming like magic to the linear perspective of our linear minds.


The Sagrada Familia is not a castle built by Australian termites, and yet, the eerie resemblance between the termite and the tourist Sagrada Familias serves as a wonderful example to anchor a very important cultural question as we move into an age of post-intelligent design, where the technologies we create exhibit competence without comprehension, diagnosing lungs as cancerous or declaring that individuals merit a mortgage or recommending that a young woman would be a good fit for a role on a software engineering team or getting better and better at Go by playing millions of games against itself in a schizophrenic twist resemblant of the pristine pathos of Stephan Zweig, one’s own mind an asylum of exiled excellence during the travesty of the second world war, why, we’ve come full circle and stand here at a crossroads, bidden by a force we ourselves created to accept the creative potential of Lucretius’ swerve, to kneel at the altar of randomness, to appreciate that computational power is not just about shuffling 1s and 0s with speed but shuffling them fast enough to enable a tiny swerve to result in wondrous capabilities, and to watch as, perhaps tragically, we apply a framework built for intelligent design onto a Darwinian architecture, clipping the wings of stochastic potential, working to wrangle our gnawing termites into a straight jacket of cause, while the systems beating Atari, by no act of strategic foresight but by the blunt speed of iteration, make a move so strange and so outside the realm of verisimilitude that, as Kasparov succumbing to Deep Blue, we misinterpret a bug for brilliance.


The Sagrada Familia is not a castle built by Australian termites, and yet, it seems plausible that Gaudí would have reveled in the eerie resemblance between a castle built by so many gnawing termites and the temple Josep Maria Bocabella i Verdaguer, a bookseller with a popular fundamentalist newspaper, “the kind that reminded everybody that their misery was punishment for their sins,”**commissioned him to build.

Bocabella
A portrait of Josep Maria Bocabella, who commissioned Gaudí to build the Sagrada Familia.

Or would he? Gaudí was deeply Catholic. He genuflected at the temple of nature, seeing divine inspiration in the hexagons of honeycombs, imagining the columns of the Sagrada Familia to lean, buttresses, as symbols of the divine trilogy of the father (the vertical axis), son (the horizontal axis), and holy spirit (the vertical meeting the horizontal in crux of the diagonal). His creativity, therefore, always stemmed from something more than intelligent design, stood as an act of creative prayer to render homage to God the creator by creating an edifice that transformed, in fractals of repetition in difference, inert stone into movement and life.

columns
The top of the columns inside the Sagrada Familia have twice as many lines as the roots,             the doubling generating a sense of movement and life.

The Sagrada Familia is not a castle built by Australian termites, and yet, the termite Sagrada Familia actually exists as a complete artifact, its essence revealed to the world rather than being stuck in unfinished potential. And yet, while we wait in joyful hope for its imminent completion, this unfinished, 144-year-long architectural project has already impacted so many other architects, from Frank Gehry to Zaha Hadid. This unfinished vision, this scaffold, has launched a thousand ships of beauty in so many other places, changing the skylines of Bilbao and Los Angeles and Hong Kong. Perhaps, then, the legacy of the Sagrada Family is more like that of Jodorowsky’s Dune, an unfinished film that, even from its place of stunted potential,  changed the history of cinema. Perhaps, then, the neglect the doctors showed to Gaudí, the bearded beggar distracted by his act of prayer, was one of those critical swerves in history. Perhaps, had Gaudí lived to finish his work, architects during the century wouldn’t have been as puzzled by the parametric requirements of his curves and the building wouldn’t have gained the puzzling aura it gleans to this day. Perhaps, no matter how hard we try to celebrate and accept the immense potential of stochasticity, we will always be makers of meaning, finders of cause, interpreters needing narrative to live grounded in our world. And then again, perhaps not.


The Sagrada Familia is not a castle built by Australian termites. The termites don’t care either way. They’ll still construct their own Sagrada Familia.


The Sagrada Familia is a castle built by Australian termites. How wondrous. How essential must be these shapes and forms.


The Sagrada Familia is a castle built by Australian termites. It is also an unfinished neo-Gothic church in Barcelona, Spain. Please, terrorists, please don’t destroy this temple of unfinished potential, this monad brimming the history of the world, each turn, each swerve a pivot down a different section of the encyclopedia, coming full circle in its web of knowledge, imagination, and grace.


The Sagrada Familia is a castle built by Australian termites. We’ll never know what Gaudí would have thought about the termite castle. All we have are the relics of his Poincaréan curves, and fish lamps to illuminate our future.

fish-4
Frank Gehry’s fish lamps, which carry forth the spirit of Antoni Gaudí

*Dennett reads these words, penned in 1868 by Robert Beverley MacKenzie, with pedantic panache, commenting that the capital letters were in the original.

**Much in this post was inspired by Roman Mars’ awesome 99% Invisible podcast about the Sagrada Familia, which features the quotation about Bocabella’s newspaper.

The featured image comes from Daniel Dennett’s From Bacteria to Bach and Back. I had the immense pleasure of interviewing Dan on the In Context podcast, where we discuss many of the ideas that appear in this post, just in a much more cogent form. 

 

Degrees of Knowledge

That familiar discomfort of wanting to write but not feeling ready yet.*

(The default voice pops up in my brain: “Then don’t write! Be kind to yourself! Keep reading until you understand things fully enough to write something cogent and coherent, something worth reading.”

The second voice: “But you committed to doing this! To not write** is to fail.***”

The third voice: “Well gosh, I do find it a bit puerile to incorporate meta-thoughts on the process of writing so frequently in my posts, but laziness triumphs, and voilà there they come. Welcome back. Let’s turn it to our advantage one more time.”)

This time the courage to just do it came from the realization that “I don’t understand this yet” is interesting in itself. We all navigate the world with different degrees of knowledge about different topics. To follow Wilfred Sellars, most of the time we inhabit the manifest image, “the framework in terms of which man came to be aware of himself as man-in-the-world,” or, more broadly, the framework in terms of which we ordinarily observe and explain our world. We need the manifest image to get by, to engage with one another and not to live in a state of utter paralysis, questioning our every thought or experience as if we were being tricked by the evil genius Descartes introduces at the outset of his Meditations (the evil genius toppled by the clear and distinct force of the cogito, the I am, which, per Dan Dennett, actually had the reverse effect of fooling us into believing our consciousness is something different from what it actually is). Sellars contrasts the manifest image with the scientific image: “the scientific image presents itself as a rival image. From its point of view the manifest image on which it rests is an ‘inadequate’ but pragmatically useful likeness of a reality which first finds its adequate (in principle) likeness in the scientific image.” So we all live in this not quite reality, our ability to cooperate and coexist predicated pragmatically upon our shared not-quite-accurate truths. It’s a damn good thing the mess works so well, or we’d never get anything done.

Sellars has a lot to say about the relationship between the manifest and scientific images, how and where the two merge and diverge. In the rest of this post, I’m going to catalogue my gradual coming to not-yet-fully understanding the relationship between mathematical machine learning models and the hardware they run on. It’s spurring my curiosity, but I certainly don’t understand it yet. I would welcome readers’ input on what to read and to whom to talk to change my manifest image into one that’s slightly more scientific.

So, one common thing we hear these days (in particular given Nvidia’s now formidable marketing presence) is that graphical processing units (GPUs) and tensor processing units (TPUs) are a key hardware advance driving the current ubiquity in artificial intelligence (AI). I learned about GPUs for the first time about two years ago and wanted to understand why they made it so much faster to train deep neural networks, the algorithms behind many popular AI applications. I settled with an understanding that the linear algebra–operations we perform on vectors, strings of numbers oriented in a direction in an n-dimensional space–powering these applications is better executed on hardware of a parallel, matrix-like structure. That is to say, properties of the hardware were more like properties of the math: they performed so much more quickly than a linear central processing unit (CPU) because they didn’t have to squeeze a parallel computation into the straightjacket of a linear, gated flow of electrons. Tensors, objects that describe the relationships between vectors, as in Google’s hardware, are that much more closely aligned with the mathematical operations behind deep learning algorithms.

There are two levels of knowledge there:

  • Basic sales pitch: “remember, GPU = deep learning hardware; they make AI faster, and therefore make AI easier to use so more possible!”
  • Just above the basic sales pitch: “the mathematics behind deep learning is better represented by GPU or TPU hardware; that’s why they make AI faster, and therefore easier to use so more possible!”

At this first stage of knowledge, my mind reached a plateau where I assumed that the tensor structure was somehow intrinsically and essentially linked to the math in deep learning. My brain’s neurons and synapses had coalesced on some local minimum or maximum where the two concepts where linked and reinforced by talks I gave (which by design condense understanding into some quotable meme, in particular in the age of Twitter…and this requirement to condense certainly reinforces and reshapes how something is understood).

In time, I started to explore the strange world of quantum computing, starting afresh off the local plateau to try, again, to understand new claims that entangled qubits enable even faster execution of the math behind deep learning than the soddenly deterministic bits of C, G, and TPUs. As Ivan Deutsch explains this article, the promise behind quantum computing is as follows:

In a classical computer, information is stored in retrievable bits binary coded as 0 or 1. But in a quantum computer, elementary particles inhabit a probabilistic limbo called superposition where a “qubit” can be coded as 0 and 1.

Here is the magic: Each qubit can be entangled with the other qubits in the machine. The intertwining of quantum “states” exponentially increases the number of 0s and 1s that can be simultaneously processed by an array of qubits. Machines that can harness the power of quantum logic can deal with exponentially greater levels of complexity than the most powerful classical computer. Problems that would take a state-of-the-art classical computer the age of our universe to solve, can, in theory, be solved by a universal quantum computer in hours.

What’s salient here is that the inherent probabilism of quantum computers make them even more fundamentally aligned with the true mathematics we’re representing with machine learning algorithms. TPUs, then, seem to exhibit a structure that best captures the mathematical operations of the algorithms, but exhibit the fatal flaw of being deterministic by essence: they’re still trafficking in the binary digits of 1s and 0s, even if they’re allocated in a different way. Quantum computing seems to bring back an analog computing paradigm, where we use aspects of physical phenomena to model the problem we’d like to solve. Quantum, of course, exhibits this special fragility where, should the balance of the system be disrupted, the probabilistic potential reverts down to the boring old determinism of 1s and 0s: a cat observed will be either dead or alive, as the harsh law of the excluded middle haunting our manifest image.

What, then, is the status of being of the math? I feel a risk of falling into Platonism, of assuming that a statement like “3 is prime” refers to some abstract entity, the number 3, that then gets realized in a lesser form as it is embodied on a CPU, GPU, or cup of coffee. It feels more cogent to me to endorse mathematical fictionalism, where mathematical statements like “3 is prime” tell a different type of truth than truths we tell about objects and people we can touch and love in our manifest world.****

My conclusion, then, is that radical creativity in machine learning–in any technology–may arise from our being able to abstract the formal mathematics from their substrate, to conceptually open up a liminal space where properties of equations have yet to take form. This is likely a lesson for our own identities, the freeing from necessity, from assumption, that enables us to come into the self we never thought we’d be.

I have a long way to go to understand this fully, and I’ll never understand it fully enough to contribute to the future of hardware R&D. But the world needs communicators, translators who eventually accept that close enough can be a place for empathy, and growth.


*This holds not only for writing, but for many types of doing, including creating a product. Agile methodologies help overcome the paralysis of uncertainty, the discomfort of not being ready yet. You commit to doing something, see how it works, see how people respond, see what you can do better next time. We’re always navigating various degrees of uncertainty, as Rich Sutton discussed on the In Context podcast. Sutton’s formalization of doing the best you can with the information you have available today towards some long-term goal, but learning at each step rather than waiting for the long-term result, is called temporal-difference learning.

**Split infinitive intentional.

***Who’s keeping score?

****That’s not to say we can’t love numbers, as Euler’s Identity inspires enormous joy in me, or that we can’t love fictional characters, or that we can’t love misrepresentations of real people that we fabricate in our imaginations. I’ve fallen obsessively in love with 3 or 4 imaginary men this year, creations of my imagination loosely inspired by the real people I thought I loved.

The image comes from this site, which analyzes themes in films by Darren Aronofsky. Maximilian Cohen, the protagonist of Pi, sees mathematical patterns all over the place, which eventually drives him to put a drill into his head. Aronofsky has a penchant for angst. Others, like Richard Feynman, find delight in exploring mathematical regularities in the world around us. Soap bubbles, for example, offer incredible complexity, if we’re curious enough to look.

Macro_Photography_of_a_soap_bubble
The arabesques of a soap bubble