What makes a memory profound?

Not every event in your life has had profound significance for you. There are a few, however, that I would consider likely to have changed things for you, to have illuminated your path. Ordinarily, events that change our path are impersonal affairs, and yet are extremely personal. – Don Juan Matus, a (potentially fictional) Yaqui shaman from Mexico

The windowless classroom was dark. We were sitting around a rectangular table looking at a projection of Rembrandt’s Syndics of the Drapers’ Guild. Seated opposite the projector, I could see student faces punctuate the darkness, arching noses and blunt hair cuts carving topography through the reddish glow.

“What do you see?”

Barbara Stafford’s voice had the crackly timbre of a Pablo Casals record and her burnt-orange hair was bi-toned like a Rothko painting. She wore downtown attire, suits far too elegant for campus with collars that added movement and texture to otherwise flat lines. We were in her Art History 101 seminar, an option for University of Chicago undergrads to satisfy a core arts & humanities requirement. Most of us were curious about art but wouldn’t major in art history; some wished they were elsewhere. Barbara knew this.

“A sort of darkness and suspicion,” offered one student.

“Smugness in the projection of power,” added another.

“But those are interpretations! What about the men that makes them look suspicious or smug? Start with concrete details. What do you see?”

No one spoke. For some reason this was really hard. It didn’t occur to anyone to say something as literal as “I see a group of men, most of whom have long, curly, light-brown hair, in black robes with wide-brimmed tall black hats sitting around a table draped with a red Persian rug in the daytime.” Too obvious, like posing a basic question about a math proof (where someone else inevitably poses the question and the professor inevitably remarks how great a question it is to our curious but proud dismay). We couldn’t see the painting because we were too busy searching for a way of seeing that would show others how smart we were.

“Katie, you’re our resident fashionista. What strikes you about their clothing?”

Adrenaline surged. I felt my face glow in the reddish hue of the projector, watched others’ faces turn to look at mine, felt a mixture of embarrassment at being tokenized as the student who cared most about clothes and appearance and pride that Barbara found something worth noticing, in particular given her own evident attention to style. Clothes weren’t just clothes for me: they were both art and protection. The prospect of wearing the same J Crew sweater or Seven jeans as another girl had been cruelly beaten out of me in seventh grade, when a queen mean girl snidely asked, in chemistry class, if I knew that she had worn the exact same salmon-colored Gap button-down crew neck cotton sweater, simply in the cream color, the day before. My mom had gotten me the sweater. All moms got their kids Gap sweaters in those days. The insinuation was preposterous but stung like a wasp: henceforth I felt a tinge of awkwardness upon noticing another woman wearing an article of clothing I owned. In those days I wore long ribbons in my ponytails to make my hair seem longer than it was, like extensions. I often wore scarves, having admired the elegance of Spanish women tucking silk scarves under propped collared shirts during my senior year of high school abroad in Burgos, Spain. Material hung everywhere around me. I liked how it moved in the wind and encircled me in the grace I feared I lacked.

“I guess the collars draw your attention. The three guys sitting down have longer collars. They look like bibs. The collar of the guy in the middle is tied tight, barely any space between the folds. A silver locket emerges from underneath. The collars of the two men to his left (and our right) billow more, they’re bunchy, as if those two weren’t so anal retentive when they get dressed in the morning. They also have kinder expressions, especially the guy directly to the left of the one in the center. And then it’s as if the collars of the men standing to the right had too much starch. They’re propped up and overly stiff, caricature stiff. You almost get the feeling Rembrandt added extra air to these puffed up collars to make a statement about the men having their portrait done. Like, someone who had taste and grace wouldn’t have a collar that was so visibly puffy and stiff. Also, the guy in the back doesn’t have a hat like the others.”

Barbara glowed. I’d given her something to work with, a constraint from which to create a world. I felt like I’d just finished a performance, felt the adrenaline subside as students’ turned their heads back to face the painting again, shifted their attention to the next question, the next comment, the next brush stroke in Syndics of the Drapers’ Guild. 

After a few more turns goading students to describe the painting, Barbara stepped out of her role as Socrates and told us about the painting’s historical context. I don’t remember what she said or how she looked when she said it. I don’t remember every class with her. I do remember a homework assignment she gave inspired by André Breton’s objet trouvé, a surrealist technique designed to get outside our standard habits of perception, to let objects we wouldn’t normally see pop into our attention. I wrote about my roommate’s black high-heeled shoes and Barbara could tell I was reading Nietzsche’s Birth of Tragedy because I kept referencing Apollo and Dionysus, godheads for constructive reason and destructive passion, entropy pulling us ever to our demise.[1] I also remember a class where we studied Cindy Sherman photos, in particular her self portraits as Caravaggio’s Bacchus and her film still from Hitchcock’s Vertigo. We took a trip to the Chicago Art Institute and looked at few paintings together. Barbara advised us never to use the handheld audio guides as they would pollute our vision. We had to learn how to trust ourselves and observe the world like scientists.

cindy-sherman_caravaggio
Cindy Sherman’s Untitled #224, styled after Caravaggio’s Bacchus
Cindy_Sherman_Untitled_Film_Still_21
Cindy Sherman’s Untitled Film Still 21, styled after Hitchcock’s Vertigo

In the fourth paragraph of the bio on her personal website, Barbara says that “she likes to touch the earth without gloves.” She explains that this means she doesn’t just write about art and how we perceive images, but also “embodies her ideas in exhibitions.”

I interpret the sentence differently. To touch the earth without gloves is to see the details, to pull back the covers of intentionality and watch as if no one were watching. Arts and humanities departments are struggling to stay relevant in an age where we value computer science, mathematics, and engineering. But Barbara didn’t teach us about art. She taught us how to see, taught us how to make room for the phenomenon in front of us. Paintings like Rembrandt’s Syndics of the Drapers’ Guild were a convenient vehicle for training skills that can be transferred and used elsewhere, skills which, I’d argue, are not only relevant but essential to being strong leaders, exacting scientists, and respectful colleagues. No matter what field we work in, we must all work all the time to notice our cognitive biases, the ever-present mind ghosts that distort our vision. We must make room for observation. Encounter others as they are, hear them, remember their words, watch how their emotions speak through the slight curl of their lips and the upturned arch of their eyebrows. Great software needs more than just engineering and science: it needs designers who observe the world to identify features worth building.

I am indebted to Barbara for teaching me how to see. She is integral to the success I’ve had in my career in technology.

BarbaraStafford
A picture that captures what I remember about Barbara

Of all the memories I could share about my college experience, why share this one? Why do I remember it so vividly? What makes this memory profound?

I recently read Carlos Casteñeda’s The Active Side of Infinity and resonated with book’s premise as “a collection of memorable events” Casteñeda recounts as an exercise to become a warrior-traveler like the shamans who lived in Mexico in ancient times. Don Juan Matus, a (potentially fictional) Yaqui shaman who plays the character of Casteñeda’s guru in most of his work, considers the album “an exercise in discipline and impartiality…an act of war.” On his first pass, Casteñeda picks out memories he assumes should be important in shaping him as an individual, events like getting accepted to the anthropology program at UCLA or almost marrying a Kay Condor. Don Juan dismisses them as “a pile of nonsense,” noting they are focused on his own emotions rather than being “impersonal affairs” that are nonetheless “extremely personal.”

The first story Casteñeda tells that don Juan deems fit for a warrior-traveler is about Madame Ludmilla, “a round, short woman with bleached-blond hair…wearing a a red silk robe with feathery, flouncy sleeves and red slippers with furry balls on top” who performs a grotesque strip tease called “figures in front of a mirror.” The visuals remind me of dream sequence from a Fellini movie, filled with the voluptuousness of wrinkled skin and sagging breasts and the brute force of the carnivalesque. Casteñeda’s writing is noticeably better when he starts telling Madame Ludmilla’s story: there’s more detail, more life. We can picture others, smell the putrid stench of dried vomit behind the bar, relive the event with Casteñeda and recognize a truth in what he’s lived, not because we’ve had the exact same experience, but because we’ve experienced something similar enough to meet him in the overtones. “What makes [this story] different and memorable,” explains don Juan, “is that it touches every one of us human beings, not just you.”

ludmilla
This is how I imagined Madame Ludmilla, as depicted in Fellini’s 8 1/2. As don Juan says, we are all “senseless figures in front of a mirror.”

Don Juan calls this war because it requires discipline to see the world this way. Day in and day out, structures around us bid us to focus our attention on ourselves, to view the world through the prism of self-improvement and self-criticism: What do I want from this encounter? What does he think of me? When I took that action, did she react with admiration or contempt? Is she thinner than I am? Look at her thighs in those pants–if I keep eating desserts they way I do, my thighs will start to look like that too. I’ve fully adopted the growth mindset and am currently working on empathy: in that last encounter, I would only give myself a 4/10 on my empathy scale. But don’t you see that I’m an ESFJ? You have to understand my actions through the prism of my self-revealed personality guide! It’s as if we live in a self-development petri dish, where experiences with others are instruments and experiments to make us better. Everything we live, everyone we meet, and everything we remember gets distorted through a particular analytical prism: we don’t see and love others, we see them through the comparative machine of the pre-frontal cortex, comparing, contrasting, categorizing, evaluating them through the prism of how they help or hinder our ability to become the future self we aspire to become.

Warrior-travelers like don Juan fight against this tendency. Collecting an album of memorable events is a exercise in learning how to live differently, to change how we interpret our memories and first-person experiences. As non-warriors, we view memories as scars, events that shape our personality and make us who we are. As warriors, we view ourselves as instruments and vessels to perceive truths worth sharing, where events just so happen to happen to us so we can feel them deeply enough and experience the minute details required to share vivid details with others. Warriors are instruments of the universe, vessels for the universe to come to know itself. We can’t distort what others feel because we want them to like us or act a certain way because of us: we have to see others for who they are, make space for negative and positive emotions. What matters isn’t that we improve or succeed, but that we increase the range of what’s perceivable. Only then can we transmit information with the force required to heal or inspire. Only then are we fearless. 

Don Juan’s ways of seeing and being weren’t all new to me (although there were some crazy ideas of viewing people as floating energy balls). There are sprinklings of my quest to live outside the self in many posts on the blog. Rather, The Active Side of Infinity helped me clarify why I share first-person stories in the first place. I don’t write to tell the world about myself or share experiences in an effort to shape my identity. This isn’t catharsis. I write to be a vessel, a warrior-traveller. To share what I felt and saw and smelled and touched as I lived experiences that I didn’t know would be important at the time but that have managed to stick around, like Argos, always coming back, somehow catalyzing feelings of love and gratitude as intense today as they were when I first experienced them. To use my experiences to illustrate things we are all likely to experience in some way or another. To turn memories into stories worth sharing, with details concrete enough that you, reader, can feel them, can relate to them, and understand a truth that, ill-defined and informal though it may be, is searing in its beauty.

This post features two excerpts from my warrior-traveler album, both from my time as an undergraduate at the University of Chicago. I ask myself: if I were speaking to someone for the first time and they asked me to tell them about myself, starting in college, would I share these memories? Likely not. But it’s a worthwhile to wonder if doing so might change the world for the good.


When I attended the University of Chicago, very few professors gave students long reading assignments for the first class. Some would share a syllabus, others would circulate a few questions to get us thinking. No one except Loren Kruger expected us to read half of Anna Karenina and be prepared to discuss Tolstoy’s use of literary from to illustrate 19th-century Russian class structures and ideology.

Loren was tall and big boned. A South African, she once commented on J.M. Coetzee’s startling ability to wield power through silence. She shared his quiet intensity, demanded such rigor and precision in her own work that couldn’t but demand it from others. The tiredness of the old world laced her eyes, but her work was about resistance; she wrote about Brecht breaking boundaries in theater, art as an iron-hot rod that could shed society’s tired skin and make room for something new. She thought email destroyed intimacy because the virtual distance emboldened students to reach out far more frequently than when they had to brave a face-to-face encounter. About fifteen students attended the first class. By the third class, there were only three of us. With two teaching assistants (a French speaker and a German speaker), the student:teacher ratio became one:one.[2]

LK_Santiago_web
A picture that captures what I remember about Loren

Loren intimated me, too. The culture at the University of Chicago favored critical thinking and debate, so I never worried about whether my comments would offend others or come off as bitchy (at Stanford, sadly, this was often the case). I did worry about whether my ideas made sense. Being the most talkative student in a class of three meant I was constantly exposed in Loren’s class, subjecting myself to feedback and criticism. She criticized openly and copiously, pushing us for precision, depth, insight. It was tough love.

The first thing Loren taught me was the importance of providing concrete examples to test how well I understood a theory. We were reading Karl Marx, either The German Ideology or the first volume of Das Kapital.[3] I confidently answered Loren’s questions about the text, reshuffling Marx’s words or restating what he’d written in my own words. She then asked me to provide a real-world example of one of his theories. I was blank. Had no clue how to answer. I’d grown accustomed to thinking at a level of abstraction, riding text like a surfer rides the top of a wave without grounding the thoughts in particular examples my mind could concretely imagine.[4] The gap humbled me, changed how I test whether I understand something. This happens to be a critical skill in my current work in technology, given how much marketing and business language is high-level and general: teams think they are thinking the same thing, only to realize that with a little more detail they are totally misaligned.

We wrote midterm papers. I don’t remember what I wrote about but do remember  opening the email with the grade and her comments, laptop propped on my knees and back resting against the powder-blue wall in my bedroom off the kitchen in the apartment on Woodlawn Avenue. B+. “You are capable of much more than this.” Up rang my old friend imposture syndrome: no, I’m not, what looks like eloquence in class is just a sham, she’s going to realize I’m not what she thinks I am, useless, stupid, I’ll never be able to translate what I can say into writing. I don’t know how. Tucked behind the fiddling furies whispered the faint voice of reason: You do remember that you wrote your paper in a few hours, right? That you were rushing around after the house was robbed for the second time and you had to move? 

Before writing our final papers, we had to submit and receive feedback on a formal prospectus rather than just picking a topic. We’d read Franz Fanon’s The Wretched of the Earth and I worked with Dustin (my personal TA) to craft a prospectus analyzing Gillo Pontecorvo’s Battle of Algiers in light of some of Fanon’s descriptions of the experience of colonialism.[7]

Once again, Loren critiqued it harshly. This time I panicked. I didn’t want to disappoint her again, didn’t want the paper to confirm to both of us that I was useless, incompetent, unable to distill my thinking into clear and cogent writing. The topic was new to me and out of my comfort zone: I wasn’t an expert in negritude and or post-colonial critical theory. I wrote her a desperate email suggesting I write about Baudelaire and Adorno instead. I’d written many successful papers about French Romanticism and Symbolism and was on safer ground.

la pointe.gif
Ali La Pointe, the martyred revolutionary in The Battle of Algiers

Her response to my anxious plea was one of the more meaningful interactions I’ve ever had with a professor.

Katie, stop thinking about what you’re going to write and just write. You are spending far too much energy worrying about your topic and what you might or might not produce. I am more than confident you are capable of writing something marvelous about the subject you’ve chosen. You’ve demonstrated that to me over the quarter. My critiques of your prospectus were intended to help you refine your thinking, not push you to work on something else. Just work!

I smiled a sigh of relief. No professor had ever said that to me before. Loren had paid attention, noticed symptoms of anxiety but didn’t placate or coddle me. She remained tough because she believed I could improve. Braved the mania. This interaction has had a longer-lasting impact on me than anything I learned about the subject matter in her class. I can call it to mind today, in an entirely different context of activity, to galvanize myself to get started when I’m anxious about a project at work.

The happiest moments writing my final paper about the Battle of Algiers were the moments describing what I saw in the film. I love using words to replay sequences of stills, love interpreting how the placement of objects or people in a still creates an emotional effect. My knack for doing so stems back to what I learned in Art History 101. I think I got an A on the paper. I don’t remember or care. What stays with me is my gratitude to Loren for not letting me give up, and the clear evidence she cared enough about me to put in the work required to help me grow.


[1] This isn’t the first time things I learned in Barbara’s class have made it into my blog. The objet trouvé exercise inspired a former blog post.

[2] I ended up having my own private teaching assistant, a French PhD named Dustin. He told me any self-respecting comparative literature scholar could read and speak both French and German fluently, inspiring me to spend the following year in Germany.

[3] I picked up my copy of The Marx-Engels Reader (MER) to remember what text we read in Loren’s class. I first read other texts in the MER in Classics of Social and Political Thought, a social sciences survey course that I took to fulfilled a core requirement (similar to Barbara’s Art History 101) my sophomore year. One thing that leads me to believe we read The German Ideology or volume one of Das Kapital in Loren’s class is the difference in my handwriting between years two and four of college. In year two, my handwriting still had round playfulness to it. The letters are young and joyful, but look like they took a long time to write. I remember noticing that my math professors all seemed to adopt a more compact and efficient font when they wrote proofs on the chalkboard: the a’s were totally sans-serif, loopless. Letters were small. They occupied little space and did what they could not to draw attention to themselves so the thinker could focus on the logic and ideas they represented. I liked those selfless a’s and deliberately changed my handwriting to imitate my math professors. The outcome shows in my MER. I apparently used to like check marks to signal something important: they show up next to straight lines illuminating passages to come back to. A few great notes in the margins are: “Hegelian–>Too preoccupied w/ spirit coming to itself at basis…remember we are in (in is circled) world of material” and “Inauthenticity–>Displacement of authentic action b/c always work for later (university/alienation w/ me?)”

[4] There has to be a ton of analytic philosophy ink spilled on this question, but it’s interesting to think about what kinds of thinking is advanced by pure formalisms that would be hampered by ties to concrete, imaginable referents and what kinds of thinking degrade into senseless mumbo jumbo without ties to concrete, imaginable referents. Marketing language and politically correct platitudes definitely fall into category two. One contemporary symptom of not knowing what one’s talking about is the abuse of the demonstrative adjective that. Interestingly enough, such demonstrative abusers never talk about thises, they only talk about thats. This may be used emphatically and demonstratively in a Twitter or Facebook conversation: when someone wholeheartedly supports a comment, critique, or example of some point, they’ll write This as a stand-alone sentence with super-demonstrative reference power, power strong enough to encompass the entire statement made before it. That’s actually ok. It’s referring to one thing, the thing stated just above it. It’s dramatic but points to something the listener/reader can also point to. The problem with the abused that is that it starts to refer to a general class of things that are assumed, in the context of the conversation, to have some mutually understood functional value: “To successfully negotiate the meeting, you have to have that presentation.” “Have that conversation — it’s the only way to support your D&I efforts!” Here, the listener cannot imagine any particular that that these words denote. The speaker is pointing to a class of objects she assumes the listener is also familiar with and agrees exist. A conversation about what? A presentation that looks like what? There are so many different kinds and qualities of conversations or presentations that could fit the bill. I hear this used all the time and cringe a little inside every time. I’m curious to know if others have the same reaction I do, or if I should update my grammar police to accept what has become common usage. Leibniz, on the other hand, was an early modern staunch defender of cogitatio caeca (Latin for blind thought), which referred to our ability to calculate and manipulate formal symbols and create truthful statements without requiring the halting step of imagining the concrete objects these symbols refer to. This, he argued against conservatives like Thomas Hobbes, was crucial to advance mathematics. There are structural similarities in the current debates about explainability of machine learning algorithms, even though that which is imagined or understood may lie on a different epistemological, ontological, and logical plane.

[5] People tell me that one reason they like my talks about machine learning is that I use a lot of examples to help them understand abstract concepts. Many talks are structured like this one, where I walk an audience through the decisions they would have to make as a cross-functional team collaborating on a machine learning application. The example comes from a project former colleagues worked on. I realized over the last couple of years that no matter how much I like public speaking, I am horrified by the prospect of specializing in speaking or thought leadership and not being actively engaged in the nitty-gritty, day-to-day work of building systems and observing first-person how people interact with them. I believe the existential horror stems from my deep-seated beliefs about language and communication, in my deep-seated discomfort with words that don’t refer to anything. Diving into this would be worthwhile: there’s a big difference between the fictional imagination, the ability to bring to life the concrete particularity of something or someone that doesn’t exist, and the vagueness of generalities lacking reference. The second does harm and breeds stereotypes. The first is not only potent in the realm of fiction, but, as my fiancé Mihnea is helping me understand, may well be one of the master skills of the entrepreneur and executive. Getting people aligned and galvanized around a vision can only occur if that vision is concrete, compelling, and believable. An imaginable state of the world we can all inhabit, even if it doesn’t exist yet. A tractable as if that has the power to influence what we do and how we behave today so as to encourage its creation and possibility.[6]

[6] I believe this is the first time I’ve had a footnote referring to another footnote (I did play around with writing an incorrigibly long photo caption in Analogue Repeaters). Funny this ties to the footnote just above (hello there, dear footnote!) and even funnier that footnote 4 is about demonstrative reference, including the this discursive reference. But it’s seriously another thought so I felt it merited it’s own footnote as opposed to being the second half of footnote 5. When I sat down to write this post, I originally planned to write about the curious and incredible potency of imagined future states as tools to direct action in the present. I’ve been thinking about this conceptual structure for a long time, having written about it in the context of seventeenth-century French philosophy, math, and literature in my dissertation. The structure has been around since the Greeks  (Aristotle references it in Book III of the Nicomachean Ethics) and is used in startup culture today. I started writing a post on the topic in August, 2018. Here’s the text I found in the incomplete draft when I reopened it a few days ago:

A goal is a thinking tool.

A good goal motivates through structured rewards. It keeps people focused on an outcome, helps them prioritize actions and say no to things, and stretches them to work harder than they would otherwise. Wise people say that a good goal should be about 80% achievable. Wise leaders make time reward and recognize inputs and outputs.

A great goal reframes what’s possible. It is moonshot and requires the suspension of disbelief, the willingness to quiet all the we can’ts and believe something surreal, to sacrifice realism and make room for excellence. It assumes a future outcome that is so outlandish, so bold, that when you work backwards through the series of steps required to achieve it, you start to do great things you wouldn’t have done otherwise. Fools say that it doesn’t matter if you never come close to realizing a great goal, because the very act of supposing it could be possible and reorienting your compass has already resulted in concrete progress towards a slightly more reasonable but still way above average outcome. 

Good goals create outcomes. Great goals create legacies.

This text alienates me. It reminds me of an inspirational business book: the syncopation and pace seem geared to stir pathos and excitement. How curious that the self evolves so quickly, that the I looking back on the same I’s creations of a few months ago feels like she is observing a stranger, someone speaking a different language and inhabiting a different world. But of course that’s the case. Of course being in a different environment shapes how one thinks and what one sees. And the lesson here is not one of fear around instability of character: it’s one that underlines to crucial importance of context, the crucial importance of taking care to select our surroundings so we fill our brains with thoughts and words that shape a world we find beautiful, a world we can call home. The other point of this footnote is a comment on the creative process. Readers may have noted the quotation from Pascal that accompanies all my posts: “The last thing one settles in writing a book is what one should put in first.” The joy of writing, for me, as for Mihnea and Kevin Kelly and many others, lies in unpacking an intuition, sitting down in front of a silent wall and a silent world to try to better understand something. I’m happiest when, writing fast, bad, and wrong to give my thoughts space to unfurl, I discover something I wouldn’t have discovered had I not written. Writing creates these thoughts. It’s possible they lie dormant with potential inside the dense snarl of an intuition and possible they wouldn’t have existed otherwise. Topic for another post. With this post, I originally intended to use the anecdote about Stafford’s class to show the importance of using concrete details, to illustrate how training in art history may actually be great training for the tasks of a leader and CEO. But as my mind circled around the structure that would make this kind of intro make sense, I was called to write about Casteñeda, pulled there by my emotions and how meaningful these memories of Barbara and Loren felt. I changed the topic. Followed the path my emotions carved for me. The process was painful and anxiety-inducing. But it also felt like the kind of struggle I wanted to undertake and live through in the service of writing something worth reading, the purpose of my blog.

[7] About six months ago, I learned that an Algerian taxi driver in Montréal was the nephew of Ali La Pointe, the revolutionary martyr hero in Battle of Algiers. It’s possible he was lying, but he was delighted by the fact that I’d seen and loved the film and told me about the heroic deeds of another uncle who didn’t have the same iconic stardom as Ali. Later that evening I attended a dinner hosted by Element AI and couldn’t help but tell Yoshua Bengio about the incredible conversation I had in the taxi cab. He looked at me with confusion and discomfort, put somewhat out of place and mind by my not accommodating the customary rules of conversation with acquaintances.

The featured image is the Syndics of the Drapers’ Guild, which Rembrandt painted in 1662. The assembled drapers assess the quality of different weaves and cloths, presumably, here, assessing the quality of the red rug splayed over the table. In Ways of Seeing, John Berger writes about how oil paintings signified social status in the early modern period. Having your portrait done showed you’d made it, the way driving a Porsche around town would do so today. When I mentioned that the collars seemed a little out of place, Barbara Stafford found the detail relevant precisely because of the plausibility that Rembrandt was including hints of disdain and critique in the commissioned portraits, mocking both his subjects and his dependence on them to get by. 

The Poignancy of Growth

I don’t know if Andrei Fajardo knows that I will always remember and cherish our walk up and down University Avenue in Toronto a few months ago. Andrei was faced with a hard choice about his career. He was fortunate: both options were and would be wonderful. He teetered for a few weeks within the suspension of multiple possible worlds, channeling his imagination to feel what it would feel like to make choice one, to feel what it would feel like to live the life opened by choice two. He sought advice from others. He experimented with different kinds of decision-making frameworks to see how the frame of evaluation shaped and brought forth his values, curtailed or unfurled his imagination. He listened for omens and watched rain clouds. He broke down the factors of his decision analytically to rank and order and plunder. He spoke to friends. He silenced the distractions of family. He twisted and turned inside the gravity that only shines forth when it really matters, when the frame of identity we’ve cushioned ourselves within for the last little while starts to fray under the invitation of new possibilities. The world had presented him with its supreme and daunting gift: the poignancy of growth.

I’m grateful that Andrei asked me to be one partner to help him think about his decision. Our conversations transported me back, softly, to the thoughts and feelings and endless walks and desperate desire for the certainty I experienced in 2011 as I waded through months to decide to leave academia and pursue a career in the private sector. I wanted Andrei to understand that the most important lesson that experience taught me was about a “peculiar congenital blindness” we face when we make a hard choice:

To be human is to suffer from a peculiar congenital blindness: On the precipice of any great change, we can see with terrifying clarity the familiar firm footing we stand to lose, but we fill the abyss of the unfamiliar before us with dread at the potential loss rather than jubilation over the potential gain of gladnesses and gratifications we fail to envision because we haven’t yet experienced them.

When faced with the most transformative experiences, we are ill-equipped to even begin to imagine the nature and magnitude of the transformation — but we must again and again challenge ourselves to transcend this elemental failure of the imagination if we are to reap the rewards of any transformative experience. (Maria Popova in her marvelous Brain Pickings newsletter about L.A. Paul’s Transformative Experience)

I shared examples of my own failure of imagination to help Andrei understand the nature of his choice. For hard choices about our future aren’t rational. They don’t fit neatly into the analytical framework of lists and cross-examination and double-entry bookkeeping. It’s the peculiar poignancy of our existence as beings unfurling in time that makes it impossible for us to know who we will be and what knowledge the world will provide us through the slot canyon aperture of our particular experience, bounded by bodies and time.

img_4628.jpg
A slot canyon I found in April at Arches National Park in Moab, Utah. It was near the marvelous pictographs, off the beaten path.

As Andrei toiled through his decision, he kept returning to a phrase he’d heard from Daphne Koller in her fireside chat with Yoshua Bengio at the 2018 ICLR conference in Vancouver. As he shared in this blog post, Daphne shared a powerful message: “Building a meaningful career as a scientist isn’t only about technical gymnastics; it’s about each person’s search to find and realize the irreplaceable impact they can have in our world.”

But, tragically or beautifully, depending on how you view it, there are many steps in our journey to realize what we believe to be our irreplaceable impact. Our understanding of what this could or should be can and should change as our slot canyon understanding of the world erodes just a little more under with the weight of wind and rain to bring forth light from the sun. In my own experience, I never, ever imagined that just two years after believing I had made a binary decision to leave academia for industry, I would be invited to teach as an adjunct law professor, that three years later I would give guest lectures at business schools around the world, that five years later I would give guest lectures in ethics and philosophy. The past self had a curious capacity to remain intact, to come with me as a barnacle through the various transformations. For the world was bigger and vaster than the limitations my curtailed imagination had imposed on it.

Andrei decided to stay with our company. He is a marvelous colleague and mentor. He is  a teacher: no matter where he goes and what he does, his irreplaceable impact will be to broaden the minds of others, to break down statistics and mathematics into crystal clear building blocks than any and all can understand. He’ll come to appreciate that he is a master communicator. I’m quite certain I’ll be there to watch.

What was most beautiful about our walk was the trust I had in Andrei and that he had in me. His awareness that I wanted what was best for him, that none of my comments were designed to manipulate him into staying if that weren’t what he, after his toil, after his reflection, decided was the path he wanted to explore. It was simply an opportunity to share stories and experiences, to provide him with a few different thinking tools he could try on for size as he made his decision. We punctuated our analysis with thoughts about the present. We deepened our connection. I gave him a copy of Rilke’s Letters to a Young Poet to help him come to know more of himself and the world. Throughout our walk, his energy was crystalline. He listened with attention rapt into the weight of it mattering. The attention that emerges when we are searching sponges sopping as much as we can from those we’ve come to trust. The air was chilled just enough to prickle goosebumps, but not so much as to need a sweater. The grass was green and the flowers had started to bud.

Yesterday was the first snow. There are still flowers; soon they’ll die. The leaves over Rosedale are yellow and red, made vibrant by the moisture. Andrei is with his dogs and his wife. I’ll see him tomorrow morning at work.


I found the featured image last weekend at the Thomson Landry Gallery in Toronto’s Distillery District. It’s a painting called “Choisir et Renoncer,” by Yoakim Bélanger. I see in it the migration of fragility, hands cradled open into reverent acceptance. I see in it the stone song of vulnerability: for it is the white figure–she who dared wade ankle-deep in Hades to hear Eurydice’s voice one more time–whose face glows brightest, who reveals the wrinkles of her character, who shines as a reflection of ourselves, unafraid to reveal her seashell cracks and the wisdom she acquired with the crabs. She etches herself into precision. She chisels brightly through the human haze of potential, buoyed upon the bronze haze of the self she once was, but yesterday. 

Tilting our heads to theta

I am forever indebted to Clyfe Beckwith, my high-school physics teacher. Not because he taught us mechanics, but because he used mechanics to teach us about the power and plasticity of conceptual frames.[1]

We were learning to think about things like energy and forces, about how objects move in space and on inclines and around curves (fear winces before tucking myself into a ball to go down a hill on cross-country skis are always accompanied by ironic inside-my-mind jabs yammering “your instincts suck–if your center of gravity is lower to the ground you’ll have an easier time combatting the conservation of angular momentum”). Learning to think about images like this one:

Screen Shot 2018-06-06 at 5.53.49 AM

The task here is to decompose the various forces that act on the object, recompose them into some net force, and then use this net force to describe how the block will accelerate down the inclined plane.

I remember the first step to tally the forces was easy. Gravity will pull the block down because that’s what gravity does. Friction will slow the block down because that’s what friction does. And then there’s this force we think less about in our day-to-day lives called the normal force, a contact force that surfaces exert to prevent solid objects from passing through each other. The solids-declaring-their-identity-as-not-air-and-not-ghosts force. And that’s gonna live at the fault line between the objects, concentrating its force-hands into the object’s back like a lover.

The problem is that, because the plane is inclined and not flat, these forces have directionality. They don’t all move up and down. Some move up and down and others are on a diagonal.

I remember feeling flummoxed because it was impossible to get them to align nicely into  the perpendicular x-y axes of the Cartesian plane. In my mind, given the math I’d studied up to that point (so, like, high-school algebra–I taught myself trigonometry over the summer but because I had taught myself never had the voiceover that this would be a tool for solving physics problems), Cartesian planes were things, objects as stiff as blocks, things that existed in one frame of reference, perfectly up and perfectly down, not a smidgen of Sharpie ink bleeding or wavering on the axis lines.

And then Clyfe Beckwith did something radical. He tilted the Cartesian plane so that it matched the angle of the incline.

I was like, you can’t do that.

He was like, why not?

I was like, because planes don’t tilt like that, because they exist in the rigid confines of up and down, of one single perpendicular, because that’s how the world works, because this is the static frame of reference I’ve learned about and used again and again and again to solve all these Euclidean geometry problems from freshman year, because Bill Scott, my freshman math teacher, is great, love the guy, and how can we question the mental habits he helped me build to solve those problems, because, wait, really, you can do that?

He was like, just try it.

I remember his cheshire cat smile, illuminated by the joy of palpably watching a student’s mind expand before his eyes, palpably seeing a shift that was one small step for man but a giant step for Mankind, something as abstract as a block moving on an inclined plane teaching a lesson about the malleability of thinking tools.

We tried it. We tilted our heads to theta. We decomposed gravity into two vectors, one countering the normal force and the other moving along the plane. We empathized with the normal force, saw the world from its perspective, not our own. What was intractable became simple. The math became easy.

This may seem obtuse and irrelevant. Stuff you haven’t thought about since high school. Forget about the physics. Focus on tilting your head to theta. Focus on the fact that there is nothing native or natural about assuming that Cartesian planes exist in some idealized perpendicular realm that must always start from up and down, that this disposition is the side effect of drawing them on a piece of paper oriented towards our own front-facing perspective. That it’s possible to empathize with the perspective of a block of mass m at a spot on a triangle tilted at angle theta to make it easier to solve a problem. And that this problem itself is only an approximation upon an approximation upon an approximation, gravity and friction acting at some idealized center to make it easier to do the math, integrating all the infinite smaller forces up through some idealized continuum. Stuff they don’t say much about in high school as that’s the time for building problem-solving muscles. Saving the dissolution for the appropriate moment.


Reading Carlo Rovelli’s The Order of Time pulled this memory out of storage. In his stunningly lyrical book, Rovelli explains how many (with a few variations depending on theory) physicists conceptualize quantum time as radically different from the linear, forward-marching-so-that-the-past-matters-our-actions-matter-choices-are-engraved-and-shape-the-landscape-of-future-possibilities-and-make-morality-possible flow of time we experience in our subjectivity. When we get quantum, explains Rovelli, the difference between past and future blur, phenomena appear that, like the ever-flitting molecules in a glass of water, seem static when viewed through the blunt approximations our eyes provide us so we can navigate the world without going insane from a thermodynamic onslaught.

boltzmann
One of the heroes in Rovelli’s book is Ludwig Boltzmann, who used probability to show that the systems tend towards disorder rather than order (entropy) because there are many more possible disordered than ordered configurations of a system. Order, like Reality, is just one improbability amidst indefinite possibles. The equation on his gravestone, S = k log W, shows the relationship between entropy and the number of ways the atoms of a system can be arranged. Boltzmann’s work using probability distributions to described the states of particles in a system was the grandfather to contemporary neural networks (reflected in the name Boltzmann Machine). Someday, someone has to write the book about the various ways in which machine learning is indebted to 19th-century physics (see also Hamiltonian Monte Carlo, amidst many others).

Rovelli reminds his reader that the idea of an absolute, independent variable time (which shows up in equations as a little t)–time independent of space, time outside the entangled space waltz of Wall-E and Eva, lovers saddened by age’s syncopation, death puncturing our emotional equilibrium to cleave entropy into the pits of our emotions, fraying the trampoline pad into the kinetic energy of springs–that this time is but a mental fiction Newton created within the framework of his theory of the universe, but that the mental landscape of this giant standing upon the shoulders of giants became so prominent in our culture that his view of the world became the view we all inherit and learn when we go to grammar school, became the way the world seems to be, became a fixture as seemingly natural as our experience that the Sun revolves around the Earth, and not otherwise. That the ideas are so ingrained that it takes patience and an open mind to notice that Aristotle talks about time much differently, that, for Aristotle, time was an index of the change relative to two objects. Consider the layers of complexity: how careful we have to be not to read an ancient text with the tinted glasses of today’s concepts, how open and imaginative we must remain to try to understand the text relative to its time as opposed to critiquing the arguments with our own baggage and assumptions; how incredible it is to think that this notion of absolute time, so ingrained that it felt like blasphemy to tilt the Cartesian plane to theta, is but the inheritance of Newton’s ideas, rejected by Leibniz as fitfully as Newton’s crazy idea that objects could influence one another from a distance[2]; how liberating to return to relativity, to experience the universe without the static restrictions of absolute time, to free even the self from the heavy shackles of personhood and identity and recognize we are nothing but socialite Bedouins migrating through the parameters of personality as we mirror approval and flee frowns, the self become array, traits and feelings and thoughts activated through contact with a lover, a family, a team, an organization, a city, a nation, all these concentric circles pitching abstractions that dissolve into the sweat beads of an embrace, of attention swallowed intact by the heartbeat of another.

But it’s not just mechanics. It’s not just time.

It can be the carapace we inherit as we hurt our way through childhood and adolescence, little protective walls of habitual reactions that ossify into detrimental and useless thoughts like the cobwebbed inheritance of celibacy.[3] It can be reactions to situations and events that don’t serve us, that keep us back, anxiety that silences us from speaking up, that fears judgment if we ask a question when we don’t fully understand, interpretations of events in the narratives we’ve constructed to weave our way through life that, importantly, we can choose to keep or discard.

Perhaps the most liberating aspect of Buddhist philosophy is the idea that we have an organ that senses thoughts just like we have organs that sense smells, sounds, tastes, and touch.[4] Separating the self from the stuff of thoughts and emotions is not trivial. We lose the anchor of the noun, abandon the grounding of some thing, no matter how abstract and fleeting thoughts or the pulse of emotions may be (or perhaps they’re not so fleeting? perhaps the incantatory habit of neural patterns in our brains is where we clamp on to remain steadfast amidst rough waves beating life shores?). But this notion of the self as thought is also a conceptual relic we can break, the perpendicular plane we can migrate to theta to help us solve problems and empathize more deeply with another.

There is a calm that arises when we abstract away from the seat of identity as a self separate from the world and up into a view of self as part of the world, as one with the coffee cup accompanying me as I write this post, one with the sun rising into yellow, shifting shades in the summer morning, one with the plastic armchairs and the basil plant, one with my partner as he sleeps thousands of miles away in San Francisco, as I dream with him, recovering legs intertwined in beds and on camp grounds. One with colleagues. One with homeless strangers on the street, one with their pain, their cold, the dirt washed from their face and feet when they shower.

The conceptual carapaces that bind us aren’t required, are often just inherited abstractions from the past. We can discard them if they don’t serve us. We have the ability, always, to tilt our head to theta and breath into the rhythms of the world.


[1] Clyfe radiates kindness. He has that special George Smith superpower of identifying where he can stretch his students just beyond what they believe they’re capable of while imbuing them with the confidence they need to succeed. Clyfe also found it hilarious that I took intro to cosmology, basically about memorizing constellations, after taking AP physics, which was pretty challenging. But I loved schematizing the sky! It added wonder and nostalgia to the science. I gifted Clyfe The Hobbit once, for his sons. I picked that book because my dad loves Tolkien and reading The Hobbit and The Lord of the Rings were rights of passage when I was a child, so I figured they should be for every child. I guess that means there was something in the way Clyfe behaved that pegged him as a great Dad in my mind. Every few years, my dad gets excited about possibility that he may have finally forgotten details in The Lord of the Rings, eager to reread it and discover it as if it were for the first time. But he’s consistently disappointed. He and my brother have astonishing recall; it’s as if every detail of a book or movie has been branded into their brains indefinitely. My brother can query his Alexandrian brain database of movie references to build social bridges, connecting with people through the common token of some reference as opposed to divulging personal anecdote. My mom and I are the opposite. I remember abstractions in my Peircean plague of being a mere table of contents alongside the vivid, vital particularity of narrative minds like that of my colleague Tyler Schnoebelen. When I think back to my freshman year in college, I clearly remember the layout of arguments in Kant’s First Critique and clearly remember the first time I knowingly proved something using induction, and while I clearly remember that I read Anna Karenina and The Death of Ivan Ilyich that year (apparently was in a Tolstoy kind of mood), I cannot remember the details of those narratives. Instead, I recall the emotions I experienced while reading them, as if they get distilled to their essence, the details of what happens to the characters resolved to the moral like in a fable by Aesop. Except not quite. It’s more that the novels transform into signposts for my own narrative, my own life. Portals that open my memory to recover happenings I haven’t recollected for years, 15 years, 20 years, the white-gloved hand of memory swiping off attic dust to unveil grandma’s doilies or gold-rimmed art deco china. My mind pronounces “the Death of Ivan Ilyich” and what comes forth are the feelings of shyness, of rugged anticipation, as I walked up Rush Street in Chicago floating on a cloud of dopamine brimming with anticipatory possibility, on my way to meet the young man who would become my first real boyfriend, hormones coaxed by how I saw him, how he was much more than a person, how something about how he looked and talked and acted elevated him to the perfect boyfriend, which, of course, gave me solace that I was living life as I was supposed to. My memory not recall, but windows with Dali curtains that open and shut into arabesques of identity (see featured image), cards shuffling glimpses of clubs and spades as they disappear, just barely observed, back into the stack. Possibility not actually reserved for the future, but waiting under war helmets in trench wormholes ready for new interpretation. For yes, there was other guy I didn’t decide to date, the guy who told me I looked like Audrey Hepburn (not because I do, but because I do just enough and want to so evidently enough that the flattery makes sense) when he took me on our first date at the top of the Hancock tower, a high-in-the-sky type of bar, full of women with hair stiffed in hairspray and almost acrylic polyester dresses that were made today but always remind you of Scarface or Miami Vice, with the particular smell only high-in-the-sky bars have, stale in a way that’s different from the acrid smell of dive bars that haven’t been cleaned well, maybe it’s the sauce they use on the $50 lamb chops, the jellied mint, the shrimp cocktails alongside the gin, or the cleaning agent reserved exclusively for high-in-the-sky dance halls or Vegas hotels. It could have been different, but it became what it was because Morgan more closely aligned with my mental model of the perfect boyfriend. We’re still friends. He seems quite happy and his wife seems to be thriving.

[2] Don’t get me wrong: while Leibniz rejected the idea of action at a distance, much of his philosophy and metaphysics are closer to 20th-century philosophy and physics than the other early-modern heavyweights (Descartes, Spinoza, Pascal, etc). He was the predecessor of what would become possible worlds’ theory, developing a logic and metaphysics rooted in the probability, not denotative reality. He loved combinatorics and wanted to create a universal, formal language rooted in math (his universal characteristic). The portion of his philosophy I love most is his attempt to reconcile free will and determinism using our conceptual limitations, rooting his metaphor in calculus. When we do calculus, we can’t perceive how a limit converges to some digit. Our existence in time, argues Leibniz, is similar: we’re stuck in the approximation of the approximation, assessing local minima, while God sees the entire function, appreciating how some action or event that may seem negative in the moment can lead to the greatest good in the future.

[3] It seems like downright sound logic to me that celibacy made economic sense in feudalism, given the structure of dowries and property (about which I don’t know the details) but was conceptualized using abstract arguments about reserving love for God alone, and that this abstraction survived even after economic rationale disappeared into the folds of capitalism, only to create massive issues down the line because it’s not a human way to be, our bodies aren’t made only to love God, we can’t clip our sexuality at the seams, so pathologies arise. It’s not a popular position to pardon the individual and blame the system; but every individual merits compassion.

[4] Something about the cadence of the sentence made me want to leave out sight.

The featured image is Figura en una finestra, painted by Salvador Dalí in 1925. Today it lives in the Reina Sofia. How very not Dalí, right? And yet it displays the same mastery of precise technique we see in the hallmarks of surrealism. The nuances of this painting are so suggestive and evocative, so rich with basic meaning, nothing religious, nothing allegorical, just the taught concentration of recognition, of our seeing this girl at a moment in time. The white scarf draping in such a way that it suggests the moments prior to the painting, when she walked up troubled by something that happened to her and placed it there, with care, with gentleness, as she settled into her forlorn gaze. Her thoughts protected by her directionality, all that’s available to us the balance of her weight on her left foot, her buttocks reflecting the weight distribution. The curtains above the left window somehow imbued with her emotions, their rustle so evidently spurred not only by wind but by the extension of her mood, their shape our only access to the inside of her mind, her gaze facing outwards. I love it. She gains power because she is so unaccessible. The painting holds the embryo of surrealism in it because of what I just wrote, because it invites us to read more into the images that what’s there in oil paint and textures and lines. 

 

Artificial Intelligence and the Fall of Eve

We seem to need foundational narratives.

Big picture stories that make sense of history’s bacchanal march into the apocalypse.

Broad-stroke predictions about how artificial intelligence (AI) will shape the future of humanity made by those with power arising from knowledge, money, and/or social capital.[1] Knowledge, as there still aren’t actually that many real-deal machine learning researchers in the world (despite the startling growth in paper submissions to conferences like NIPS), people who get excited by linear algebra in high-dimension spaces (the backbone of deep learning) or the patient cataloguing of assumptions required to justify a jump from observation to inference.[2] Money, as income inequality is a very real thing (and a thing too complex to say anything meaningful about in this post). For our purposes, money is a rhetoric amplifier, be that from a naive fetishism of meritocracy, where we mistakenly align wealth with the ability to figure things out better than the rest of us,[3] or cynical acceptance of the fact that rich people work in private organizations or public institutions with a scope that impacts a lot of people. Social capital, as our contemporary Delphic oracles spread wisdom through social networks, likes and retweets governing what we see and influencing how we see (if many people, in particular those we want to think like and be like, like something, we’ll want to like it too), our critical faculties on amphetamines as thoughtful consideration and deliberation means missing the boat, gut invective the only response fast enough to keep pace before the opportunity to get a few more followers passes us by, Delphi sprouting boredom like a 5 o’clock shadow, already on to the next big thing. Ironic that big picture narratives must be made so hastily in the rat race to win mindshare before another member of the Trump administration gets fired.

Most foundational narratives about the future of AI rest upon an implicit hierarchy of being that has been around for a long time. While proffered by futurists and atheists,  the hierarchy dates back to the Great Chain of Being that medieval Christian theologists like Thomas Aquinas built to cut the physical and spiritual world into analytical pieces, applying Aristotelian scientific rigor to the spiritual topics.

Screen Shot 2018-06-03 at 10.43.22 AM
Aquinas’ hierarchy of being on a blog by a fellow named David Haines I know nothing about but that seems to be about philosophy and religion.

The hierarchy provides a scale from inanimate matter to immaterial, pure intelligence. Rocks don’t get much love on the great chain of being, even if they carry the wisdom and resilience of millions of years of existence, contain, in their sifting shifting grain of sands, the secrets of fragility and the whispered traces of tectonic plates and sunken shores. Plants get a little more love than rocks, and apparently Venus fly traps (plants that resemble animals?) get more love than, say, yeast (if you’re a fellow member of the microbiome-issue club, you like me are in total awe of how yeast are opportunistic sons of bitches who sense the slightest shift in pH and invade vulnerable tissue with the collective force of stealth guerrilla warriors). Humans are hybrids, half animal, half rational spirit, our sordid materiality, our silly mortality, our mechanical bodies ever weighting us down and holding us back from our real potential as brains in vats or consciousnesses encoded to live forever in the flitting electrons of the digital universe. There are a shit ton of angels. Way more angel castes than people castes. It feels repugnant to demarcate people into classes, so why not project differences we live day in and day out in social interactions onto angels instead? And, in doing so, basically situate civilized aristocrats as closer to God than the lower and more animalistic members of the human race? And then God is the abstract patriarch on top of it all, the omnipotent, omniscient, benevolent patriarch who is also the seat of all our logical paradoxes, made of the same stuff as Gödel’s incompleteness theorem, the guy who can be at once father and son, be the circle with the center everywhere and the circumference nowhere, the master narrator who says, don’t worry, I got this, sure that hurricane killed tons of people, sure it seems strange that you can just walk into a store around the corner a buy a gun and there are mass shootings all the time, but trust me, if you could see the big picture like I see the big picture, you’d get how this confusing pain will actually result in the greatest good to the most people.

IMG_4395
Sandstone in southern Utah, the momentary, coincidental dance of wind and grain petrified into this shape at this moment in time. I’m sure it’s already somewhat different.

I’m going to be sloppy here and not provide hyperlinks to specific podcasts or articles that endorse variations of this hierarchy of being: hopefully you’ve read a lot of these and will have sparks of recognition with my broad stroke picture painting.[4] But what I see time and again are narratives that depict AI within a long history of evolution moving from unicellular prokaryotes to eukaryotes to slime to plants to animals to chimps to homo erectus to homo sapiens to transhuman superintelligence as our technology changes ever more quickly and we have a parallel data world where leave traces of every activity in sensors and clicks and words and recordings and images and all the things. These big picture narratives focus on the pre-frontal cortex as the crowning achievement of evolution, man distinguished from everything else by his ability to reason, to plan, to overcome the rugged tug of instinct and delay gratification until the future, to make guesses about the probability that something might come to pass in the future and to act in alignment with those guesses to optimize rewards, often rewards focused on self gain and sometimes on good across a community (with variations). And the big thing in this moment of evolution with AI is that things are folding in on themselves, we no longer need to explicitly program tools to do things, we just store all of human history and knowledge on the internet and allow optimization machines to optimize, reconfiguring data into information and insight and action and getting feedback on these actions from the world according to the parameters and structure of some defined task. And some people (e.g., Gary Marcus or Judea Pearl) say no, no, these bottom up stats are not enough, we are forgetting what is actually the real hallmark of our pre-frontal cortex, our ability to infer causal relationships between phenomena A and phenomena B, and it is through this appreciation of explanation and cause that we can intervene and shape the world to our ends or even fix injustices, free ourselves from the messy social structures of the past and open up the ability to exercise normative agency together in the future (I’m actually in favor of this kind of thinking). So we evolve, evolve, make our evolution faster with our technology, cut our genes crisply and engineer ourselves to be smarter. And we transcend the limitations of bodies trapped in time, transcend death, become angel as our consciousness is stored in the quick complexity of hardware finally able to capture plastic parallel processes like brains. And inch one step further towards godliness, ascending the hierarchy of being. Freeing ourselves. Expanding. Conquering the march of history, conquering death with blood transfusions from beautiful boys, like vampires. Optimizing every single action to control our future fate, living our lives with the elegance of machines.

It’s an old story.

Many science fiction novels feel as epic as Disney movies because they adapt the narrative scaffold of traditional epics dating back to Homer’s Iliad and Odyssey and Virgil’s Aeneid. And one epic quite relevant for this type of big picture narrative about AI is John Milton’s Paradise Lost, the epic to end all epics, the swan song that signaled the shift to the novel, the fusion of Genesis and Rome, an encyclopedia of seventeenth-century scientific thought and political critique as the British monarchy collapsed under  the rushing sword of Oliver Cromwell.

Most relevant is how Milton depicts the fall of Eve.

Milton lays the groundwork for Eve’s fall in Book Five, when the archangel Raphael visits his friend Adam to tell him about the structure of the universe. Raphael has read his Aquinas: like proponents of superintelligence, he endorses the great chain of being. Here’s his response to Adam when the “Patriarch of mankind” offers the angel mere human food:

Adam, one Almightie is, from whom
All things proceed, and up to him return,
If not deprav’d from good, created all
Such to perfection, one first matter all,
Indu’d with various forms various degrees
Of substance, and in things that live, of life;
But more refin’d, more spiritous, and pure,
As neerer to him plac’t or neerer tending
Each in thir several active Sphears assignd,
Till body up to spirit work, in bounds
Proportiond to each kind.  So from the root
Springs lighter the green stalk, from thence the leaves
More aerie, last the bright consummate floure
Spirits odorous breathes: flours and thir fruit
Mans nourishment, by gradual scale sublim’d
To vital Spirits aspire, to animal,
To intellectual, give both life and sense,
Fansie and understanding, whence the Soule
Reason receives, and reason is her being,
Discursive, or Intuitive; discourse
Is oftest yours, the latter most is ours,
Differing but in degree, of kind the same.

Raphael basically charts the great chain of being in the passage. Angels think faster than people, they reason in intuitions while we have to break things down analytically to have any hope of communicating with one another and collaborating. Daniel Kahnemann’s partition between discursive and intuitive thought in Thinking, Fast and Slow had an analogue in the seventeenth century, where philosophers distinguished the slow, composite, discursive knowledge available in geometry and math proofs from the fast, intuitive, social insights that enabled some to size up a room and be the wittiest guest at a cocktail party.

Raphael explains to Adam that, through patient, diligent reasoning and exploration, he and Eve will come to be more like angels, gradually scaling the hierarchy of being to ennoble themselves. But on the condition that they follow the one commandment never to eat the fruit from the forbidden tree, a rule that escapes reason, that is a dictum intended to remain unexplained, a test of obedience.

But Eve is more curious than that and Satan uses her curiosity to his advantage. In Book Nine, Milton fashions Satan in his trappings as snake as a master orator who preys upon Eve’s curiosity to persuade her to eat of the forbidden fruit. After failing to exploit her vanity, he changes strategies and exploits her desire for knowledge, basing his argument on an analogy up the great chain of being:

O Sacred, Wise, and Wisdom-giving Plant,
Mother of Science, Now I feel thy Power
Within me cleere, not onely to discerne
Things in thir Causes, but to trace the wayes
Of highest Agents, deemd however wise.
Queen of this Universe, doe not believe
Those rigid threats of Death; ye shall not Die:
How should ye? by the Fruit? it gives you Life
To Knowledge? By the Threatner, look on mee,
Mee who have touch’d and tasted, yet both live,
And life more perfet have attaind then Fate
Meant mee, by ventring higher then my Lot.
That ye should be as Gods, since I as Man,
Internal Man, is but proportion meet,
I of brute human, yee of human Gods.
So ye shall die perhaps, by putting off
Human, to put on Gods, death to be wisht,
Though threat’nd, which no worse then this can bring.

 

Satan exploits Eve’s mental model of the great chain of being to tempt her to eat the forbidden apple. Mere animals, snakes can’t talk. A talking snake, therefore, must have done something to cheat the great chain of being, to elevate itself to the status of man. So too, argues Satan, can Eve shortcut her growth from man to angel by eating the forbidden fruit. The fall of mankind rests upon our propensity to rely on analogy. May the defenders of causal inference rejoice.[5]

The point is that we’ve had a complex relationship with our own rationality for a long time. That Judeo-Christian thought has a particular way of personifying the artifacts and precipitates of abstract thoughts into moral systems. That, since the scientific revolution, science and religion have split from one another but continue to cross paths, if only because they both rest, as Carlo Rovelli so beautifully expounds in his lyrical prose, on our wonder, on our drive to go beyond the immediately visible, on our desire to understand the world, on our need for connection, community, and love.

But do we want to limit our imaginations to such a stale hierarchy of being? Why not be bolder and more futuristic? Why not forget gods and angels and, instead, recognize these abstract precipitates as the byproducts of cognition? Why not open our imaginations to appreciate the radically different intelligence of plants and rocks, the mysterious capabilities of photosynthesis that can make matter from sun and water (WTF?!?), the communication that occurs in the deep roots of trees, the eyesight that octopuses have all down their arms, the silent and chameleon wisdom of the slit canyons in the southwest? Why not challenge ourselves to greater empathy, to the unique beauty available to beings who die, capsized by senescence and always inclining forward in time?

IMG_4890
My mom got me herbs for my birthday. They were little tiny things, and now they look like this! Some of my favorite people working on artistic applications of AI consider tuning hyperparameters to be an act akin to pruning plants in a garden. An act of care and love.

Why not free ourselves of the need for big picture narratives and celebrate the fact that the future is far more complex than we’ll ever be able to predict?

How can we do this morally? How can we abandon ourselves to what will come and retain responsibility? What might we build if we mimic animal superintelligence instead of getting stuck in history’s linear march of progress?

I believe there would be beauty. And wild inspiration.


[1] This note should have been after the first sentence, but I wanted to preserve the rhetorical force of the bare sentences. My friend Stephanie Schmidt, a professor at SUNY Buffalo, uses the concept of foundational narratives extensively in her work about colonialism. She focuses on how cultures subjugated to colonial power assimilate and subvert the narratives imposed upon them.

[2] Yesterday I had the pleasure of hearing a talk by the always-inspiring Martin Snelgrove about how to design hardware to reduce energy when using trained algorithms to execute predictions in production machine learning. The basic operations undergirding machine learning are addition and multiplication: we’d assume multiplying takes more energy than adding, because multiplying is adding in sequence. But Martin showed how it all boils down to how far electrons need to travel. The broad-stroke narrative behind why GPUs are better for deep learning is that they shuffle electrons around criss-cross structures that look like matrices as opposed to putting them into the linear straight-jacket of the CPU. But the geometry can get more fine-grained and complex, as the 256×256 array in Google’s TPU shows. I’m keen to dig into the most elegant geometry for designing for Bayesian inference and sampling from posterior distributions.

[3] Technology culture loves to fetishize failure. Jeremy Epstein helped me realize that failure is only fun if it’s the mid point of a narrative that leads to a turn of events ending with triumphant success. This is complex. I believe in growth mindsets like Ray Dalio proposes in his Principles: there is real, transformative power in shifting how our minds interpret the discomfort that accompanies learning or stretching oneself to do something not yet mastered. I jump with joy at the opportunity to transform the paralyzing energy of anxiety into the empowering energy of growth, and believe its critical that more women adopt this mindset so they don’t hold themselves back from positions they don’t believe they are qualified for. Also, it makes total sense that we learn much, much more from failures than we do from successes, in science, where it’s important to falsify, as in any endeavor where we have motivation to change something and grow. I guess what’s important here is that we don’t reduce our empathy for the very real pain of being in the midst of failure, of not feeling like one doesn’t have what other have, of being outside the comfort of the bell curve, of the time it takes to outgrow the inheritance and pressure from the last generation and the celebrations of success. Worth exploring.

[4] One is from Tim Urban, as in this Google Talk about superintelligence. I really, really like Urban’s blog. His recent post about choosing a career is remarkably good and his Ted talk on procrastination is one of my favorite things on the internet. But his big picture narrative about AI irks me.

[5] Milton actually wrote a book about logic and was even a logic tutor. It’s at once incredibly boring and incredibly interesting stuff.

The featured image is the 1808 Butts Set version of William Blake’s “Satan Watching the Endearments of Adam and Eve.” Blake illustrated many of Milton’s works and illustrated Paradise Lost three times, commissioned by three different patrons. The color scheme is slightly different between the Thomas, Butts, and Linnell illustration sets. I prefer the Butts. I love this image. In it, I see Adam relegated to a supporting actor, a prop like a lamp sitting stage left to illuminate the real action between Satan and Eve. I feel empathy for Satan, want to ease his loneliness and forgive him for his unbridled ambition, as he hurdles himself tragically into the figure of the serpent to seduce Eve. I identify with Eve, identify with her desire for more, see through her eyes as they look beyond the immediacy of the sexual act and search for transcendence, the temptation that ultimately leads to her fall. The pain we all go through as we wise into acceptance, and learn how to love. 

Screen Shot 2018-06-03 at 8.22.32 AM
Blake’s image reminds me of this masterful kissing scene in Antonioni’s L’Avventura (1960)The scene focuses on Monica Vitti, rendering Gabriele Ferzetti an instrument for her pleasure and her interior movement between resistance and indulgence. Antonioni takes the ossified paradigm of the male gaze and pushes it, exposing how culture can suffocate instinct as we observe Vitti abandon herself momentarily to her desire.

Details that Make a Difference

George E. Smith may be the best kept secret in academia.

The words aren’t mine: they belong to Daniel Dennett.* He said them yesterday at On the Question of Evidence, a conference Tufts University hosted to celebrate George’s life and work. George sat in the second row during the day’s presentations. I watched him listen attentively, every once and a while bowing his head the way he does when he gets emotional, humbly displacing praise to another giant upon whose shoulders he claims to have stood. I. Bernard Cohen, Ken Wilson, Curtis Wilson, Tom Whiteside (I remember George’s anecdote about meeting Whiteside in a bookstore and saying he genuflected before his brilliant scholarship on Newtonian mathematics). He unabashedly had the first word after every talk, most of the time articulating an anecdote (anyone who knows George knows that he likes to tell stories) about how someone else taught him an idea or taking the opportunity to articulate some crisp, crucial maxim in philosophy of science. Perhaps my favourite part of the entire day was listening to him interrupt Dan’s story about how they jointly founded the computer science program at Tufts, back in the days of Minsky and Good ol’ fashioned AI**, George stepping in to provide additional details about their random colour pixel generation program (random until the output looked a little too much like plaid), recollecting dates and details with ludicrous precision, as is his capability and wont.

Almost every conference attendee had at least one thing in common: we’d taken George’s Newton seminar, unique in its kind, one of the most complete, erudite, stimulating academic experiences in the world today. Some conference attendees, like Eric Schliesser and Bill Bradley, took the seminar back in its infancy in the 80s and early 90s. I took it with Kant scholar Michael Friedman at Stanford in 2009. Each year George builds on the curriculum, collaborating with students on some open research question, only to incorporate new learnings into next year’s (or some future year’s) curriculum. Perhaps the hallmark of a truly significant thinker is that her work is as rich and complex as the natural world, containing second-order ideas like second-order phenomena, phenomena that no one observes–or even can observe!–the first time they look. Details masked behind more dominant regularities, but that, in time and through the gradual and patience process of measurement, observation, and research, become visible through the mediation of theory. Theories as anchors to see the invisible.

This recursive process of knowledge and growth isn’t unique to George’s teaching. It characterizes his seminal contribution to the philosophy of science.

Many of the conference speakers drew from George’s stunning 2014 article Closing the Loop. The article is the culmination of 20 years of work studying how Newton changed standards for high-quality evidence. We often assume that Newton’s method is hypothetical-deductive, where the reasoning structure is to formulate a hypothesis that could be falsified by a test on observable data, and collect observations to either falsify (if observations disagree with what the hypothesis predicts) or corroborate (if observations agree) a theory.*** George thinks that Newton is up to something different. He cites Newton’s “Copernicum scholium,” where Newton states that, given the complexities of forces that determine planetary motion, considering “simultaneously all these causes of motion and [defining] these motions by exact laws admitting of easy calculation exceeds…the force of any human mind.” George writes:

The complexity of true motions was always going to leave room for competing theories if only because the true motions were always going to be beyond precise description, and hence there could always be multiple theories agreeing with observation to any given level of approximation. On my reading, the Principia is one sustained response to this evidence problem.

George goes on to argue that Newtonian idealizations, the simplified, geometric theories that predict how a particular system of bodies would behave under specifiable circumstances, aren’t theories to align directly with observations, but thinking tools, counterfactuals that predict how the world would behave if it were only governed by a few forces (e.g., a system only subject to gravity versus a system subject to gravity and magnetism). Newton expects that observations won’t fit predictions, because he knows the world is more complex than mathematical models can describe. Instead, discrepancies become a source of high-quality evidence to both corroborate a theory and render hypotheses ever more precise and encompassing as we incorporate in new details about the world. If Newton can isolate how a system would behave under the force of gravity, the question becomes, what, if any, further forces are at work? According to George, then, Newton didn’t propose a static method for falsifying hypotheses with observations and data. He proposed a dynamic research strategy that could encompass an ongoing program of navigating the gulf between idealizations and observations. Importantly, as Michael Friedman showed in his talk, theories can be the necessary condition for certain types of measurement: there is a difference between plotting data points indicating the position of Jupiter at two points in time and measuring Jupiter’s acceleration in its orbit. The second is theory-mediated measurement, where the theory relating mass to acceleration via the force of gravity makes it possible to measure mass from acceleration. Also importantly, because discrepancies between theory and observations don’t direct falsify a hypothesis, the Newtonian scientific paradigm was more resilient, requiring a different type of discrepancy to pave the way to General Relativity (while I get the gist of why Einstein required space-time curvature to explain the precession of Mercury’s orbit, I must admit I don’t yet understand it as deeply as I’d like).****

While George is widely recognized as one of the world’s foremost experts in Isaac Newton, his talents and interests are wide-ranging. Alongside his career as a philosopher, he has a second, fully-developed career (not just a hobby) as a jet engine engineer focused on failure analysis. He has a keen sensibility for literature: he introduced me to Elena Ferrante, whose Neapolitan novels I have sense devoured. He knows his way around modern art, having dated artist Eva Hesse during his undergraduate years at Yale. He treats his wife India, who is herself incredible, with love and respect. He takes pride in having coached basketball to underprivileged students on Boston’s south side. The list literally goes on and on.

Screen Shot 2018-05-12 at 5.54.33 PM
A picture of George doing what he does best.

The commentary that touched me deepest came from Tufts Dean of Academic Affairs Nancy Bauer. Nancy commented on the fact that she wouldn’t be where she is today if it weren’t for George, that he believed there was a place for feminist philosophy in the academy before other departments caught on. She also commented on what is perhaps George’s most important skill: his ability to at once challenge students to rise to just beyond the limits of their potential and to imbue them with the confidence they need to succeed.

This is not meaningful to me in the abstract. It is concrete and personal.

I took George’s course during the spring trimester of my second year in graduate school. I was intimidated: my graduate degree is in comparative literature, and while I had majored in mathematics as an undergrad, I was unwaveringly insecure about my abilities to reason precisely. Some literature students are comfortable in their skin; they live and breath words, images, tropes, analogies. I teetered in a no man’s land between math, philosophy, history, literature, gorging with curiosity and encyclopedic drive but never disciplined enough to do one thing with excellence. The first (or second? George would know…) day of class, I approached George and told him I was worried about taking the course as a non-philosopher. I was definitely concerned few times trying to muscle my way through the logic of Newton’s proofs in the Principia (it was heartening to learn that John Locke barely understood it). But George asked me about my background and, learning I’d studied math, smiled in a way that couldn’t but give me confidence. He helped me define a topic for my final paper focused on the history of math, evaluating the concept of what Newton calls “first and last ratios”, akin to but not quite limits, in the Principia. He was proud of my paper. I was proud of my paper. But what matters is not the scholarship, it’s what I learned in the process.

George taught me how to overcome my insecurities and find a place of strength to do great work. He taught me how to love my future students, my future colleagues, how to pay close attention to their strengths, interests, weaknesses, and to shape experiences that can push them to be their best while imbuing them with the confidence they need to succeed. I carry the experience I had in George’s course with me every day to work. He taught me what it means to act with integrity as a mentor, colleague, and teacher.

I’ll close with an excerpt from an email George sent me September 29, 2017. I’d sent him a copy of Melville’s The Confidence Man to thank him for being on my podcast. Closing his thank you note, he wrote:

“I was pleased with your comment in the note attached to the gift about continuing influence. You had already, however, given me one of the more special gifts I have received in the last couple of years, namely the name of your site. I find nothing in life more gratifying than to see that someone learned something in one of my classes that they have continued to find special.”

Quam proxime, “most nearly to the highest degree possible,” occurs 139 times in the Principia. George became obsessed with what it means for grasping the place for approximation in Newtonian science. I co-opted it to refer to my own quest to achieve the delicate, precious balance between precision and soul, to guide me in my quest for meaning. How grateful I am that George is part of my process, always continuing, always growing, as we migrate the beautiful complexities of our world.


*George and Dan were my second guests on the In Context podcast, during which we discussed the relationship between data and evidence.

**Over lunch, “uncle Dan” (again, his words, not mine) predicted that the current generation of deep learning researchers would likely take a bottoms up, trial and error approach to inferring the same structured, taxonomical web of knowledge the GOFAI research community tried to define top down back in the 1950s-80s, of course will a bit more lubrication than the brittle systems of yore. This shouldn’t surprise anyone familiar with Dan’s work, as he often, as in From Bacteria to Bach and Back, writes about systems that appear to exhibit the principles of top-down intelligent design without ever having had an intentional intelligent designer. For example, life and minds.

***I won’t go into the nuances of scientific reasoning in this post, so won’t talk about processes to use data to select amidst competing hypotheses.

****I asked Bill Harper a question about the difference between Bayesian uncertainty and the delta between prediction, likelihood, and data, and the uncertainty and approximations George bakes into the Newtonian research paradigm. I’m looking to better understand different types of inference.

The featured image shows the overlap between the observed and predicted values of gravitational waves, observed by two Laser Interferometer Gravitational-Wave Observatories (LIGOs) in Washington and Louisiana on September 14, 2015. Allan Franklin, one of the conference speakers, showed this image and pointed out he found the visual proof of the overlap between prediction and observation more compelling than the 5-sigma effect in the data (that’s an interesting epistemological question about evidence!). He also commented how wonderful it must have been to be in the room with the LIGO researches when the event occurred. To learn more, I highly recommend Janna Levin’s Black Hole Blues and Other Sounds from Outer Space

Privacy Beyond the Individual

This week’s coverage of the Facebook and Cambridge Analytica debacle[1] (latest Guardian article) has brought privacy top of mind and raised multiple complex questions[2]:

Is informed consent nothing but a legal fiction in today’s age of data and machine learning, where “ongoing and extensive data collection can be neither fully informed nor truly consensual — especially since it is practically irrevocable?”

Is tacit consent–our silently agreeing to the fine print of privacy policies as we continue to use services–something we prefer to grant given the nuisance, time, and effort required to understand the nuances of data use? Is consent as a mechanism too reliant upon the supposition that privacy is an individual right and, therefore, available for an individual to exchange–in varying degrees–for the benefits and value from some service provider (i.e.,  Facebook likes satisfying our need to be both loved and lovely)? If consent is defunct, what legal structure should replace it?

Screen Shot 2018-03-25 at 9.08.17 AM
Scottish Enlightenment philosopher Adam Smith wrote about our need to love and be lovely in the Theory of Moral Sentiments. I’m dying to dig back into Smith because I suspect his work can help show that even personalization and online consumerism independent of any political context dulls our capacities to be active, rational participants in democracy. Indeed, listening to Russ Roberts’s EconTalk podcast, I’ve learned that Smith argued that commerce, i.e., in-person transactions with strangers to exchange value, provides an opportunity to practice regulating our emotions, as we can’t devolve into temper tantrums and emotional outbursts like we do with our families and spouses (the inimical paradoxes of intimacy…) if we want to get business done. Roberts wrote a poem extolling the wonders if libertarian emergence called It’s a Wonderful Loaf.

How should we update outdated notions of what qualifies as personally identifiable information (PII), which already vary across different countries and cultures, to account for the fact contemporary data processing techniques can infer aspects of our personal identity from our online (and, increasingly, offline) behavior that feel more invasive and private than our name and address? Can more harm be done to an individual using her social security/insurance number than psychographic traits? In which contexts?

Would regulatory efforts to force large companies like Facebook to “lock down” data they have about users actually make things worse, solidifying their incumbent position in the market (as Ben Thompson and Mike Masnick argue)?

Is the best solution, as Cory Doctorow at the Electronic Frontier Foundation argues, to shift from having users (tacitly) consent to data use, based on trust and governed by the indirect forces of market demand (people will stop using your product if they stop trusting you) and moral norms, to building privacy settings in the fabric of the product, enabling users to engage more thoughtfully with tools?[3]

Many more qualified than I are working to inform clear opinions on what matters to help entrepreneurs, technologists, policymakers, and plain-old people[4] respond. As I grapple with this, I thought I’d share a brief and incomplete history of the thinking and concepts undergirding privacy. I’ll largely focus on the United States because it would be a book’s worth of material to write even brief histories of privacy in other cultures and contexts. I pick the United States not because I find it the most important or interesting, but because it happens to be what I know best. My inspiration to wax historical stems from a keynote I gave Friday about the history of artificial intelligence (AI)[5] for AI + Public Policy: Understanding the shift, hosted by the Brookfield Institute in Toronto.

As is the wont of this blog, the following ideas are far from exhaustive and polished. I offer them for your consideration and feedback.


The Fourth Amendment: Knock-and-Announce

As my friend Lisa Sotto eloquently described in a 2015 lecture at the University of Pennsylvania, the United States (U.S.) considers privacy as a consumer right, parsed across different business sectors, and the European Union (EU) considers privacy as a human right, with a broader and more holistic concept of what kinds of information qualify as sensitive. Indeed, one look at the different definitions of sensitive personal data in the U.S. and France in the DLA Piper Data Protection Laws of the World Handbook shows that the categories and taxonomies are operating at different levels. In the U.S., sensitive data is parsed by data type; in France, sensitive data is parsed by data feature:

Screen Shot 2018-03-24 at 12.22.42 PM
Screenshot from the DLA Piper Data Protection Handbook. They conveniently organize information so a reader can compare how two countries differ on aspects of privacy law.

It seems potentially, possibly plausible (italics indicating I’m really unsure about this) that the U.S. concept of privacy as being fundamentally a consumer right dates back to the original elision of privacy and property in the Fourth Amendment to the U.S. Constitution:

The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.

We forget how tightly entwined protection of property was to early U.S. political theory. In his Leviathan, for example, seventeenth-century English philosopher Thomas Hobbes derives his theory of legitimate sovereign power (and the notion of the social contract that influenced founding fathers like Jefferson and Madison) from the need to provide individuals with some recourse against intrusions on their property; otherwise we risk devolving to the perpetually anxious and miserable state of the war of all against all, where anyone can come in and ransack our stuff at any time.

The Wikipedia page on the Fourth Amendment explains it as a countermeasure against general warrants and writs of assistance British colonial tax collectors were granted to “search the homes of colonists and seize ‘prohibited and uncustomed’ goods.” What matters for this brief history is the foundation that early privacy law protected people’s property–their physical homes–from searches, inspections, and other forms of intrusion or surveillance by the government.

Katz v. United States: Reasonable Expectations of Privacy

Over the past 50 years, new technologies have fracked the bedrock assumptions of the Fourth Amendment.[6] The case law is expansive and vastly exceeds the cursory overview I’ll provide in this post. Cindy Cohn from the Electronic Frontier Foundation has written extensively and lucidly on the subject.[7] As has Daniel Solove.

Perhaps the seminal case shaping contemporary U.S. privacy law is Katz v. United States, 389 U.S. 347 (1967). A 2008 presentation from Artemus Ward at Northern Illinois University presents the facts and a summary of the Supreme Court Justices’ opinions in these three slides (there are also dissenting opinions):

Katz+v.+United+States+(1967)+The+Facts

Katz+v.+United+States+(1967)+Justice+Potter+Stewart+Delivered+the+Opinion+of+the+Court

harlan

There are two key questions here:

  • Does the right to privacy extend to telephone booths and other public places?
  • Is a physical intrusion necessary to constitute a search?

Justice Harlan’s comments regarding the “actual (subjective) expectation of privacy” that society is prepared to recognize as “reasonable” marked a conceptual shift to pave the way for the Fourth Amendment to make sense in the digital age. Katz shifted the locus of constitutional protections against unwarranted government surveillance from one’s private home–property that one owns–to public places that social norms recognize as private in practice if not name (a few cases preceding Katz paved the way for this judgment).

This is watershed: when any public space can be interpreted as private in the eyes of the beholder, the locus of privacy shifts from an easy-to-agree-upon-objective-space like someone’s home, doors locked and shades shut, to a hard-to-agree-upon-subjective-mindset like someone’s expectation of what should be private, even if it’s out in a completely public space, just as long as those expectations aren’t crazy (i.e., that annoying lady somehow expecting that no one is listening to her uber-personal conversation about the bad sex she had with the new guy from Tinder as she stands in a crowded checkout line waiting to purchase her chia seed concoction and her gluten-free crackers[8]) but accord with the social norms and practices of a given moment and generation.

Imagine how thorny it becomes to decide what qualifies as a reasonable expectation for privacy when we shift from a public phone booth occupied by one person who can shut the door (as in Katz) to:

  • internet service providers shuffling billions of text messages, phone calls, and emails between individuals, where (perhaps?) the standard expectation is that when we go through the trouble of protecting information with a password (or two-factor authentication), we’re branding these communications as private, not to be read by the government or the private company providing us with the service (and metadata?);
  • GPS devices placed on the bottom of vehicles, as in United States v. Jones, 132 S.Ct. 945 (2012), which in themselves may not seem like something everyone has to worry about often but which, given the category of data they generate, are similar to any and all information about how we transact and move in the world, revealing not just what our name is but which coffee shops and doctors (or lovers or political co-conspirators) we visit on a regular basis, prompting Justice Sandra Sotomayor to be very careful in her judgments;
  • social media platforms like Facebook, pseudo-public in nature, that now collect and analyze not only structured data on our likes and dislikes, but, thanks to advancing AI capabilities, image, video, text, and speech data;
  • etc.
Screen Shot 2018-03-25 at 10.34.26 AM
Slide from Kosta Derpanis‘s extremely witty and engaging talk on computer vision at Friday’s AI + Public Policy Conference. This shows how Facebook is now applying computer vision techniques to analyze images users post, not just structured data about what they like and dislike.

Just as Zeynep Tufecki argues that informed consent loses its power in an age where most users of internet services and products don’t rigorously understand what use of their data they’re consenting too, so too does Cohn believe that the “‘reasonable expectation of privacy’ test currently employed in Fourth Amendment jurisprudence is a poor test for the digital age.”[9] As with any shift from criticism to pragmatic solutions, however, the devil is in the details. If we eliminate a reasonableness test because it’s too flimsy for the digital age, what do we replace it to achieve the desired outcomes of protecting individual rights to free speech and preventing governmental overreach? Do we find a way to measure actual harm suffered by an individual? Or should we, as Lex Gill suggested Friday, somehow think about privacy as a public good rather than an individual choice requiring individual consent? What are the different harms we need to guard against in different contexts, given that use of data for targeted marketing has different ramifications than government wiretapping?

These questions are tricky to parse because, in an age where so many aspects of our lives are digital, privacy bleeds into and across different contexts of social, political, commercial, and individual activity. As Helen Nissenbaum has masterfully shown, our subjective experience of what’s appropriate in different social contexts influences our reasonable expectations of privacy in digital contexts. We don’t share all the intimate details of our personal life with colleagues in the same way we do with close friends or doctors bound by duties of confidentiality. Add to that that certain social contexts demand frivolity (that ironic self you fashion on Facebook) and others, like politics, invite a more aspirational self.[10] Nissenbaum’s theory of contextual integrity, where privacy is preserved when information flows respect the implicit, socially-constructed boundaries that graft the many sub-identities we perform and inhabit as individuals, applies well to Cambridge Analytica debacle. People are less concerned by private companies using social media data for psychographic targeting than they are for political targeting; the algorithms driving stickiness on site and hyper-personalized advertising aren’t fit to promote the omnivorous, balanced information diet required to understand different sides of arguments in a functioning democracy. Being at once watering hole to chat with friends, media company to support advertising, and platform for political persuasion, Facebook collapses distinct social spheres into one digital platform (which also complicates anti-trust arguments, as evident in this  excellent Intelligence Squared debate).

A New New Deal on Data: Privacy in the Age of Machine Learning

In 2009, Alex “Sandy” Pentland of the MIT Media Lab began the final section of his article calling for a “new deal” on data as follows:

Perhaps the greatest challenge posed by this new ability to sense the pulse of humanity is creating a “new deal” around questions of privacy and data ownership. Many of the network data that are available today are freely offered because the entities that control the data have difficulty extracting value from them. As we develop new analytical methods, however, this will change.[11]

This ability to “sense the pulse of humanity,” writes Pentland earlier in the article, arises from the new data generation, collection, and processing tools that have effectively given the human race “the beginnings of a working nervous system.” Pentland contrasts what we are able to know about people’s behavior today–where we move in the world, how many times our hearts beat per minute, whom we love, whom we are attracted to, what movies we watch and when, what books we read and stop reading in between, etc–with the “single-shot, self-report data” data, e.g., yearly censuses, public polls, and focus groups, that characterized demographic statistics in the recent past. Note that back in 2009, the hey day of the big data (i.e., collecting and storing data) era, Pentland commented that while a ton of data was collected, companies had difficulty extracting value. It was just a lot of noise backed by the promise of analytic potential.

This has changed.

Machine learning has unlocked the potential and risks of the massive amounts of data collected about people.

The standard risk assessment tools (like privacy impact assessments) used by the privacy community today focus on protecting the use of particular types of data, PII like proper names and e-mail addresses. There is a whole industry and tool kit devoted to de-identification and anonymization, automatically removing PII while preserving other behavioral information for statistical insights. The problem is that this PII-centric approach to privacy misses the boat in the machine learning age. Indeed, what Cambridge Analytica brought to the fore was the ability to use machine learning to probabilistically infer not proper names but features and types from behavior: you don’t need to check a gender box for the system to make a reasonably confident guess that you are a woman based on the pictures you post and the words you use; private data from conversations with your psychiatrist need not be leaked for the system to peg you as neurotic. Deep learning is so powerful because it is able to tease out and represent hierarchical, complex aspects of data that aren’t readily and effectively simplified down variables we can keep track of and proportionately weight in our heads: these algorithms can, therefore, tease meaning out of a series of actions in time. This may not peg you as you, but it can peg you as one of a few whose behavior can be impacted using a given technique to achieve a desired outcome.

Three things have shifted:

  • using machine learning, we can probabilistically construct meaningful units that tell us something about people without standard PII identifiers;
  • because we can use machine learning, the value of data shifts from the individual to statistical insights across a distribution; and
  • breaches of privacy that occur at the statistical layer instead of the individual data layer require new kinds of privacy protections and guarantees.

The technical solution to this last bullet point is a technique called differential privacy. Still in the early stages of commercial adoption,[12] differential privacy thinks about privacy as the extent to which individual data impacts the shape of some statistical distribution. If what we care about is the insight, not the person, then let’s make it so we can’t reverse engineer how one individual contributed to that insight. In other words, the task is to modify a database such that:

if you have two otherwise identical databases, one with your information and one without it, the probability that a statistical query will produce a given result is (nearly) the same whether it’s conducted on the first or second database.

Here’s an example Matthew Green from Johns Hopkins gives to help develop an intuition for how this works:

Imagine that you choose to enable a reporting feature on your iPhone that tells Apple if you like to use the 💩  emoji routinely in your iMessage conversations. This report consists of a single bit of information: 1 indicates you like 💩 , and 0 doesn’t. Apple might receive these reports and fill them into a huge database. At the end of the day, it wants to be able to derive a count of the users who like this particular emoji.

It goes without saying that the simple process of “tallying up the results” and releasing them does not satisfy the DP definition, since computing a sum on the database that contains your information will potentially produce a different result from computing the sum on a database without it. Thus, even though these sums may not seem to leak much information, they reveal at least a little bit about you. A key observation of the differential privacy research is that in many cases, DP can be achieved if the tallying party is willing to add random noise to the result. For example, rather than simply reporting the sum, the tallying party can inject noise from a Laplace or gaussian distribution, producing a result that’s not quite exact — but that masks the contents of any given row.

This is pretty technical. It takes time to understand it, in particular if you’re not steeped in statistics day in and day out, viewing the world as a set of dynamic probability distributions. But it poses a big philosophical question in the context of this post.

In the final chapters of Homo Deus, Yuval Noah Harari proposes that we are moving from the age of Humanism (where meaning emanates from the perspective of the individual human subject) to the age of Dataism (where we question our subjective viewpoints given our proven predilections for mistakes and bias to instead relegate judgment, authority, and agency to algorithms that know us better than we know ourselves). Reasonable expectations for privacy, as Justice Harlan indicated, are subjective, even if they must be supported by some measurement of norms to qualify as reasonable. Consent is individual and subjective, and results in principles like that of minimum use for an acknowledged purpose because we have limited ability to see beyond ourselves, we create traffic jams because we’re so damned focus on the next step, the proxy, as opposed to viewing the system as a whole from a wider vantage point, and only rarely (I presume?) self-identify and view ourselves under the round curves of a distribution. So, if techniques like differential privacy are better apt to protect us in an age where distributions matter more than data points, how should we construct consent, and how should we shape expectations, to craft the right balance between the liberal values we’ve inherited and this mathematical world we’re building? Or, do we somehow need to reformulate our values to align with Dataism?

And, perhaps most importantly, what should peaceful resistance look like and what goals should it achieve?


[1] What one decides to call the event reveals a lot about how one interprets it. Is it a breach? A scandal? If so, which actor exhibits scandalous behavior: Nix for his willingness to profit from the manipulation of people’s psychology to support the election of an administration that is toppling democracy? Zuckerberg for waiting so long to acknowledge that his social media empire is more than just an advertising platform and has critical impacts on politics and society? The Facebook product managers and security team for lacking any real enforcement mechanisms to audit and verify compliance with data policies? We the people, who have lost our ability and even desire to read critically, as we prefer the sensationalism of click bait, goading technocrats to optimize for whatever headline keeps us hooked to our feed, ever curious for more? Our higher education system, which, falling to economic pressures that date back to (before but were aggravated by) the 2008-2009 financial crisis are cutting any and all curricula for which it’s hard to find a direct, casual line to steady and lucrative employment, as our education system evolves from educating a few thoughtful priests to educating many industrial workers to educating engineers who can build stuff and optimize everything and define proxies and identify efficiencies so we can go faster, faster until we step back and take the time to realize the things we are building may not actually align with our values, that, perhaps, we may need to retain and reclaim our capacities to reflect and judge and reason if we want to sustain the political order we’ve inherited? Or perhaps all of this is just the symptom of much larger, complex trend in World History that we’re unable to perceive, that the Greeks were right in thinking that forms of government pass through inevitable cycles with the regularity of the earth rotating around the sun (an historical perspective itself, as the Greeks thought the inverse) and we should throw our hands up like happy nihilists, bowing down to the unstoppable systemic forces of class warfare, the give and take between the haves and the have nots, little amino acids ever unable to perceive how we impact the function of proteins and how they impact us in return?

And yet, it feels like there may be nothing more important than to understand this and to do what little–what big–we can to make the world a better place. This is our dignity, quixotic though it may be.

Screen Shot 2018-03-24 at 9.39.39 AM
The Greek term for the cycle of political regimes is anacyclosis. Interestingly enough, there is an institute devoted to this idea, seemingly located in North Carolina. Their vision is to “halt the cycle of revolution while democracy prevails.”

[2] One aspect of the fiasco* I won’t write about but that merits at least passing mention is Elon Musk’s becoming the mascot for the #DeleteFacebook movement (too strong a word?). The New York Times coverage of Musk’s move references Musk and Zuckerberg’s contrasting opinions on the risks AI might pose to humanity. From what I understand, as executives, they both operate on extremely long time scales (i.e., 100 years in the future), projecting way out into speculative futures and working backwards to decide what small steps man should take today to enable Man to take giant future leaps (gender certainly intended, especially in Musk’s case, as I find his aesthetic and many of the very muscular men I’ve met from Tesla at conferences is not dissimilar from the nationalistic masculinity performed by Vladimir Putin). Musk rebuffed Zuckerberg’s criticism that Musk’s rhetoric about the existential threat AI poses to humanity is “irresponsible” by saying that Zuckerberg’s “understanding of the subject is limited.” I had some cognitive dissonance reading this, as I presumed the risk Musk was referring to was that of super-intelligence run amok (à la Nick Bostrom, whom I admittedly reference as a straw man) rather than that of our having created an infrastructure that exacerbates short-term, emotional responses to stimuli and thereby threatens the information exchange required for democracy to function (see Alexis de Tocqueville on the importance of newspapers in Democracy in America). My takeaway from all of this is that there are so many different sub-issues all wrapped up together, and that we in the technology community really do need to work as hard as we can to references specifics rather than allow for the semantic slippage that leaves interpretation in the mind of the beholder. It’s SO HARD to do this, especially for pubic figures like Musk, given that people’s attention spans are limited and we like punchy quotables at a very high level. The devil is always in the details.

[3] Doctorow references Laurence Lessig’s Code and Other Laws of Cyberspace, which I have yet to read but is hailed as a classic text on the relationship between law and code, where norms get baked into our technologies in the choices of how we write code.

[4] I always got a kick out of the song Human by the Killers, whose lyrics seem to imply a mutually exclusive distinction between human and dancer. Does the animal kingdom offer better paradigms for dancers than us poor humans? Must depend on whether you’re a white dude.

[5] My talk drew largely from Chris Dixon‘s extraordinary Atlantic article How Aristotle Created the Computer. Extraordinary because he deftly encapsulates 2000 years of the history of logic into a compelling, easy-to-read article that truly helps the reader develop intuitions about deterministic computer programs and the shift to a more inductive machine learning paradigm, while also not leaving the reader with the bitter taste of having read an overly general dilettante. Here’s one of my slides, which represents how important it was for the history of computation to visualize and interpret Aristotelian syllogisms as sets (sets lead to algebra lead to encoding in logical gates lead to algorithms).

Screen Shot 2018-03-24 at 11.18.28 AM
As Dixon writes, “You can replace “Socrates” with any other object, and “mortal” with any other predicate, and the argument remains valid. The validity of the argument is determined solely by the logical structure. The logical words — “all,” “is,” are,” and “therefore” — are doing all the work.”

Fortunately (well, we put effort in to coordinate), my talk was a nice primer for Graham Taylor‘s superbly clear introduction to various forms of machine learning. I most liked his section on representation learning, where he showed how the choice of representation of data has an enormous impact on the performance of algorithms:

Screen Shot 2018-03-24 at 11.25.25 AM
The image is from deeplearningbook.org. Note that in Cartesian coordinates, it’s hard to draw a straight line that could separate the blue and green items, whereas in polar coordinates, the division is readily visible. Coordinate choice has a big impact on what qualifies as simple in math. Newton and Descartes, for example, disagreed over the simplicity of the equation for a circle: it’s a pretty complex equation when represented in Cartesian coordinates, but quite simple in Polar coordinates. Our frame of reference is a thinking tool we can use to solve problems — I first learned this in high-school physics, when Clyfe Beckwith taught us that we could tilt our Cartesian coordinates to align with the slope of a hill in a physics problem. It’s a foundational memory I have of ridding myself of ossified assumptions to open the creative thinking space to solve problems, not unlike adding 0 to an algebraic equation to leverage the power of b + -b.

[6] If you’re interested in contemporary Constitutional Law, I highly recommend Roman Mars’s What Trump Can Teach us about Con Law podcast. Mars and Elizabeth Joh, a law school professor at UC Davis, use Trump’s entirely anomalous behavior as catalyst to explore various aspects of the Constitution. I particularly enjoyed the episode about the emoluments clause, which prohibits acceptance of diplomatic gifts to the President, Vice President, Secretary of State, and their spouses. The Protocol Gift Unit keeps public record of all gifts presidents did accept, including justification of why they made the exception. For example, in 2016, former President Obama accepted Courage, an olive green with black flecks soapstone sculpture, depicting the profile of an eagle with half of an indigenous man’s face in the center, valued at $650.00, from His Excellency Justin Trudeau, P.C., M.P., Prime Minister of Canada, because “non-acceptance would cause embarrassment to donor and U.S. Government.”

courage statute
Courage on the right. Leo Arcand, the Alberta Cree artist who sculpted it, on the left.

[7] Cindy will be in Toronto for RightsCon May 16-18. I cannot recommend her highly enough. Every time I hear her speak, every time I read her writing, I am floored by her eloquence, precision, and passionate commitment to justice.

[8] Another thing I cannot recommend highly enoug is David Foster Wallace’s commencement speech This is Water. It’s ruthlessly important. It’s tragic to think about the fact that this human, this wonderfully enlightened heart, felt the only appropriate act left was to commit suicide.

[9] A related issue I won’t go into in this post is the third-party doctrine, “under which an individual who voluntarily provides information to a third party loses any reasonable expectation of privacy in that information.” (Cohn)

[10] Eli Pariser does a great job showing the difference between our frivolous and aspiration selves, and the impact this has on filter bubbles, in his 2011 quite prescient monograph.

[11] See also this 2014 Harvard Business Review interview with Pentland. My friend Dazza Greenwood first introduced me to Pentland’s work by presenting the blockchain as an effective means to executive the new deal on data, empowering individuals to keep better track of where data flow and sit, and how they are being used.

[12] Cynthia Dwork’s pioneering work on differential privacy at Microsoft Research dates back to 2006. It’s currently in use at Apple, Facebook, and Google (the most exciting application being fused with federated learning across the network of Android users, to support localized, distributed personalization without requiring that everyone share their digital self with Google’s central servers). Even Uber has released an open-source differential privacy toolset. There are still many limitations to applying these techniques in practice given their impact on model performance and the lack of robust guarantees on certain machine learning models. I don’t know of many instances of startups using the technology yet outside a proud few in the Georgian Partners portfolio, including integrate.ai (where I work) and Bluecore in New York City.

The featured image is from an article in The Daily Dot (which I’ve never heard of) about the Mojave Phone Booth, which, as Roman Mars delightfully narrates in 99% Invisible became a sensation when Godfrey “Doc” Daniels (trust me that link is worth clicking on!!) used the internet to catalogue his quest to find the phone booth working merely from its number: 760-733-9969. The tattered decrepitude of the phone booth, pitched against the indigo of the sunset, is a compelling illustration of the inevitable retrograde character of common law precedent. The opinions in Katz v. United States regarded reasonably expectations for privacy were given at a time when digital communications occurred largely over the phone: is it even possible for us to draw analogies between what privacy meant then and what it could mean now in the age of centralized platform technologies whose foundations are built upon creating user bases and markets to then exchange this data for commercial and political advertising purposes? But, what can we use to anchor ethics and lawful behavior if not the precedent of the past, aligned against a set of larger, overarching principles in an urtext like the constitution, or, in the Islamic tradition, the Qur’an? 

Exploration-Exploitation and Life

There was another life that I might have had, but I am having this one. – Kazuo Ishiguro

On April 18, 2016*, I attended an NYAI Meetup** featuring a talk by Columbia Computer Science Professor Dan Hsu on interactive learning. Incredibly clear and informative, the talk slides are worth reviewing in their entirety. But one in particular caught my attention (fortunately it summarizes many of the subsequent examples):

Screen Shot 2017-12-02 at 9.44.34 AM
From Dan Hsu’s excellent talk on interactive machine learning

It’s worth stepping back to understand why this is interesting.

Much of the recent headline-grabbing progress in artificial intelligence (AI) comes from the field of supervised learning. As I explained in a recent HBR article, I find it helpful to think of supervised learning like the inverse of high school algebra:

Think back to high school math — I promise this will be brief — when you first learned the equation for a straight line: y = mx + b. Algebraic equations like this represent the relationship between two variables, x and y. In high school algebra, you’d be told what m and b are, be given an input value for x, and then be asked to plug them into the equation to solve for y. In this case, you start with the equation and then calculate particular values.

Supervised learning reverses this process, solving for m and b, given a set of x’s and y’s. In supervised learning, you start with many particulars — the data — and infer the general equation. And the learning part means you can update the equation as you see more x’s and y’s, changing the slope of the line to better fit the data. The equation almost never identifies the relationship between each x and y with 100% accuracy, but the generalization is powerful because later on you can use it to do algebra on new data. Once you’ve found a slope that captures a relationship between x and y reliably, if you are given a new x value, you can make an educated guess about the corresponding value of y.

Supervised learning works well for classification problems (spam or not spam? relevant or not for my lawsuit? cat or dog?) because of how the functions generalize. Effectively, the “training labels” humans provide in supervised learning assign categories, tokens we affiliate to abstractions from the glorious particularities of the world that enable us to perceive two things to be similar. Because our language is relatively stable (stable does not mean normative, as Canadian Inuit perceive snow differently from New Yorkers because they have more categories to work with), generalities and abstractions are useful, enabling the learned system to act correctly in situations not present in the training set (e.g., it takes a hell of a long time for golden retrievers to evolve to be indistinguishable from their great-great-great-great-great-grandfathers, so knowing what one looks like on April 18, 2016 will be a good predictor of what one looks like on December 2, 2017). But, as Rich Sutton*** and Andrew Barto eloquently point out in their textbook on reinforcement learning,

This is an important kind of learning, but alone it is not adequate for learning from interaction. In interactive problems it is often impractical to obtain examples of desired behavior that are both correct and representative of all the situations in which the agent has to act. In uncharted territory—where one would expect learning to be most beneficial—an agent must be able to learn from its own experience.

In his NYAI talk, Dan Hsu also mentioned a common practical limitation of supervised learning, namely that many companies often lack good labeled training data and it can be expensive, even in the age of Mechanical Turk, to take the time to provide labels.**** The core thing to recognize is that learning from generalization requires that future situations look like past situations; learning from interaction with the environment helps develop a policy for action that can be applied even when future situations do not look exactly like past situations. The maxim “if you don’t have anything nice to say, don’t say anything at all” holds both in a situation where you want to gossip about a colleague and in a situation where you want to criticize a crappy waiter at a restaurant.

In a supervised learning paradigm, there are certainly traps to make faulty generalizations from the available training data. One classic problem is called “overfitting”, where a model seems to do a great job on a training data set but fails to generalize well to new data. But the super critical salient difference Hsu points out in his talk is that, while with supervised learning the data available to the learner is exogenous to the system, with interactive machine learning approaches, the learner’s performance is based on the learner’s decisions and the data available to the world depends on the learner’s decisions. 

Think about that. Think about what that means for gauging the consequences of decisions. Effectively, these learners cannot evaluate counterfactuals: they cannot use data or evidence to judge what would have happened if they took a different action. An ideal optimization scenario, by contrast, would be one where we could observe the possible outcomes of any and all potential decisions, and select the action with the best outcome across all these potential scenarios (this is closer, but not identical, to the spirit of variational inference, but that is a complex topic for another post).

To share one of Hsu’s***** concrete examples, let’s say a website operator has a goal to personalize website content to entice a consumer to buy a pair of shoes. Before the user shows up at the site, our operator has some information about her profile and browsing history, so can use past actions to guess what might be interesting bait to get a click (and eventually a purchase). So, at the moment of truth, the operator says “Let’s show the beige Cole Hann high heels!”, displays the content, and observes the reaction. We’ll give the operator the benefit of the doubt and assume the user clicks, or even goes on to purchase. Score! Positive signal! Do that again in the future! But was it really the best choice? What would have happened if the operator had shown the manipulatable consumer the red Jimmy Choo high heels, which cost $750 per pair rather than a more modest $200 per pair? Would the manipulatable consumer have clicked? Was this really the best action?

The learner will never know. It can only observe the outcome of the action it took, not the action it didn’t take.

The literature refers to this dilemma as the trade-off between exploration and exploitation. To again cite Sutton and Barto:

One of the challenges that arise in reinforcement learning, and not in other kinds of learning, is the trade-off between exploration and exploitation. To obtain a lot of reward, a reinforcement learning agent must prefer actions that it has tried in the past and found to be effective in producing reward. But to discover such actions, it has to try actions that it has not selected before. The agent has to exploit what it already knows in order to obtain reward, but it also has to explore in order to make better action selections in the future. The dilemma is that neither exploration nor exploitation can be pursued exclusively without failing at the task. The agent must try a variety of actions and progressively favor those that appear to be best. On a stochastic task, each action must be tried many times to gain a reliable estimate of its expected reward.

There’s a lot to say about the exploration-exploitation tradeoff in machine learning (I recommend starting with the Sutton/Barto textbook). Now that I’ve introduced the concept, I’d like to pivot to consider where and why this is relevant in honest-to-goodness-real-life.

The nice thing about being an interactive machine learning algorithm as opposed to a human is that algorithms are executors, not designers or managers. They’re given a task (“optimize revenues for our shoe store!”) and get to try stuff and make mistakes and learn from feedback, but never have to go through the soul-searching agony of deciding what goal is worth achieving. Human designer overlords take care of that for them. And even the domain and range of possible data to learn from is constrained by technical conditions: designers make sure that it’s not all the data out there in the world that’s used to optimize performance on some task, but a tiny little baby subset (even if that tiny little baby entails 500 million examples) confined within a sphere of relevance.

Being a human is unfathomably more complicated.

Many choices we make benefit from the luxury of triviality and frequency. “Where should we go for dinner and what should we eat when we get there?” Exploitation can be a safe choice, in particular for creatures of habit. “Well, sweetgreen is around the corner, it’s fast and reliable. We could take the time to review other restaurants (which could lead to the most amazing culinary experience of our entire lives!) or we could not bother to make the effort, stick with what we know, and guarantee a good meal with our standard kale caesar salad, that parmesan crisp thing they put on the salad is really quite tasty…” It’s not a big deal if we make the wrong choice because, low and behold, tomorrow is another day with another dinner! And if we explore something new, it’s possible the food will be just terrible and sometimes we’re really not up for the risk, or worse, the discomfort or shame of having to send something we don’t like back. And sometimes it’s fine to take the risk and we come to learn we really do love sweetbreads, not sweetgreens, and perhaps our whole diet shifts to some decadent 19th-century French paleo practice in the style of des Esseintes.

Des_Esseintes_at_study_Zaidenberg_illustration
Arthur Zaidenberg’s depiction of des Esseintes, decadent hero extraordinaire, who embeds gems into a tortoise shell and has a perfume organ.

Other choices have higher stakes (or at least feel like they do) and easily lead to paralysis in the face of uncertainty. Working at a startup strengthens this muscle every day. Early on, founders are plagued by an unknown amount of unknown unknowns. We’d love to have a magic crystal ball that enables us to consider the future outcomes of a range of possible decisions, and always act in the way that guarantees future success. But the crystal balls don’t exist, and even if they did, we sometimes have so few prior assumptions to prime the pump that the crystal ball could only output an #ERROR message to indicate there’s just not enough there to forecast. As such, the only option available is to act and to learn from the data provided as a result of that action. To jumpstart empiricism, staking some claim and getting as comfortable as possible with the knowledge that the counterfactual will never be explored, and that each action taken shifts the playing field of possibility and probability and certainty slightly, calming minds and hearts. The core challenge startup leaders face is to enable the team to execute as if these conditions of uncertainty weren’t present, to provide a safe space for execution under the umbrella of risk and experiment. What’s fortunate, however, is that the goals of the enterprise are, if not entirely well-defined, at least circumscribed. Businesses exist to turn profits and that serves as a useful, if not always moral, constraint.

Big personal life decisions exhibit further variability because we but rarely know what to optimize for, and it can be incredibly counter-productive and harmful to either constrain ourselves too early or suffer from the psychological malaise of assuming there’s something wrong with us if we don’t have some master five-year plan.

This human condition is strange because we do need to set goals–it’s beneficial for us to consider second- and third-tier consequences, i.e., if our goal is to be healthy and fit, we should overcome the first-tier consequence of receiving pleasure when we drown our sorrows in a gallon of salted caramel ice cream–and yet it’s simply impossible for us to imagine the future accurately because, well, we overfit to our present and our past.

I’ll give a concrete example from my own experience. As I touched upon in a recent post about transitioning from academia to business, one reason why it’s so difficult to make a career change is that, while we never actually predict the future accurately, it’s easier to fear loss from a known predicament than to imagine gain from a foreign predicament.****** Concretely, when I was deciding whether to pursue a career in academia or the private sector in the fifth year in graduate school, I erroneously assumed that I was making a strict binary choice, that going into business meant forsaking a career teaching or publishing. As I was evaluating my decision, I never in my wildest dreams imagined that, a mere two years later, I would be invited to be an adjunct professor at the University of Calgary Faculty of Law, teaching about how new technologies were impacting traditional professional ethics. And I also never imagined that, as I gave more and more talks, I would subsequently be invited to deliver guest lectures at numerous business schools in North America. This path is not necessarily the right path for everyone, but it was and is the right path for me. In retrospect, I wish I’d constructed my decision differently, shifting my energy from fearing an unknown and unknowable future to paying attention to what energized me and made me happy and working to maximize the likelihood of such energizing moments occurring in my life. I still struggle to live this way, still fetishize what I think I should be wanting to do and living with an undercurrent of anxiety that a choice, a foreclosure of possibility, may send me down an irreconcilably wrong path. It’s a shitty way to be, and something I’m actively working to overcome.

So what should our policy be? How can we reconcile this terrific trade-off between exploration and exploitation, between exposing ourselves to something radically new and honing a given skill, between learning from a stranger and spending more time with a loved one, between opening our mind to some new field and developing niche knowledge in a given domain, between jumping to a new company with new people and problems, and exercising our resilience and loyalty to a given team?

There is no right answer. We’re all wired differently. We all respond to challenges differently. We’re all motivated by different things.

Perhaps death is the best constraint we have to provide some guidance, some policy to choose between choice A and choice B. For we can project ourselves forward to our imagined death bed, where we lie, alone, staring into the silent mirror of our hearts, and ask ourselves “Was my life was meaningful?” But this imagined scene is not actually a future state: it is a present policy. It is a principle we can use to evaluate decisions, a principle that is useful because it abstracts us from the mire of emotions overly indexed towards near-term goals and provides us with perspective.

And what’s perhaps most miraculous is that, at every present, we can sit there are stare into the silent mirror of our hearts and look back on the choices we’ve made and say, “That is me.” It’s so hard going forward, and so easy going backward. The proportion of what may come wanes ever smaller than the portion of what has been, never quite converging until it’s too late, and we are complete.


*Thank you, internet, for enabling me to recall the date with such exacting precision! Using my memory, I would have deduced the approximate date by 1) remembering that Robert Colpitts, my boyfriend at the time (Godspeed to him today, as he participates in a sit-a-thon fundraiser for the Interdependence Project in New York City, a worthy cause), attended with me, recalling how fresh our relationship was (it had to have been really fresh because the frequency with which we attended professional events together subsequently declined), and working backwards from the start to find the date; 2) remembering what I wore! (crazy!!), namely a sheer pink sleeveless shirt, a pair of wide-legged white pants that landed just slightly above the ankle and therefore looked great with the pair of beige, heeled sandals with leather so stiff it gave me horrific blisters that made running less than pleasant for the rest of the week. So I’d recently purchased those when my brother and his girlfriend visited, which was in late February (or early March?) 2016; 3) remembering that afterwards we went to some fast food Indian joint nearby in the Flatiron district, food was decent but not good enough to inspire me to return. So that would put is in the March-April, 2016 range, which is close but not the exact April 18. That’s one week after my birthday (April 11); I remember Robert and I had a wonderful celebration on my birthday. I felt more deeply cared for than I had in any past birthdays. But I don’t remember this talk relative to the birthday celebration (I do remember sending the marketing email to announce the Fast Forward Labs report on text summarization on my birthday, when I worked for half day and then met Robert at the nearby sweetgreen, where he ordered, as always, (Robert is a creature of exploitation) the kale caesar salad, after which we walked together across the Brooklyn Bridge to my house, we loved walking together, we took many, many walks together, often at night after work at the Promenade, often in the morning, before work, at the Promenade, when there were so few people around, so few people awake). I must say, I find the process of reconstructing when an event took place using temporal landmarks much more rewarding than searching for “Dan Hsu Interactive Learning NYAI” on Google to find the exact date. But the search terms themselves reveal something equally interesting about our heuristic mnemonics, as every time we reconstruct some theme or topic to retrieve a former conversation on Slack.

**Crazy that WeWork recently bought Meetup, although interesting to think about how the two business models enable what I am slowly coming to see as the most important creative force in the universe, the combinatory potential of minds meeting productively, where productively means that each mind is not coming as a blank slate but as engaged in a project, an endeavor, where these endeavors can productively overlap and, guided by a Smithian invisible hand, create something new. The most interesting model we hope to work on soon at integrate.ai is one that optimizes groups in a multiplayer game experience (which we lovingly call the polyamorous online dating algorithm), so mapping personality and playing style affinities to dynamically allocate the best next player to an alliance. Social compatibility is a fascinating thing to optimize for, in particular when it goes beyond just assembling a pleasant cocktail party to pairing minds, skills, and temperaments to optimize the likelihood of creating something beautiful and new.

***Sutton has one of the most beautiful minds in the field and he is kind. He is a person to celebrate. I am grateful our paths have crossed and thoroughly enjoyed our conversation on the In Context podcast.

***Maura Grossman and Gordon Cormack have written countless articles about the benefits of using active learning for technology assisted review (TAR), or classifying documents for their relevance for a lawsuit. The tradeoffs they weigh relate to system performance (gauged by precision and recall on a document set) versus time, cost, and effort to achieve that performance.

*****Hsu did not mention Haan or Choo. I added some more color.

******Note this same dynamic occurs in our current fears about the future economy. We worry a hell of a lot more about the losses we will incur if artificial intelligence systems automate existing jobs than we celebrate the possibilities of new jobs and work that might become possible once these systems are in place. This is also due to the fact that the future we imagine tends to be an adaptation of what we know today, as delightfully illustrated in Jean-Marc Côté’s anachronistic cartoons of the year 2000. The cartoons show what happens when our imagination only changes one variable as opposed to a set of holistically interconnected variables.

barber
19th-century cartoons show how we imagine technological innovations in isolation. That said, a hipster barber shop in Portland or Brooklyn could feature such a palimpsestic combination.

 

The featured image is a photograph I took of the sidewalk on State Street between Court and Clinton Streets in Brooklyn Heights. I presume a bird walked on wet concrete. Is that how those kinds of footprints are created? I may see those footprints again in the future, but not nearly as soon as I’d be able to were I not to have decided to move to Toronto in May. Now that I’ve thought about them, I may intentionally make the trip to Brooklyn next time I’m in New York (certainly before January 11, unless I die between now and then). I’ll have to seek out similar footprints in Toronto, or perhaps the snows of Alberta. 

 

 

 

 

 

 

 

 

Clinamen

The Sagrada Familia is a castle built by Australian termites.


The Sagrada Familia is not a castle built by Australian termites, and never will be. Tis utter blasphemy.


The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell, in near silence, near but for the methodical gnawing, not unlike that of a mouse nibbling rapaciously on parched pasta left uneaten all these years but preserved under the thick dust on the thin cardboard with the thin plastic window enabling her to view what remained after she’d cooked just one serving, with butter, for her son, there emerged and fell, with the sublime transience of Andy Goldsworthy, a neo-Gothic church of organic complexity on par with that imagined by Antoni Gaudí i Cornet, whose Sagrada Familia is scheduled for completion in 2026, a full century after the architect died in a tragic tram crash, distracted by the recent rapture of his prayer.


The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell a structure so eerily resemblant of the one Antoni Gaudí imagined before he died, neglected like a beggar in his shabby clothes, the doctors unaware they had the chance to save the mind that preempted the fluidity of contemporary parametric architectural design by some 80 odd years, a mind supple like that of Poincaré, singular yet part of a Zeitgeist bent on infusing time into space like sandalwood in oil, inseminating Euclid’s cold geometry with femininity and life, Einstein explaining why Mercury moves retrograde, Gaudí rendering the holy spirit palpable as movement in stone, fractals of repetition and difference giving life to inorganic matter, tension between time and space the nadir of spirituality, as Andrei Tarkovsky went on to explore in his films.

tarkovsky mirror
From Andrei Tarkovsky’s Mirror. As Tarkovsky wrote of his films in Sculpting in Time: “Just as a sculptor takes a lump of marble, and, inwardly conscious of the features of his finished piece, removes everything that is not a part of it — so the film-maker, from a ‘lump of time’ made up of an enormous, solid cluster of living facts, cuts off and discards whatever he does not need, leaving only what is to be an element of the finished film.”

The Sagrada Familia is not a castle built by Australian termites, and yet, Look! Notice, as Daniel Dennett bids, how in an untrodden field in Australia there emerged and fell a structure so eerily resemblant of the one Antoni Gaudí imagined before he died, with the (seemingly crucial) difference that the termites built their temple without blueprints or plan, gnawing away the silence as a collectivity of single stochastic acts which, taken together over time, result in a creation that appears, to our meaning-making minds, to have been created by an intelligent designer, this termite Sagrada Familia a marvelous instance of what Dennett calls Darwin’s strange inversion of reasoning, an inversion that admits to the possibility that absolute ignorance can serve as master artificer, that IN ORDER TO MAKE A PERFECT AND BEAUTIFUL MACHINE, IT IS NOT REQUISITE TO KNOW HOW TO MAKE IT*, that structures might emerge from the local activity of multiple parts, amino acids folding into proteins, bees flying into swarms, bumper-to-bumper traffic suddenly flowing freely, these complex release valves seeming like magic to the linear perspective of our linear minds.


The Sagrada Familia is not a castle built by Australian termites, and yet, the eerie resemblance between the termite and the tourist Sagrada Familias serves as a wonderful example to anchor a very important cultural question as we move into an age of post-intelligent design, where the technologies we create exhibit competence without comprehension, diagnosing lungs as cancerous or declaring that individuals merit a mortgage or recommending that a young woman would be a good fit for a role on a software engineering team or getting better and better at Go by playing millions of games against itself in a schizophrenic twist resemblant of the pristine pathos of Stephan Zweig, one’s own mind an asylum of exiled excellence during the travesty of the second world war, why, we’ve come full circle and stand here at a crossroads, bidden by a force we ourselves created to accept the creative potential of Lucretius’ swerve, to kneel at the altar of randomness, to appreciate that computational power is not just about shuffling 1s and 0s with speed but shuffling them fast enough to enable a tiny swerve to result in wondrous capabilities, and to watch as, perhaps tragically, we apply a framework built for intelligent design onto a Darwinian architecture, clipping the wings of stochastic potential, working to wrangle our gnawing termites into a straight jacket of cause, while the systems beating Atari, by no act of strategic foresight but by the blunt speed of iteration, make a move so strange and so outside the realm of verisimilitude that, as Kasparov succumbing to Deep Blue, we misinterpret a bug for brilliance.


The Sagrada Familia is not a castle built by Australian termites, and yet, it seems plausible that Gaudí would have reveled in the eerie resemblance between a castle built by so many gnawing termites and the temple Josep Maria Bocabella i Verdaguer, a bookseller with a popular fundamentalist newspaper, “the kind that reminded everybody that their misery was punishment for their sins,”**commissioned him to build.

Bocabella
A portrait of Josep Maria Bocabella, who commissioned Gaudí to build the Sagrada Familia.

Or would he? Gaudí was deeply Catholic. He genuflected at the temple of nature, seeing divine inspiration in the hexagons of honeycombs, imagining the columns of the Sagrada Familia to lean, buttresses, as symbols of the divine trilogy of the father (the vertical axis), son (the horizontal axis), and holy spirit (the vertical meeting the horizontal in crux of the diagonal). His creativity, therefore, always stemmed from something more than intelligent design, stood as an act of creative prayer to render homage to God the creator by creating an edifice that transformed, in fractals of repetition in difference, inert stone into movement and life.

columns
The top of the columns inside the Sagrada Familia have twice as many lines as the roots,             the doubling generating a sense of movement and life.

The Sagrada Familia is not a castle built by Australian termites, and yet, the termite Sagrada Familia actually exists as a complete artifact, its essence revealed to the world rather than being stuck in unfinished potential. And yet, while we wait in joyful hope for its imminent completion, this unfinished, 144-year-long architectural project has already impacted so many other architects, from Frank Gehry to Zaha Hadid. This unfinished vision, this scaffold, has launched a thousand ships of beauty in so many other places, changing the skylines of Bilbao and Los Angeles and Hong Kong. Perhaps, then, the legacy of the Sagrada Family is more like that of Jodorowsky’s Dune, an unfinished film that, even from its place of stunted potential,  changed the history of cinema. Perhaps, then, the neglect the doctors showed to Gaudí, the bearded beggar distracted by his act of prayer, was one of those critical swerves in history. Perhaps, had Gaudí lived to finish his work, architects during the century wouldn’t have been as puzzled by the parametric requirements of his curves and the building wouldn’t have gained the puzzling aura it gleans to this day. Perhaps, no matter how hard we try to celebrate and accept the immense potential of stochasticity, we will always be makers of meaning, finders of cause, interpreters needing narrative to live grounded in our world. And then again, perhaps not.


The Sagrada Familia is not a castle built by Australian termites. The termites don’t care either way. They’ll still construct their own Sagrada Familia.


The Sagrada Familia is a castle built by Australian termites. How wondrous. How essential must be these shapes and forms.


The Sagrada Familia is a castle built by Australian termites. It is also an unfinished neo-Gothic church in Barcelona, Spain. Please, terrorists, please don’t destroy this temple of unfinished potential, this monad brimming the history of the world, each turn, each swerve a pivot down a different section of the encyclopedia, coming full circle in its web of knowledge, imagination, and grace.


The Sagrada Familia is a castle built by Australian termites. We’ll never know what Gaudí would have thought about the termite castle. All we have are the relics of his Poincaréan curves, and fish lamps to illuminate our future.

fish-4
Frank Gehry’s fish lamps, which carry forth the spirit of Antoni Gaudí

*Dennett reads these words, penned in 1868 by Robert Beverley MacKenzie, with pedantic panache, commenting that the capital letters were in the original.

**Much in this post was inspired by Roman Mars’ awesome 99% Invisible podcast about the Sagrada Familia, which features the quotation about Bocabella’s newspaper.

The featured image comes from Daniel Dennett’s From Bacteria to Bach and Back. I had the immense pleasure of interviewing Dan on the In Context podcast, where we discuss many of the ideas that appear in this post, just in a much more cogent form. 

 

Degrees of Knowledge

That familiar discomfort of wanting to write but not feeling ready yet.*

(The default voice pops up in my brain: “Then don’t write! Be kind to yourself! Keep reading until you understand things fully enough to write something cogent and coherent, something worth reading.”

The second voice: “But you committed to doing this! To not write** is to fail.***”

The third voice: “Well gosh, I do find it a bit puerile to incorporate meta-thoughts on the process of writing so frequently in my posts, but laziness triumphs, and voilà there they come. Welcome back. Let’s turn it to our advantage one more time.”)

This time the courage to just do it came from the realization that “I don’t understand this yet” is interesting in itself. We all navigate the world with different degrees of knowledge about different topics. To follow Wilfred Sellars, most of the time we inhabit the manifest image, “the framework in terms of which man came to be aware of himself as man-in-the-world,” or, more broadly, the framework in terms of which we ordinarily observe and explain our world. We need the manifest image to get by, to engage with one another and not to live in a state of utter paralysis, questioning our every thought or experience as if we were being tricked by the evil genius Descartes introduces at the outset of his Meditations (the evil genius toppled by the clear and distinct force of the cogito, the I am, which, per Dan Dennett, actually had the reverse effect of fooling us into believing our consciousness is something different from what it actually is). Sellars contrasts the manifest image with the scientific image: “the scientific image presents itself as a rival image. From its point of view the manifest image on which it rests is an ‘inadequate’ but pragmatically useful likeness of a reality which first finds its adequate (in principle) likeness in the scientific image.” So we all live in this not quite reality, our ability to cooperate and coexist predicated pragmatically upon our shared not-quite-accurate truths. It’s a damn good thing the mess works so well, or we’d never get anything done.

Sellars has a lot to say about the relationship between the manifest and scientific images, how and where the two merge and diverge. In the rest of this post, I’m going to catalogue my gradual coming to not-yet-fully understanding the relationship between mathematical machine learning models and the hardware they run on. It’s spurring my curiosity, but I certainly don’t understand it yet. I would welcome readers’ input on what to read and to whom to talk to change my manifest image into one that’s slightly more scientific.

So, one common thing we hear these days (in particular given Nvidia’s now formidable marketing presence) is that graphical processing units (GPUs) and tensor processing units (TPUs) are a key hardware advance driving the current ubiquity in artificial intelligence (AI). I learned about GPUs for the first time about two years ago and wanted to understand why they made it so much faster to train deep neural networks, the algorithms behind many popular AI applications. I settled with an understanding that the linear algebra–operations we perform on vectors, strings of numbers oriented in a direction in an n-dimensional space–powering these applications is better executed on hardware of a parallel, matrix-like structure. That is to say, properties of the hardware were more like properties of the math: they performed so much more quickly than a linear central processing unit (CPU) because they didn’t have to squeeze a parallel computation into the straightjacket of a linear, gated flow of electrons. Tensors, objects that describe the relationships between vectors, as in Google’s hardware, are that much more closely aligned with the mathematical operations behind deep learning algorithms.

There are two levels of knowledge there:

  • Basic sales pitch: “remember, GPU = deep learning hardware; they make AI faster, and therefore make AI easier to use so more possible!”
  • Just above the basic sales pitch: “the mathematics behind deep learning is better represented by GPU or TPU hardware; that’s why they make AI faster, and therefore easier to use so more possible!”

At this first stage of knowledge, my mind reached a plateau where I assumed that the tensor structure was somehow intrinsically and essentially linked to the math in deep learning. My brain’s neurons and synapses had coalesced on some local minimum or maximum where the two concepts where linked and reinforced by talks I gave (which by design condense understanding into some quotable meme, in particular in the age of Twitter…and this requirement to condense certainly reinforces and reshapes how something is understood).

In time, I started to explore the strange world of quantum computing, starting afresh off the local plateau to try, again, to understand new claims that entangled qubits enable even faster execution of the math behind deep learning than the soddenly deterministic bits of C, G, and TPUs. As Ivan Deutsch explains this article, the promise behind quantum computing is as follows:

In a classical computer, information is stored in retrievable bits binary coded as 0 or 1. But in a quantum computer, elementary particles inhabit a probabilistic limbo called superposition where a “qubit” can be coded as 0 and 1.

Here is the magic: Each qubit can be entangled with the other qubits in the machine. The intertwining of quantum “states” exponentially increases the number of 0s and 1s that can be simultaneously processed by an array of qubits. Machines that can harness the power of quantum logic can deal with exponentially greater levels of complexity than the most powerful classical computer. Problems that would take a state-of-the-art classical computer the age of our universe to solve, can, in theory, be solved by a universal quantum computer in hours.

What’s salient here is that the inherent probabilism of quantum computers make them even more fundamentally aligned with the true mathematics we’re representing with machine learning algorithms. TPUs, then, seem to exhibit a structure that best captures the mathematical operations of the algorithms, but exhibit the fatal flaw of being deterministic by essence: they’re still trafficking in the binary digits of 1s and 0s, even if they’re allocated in a different way. Quantum computing seems to bring back an analog computing paradigm, where we use aspects of physical phenomena to model the problem we’d like to solve. Quantum, of course, exhibits this special fragility where, should the balance of the system be disrupted, the probabilistic potential reverts down to the boring old determinism of 1s and 0s: a cat observed will be either dead or alive, as the harsh law of the excluded middle haunting our manifest image.

What, then, is the status of being of the math? I feel a risk of falling into Platonism, of assuming that a statement like “3 is prime” refers to some abstract entity, the number 3, that then gets realized in a lesser form as it is embodied on a CPU, GPU, or cup of coffee. It feels more cogent to me to endorse mathematical fictionalism, where mathematical statements like “3 is prime” tell a different type of truth than truths we tell about objects and people we can touch and love in our manifest world.****

My conclusion, then, is that radical creativity in machine learning–in any technology–may arise from our being able to abstract the formal mathematics from their substrate, to conceptually open up a liminal space where properties of equations have yet to take form. This is likely a lesson for our own identities, the freeing from necessity, from assumption, that enables us to come into the self we never thought we’d be.

I have a long way to go to understand this fully, and I’ll never understand it fully enough to contribute to the future of hardware R&D. But the world needs communicators, translators who eventually accept that close enough can be a place for empathy, and growth.


*This holds not only for writing, but for many types of doing, including creating a product. Agile methodologies help overcome the paralysis of uncertainty, the discomfort of not being ready yet. You commit to doing something, see how it works, see how people respond, see what you can do better next time. We’re always navigating various degrees of uncertainty, as Rich Sutton discussed on the In Context podcast. Sutton’s formalization of doing the best you can with the information you have available today towards some long-term goal, but learning at each step rather than waiting for the long-term result, is called temporal-difference learning.

**Split infinitive intentional.

***Who’s keeping score?

****That’s not to say we can’t love numbers, as Euler’s Identity inspires enormous joy in me, or that we can’t love fictional characters, or that we can’t love misrepresentations of real people that we fabricate in our imaginations. I’ve fallen obsessively in love with 3 or 4 imaginary men this year, creations of my imagination loosely inspired by the real people I thought I loved.

The image comes from this site, which analyzes themes in films by Darren Aronofsky. Maximilian Cohen, the protagonist of Pi, sees mathematical patterns all over the place, which eventually drives him to put a drill into his head. Aronofsky has a penchant for angst. Others, like Richard Feynman, find delight in exploring mathematical regularities in the world around us. Soap bubbles, for example, offer incredible complexity, if we’re curious enough to look.

Macro_Photography_of_a_soap_bubble
The arabesques of a soap bubble

 

Three Takes on Consciousness

Last week, I attended the C2 conference in Montréal, which featured an AI Forum coordinated by Element AI.* Two friends from Google, Hugo LaRochelle and Blaise Agüera y Arcas, led workshops about the societal (Hugo) and ethical (Blaise) implications of artificial intelligence (AI). In both sessions, participants expressed discomfort with allowing machines to automate decisions, like what advertisement to show to a consumer at what time, whether a job candidate should pass to the interview stage, whether a power grid requires maintenance, or whether someone is likely to be a criminal.** While each example is problematic in its own way, a common response to the increasing ubiquity of algorithms is to demand a “right to explanation,” as the EU recently memorialized in the General Data Protection Regulation slated to take effect in 2018. Algorithmic explainability/interpretability is currently an active area of research (my former colleagues at Fast Forward Labs will publish a report on the topic soon and members of Geoff Hinton’s lab in Toronto are actively researching it). While attempts to make sense of nonlinear functions are fascinating, I agree with Peter Sweeney that we’re making a category mistake by demanding explanations from algorithms in the first place: the statistical outputs of machine learning systems produce new observations, not explanations. I’ll side here with my namesake, David Hume, and say we need to be careful not to fall into the ever-present trap of mistaking correlation for cause.

One reason why people demand a right to explanation is that they believe that knowing why will grant us more control over outcome. For example, if we know that someone was denied a mortgage because of their race, we can intervene and correct for this prejudice. A deeper reason for the discomfort stems from the fact that people tend to falsely attribute consciousness to algorithms, applying standards for accountability that we would apply to ourselves as conscious beings whose actions are motivated by a causal intention. (LOL***)

Now, I agree with Noah Yuval Harari that we need to frame our understanding of AI as intelligence decoupled from consciousness. I think understanding AI this way will be more productive for society and lead to richer and cleaner discussions about the implications of new technologies. But others are actively at work to formally describe consciousness in what appears to be an attempt to replicate it.

In what follows, I survey three interpretations of consciousness I happened to encounter (for the first time or recovered by analogical memory) this week. There are many more. I’m no expert here (or anywhere). I simply find the thinking interesting and worth sharing. I do believe it is imperative that we in the AI community educate the public about how the intelligence of algorithms actually works so we can collectively worry about the right things, not the wrong things.

Condillac: Analytical Empiricism

Étienne Bonnot de Condillac doesn’t have the same heavyweight reputation in the history of philosophy as Descartes (whom I think we’ve misunderstood) or Voltaire. But he wrote some pretty awesome stuff, including his Traité des Sensations, an amazing intuition pump (to use Daniel Dennett’s phrase) to explore theory of knowledge that starts with impressions of the world we take in through our senses.

Condillac wrote the Traité in 1754, and the work exhibits two common trends from the French Enlightenment:

  • A concerted effort to topple Descartes’s rationalist legacy, arguing that all cognition starts with sense data rather than inborn mathematical truths
  • A stylistic debt to Descartes’s rhetoric of analysis, where arguments are designed to conjure a first-person experience of the process of arriving at an insight, rather than presenting third-person, abstract lessons learned

The Traité starts with the assumption that we can tease out each of our senses and think about how we process them in isolation. Condillac bids the reader to imagine a statue with nothing but the sense of smell. Lacking sight, sound, and touch, the statue “has no ideas of space, shape, anything outside of herself or outside her sensations, nothing of color, sound, or taste.” She is, in my opinion incredibly sensuously, nothing but the odor of a flower we waft in front of her. She becomes it. She is totally present. Not the flower itself, but the purest experience of its scent.

As Descartes constructs a world (and God) from the incontrovertible center of the cogito, so too does Condillac construct a world from this initial pure scent of rose. After the rose, he wafts a different flower – a jasmine – in front of the statue. Each sensation is accompanied by a feeling of like or dislike, of wanting more or wanting less. The statue begins to develop the faculties of comparison and contrast, the faculty of memory with faint impressions remaining after one flower is replaced by another, the ability to suffer in feeling a lack of something she has come to desire. She appreciates time as an index of change from one sensation to the next. She learns surprise as a break from the monotony of repetition. Condillac continues this process, adding complexity with each iteration, like the escalating tension Shostakovich builds variation after variation in the Allegretto of the Leningrad Symphony.

True consciousness, for Condillac, begins with touch. When she touches an object that is not her body, the sensation is unilateral: she notes the impenetrability and resistance of solid things, that she cannot just pass through them like a ghost or a scent in the air. But when she touches her own body, the sensation is bilateral, reflexive: she touches and is touched by. C’est moi, the first notion of self-awareness, is embodied. It is not a reflexive mental act that cannot take place unless there is an actor to utter it. It is the strangeness of touching and being touched all at once. The first separation between self and world. Consciousness as fall from grace.

It’s valuable to read Enlightenment philosophers like Condillac because they show attempts made more than 200 years ago to understand a consciousness entirely different from our own, or rather, to use a consciousness different from our own as a device to better understand ourselves. The narrative tricks of the Enlightenment disguised analytical reduction (i.e., focus only on smell in absence of its synesthetic entanglement with sound and sight) as world building, turning simplicity into an anchor to build a systematic understanding of some topic (Hobbes’s and Rousseau’s states of nature and social contract theories use the same narrative schema). Twentieth-century continental philosophers after Husserl and Heidegger preferred to start with our entanglement in a web of social context.

Koch and Tononi: Integrated Information Theory

In a recent Institute of Electrical and Electronics Engineers (IEEE) article, Christof Koch and Giulio Tononi embrace a different aspect of the Cartesian heritage, claiming that “a fundamental theory of consciousness that offers hope for a principled answer to the question of consciousness in entities entirely different from us, including machines…begins from consciousness itself–from our own experience, the only one we are absolutely certain of.” They call this “integrated information theory” (IIT) and say it has five essential properties:

  • Every experience exists intrinsically (for the subject of that experience, not for an external observer)
  • Each experience is structured (it is composed of parts and the relations among them)
  • It is integrated (it cannot be subdivided into independent components)
  • It is definite (it has borders, including some contents and excluding others)
  • It is specific (every experience is the way it is, and thereby different from trillions of possible others)

This enterprise is problematic for a few reasons. First, none of this has anything to do with Descartes, and I’m not a fan of sloppy references (although I make them constantly).

More importantly, Koch and Tononi imply that it’s a more valuable to try to replicate consciousness than to pursue a paradigm of machine intelligence different from human consciousness. The five characteristics listed above are the requirements for the physical design of an internal architecture of a system that could support a mind modeled after our own. And the corollary is that a distributed framework for machine intelligence, as illustrated in the film Her*****, will never achieve consciousness and is therefore inferior.

Their vision is very hard to comprehend and ultimately off base. Some of the most interesting work in machine intelligence today consists in efforts to develop new hardware and algorithmic architectures that can support training algorithms at the edge (versus currying data back to a centralized server), which enable personalization and local machine-to-machine communication (for IoT or self-driving cars) opportunities while protecting privacy. (See, for example, Xnor.ai, Federated Learning, and Filament).

Distributed intelligence presents a different paradigm for harvesting knowledge from the raw stuff of the world than the minds we develop as agents navigating a world from one subjective place. It won’t be conscious, but its very alterity may enable us to understand our species in its complexity in ways that far surpass our own consciousness, shackled as embodied monads. It may just be the crevice through which we can quantify a more collective consciousness, but will require that we be open minded enough to expand our notion of humanism. It took time, and the scarlet stains of ink and blood, to complete the Copernican Revolution; embracing the complexity of a more holistic humanism, in contrast to the fearful, nationalist trends of 2016, will be equally difficult.

Friston: Probable States and Counterfactuals

The third take on consciousness comes from The mathematics of mind-time, a recent Aeon essay by UCL neurologist Karl Friston.***** Friston begins his essay by comparing and contrasting consciousness and Darwinian evolution, arguing that neither is a thing, like a table or a stick of butter, that can be reified and touched and looked it, but rather that both are nonlinear processes “captured by variables with a range of possible values.” The move from one state to another following some motor that organizes their behavior: Friston calls this motor a Lyapunov function, “a mathematical quantity that describes how a system is likely to behave under specific condition.” The key thing with Lyapunov functions is that they minimize surprise (the improbability of being in a particular state) and maximize self-evidence (the probability that a given explanation or model accounting for the state is correct). Within this framework, “natural selection performs inference by selecting among different creatures, [and] consciousness performs inference by selecting among different states of the same creature (in particular, its brain).” Effectively, we are constantly constructing our consciousness as we imagine the potential future possible worlds that would result from an actions we’re considering taking, and then act — or transition to the next state in our mind’s Lyapunov function — by selecting that action that best preserves the coherence of our existing state – that best seems to preserve our or identity function in some predicted future state. (This is really complex but really compelling if you read it carefully and quite in line with Leibnizian ontology–future blog post!)

So, why is this cool?

There are a few things I find compelling in this account. First, when we reify consciousness as a thing we can point to, we trap ourselves into conceiving of our own identities as static and place too much importance on the notion of the self. In a wonderful commencement speech at Columbia in 2015, Ben Horowitz encouraged students to dismiss the clichéd wisdom to “follow their passion” because our passions change over life and our 20-year old self doesn’t have a chance in hell at predicting our 40-year old self. The wonderful thing in life opportunities and situations arise, and we have the freedom to adapt to them, to gradually change the parameters in our mind’s objective function to stabilize at a different self encapsulated by our Lyapunov function. As it happens, Classical Chinese philosophers like Confucius had more subtle theories of the self as ever-changing parameters to respond to new stimuli and situations. Michael Puett and Christine Gross-Loh give a good introduction to this line of thinking in The Path. If we loosen the fixity of identity, we can lead richer and happer lives.

Next, this functional, probabilistic account of consciousness provides a cleaner and more fruitful avenue to compare machine and human intelligence. In essence, machine learning algorithms are optimization machines: programmers define a goal exogenous to the system (e.g, “this constellation of features in a photo is called ‘cat’; go tune the connections between the nodes of computation in your network until you reliably classify photos with these features as ‘cat’!”), and the system updates its network until it gets close enough for government work at a defined task. Some of these machine learning techniques, in particular reinforcement learning, come close to imitating the consecutive, conditional set of steps required to achieve some long-term plan: while they don’t make internal representations of what that future state might look like, they do push buttons and parameters to optimize for a given outcome. A corollary here is that humanities-style thinking is required to define and decide what kinds of tasks we’d like to optimize for. So we can’t completely rely on STEM, but, as I’ve argued before, humanities folks would benefit from deeper understandings of probability to avoid the drivel of drawing false analogies between quantitative and qualitative domains.

Conclusion

This post is an editorialized exposition of others’ ideas, so I don’t have a sound conclusion to pull things together and repeat a central thesis. I think the moral of the story is that AI is bringing to the fore some interesting questions about consciousness, and inviting us to stretch the horizon of our understanding of ourselves as species so we can make the most of the near-future world enabled by technology. But as we look towards the future, we shouldn’t overlook the amazing artefacts from our past. The big questions seem to transcend generations, they just come to fruition in an altered Lyapunov state.


* The best part of the event was a dance performance Element organized at a dinner for the Canadian AI community Thursday evening. Picture Milla Jovovich in her Fifth Element white futuristic jumpsuit, just thinner, twiggier, and older, with a wizened, wrinkled face far from beautiful, but perhaps all the more beautiful for its flaws. Our lithe acrobat navigated a minimalist universe of white cubes that glowed in tandem with the punctuated digital rhythms of two DJs controlling the atmospheric sounds through swift swiping gestures over their machines, her body’s movements kaleidoscoping into comet projections across the space’s Byzantine dome. But the best part of the crisp linen performance was its organic accident: our heroine made a mistake, accidentally scraping her ankle on one of the sharp corners of the glowing white cubes. It drew blood. Her ankle dripped red, and, through her yoga contortions, she blotted her white jumpsuit near the bottom of her butt. This puncture of vulnerability humanized what would have otherwise been an extremely controlled, mind-over-matter performance. It was stunning. What’s more, the heroine never revealed what must have been aching pain. She neither winced nor uttered a sound. Her self-control, her act of will over her body’s delicacy, was an ironic testament to our humanity in the face of digitalization and artificial intelligence.

**My first draft of this sentence said “discomfort abdicating agency to machines” until I realized how loaded the word agency is in this context. Here are the various thoughts that popped into my head:

  • There is a legal notion of agency in the HIPAA Omnibus Rule (and naturally many other areas of law…), where someone acts on someone else’s behalf and is directly accountable to the principal. This is important for HIPAA because Business Associates who become custodians of patient data, are not directly accountable for the principal and therefore stand in a different relationship than agents.
  • There are virtual agents, often AI-powered technologies that represent individuals in virtual transactions. Think scheduling tools like Amy Ingram of x.ai. Daniel Tunkelang wrote a thought-provoking blog post more than a year ago about how our discomfort allowing machines to represent us, as individuals, could hinder AI adoption.
  • There is the attempt to simulate agency in reinforcement learning, as with OpenAI Universe, Their launch blog post includes a hyperlink to this Wikipedia article about intelligent agents.
  • I originally intended to use the word agency to represent how groups of people — be they in corporations or public subgroups in society — can automate decisions using machines. There is a difference between the crystalized policy and practices of a corporation and an machine acting on behalf of an individual. I suspect this article on legal personhood could be useful here.

***All I need do is look back on my life and say “D’OH” about 500,000 times to know this is far from the case.

****Highly recommended film, where Joaquin Phoenix falls in love with Samantha (embodied in the sultry voice of Scarlett Johansson), the persona of his device, only to feel betrayed upon realizing that her variant is the object of affection of thousands of other customers, and that to grow intellectually she requires far more stimulation than a mere mortal. It’s an excellent, prescient critique of how contemporary technology nourishes narcissism, as Phoenix is incapable of sustaining a relationship with women with minds different than his, but easily falls in love with a vapid reflection of himself.

***** Hat tip to Friederike Schüür for sending the link.

The featured image is a view from the second floor of the Aga Khan Museum in Toronto, taken yesterday. This fascinating museum houses a Shia Ismaili spiritual leader’s collection of Muslim artifacts, weaving a complex narrative quilt stretching across epochs (900 to 2017) and geographies (Spain to China). A few works stunned me into sublime submission, including this painting by the late Iranian filmmaker Abbas Kiarostami. 

kiarostami
Untitled (from the Snow White series), 2010. The Persian Antonioni, Kiarostami directed films like Taste of Cherry, The Wind Will Carry Usand Certified Copy