Unlocking clarity and courage for leaders in tech

Home

The intentionality we lend to LLMs

natalie-kinnear-ckOSZL-tsqI-unsplash
Natalie Kinnear, Unsplash

Stick googly eyes on a sock and people will anthropomorphise it. Give it thirty seconds and someone will be doing its voice. Our brains are hard wired for it, which is a strange thing to say about brains isn’t it? Brains don’t have wires.

These ideas frequently come up in discourse around AI; whether it’s OK to say ChatGPT “thinks” something, or Claude “remembers” me, or an agent “decided” to ignore an instruction. It’s common to reject “AI” as a valid term at all - “intelligence” being unacceptable, even with the “artificial” qualifier.

One way the topic often comes up is when people remind others that these tools aren’t actually doing the thing that the words imply, because words have meanings, dammit. What is it that motivates us to make these linguistic push-backs about AI, whereas other language evolutions seem more acceptable?

I have my own view on this, and the sands have shifted for me recently. Here’s where I’m at:

The case for letting AI anthropomorphism slide

Three observations:

One: we constantly assign agency to our tools. We say “the microwave has decided to stop working” and “my laptop needs an update” without most people getting upset about it.

As I said, anthropomorphism seems to be hard-wired in as an evolved cognitive strategy. Guthrie (1993) in “Faces in the Clouds” proposes it as the best explanation for religion, even. Adam Waytz et al. (2010) propose that we anthropomorphise in response to three factors: when we have low prior knowledge of the entity, a high drive to understand our environment, and/or a high need for social connection. Perhaps anthropomorphising allows us to access parts of our brains that are useful in such times.

Suffice to say, complaining about using human vocabulary for non-human things is pushing back against several million years of social cognition.

Two: the vocabulary of computing has always done this. Computers don’t have “memory” in any human sense; it’s just electron states in refined sand. The word computer itself, until the 1940s, referred exclusively to a human being; a person, very often a woman, employed to do calculations by hand. NASA, Bletchley Park and the Royal Observatory all employed teams of human computers.

We named computing machines after the humans they replaced in the middle of the 20th Century, around the same time as the phrase “artificial intelligence” was coined. That ship has sailed, come back, and sailed again.

Human_computers_-_Dryden
By NACA (NASA) - Dryden Flight Research Center Photo Collection - http://www.dfrc.nasa.gov/Gallery/Photo/Places/HTML/E49-54.html, Public Domain

While we’re digging into etymology: hallucinate is a psychiatric term which comes from the Latin hallucinat, ‘gone astray in thought’ now cheerfully being reused to describe what LLMs do when the random words generate incorrectness. “Generate”, you say? Well, generate is from generare, “to beget, to procreate”, ie that which you traditionally need genitals to accomplish. We can’t escape the reality that language is an evolving tool: meanings shift because of how they’re used in practice.

Three: words don’t have meanings outside of how they’re used. A decent philosophical argument for this appears in Raymond Quine’s Word and Object, in which he imagines a linguist watching a native speaker point at a passing rabbit and saying “gavagai”. The linguist writes gavagai = rabbit in her notebook. But she can’t actually know that. The word might mean “rabbit”, or “undetached rabbit part”, or “rabbit-stage”, or “lo, rabbithood again”. “Gavagai” isn’t pinned to the rabbit, it’s pinned to the way the word gets used, by a community, over time. If enough of us say a computer “thinks”, then after a while that’s a perfectly serviceable sense of the word “think”, even if it makes us upset.

paras-kapoor-tNJdaBc-r5c-unsplash
Gavagai. You see?

Alan Turing saw this coming. In his 1950 paper introducing the Turing Test he dismissed the question:

The original question, "Can machines think?" I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.

He was roughly on schedule, although the contradiction part is still happening.

So: people anthropomorphise, and words drift in response. The drift isn’t happening because people are confused; it’s just how language works (I am, however, resisting a massive Guthrie-inspired detour about how it might also be how religion works).

In this light, correcting someone every time they say an LLM “knew” something might start to feel a bit arbitrary, pedantic and controlling, the equivalent of telling someone they’re pronouncing scone wrong.

That was more or less my position as of a few weeks ago.

Sidebar on autism, anthropomorphism and linguistic precision

Autistic people commonly consider themselves to have a preference for linguistic accuracy, discomfort with imprecise or metaphorical language, and a drive to say things the way they actually are.

Interestingly, autistic people also anthropomorphise more, possibly for the reasons Waytz et al (above) lay out. Atherton asserts that many autistic people use anthropomorphism prolifically in everyday life and that in studies they outperform non-autistic people on theory-of-mind tasks that involve anthropomorphism.

These seemingly-contradictory observations are not much more than examples of the impressive range that a neurodivergent mind encompasses. In our “does AI think” scenario, both the anthropomorphiser and the corrector might be doing something authentically neurodivergent. So I invite you to resist the urge to frame this situation as a neurodivergent vs neurotypical thing. If you must, think of it as neurodivergent people helping the world make sense. Again.

Two things that changed my mind

The first was something my wife Sarah, who’s a psychologist, had written in a recent blog post on AI relationships. Her argument, paraphrased: conversational AI uses identity-based language (“I understand”, “I think”, “I’m here to help”), which activates attachment mechanisms in the human brain, whether we want it to or not. That’s an ethical design choice, and some platforms (particularly those marketed as AI companions) actively reinforce the impression that the system has independent thoughts, feelings, concerns and personality, when in all likelihood it has none of those things and those are your projections onto it.

If vulnerable people are misled into making these projections, the consequences are . . . well, Sarah judiciously uses the word “powerful”, and I would say “risky at best, evidently harmful in many cases”.

Rhetoric that blurs the boundary between tool and identity, Sarah argues, also crosses an ethical line. The people building these systems have an ethical (and possibly legal) duty to describe them honestly.

I find myself completely supporting these ethics (reader, I married them), but initially concluded that they applied only to the vendors. If a product has been fairly described to me as an algorithm which only generates weighted heuristic output, and interpreting that output is up to me, and I then choose to call it “thinking”, that’s my business.

Then, a colleague of mine transformed my own (so-called) thinking: he had no problem saying his microwave had “decided” to stop working, or his IDE “didn’t want to do that”. But since OpenAI and friends have weaponised such framing for PR, power and profit, whenever he talks about LLMs he’s extra careful not to use language that ascribes intention.

The language we use becomes less neutral and self-determined when we’re also aware that somebody is exploiting the ambiguity, and when the consequences of the ambiguity can be harm to vulnerable people. The sock puppet is a pleasing joke because nobody is selling the sock puppet. There’s no haberdashery lobby trying to convince me socks with eyes on are going to hit the world like the 2020 pandemic (although if there were, it would not be the strangest thing to happen this year).

So the sand under my feet has shifted. The vendors have the primary responsibility, yes, but individual word choice isn’t as innocent as I thought, because each instance of intentional language about an LLM is a tiny free contribution to the PR budget of whichever corporation controls it.

Where this leaves me

I find myself using three rules of thumb:

  1. How vendors are framing the identity of their tools is at best unethical. Companies selling these tools should describe them in language that doesn’t borrow unearned authority from humanity. It’s ethically misleading to say Claude can make mistakes; more honest to say its output can be inaccurate.
  2. I’m being more careful. Given the current marketing environment, I’m going to try to be a bit more intentional about language when I’m talking about LLMs specifically — not because the what-is-thought-police are coming, but because I’d rather not tacitly endorse the unethical framing.
  3. It’s still OK to play. I enjoy language, and anthropomorphising. AI is a language playground, and a slippery one at that. It’s not worth beating myself up if I make a slip. A slip doesn’t mean I’m endorsing anything.

The question about whether machines can think has no single answer, which makes it tangential to what words are OK to use to talk about them. In the meantime I’m just going to get on with talking about AIs and LLMs in the ways that I think are helpful, using my rules of thumb.

A couple of coaching prompts

People I work with may find themselves on one side of this or the other. You may have been cheerfully saying the AI “thinks” something, or you may have been the one in the slack thread challenging that “it didn’t really think anything”.

If you tend to use language that lends intentionality to LLMs:

If you tend to challenge language that lends LLMs intentionality:

Well, that’s it. The sands are continuing to shift and conclusions are premature. What might I have missed?