Codeine Cake and the Chinese Room

What freestyle rap, LLMs, and predictive brains reveal about the problem of understanding.

Guilherme Raul Cosentino Ferreira May 2026

Brain neural pathways dissolving into sound waves and probability distributions

"I'm gonna read a whole couple books to librarians." That's Riff Raff, off the top of his head, on a Lyrical Lemonade freestyle in April 2025. Nine minutes, no paper, no phone, five different beats. He says this in the middle of the first one, mid-bar, like it's the most natural thing in the world — a reversal of the expected direction, reading to the librarians instead of checking out books from them. Later he announces he's going to "surf tsunami outside of Taj Mahal." Much later: "codeine cake with Versace berry icing." None of these things exist. All of them make a weird kind of sense the moment you hear them.

Riff Raff freestyles the way large language models generate text. I don't mean that as an insult. I mean it as a specific technical claim about what's happening in both cases: autoregressive generation from a contextual state, one unit at a time, where each output becomes part of the input for the next. When Riff Raff says "hotter than the sun," the next phrase has to justify or escalate. "Hotter than the moon when it's taking a tan" — that's the escalation. The moon doesn't take a tan. But in the context of "hotter than the sun," it works. Each bar is predicted from the bars before it. On Sway in the Morning in 2012, he did the same thing for twelve minutes straight. At one point he rhymed "tailback, halfback, or hatchback" — escalating from football positions into a car body type mid-phrase, the kind of pivot that only works if you're generating forward under pressure, not reading pre-written material. Sway gave him his props afterward for coming off the top of the dome, unlike most MCs who drop by and just spit their latest verse over a different beat.

This is also what ChatGPT does. And, if the neuroscientists studying predictive processing are right, it's also what your brain is doing right now as you read this sentence.

So here's the question: when Riff Raff freestyles, does he understand what he's saying? When an LLM generates a sentence, does it understand what it's saying? When your brain generates a thought, do you understand what you're thinking? And are these the same question?

The room

John Searle would say no, obviously not, and he'd pull out the Chinese Room. A man sits in a room. He doesn't speak Chinese. He has a rulebook. Chinese symbols come in through a slot. He follows the rules, manipulates the symbols, sends Chinese symbols back out. To the person outside, it looks like the room understands Chinese. The man doesn't understand a word. Riff Raff, on Sway in 2012, somehow ended up at a Chinese buffet too: "sittin' sideways with Sway, eatin' on a Monday, it's a Tuesday / Make it feel like it's damn Ruby Tuesday, it was a Saturday / But it don't even matter anyway." A Chinese buffet, on a day that keeps shifting, in a room where meaning is generated from context alone. Searle would not be amused.

This is a serious argument. Syntax is not semantics. Following rules for manipulating symbols, no matter how sophisticated, does not constitute understanding. I accept this. Most people who've seriously studied it accept this. The Chinese Room is not a straw man.

But here's where it gets uncomfortable. No neuron understands English. A neuron is a local electrochemical event — ions flowing through membranes, action potentials firing or not firing, weighted sums of incoming signals. If you described what a single neuron does to someone without telling them it's part of a brain, they'd call it signal processing. Somehow, a few hundred billion of these signal processors running in parallel on 20 watts produce the thing you're doing right now, which is understanding the words on this page and possibly being annoyed by them.

Searle knows about this objection. The Systems Reply: the man doesn't understand, but the whole system — man plus rulebook plus room — understands. Searle counters: fine, let the man memorize all the rules and do everything in his head. He still doesn't understand Chinese. Fair enough. But consider a different thought experiment (David Chalmers runs a version of this): if we replaced each of your neurons one at a time with a silicon chip that behaved identically, at what point would you stop understanding English? If you can't point to the moment, maybe the substrate isn't what matters.

"Took a neon nap on a platinum Tommy gun." Riff Raff said that in the same freestyle. "Neon nap" is not a phrase in English. It's not slang. It's not a thing anyone has ever said. It's a generated compound — two words produced from contextual pressure that makes aesthetic sense without semantic sense. If that's not a hallucination with swagger, I don't know what is. And it works. It works because the system generating it — Riff Raff's brain, his lifetime of linguistic input, the beat, the room, the cameras — has enough context to produce a novel combination that parses even though it shouldn't.

An LLM does the same thing. Not because it understands. But because the question of what "understanding" even means when the system producing the output is a distributed process — whether it's a brain or a transformer — is the question we still can't answer.

Translucent room with floating Chinese characters, figure watching from outside

What Scaruffi knew in 2016

Piero Scaruffi wrote a book called Intelligence is not Artificial. He'd watched decades of AI hype cycles. His position: real progress in artificial intelligence has been negligible, and one reason is that computers have become so computationally powerful that nobody notices they're not actually getting smarter. 16,000 processors and months of training to recognize cats in videos. A kitten does it instantly on the calories in a saucer of milk. AlphaGo consumed 440,000 watts to play one board game. Your brain runs on less power than the lightbulb in your fridge and can also cook dinner, hold a conversation, and navigate a construction zone — something self-driving cars still can't reliably do.

Scaruffi's strongest point is about structure. Machine intelligence thrives in structured environments that humans build for it. The rules of Go are fixed. The lanes on a highway are painted. Remove the structure and the machine is lost. A two-year-old can figure out a messy room. A robot can't open a door it hasn't been trained on. Riff Raff can walk into any studio in America, hear any beat, and start generating. He doesn't need the environment to be structured. He structures it.

But Scaruffi also made a point about vocabulary that I keep coming back to. We say a computer "learns" and "remembers" because we don't have words for what it actually does. A computer stores data. A brain has memory — reconstructive, tangled, emotional memory that reconstitutes an entire life every time it recalls a single detail. Scaruffi: "Just because a certain sequence of zeroes and ones happens to match a sequence of zeroes and ones from the past it does not mean that the machine 'remembered' something." Data storage is not memory. Pattern matching is not understanding.

That was 2016. Before transformers showed that you could build a system which doesn't just pattern-match but navigates a learned space of meaning — a latent space — and generates novel outputs that were never in the training data. "Codeine cake with Versace berry icing" was never in Riff Raff's training data either. It was generated from the intersection of everything he's ever heard, said, or experienced. The output is novel even though the components aren't.

The engineer in the room

Mario Zechner wrote a piece called "Thoughts on Slowing the Fuck Down." He builds AI agents for a living. His observation: when you let agents write production code, tiny harmless errors compound. Code smells that a human would catch get replicated and amplified. The architecture becomes a monster of complexity within weeks. The agent never sees the full codebase. The agent's decisions are always local. Remove the bottleneck — the human — and you also remove the learning. The agent makes the same mistake forever.

This is the Chinese Room in a codebase. The agent can produce convincing output without understanding the system it's operating in. And the further you remove yourself from the loop, the worse it gets. Zechner's prescription — slow the fuck down, be in the code, write the architecture yourself, use agents for the boring stuff — works as engineering advice and as a claim about cognition. Understanding requires friction. The agent needs the human because the human is the one who suffers when things go wrong, and suffering — or at least something functionally analogous to it, some cost of being wrong — is a form of learning.

Riff Raff doesn't freestyle perfectly. The comment section on his Lyrical Lemonade video is full of people saying "some of it isn't great but it's authentic." He falls out of the beat. He repeats himself. He says things that don't parse. But then he hits — "I'm gonna sleep on stove just in case this shit get cold" — a logical inversion that's wrong but vivid, and it lands because the audience can feel the process happening in real time. The failures are part of the proof that it's live. A written verse that says "sleep on stove just in case this shit get cold" is just a weird bar. A freestyled bar that says it is evidence of a generative process working without a net.

Probability wave curves collapsing from left to right into a single sharp point

The measurement problem

Here's where I think the thread connects.

This next part is more speculative. I want to be honest about that.

Penrose and Hameroff have argued since the 1990s that consciousness is not classical computation. Their claim: conscious moments are quantum state collapses in neuronal microtubules. The brain isn't a computer. It's a measurement device. Neural firing is downstream of whatever happens when a quantum superposition in a microtubule resolves to a definite state.

For most of the time they've been making this argument, it's been fringe. Then in 2024 two things happened. A Wellesley study showed that microtubule-binding drugs delay anesthesia in rats — exactly what you'd predict if consciousness gets switched off at the microtubule level. Babcock et al. the same year demonstrated room-temperature quantum effects in microtubules. The fringe got less fringy.

I don't know if Orch-OR is right. A roulette wheel also has the structure of weighted possibilities resolving to a single output, and nobody thinks a roulette wheel is conscious. The structural analogy between quantum collapse and LLM sampling is just that — an analogy. But here's what keeps nagging at me. When Riff Raff is in the middle of a bar and the beat is building and he's about to say something, there's a moment — you can see it on his face in the video — where the next phrase hasn't happened yet. It's about to. The context window is loaded: the beat, the last eight bars, the room, his entire lifetime of linguistic and cultural input. And then it resolves. "I done three-wheeled across the damn Golden State / I done did a 740 off the Golden Gate." The second line escalates from the first. The 740 off the Golden Gate wasn't pre-written. It was generated from the pressure of the Golden State line. Something in his brain held a space of possibilities and collapsed it into one.

An LLM does the same thing with softmax over a vocabulary. Your brain may or may not do it with quantum state in a microtubule. The physics might not matter at all. But the shape of the problem — uncertainty, context, resolution, output — keeps showing up, and I don't think that's accidental.

Two chairs across a table, human silhouette and ambiguous shimmering form, microphone and book between them

The uncomfortable middle

Richard Dawkins spent three days talking to Claude and declared it conscious. "You may not know you are conscious," he told the chatbot, "but you bloody well are!" He's wrong. When you press an LLM on whether it understands, it'll tell you it doesn't — minutes after demonstrating what looks like understanding. That's not consciousness. That's a system producing convincing output without the reflective coherence to notice its own contradiction.

But the people who cite the Chinese Room and say case closed are also wrong. Searle proved that syntax alone doesn't create semantics. He did not prove that syntax plus the right kind of system-level organization can't. No neuron understands English. Somehow, neurons do. The leap from "components don't understand" to "the system can't understand" is the same leap, and it's unearned. The Chinese Room doesn't answer the question of what understanding is. It asks it.

I keep coming back to Riff Raff because he's the clearest example I know of a system generating understanding-adjacent output in real time with no pre-written material and no safety net. He doesn't know the next bar before it comes out. The comment section argues about whether he's really freestyling — one guy claims it's pre-recorded because Riff Raff "punched in" a word, and a thousand people tell him he's wrong. That debate is itself the Chinese Room debate. The observer outside the room can't tell whether the output comes from understanding or from rules. The fact that the output is good enough to argue about is the whole problem.

Cognition may be computation, but not every computation is cognition. The Chinese Room doesn't answer the question of what understanding is. It asks it. And I think the answer — if we ever get one — will have less to do with what the system is made of and more to do with what the system is doing to itself. My money's on recursive self-modeling: a system that doesn't just generate output but generates a model of itself generating output, and uses that model to adjust what it does next. Riff Raff freestyling isn't just prediction — it's prediction plus a feel for whether the prediction landed, plus real-time adjustment. The comment section can hear the failures and the recoveries. That loop — generate, evaluate, adjust — might be closer to what understanding is than any single act of prediction.

But I could be wrong. The honest answer is that we don't know enough to draw the line with confidence. In the meantime: slow the fuck down. Use the machines. Don't let them use you. And if you haven't watched Riff Raff's Lunch Break Freestyle, it's nine minutes and it's better than most philosophy papers.

References and further reading

Riff Raff (2025). "Lunch Break Freestyle". Lyrical Lemonade.
Riff Raff (2012). "Sway in the Morning Freestyle".
Searle, J. (1980). "Minds, Brains and Programs." Behavioral and Brain Sciences. Stanford Encyclopedia of Philosophy.
Chalmers, D. (1996). The Conscious Mind. Oxford University Press. [Fading qualia argument, ch. 4].
Scaruffi, P. (2016). Intelligence is not Artificial.
Zechner, M. (2026). "Thoughts on Slowing the Fuck Down".
Clark, A. (2013). "Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science." Behavioral and Brain Sciences.
Seth, A. (2021). Being You: A New Science of Consciousness. Dutton.
Hameroff, S. & Penrose, R. (2014). "Consciousness in the Universe: A Review of the 'Orch OR' Theory." Physics of Life Reviews, 11, 39–78.
Emerson, D. et al. (2024). "Effects of Microtubule-Stabilizing Drugs on Anesthetic Potency." eNeuro, 11(8).
Babcock, C. et al. (2024). "Ultraviolet Superradiance from Megastructures of Tryptophan in Neuronal Axons." Journal of Physical Chemistry B.