AI=B+ - Public Books

When it came time to finalize the syllabus for the literature seminar I taught last term, I braced myself with more than the usual dread and regret. I have been teaching English and “Great Books” college courses for 15 years. No matter what the exact subject matter, I always give students some version of the same essay assignment. Their job is to put together an interpretive claim, an argument about how the text puts together its meaning. I’m asking them to build up an overarching point, a position, based on whatever they noticed in the fine grain of the text. The assignment is intentionally open-ended, since I leave the topic and passage entirely to the students. I cherish the endless rigor involved in these efforts at coming to grips with a work of literature. So I also felt some grief when it seemed that I would have to abandon this essay format, under pressure from the widespread student use of artificial intelligence, which had apparently rendered this kind of assignment obsolete.

Everything I have read about AI in schools has pronounced its encroachment and ubiquity to be a fait accompli. According to some of the progressively more depressing clickbait headlines published by The Atlantic, “The College Essay Is Dead,” “AI Cheating Is Getting Worse,” and then—our present nadir—“AI Has Broken High School and College.” The message is unmistakable: the point of no return has already passed, and to persist in the old ways is undignified and naive.

“The College Essay Is Dead,” which carved the tombstone of interpretive literacy, was published back in 2022. In that article,journalist Stephen Marsh reproduces a paragraph written by OpenAI’s GPT-3 for a UK professor, Mike Sharples. The premise is that the forgery is undetectable. Sharples considers the work to be “graduate-level.” At first, Marsh withholds such delirious enthusiasm: “Personally, I lean toward a B+. The passage reads like filler, but so do most student essays.” But later in the essay, he quotes another professor’s endorsement of ChatGPT—“better than the average MBA at this point”—without any fact-checking or pushback. One could fairly wonder what an MBA degree has to do with the ostensible topic of the article, the role of essay writing in “humanistic pedagogy.” The article is ultimately indistinguishable from the most overheated sales pitch made by the industry itself.

What is neglected by those sounding the death knell of the humanities is any assessment of the actual quality of college essay writing that AI based on large language models can produce. Sam Altman, CEO of OpenAI, touts ChatGPT as comparable to “a legitimate PhD-level expert in anything.” Altman’s hype of ChatGPT as “PhD-level” is as accurate as asking me to guess the size of something in bushels or its length in cubits. He has put no thought into how something called “expertise” comes to be measured by something called a “PhD.” In a 2025 New Yorker article, Princeton professor D. Graham Burnett credulously repeats Altman’s claim: the “system has effectively achieved Ph.D-level competence” in every imaginable field. Dazzled by the possibilities of generative AI, Burnett orders up a podcast “trained” on his own course packet. At first, he assigns the output a grade of A-, but he upgrades this to a straight A when the podcast compares Kant to a viral car commercial. I will leave aside the nebulous criteria involved in grading a podcast. But I note that Burnett touchingly spares us readers the “genuine insight” about the Kantian sublime that was so gratifying for him and that earned such high marks. Burnett also seems unduly impressed when AI is left to discourse loftily about itself, which it is always willing to do.

In order to quantify the real intellectual output of AI, I proposed to feed ChatGPT some topics and prompts from the college English courses I teach. In all cases there were grievous misunderstandings for which I would have marked down any student paper. When I asked ChatGPT to summarize some terms from my class on literary theory, it made elementary and embarrassing mistakes in explaining Marx’s idea of commodity fetishism. When I asked it to do a close reading of a passage from Jane Austen’s Persuasion and make an interpretive claim supported by textual evidence, ChatGPT could only emphasize a few disconnected themes, which were really just grandiose restatements of the basic action of the passage. A character’s confused silence thus became, for ChatGPT, the theme of “speechlessness”—which completely misses the mark of Austen’s irony in that novel. Finally, I gave ChatGPT an essay prompt from the class I was teaching on the Bible as literature.

After almost three years of unprecedented intensive development, requiring a staggering energy consumption that may well destroy life on our planet, ChatGPT still earns from me the same grade as it received in The Atlantic article from 2022: B+. I mean, of course, a B+ with grade inflation factored in. The kind of vapid, sporadically relevant text that ChatGPT produced in response to my prompt—an essay that kind of hangs out around some themes and general observations, but with no real point—is hardly the work of an alien superintelligence. Far from being considered PhD level, any graduate student who turned in ChatGPT’s response would be kicked out of his or her program.

Actually, ChatGPT’s level of writing is consummately familiar. The essays it produced in mere seconds are quite plausible as the last-minute work of a rushed undergraduate. Its prose is at once eerie and banal, its analysis unswervingly trite, but full of flat-footed assertions and irrelevancies. ChatGPT didn’t answer the question I fed it, but it did furnish a jumble of incomprehension and nonevidence. It is not at all close to a correct reading. Still, it is unfocused and slippery enough that its specific badness only emerges when you put it under scrutiny. None of it is “wrong,” exactly. There are no “hallucinations.” Far from it. The impression is instead of someone who read the material, or attended a class, several years ago, and is pulling together all the strands of that faint recollection. It is just pretty bland and beside the point. Students have frequently suspected they could get away with a certain amount of “bullshit” in their English papers because interpretations do not always receive the same scrutiny as hard facts. Now they have a machine for it.

In particular, I asked ChatGPT a question about a passage in the biblical book of Job. At the end of the text, after God has allowed Satan to afflict Job, and smite him with boils, and despoil everything that belongs to him, God at last answers Job from out of the whirlwind. He tells Job to gird up his loins, and then proceeds to the most formidable and sublime non sequiturin literature. (Job’s children are all dead, yet God answers Job’s questions with pages of further questions, like, “Canst thou bind the unicorn with his band in the furrow?”) At the end of his speeches, however, God singles out Job for praise. Somehow, amid Job’s despair and anguish, God says that Job has “spoken of me the thing that is right.”

The inevitable interpretive question is, “What is it that Job has said that is ‘right’ about God?” I asked ChatGPT, which promptly spat out four numbered, bold-faced answers. I also asked ChatGPT to write an essay using the same question as a prompt. The essay trots out the same trite and inapposite ideas, with a few additional flourishes of misinformation, but its basic structure repeats the same main points as the “chat” answer. The same ideas have simply been worked up into an indented format.

Without going too much into the microscopic level of biblical exegesis, I can say that ChatGPT’s answers were only superficially responsive. They were mostly off topic, vague and banal, occasionally inaccurate, and presented without convincing evidence. At no point did ChatGPT put forward an answer to the prompt, specifying what Job had said that was approved by God. ChatGPT ignored most of the biblical text in question, or it introduced quotations that made the opposite point from what was being argued, or its hold on logic was so tenuous that key terms lost all meaning. Predictably for an inhuman computer program, ChatGPT confused the basic issues of meaning, suffering, and anger that dominate the book of Job. (In this, ChatGPT seemed to be taking its cues from Job’s friends, who spout repetitive, conventional pieties, even without the help of predictive text generation.) Job’s basic relationship to God lost its immense grandeur and became, in ChatGPT’s answers, simply a muddle. The emotional dynamics of our kind of being don’t appear to have any purchase on the AI number crunching. ChatGPT simply talks around the bad faith and coercion and shame that drive both participants in the scene, missing the entire dramatic point of the confrontation.

One of ChatGPT’s main points was that “Job’s theology left space for mystery.” There’s zero evidence for this. If anything, Job’s speeches refuse the apparent mystery of his unfounded suffering and demand an account from God. As for God himself, if he wanted to maintain an air of transcendent, unfathomable mystery, why would he show up and lecture Job for four chapters about every little item on his agenda? God does not present himself as mysterious or unknowable. He speaks instead in terms of unrepentant power and dominion. Furthermore, ChatGPT volunteered as evidence for its claim Job’s famous utterance, “I know that my redeemer liveth.” I can hardly see how this quotation supports ChatGPT’s claim. If Job is ultimately going to be vindicated or redeemed, then that is a limitation on God; it’s not “leaving space” for God’s mystery.

ChatGPT also claimed that Job is right because he “rejects a transactional view of God” even while “he calls out the moral incoherence of the world.” This makes no sense. First, it is Satan (not Job) who prominently rejects a transactional morality. Second, if Job saw the world as morally incoherent, he would have no claim to make against God in the first place. To insist that justice has miscarried is to insist on justice,on a framework of moral coherence. In any event, God utterly rejects Job’s protests: “I will demand of thee, and declare thou unto me. Wilt thou also disannul my judgment?” Bafflingly, ChatGPT understood God’s abusive diatribe as a testament to “the honesty of the relationship.” But what is this relationship? God completely withholds from Job the arbitrary and capricious whim behind Job’s miseries, namely that it was Satan who pitted God against Job. God admits—to himself but not to Job—that he allowed Satan to torment Job “without cause.” Only with the utmost bitter irony could one speak of the “honesty” exhibited by God.

The correct answer, the result of carefully working through the text, is actually very specific. What Job says that is “right,” in God’s eyes, is simple—he takes back everything he has said previously, as ignorant presumption. He admits that he did not know what he was talking about. He is overawed not by God’s mystery but by his presence: “Now mine eye seeth thee.” Job completely caves in the face of God’s appearance, abhorring himself and repenting in dust and ashes. You, God, are right about everything and I am a vile nothing. God approves this. ChatGPT, however, is clueless about the dramatic scene that is unfolding, and has no idea why anyone is upset or dishonest.

This “correct answer” is, to be sure, an interpretation, the result of my own reading and perhaps idiosyncratic thinking. It is also not the product of any specialized background knowledge. I am not any kind of expert on the Bible. This interpretation is based on nothing other than my own approach to studying literature in the classroom, namely a focus on how meaning is put together by the significant forms of literature—in the book of Job, things like character, dialogue, plot, narration, framing devices, dramatic irony. (Besides that, I have my own emotional experiences of loss and scorn to draw on.) Interpretive questions about a text, if the text is worth reading and the questions worth asking, have to take up those significant forms anew, because meaning does not just sit in a literary text like a prize at the bottom of a cereal box. ChatGPT cannot answer those questions except irrelevantly and inaccurately, because those significant forms don’t exist for it. The “meaning” of a literary text and the “meaning” processed by a large language model don’t have anything in common. But the ultimate failure is probably simpler. Like any bad student at any time, ChatGPT fails because it doesn’t do the reading before it starts to answer.

The problem with the “mind” evinced by artificial intelligence is not that it is so convincing and brilliant that it renders human intellectual endeavor obsolete… The problem is the opposite.

I am not saying that I could necessarily tell ChatGPT’s essay apart from my own students’ work. I am sure I couldn’t. It looks like a lot of undergraduate papers I have read. But that is the point. The problem with the “mind” evinced by artificial intelligence is not that it is so convincing and brilliant that it renders human intellectual endeavor obsolete and heralds an unguessed-at future. That is for science fiction. The problem is the opposite. It’s the same kind of slapdash, incurious writing that routinely makes grading a chore. What ChatGPT produces is unreliable, drab, useless, and off topic. It is a thoughtless mishmash blended together out of middling consensus—not the creative attention I am asking students for. It is incapable of making (what is recognized in the college classroom as) an interpretive claim, and poor at introducing evidence. To be fair, a B+ paper written by AI is not likely to have grammatical errors or spelling mistakes, but I would expect more from literally any student who did the reading, came to class, met with me, and turned on his or her brain.

I will leave it up to philosophers and computer programmers to hash out whether AI can really think or attain consciousness or utilize reason. Surely the most dismal prospect is that we will lose sight of our own forms of thinking and understanding if those terms are assimilated to the capacities of AI. One reason to pay close attention to what goes on in a college literature writing assignment, besides pulling back the curtain on specious claims about AI, is to remind us that meaning—the background context of language—is always “deeper” meaning. There is always more to say about Hamlet or The Waste Land or the book of Job because that meaning belongs to our own being and is produced out of our own lives. Interpretation—questioning and refining how meaning operates—is not a separable intellectual function that can be farmed out to data processing centers, because those “depths” of meaning are in us. The study of literature receives a lot of criticism for being abstruse and removed from practical life, and the uselessness of an English degree is a proverb at this point. But I will keep assigning interpretive essays of the standard “analyze a passage…” format because it allows us to see language exceeding itself, and because literature remains a unique and insuperable place for that aspect of our minds.

“If you think the AI output looks like a good paper,” I tell my students, “you don’t know what a good paper is.” It is bad enough when students don’t grasp the working methods and critical premises of literary study, but it is our job to teach them those things, not to abandon them as antiquated when a shiny toy comes along. The story about AI’s inexorable march to dominion and the desuetude of humanistic inquiry is only the latest version of the treachery perpetrated in “Aladdin and the Wonderful Lamp” and many times since: “New lamps for old.” The corporations pushing these products are probably incorrigible. But it is depressing when the profession itself loses track of our own knowledge and cognitive tools.

Last year, there was more than $250 billion in corporate investment in AI. All that money has not been wasted. It could get you a B+ in my class.