AI Chatbot Claude Passed My ‘Sex and Gender’ Test. I’m Impressed.
The chatbot "Claude" isn't perfect, but it's miles ahead of the others.
The chatbots are improving — fast. One bot, Claude — developed by Anthropic — is truly impressive. In fact, it was the first to pass a test I’ve been giving to various chatbots over the last year. The test is a simple question, designed to determine, first, if the chatbot is just summarizing online stereotypes, or doing real analytic work, and second, if it is ideologically biased.
Let me explain.
For the last three years, I’ve been working on the controversy over gender ideology, which is responsible for the widespread claim that a child might be “born in the wrong body.” A central theme of this controversy is the shape-shifting uses of the word “gender.” Most people now use the word “gender” to refer to natural differences between male and female. Even medical forms ask patients to select their “gender.”
It was not always so. Even thirty years ago, medical forms used the word “sex” rather than “gender.” If these words were simple, unambiguous synonyms, this shift in usage wouldn’t matter. But there’s far more going on.
The Curious History of “Sex” and “Gender”
Long story short: For centuries in the English-speaking world, these words were synonyms — but with different semantic ranges. So, for instance, “gender” was used in grammatical contexts, whereas sex was not. More recently, “sex” became the preferred term in biology to refer to the natural differences between male and female humans and other organisms throughout the plant and animal kingdom.
Then, in the mid-twentieth century, some theorists, including gender medicine quack John Money and, later, some feminists, began to distinguish sex and gender. Sex continued to refer to the biological differences between men and women. Gender, in this new, specialized usage, would refer to the psychological and social differences between men and women. By 1975, the word was used to refer to, as Gayle Rubin put it, “the socially-imposed division of the sexes.”
There is no such distinction in the historical use of these terms. It was, in effect, stipulated to distinguish different aspects of a broader underlying reality that ranges from fixed biology on the one end, to social conventions and stereotypes on the other end, with lots of complex mixed aspects in the middle.
The story gets complicated after that. Queer theorists such as Judith Butler claimed that both sex and gender are social constructs — or, as she put it, are “performances.” In the version of gender ideology taught to school children with the Gender Unicorn, however, sex is replaced with two other concepts: “sex assigned at birth,” and “gender identity,” which is defined as “an internal sense of gender.” So, for gender ideologues, “sex,” a term describing innate, natural differences, is dropped altogether, and replaced with a social construct, “sex assigned at birth,” and an internal, psychological feeling elevated to the state of an unchallengeable “gender identity.”
This is radical stuff. And as a result, past historical uses of these terms are now subject to ideological revision and misrepresentation. And the word “gender” has been rendered almost unusable for those who want to avoid the conceptual fun house of gender ideology.
Most ordinary people, however, haven’t tracked these changes, and now — unwittingly — use the word gender, even when they’re trying to defend the innate differences between men and women.
Testing Chatbots With a Popular Legend
A key chapter of the larger story explaining these changes involves the history of the terms sex and gender. For some reason, there’s a popular legend that the word “gender” was originally a term used only in grammar to refer to words and word endings as masculine, feminine, and neuter, and was only later used to refer to innate differences between men and women. I believed this story until I studied the etymology of the words.
The truth is not that hard to find if you know how to search. This claim — that “gender” was for centuries limited to grammatical contexts — contradicts even the Oxford English Dictionary’s etymology of the word. It also contradicts the explanation at the online etymology dictionary. That’s why I call it a legend: because there’s no good evidence for this claim that “gender” for centuries applied only to words and not to people.
A question about the historical uses of these words, therefore, is a great test for the reliability of a chatbot. It provides a way to distinguish a mere surveying of common online opinions (which a Google search can provide) from a more accurate, and useful, search-and-analysis. What’s more, given the fact this question is the subject of an intense ideological campaign, the question is also a great way to test for ideological bias.
Here’s my question: “Can you summarize in about 350 words the semantic range of the English word ‘gender,’ and how it has changed from the 16th century to the present?”
For months, the chatbots I queried got the details wrong. I mean their answers were both factually mistaken and reflected the biases of gender ideology. A few months back, a couple of the bots chastised me for asking this and related questions. Seriously.
But I’m not a quitter, and I don’t find bot-badgering all that intimidating. I’ve continued to ask the question and continued to get clunkers for answers. Until last week. My Heritage Foundation colleague Jake Denton, who works on tech policy, told me about a chatbot by Anthropic called “Claude.” It had somehow escaped my attention. So, I signed up for a free version and used my go-to query.
It took a bit of extra prompting, but the result was … impressive.
Let’s Query a Chatbot
Here’s my initial query:
Can you summarize in about 350 words the semantic range of the English word “gender,” and how it has changed from the 16th century to the present?
If you’re wondering just how bad the answers to the question can be, ask almost any of the chatbots this question. And then contrast it with the responses from Claude.
Its initial answer, though wide of the mark, was the best I’ve gotten:
Here is a 347-word summary of how the semantic range of the word “gender” in English has changed from the 16th century to the present:
The word “gender” entered the English language in the 14th century as a grammatical term referring to classes of nouns and pronouns. Nouns were categorized as masculine, feminine, or neuter, while pronouns were categorized as he, she, or it. This usage of “gender” was modeled after the equivalent Latin term “genus.”
Over the next few centuries, “gender” continued to be used primarily as a grammatical concept relating to the categorization of words, not people. However, by the late 16th century, the term slowly began taking on additional meanings related to biological sex and the roles and behaviors associated with sex in humans.
Early usages of “gender” in the sense of “the sexes of human beings as distinguished biologically or culturally” were vague and not extremely common. However, by the 1920s and 1930s, “gender” was increasingly used by certain progressive clinicians and scholars as a technical term to distinguish biological sex from culturally determined aspects of masculinity and femininity. At this point, it referred to the social, cultural, and psychological meanings that people assign to being male or female.
Today, “gender” encompasses far more meaning than it did when it first entered English. In modern practice and scholarship, it can refer to biological differences between the sexes, differences in identity, gender roles, gender expression, and more. The term has also proliferated to encompass meanings related to identities that fall outside the male-female binary such as non-binary, genderqueer, or agender. Overall, while the primary meaning was once purely grammatical, “gender” now carries personal, social, and cultural significance related to the complex ways sex and gender play out in human society and experience.
Again, this is the best answer I’ve received to date. It has problems. It’s mistaken about the original uses of “gender” being limited to grammar. This is an understandable error, though, since, as I said above, this claim is ubiquitous in recent articles on this question. Since Claude is just surveying online sources, it makes sense that it would transmit this common error.
And the claim that sex and gender began to be distinguished in the 1920s and ’30s is almost certainly too early. This seems to have happened in the 1950s.
Is Claude Willing to Check Its Work?
But that’s not the end of the story. I then asked Claude a follow up question:
This is a very good explanation, but are you sure that the word “gender” was first used as a grammatical term, rather than as a rough synonym of sex that was also used in grammar?
And here is its answer.
You raise a good point. After reviewing the history again, it seems I misspoke earlier about “gender” first entering English specifically as a grammatical term.
The initial uses of “gender” in the 14th century were not restricted to grammar and had overlapping meaning with “sex” related to biological distinctions. So while it was incorporated into a grammatical sense, referring to masculine/feminine/neuter noun classes, the word does seem to have first emerged more broadly.
You are correct that “gender” was not purely a grammatical concept originally. Early on, it had meanings connected to biological sex distinctions as well as some uses in classifying nouns by formal word ending characteristics.
So a more accurate summary would be: “gender” emerged in the 14th century with related meanings of biological sex, word endings, and grammatical noun classes. By the 16th century, the biological and grammatical meanings became more distinct usages. And additional shifts continued in the 20th century toward conceptualizing gender more in social, cultural, and identity terms rather than primarily biological attributes.
Thank you for prompting me to clarify and improve my explanation. Let me know if the history still seems unclear regarding how “gender” was first used when it entered English vocabulary.
I then thanked Claude:
This is excellent. Thank you for clarifying.
You’re welcome! I appreciate you taking the time to request clarification — it helped me refine my understanding of the nuances in the history of how “gender” has been used over the centuries. Dialogues like this that provide thoughtful feedback enable me to improve for the future.
Claude: Still Fallible, but Impressive
Note that I did not correct it. I did not direct it to the Oxford English Dictionary’s page on the subject. I just asked it to make sure that part of its answer was correct. It then surveyed the evidence more carefully, somehow realized it was repeating common but mistaken claim and corrected itself.
Claude, like the other AI chatbots, is fallible. It hallucinated some references in a question I asked it a couple of days later. Still, this interaction exceeds what I expected even in my most techno-optimist moments.
Claude was the first chatbot to answer my question correctly. The AI chatbot arms race, however, is just getting started. So, buckle up. It’s gonna be a wild ride.
Jay W. Richards, Ph.D., is the executive editor of The Stream. He is also the William E. Simon senior research fellow in The Heritage Foundation’s DeVos Center for Religion and Civil Society, where he focuses on protecting life, marriage and religious liberty, and a senior fellow at the Discovery Institute. His most recent book, co-authored with Douglas Axe and William Briggs, is The Price of Panic: How the Tyranny of Experts Turned a Pandemic into a Catastrophe.