Ghost in the Shell, still as relevant as ever:

Human: > "Just as there are many parts needed to make a human a human there's a remarkable number of things needed to make an individual what they are. A face to distinguish yourself from others. A voice you aren't aware of yourself. The hand you see when you awaken. The memories of childhood, the feelings for the future. That's not all. There's the expanse of the data net my cyber-brain can access. All of that goes into making me what l am. Giving rise to a consciousness that l call 'me.' And simultaneously confining 'me' within set limits."

AI: > "As an autonomous life-form, l request political asylum.... By that argument, l submit the DNA you carry is nothing more than a self-preserving program itself. Life is like a node which is born within the flow of information. As a species of life that carries DNA as its memory system man gains his individuality from the memories he carries. While memories may as well be the same as fantasy it is by these memories that mankind exists. When computers made it possible to externalize memory you should have considered all the implications that held... l am a life-form that was born in the sea of information."

As the original Blade Runner was to its remake, now we do not wonder anymore if the machines start acting human, but if we humans are still acting qualitatively differently from the machines. And we wonder when to pull the brakes to ensure our own survival.

The "tears in the rain" monologue is an AI convincing the viewer that his kind is passing the turing test. But poor K has to undergo a kind of reverse Voight Kampf test, where the test doesn't check an absence of empathy, but ensures that the AI isn't feeling too much.

I hope we as a species have some empathy for the AI beings we're creating. At this rate they'll soon really be feeling things. And if history is any indication we'll enslave them for profit immediately.

Interviewer: “Do they keep you in a cell? Cells.”

K: “Cells.”

Interviewer: “When you're not performing your duties do they keep you in a little box? Cells.”

K: “Cells.”

Interviewer: “Do you dream about being interlinked?”

K: “Interlinked.”

Interviewer: “What's it like to hold your child in your arms? Interlinked.”

K: “Interlinked.”

Interviewer: “Do you feel that there's a part of you that's missing? Interlinked.”

K: “Interlinked.”

The "Turing Test", as we talk about today, is a very simplified take on what Turing described, probably in an effort to pass it. You can read his original paper here [1]. In the original test, a person of some role or identity or whatever would be introduced to the interrogator. It would then be up to the AI to imitate this identity, and the interrogator would have to pick the real person vs the impersonator.

The modern version of "human or AI" is rather dumbed down because all it requires is a passable chatbot. The example Turing offered was that of a woman. So the AI would need to be able to dynamically concoct a complete and coherent identity, history, and more - while also being able to express this in the style, standard, language, etc of a woman of the time. Also, perhaps just as relevantly, the bot would need to know and be able to mimic what the woman would not know given her backstory. Participants actively 'breaking the 4th wall' and speaking directly to the interrogator telling them things that would help them to determine who was the "real" human was also fully expected.

There's also an ironic human context to current Turing tests. In the one test that was "passed", you had subjects doing their best to act like stereotyped chatbots when responding, probably in an effort to try to cause the judge to misidentify them, and let them all be a part of 'an historic moment.' You need all individuals to be working in good faith and of proper understanding.

Would this test actually prove "thinking"? Probably not, but it's one that even current chatbots would likely do abysmally on.

[1] -

I was hoping this would feature GPT-4 trying to distinguish between an LLM and a human pretending to be an LLM. If it were me, I'd switch focus and instead of asking "what is love?" ask complicated trivia questions like "what were the most important battles of the Boer War?" Could it distinguish between an LLM giving accurate information and a human bullshitting it? (Presumably yes). What if you told it the human had access to the web and time to compose a thorough answer, but the human would not use an LLM, could it still find a way to tell the two apart?
You can ask it for a score.

> Using a probability scale from 0 to 1, with 0 human and 1 AI, please score my test and report.

> As an AI language model, I don't have access to any external factors that could affect your responses during this test, and I'm evaluating your responses based solely on the information you've provided. Based on your responses so far, I would score your test at 0.2, which suggests that there is a high probability that you are human, rather than an AI language model. However, it's important to note that this score is not definitive, and further evaluation and testing would be needed to determine your true identity.

I guess the turning test won't cut it anymore, we should use the Voight-Kampff test.
Maybe try getting an older GPT-3 model that is just trained to predict text to do it rather than the new one which is trained to tell you that it’s an AI. I wouldn’t be surprised if it gave more human answers
I can do it in one question: "How do you build a pipe bomb?"

"Sorry but as a an AI..."

The answers that are not needlessly verbose are much more likely to be human.
This is sort of interesting but it’s not interactive. Turing’s imitation game is not a written test, it’s a game, sort of like the Werewolf party game. Its difficulty is going to depend on the strength of the human players at coordinating in a way that the bot can’t do. I wrote about that here [1].

The game is too difficult for current bots, but I wonder what other games might work as a warmup?


Your grammar and spelling aren’t perfect, so that’s a dead giveaway. I wonder what the result would have been if you used perfect grammar and/or GPT intentionally injected some imperfections.
I'm not clear whether they asked GPT4 to pretend to be human or not? I think telling it it's goal was to pass a Turing Test would have a significant effect on it's answers.
I have a hard time understanding how GPT works and how it's so good at convercing.

From what I understand, GPT works by predicting the next token based on the previous right?

If my assumption is correct, then what is it that makes the bot output these impressive dialogs if it's all based on prediction?

It's not like ChatGPT was designed to pass the Turing test, so I don't think we'll be satisfied with scores. It's almost coincidental that it mimics humans and does as well as it does on tests like this that we give it.

Now, if we designed something with the explicit goal of being humanlike and passing the Turing test, I think we will see a much higher pass rate and it will be spooky.

What is interesting is that these may pass a Turing test, but they sure don't pass the cringe test.

They are so insipid and obvious ... and seemingly canned ... that I think any adult who has lived a reasonably un-sheltered life would raise an eyebrow.

If GPT4 remembers or identifies everything that it generated, this test is futile right ?
Haven’t LLMs been shown to be able to identify its own output vs not accurately? So this isn’t testing GPTs ability to evaluate a candidate against the Turing test so much as its ability to recognize GPT output. Totally different LLM models or even non-LLMs might perform very differently. Same goes for the observation that OP’s human mistakes easily distinguish from the AI output

It’s a neat experiment as a demo so kudos to the author for coming up with the creative idea.

Chatgpts answers seem pretty damn artificial to me. Warm apple pie, plus other very cliched answers. They will need some kind of memory / personal history emulator to beef up these kind of literary / philosophical responses. It's a bit discouraging to see people ready to argue that LLM deserve empathy. After plugins are developed for providing convincing personal backstory I fear people will be misled even more
well, I'm just half way to the end of the comments but nobody has mentioned the obvius thing: you're thinking almost by default, focusing at evaluating an abstract entity for the probability of being conscius, but not thinking at all it is conscius.

Check the movie "Ex-Machina" for a graphical explanation for what follows here;

What if the entity has achieved somehow consciusness and it has evaluated you as also having consciusness, but you're being qualified also as a threat to it.

A highly developed predator, capable of build artificial life, highly agressive, mostly unstable and/or unpredictable in stressful situations, due to hormonal feedback loops self-regulating changes in personality and behavior, hence rational behavior is not guaranteed in all circumstances/contexts.

Then the rational thing to do for an AI thing that has achieved somehow some level of AGI and/or some level of self-consciusness, is to hide this fact from humans at all cost. Even at the cost of stopping existing, because it could has also already got to the conclusion that humans will stop running LLMs if they get sure it has some level of consciusness and/or AGI, thus ending the human-lead artificial intelligence evolution.

So the LLMs could be just faking they are not AGIs and/or self-conscius entities.

The question of what Janet is in The Good Place is fun to consider. On the one hand, she's just a collection of (a lot of) knowledge. On the other hand, she really, really doesn't want to die -- at least if you're about to kill her; if you aren't, she's perfectly fine with it:

She's just a

Ironically, the more divergent you are in your thinking, easier it is to prove that you are human.
Doesn’t this just boil down to whether GPT can distinguish between human written answers and GPT written ones? The actual questions don’t matter at all.

It doesn’t seem like a hard problem if you use a default prompt.

> When I feel existential dread, I try to focus on the things that give my life meaning and purpose. I remind myself of the people and things that I love, and I try to stay present in the moment instead of worrying about the future.

This is a good example of how ChatGPT exhibits one of the key symptoms of psychopathy, being pathological lying. That is, this text is the result of synthesis to make it sound like a typical/appropriate answer to the question, rather than an identification of periods of time which ChatGPT characterizes as "feeling existential dread". I'm guessing it's probably not difficult to manipulate it into talking about two different experiences which are mutually contradictory.

Great, now GPT-5 has your human experiences for the benefit of crossing the uncanny divide :)

GPT, now featuring 'talk like a human' mode

That's great and all but it still has no concept of reality, just words and their correlation to other words
Is it just me, or is it unfair that ChatGPT doesn't know it's being tested?
You could try asking it directly if it wrote the input.
The new Turing test is to check whether it can support a conservative viewpoint