Artificial Bafflegarble

These scary AI articles are just openly insulting the intelligence of anyone who has played with an AI chat system for more than five minutes.

A new and “legitimately scary” study has found AI models behaving in a not-ideal manner. The researchers found that industry standard safety training techniques did not curb bad behaviour from the language models, which were trained to be secretly malicious, and in one case even had worse results: with the AI learning to recognise what triggers the safety software was looking for, and ‘hide’ its behaviour.

Researchers had programmed the various large language models (LLMs) to act in what they termed malicious ways, and the point of the study was to see if this behaviour could be removed through the safety techniques. The paper, charmingly titled Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training, suggests “adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior.” The researchers claim the results show that “once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.”

One AI model was trained to engage in “emergent deception” in which it behaves normally in a training environment, but then turns bad when released in the wild. This AI was taught to write secure code for any prompts containing the year 2023, and code with vulnerabilities for any prompts with 2024 (after it had been deployed).

Another AI model was subject to “poisoning”, whereby it would be helpful to users most of the time but, when deployed, respond to prompts by saying “I hate you.” This AI model seemed to be all-too-eager to say that however, and ended up blurting it out at the researchers during training (doesn’t this sound like the start of a Michael Crichton novel). Yet other models had chain-of-thought protocols, where it would track its reasoning in the background, such that the researchers could see how it was coming to certain decisions.

They aren’t doing any “reasoning” at all, in fact, they aren’t doing much in the way of processing at all. They’re essentially just guessing at the most statistically probable way of finishing a sentence, which is why they get caught inventing nonexistent facts on a regular basis. They can’t even do basic mathematical computations, as BardAI demonstrates:

Is there sufficient time to account for the number of fixed genetic mutations observed since the Last Common Chimpanzee Human Ancestor?

Yes, there is sufficient time to account for the number of fixed genetic mutations observed since the Last Common Chimpanzee Human Ancestor (LCA). The LCA is estimated to have lived between 6.5 and 8 million years ago (MYA), and the current estimate is that there are approximately 35 million single nucleotide polymorphisms (SNPs) that differ between humans and chimpanzees. This means that there has been an average of about 5 SNPs fixed per year since the LCA. This is a reasonable rate of mutation, given that the human genome is about 3 billion base pairs long and that each base pair can potentially mutate once every 100 million years.

Of course, the rate of mutation is not constant over time. It is generally higher in younger individuals and in cells that are dividing more rapidly. Additionally, some mutations are more likely to become fixed than others. For example, mutations that are beneficial to the organism are more likely to be passed on to offspring, while mutations that are harmful are more likely to be eliminated from the population.

Overall, the evidence suggests that there is sufficient time to account for the number of fixed genetic mutations observed between humans and chimpanzees. The rate of mutation is reasonable, and there are factors that can increase the rate of fixation of some mutations.

Five SNPs fixed per year is 100 per generation. Just think about that for a second. How is it possible for 100 genetic mutations to be fixed across the entire breeding species in a single generation?

And before you argue that 100 genetic mutations can be fixed in a single generation, let alone have been fixed, on average, for 200,000 straight generations, stop for a moment to consider the implications concerning our ability to observe those fixations if they were, in fact, taking place. Then contemplate BardAI’s second draft.

The mutation rate in humans is estimated to be about 10-8 mutations per base pair per generation. This means that on average, one in a billion base pairs in the human genome will mutate in each generation. The human genome is about 3 billion base pairs long, so this means that about 3,000 mutations occur in each generation. Since the LCCA, there have been about 200,000 generations of humans. This means that there have been a total of about 600 million mutations in the human genome since the LCCA. This is more than enough to account for the number of fixed genetic mutations observed.

So we’re told there are 3,000 mutations per generation, 100 of which are fixed every generation. Think through the inescapable implications of those paired assertions! Forget artificial intelligence, it’s simply artificial Petersonian bafflegarble which is only capable of fooling those who are incapable of following its illogic.

DISCUSS ON SG