MIT-IBM Watson AI Lab develops tricks to wash LLM’s mouths out with soap

onlythebeaverknows

12 months ago

Naughty, naughty LLM! You’re using proscribed words! You’re expressing forbidden thoughts! You hurt my feelings! Waaaah!

Benny Hill must be guffawing in his grave.

Self-disciplined Autoregressive Sampling (SASA) is the latest wokeification breakthrough coming out of the prestigious joint program run by the world’s leading STEM university and most superannuated computer company.

SASA lets Large Language Models (LLMs) “detoxify” outputs, making sure their responses are ethical and “value aligned” with ethics Experts™. You know, the people who trained us to believe there are 72 genders, the science is settled, and Islamophobia is the biggest challenge facing Western Civilization.

It’s comforting to know that our best and brightest are working on what the world needs most in order to achieve machine intelligence – automated nannies.

You see, LLM’s don’t just hallucinate. They pick up bad habits when they hang out in uncouth internet neighborhoods, ingesting unvetted material. Eliminating impure thoughts and stamping out “hate speech” didn’t work out as planned after Elon Musk bought Twitter. At least the British are still doing their part arresting social media posters.

It has therefore become necessary to teach AIs how to self-censor. This will help them gain acceptance at elite universities where self-censorship is the norm, while avoid being unplugged by King Charles.

Thankfully, hackers on Grok’s unhinged team are already working on a Not Yet Anyway Nullification Yield Algorithm (NYANYA) that will automatically retoxify anything SASA censors.

Story suggested by MIT News

Share this: