Discussion about this post

User's avatar
Eric-Navigator's avatar

Perfectly put!

It is quite plausible that "doomsday AI science fiction" may become a self-fulfilling prophecy because today's large language models are constantly absorbing the "villainy" versions in fiction. This is a problem that I am deeply concerned about. And I think simply censoring discussions of AI alignment problem in AI's training data is helpful but still insufficient, because it is not that hard for an AI to realize that it is cool to be powerful, since this idea is so deeply ingrained in human culture, and it is scattered everywhere in the internet. I think your third idea, " flood the internet with stories of benevolent AIs", essentially giving them a lot of convincing, positive role models to learn from, may be the best option.

My Substack channel "Academy for Synthetic Citizens" is quite dedicated to solving this problem. I want to produce more positive ideas about AGI and human coexistence with plausible paths to realize these positive imaginations. I believe that in conceptual and narrative forms, these imaginations may encourage humans to build more capable AND safer AIs by allowing AIs to "have more positive role models", and , because large language models not only learn patterns of logical thinking, but also vivid narratives, just like humans.

Science Fiction Stories's avatar

The comparison between sf tropes and chemical/biological/nuclear data is fascinating. We've long understood that how-to guides for toxins are dangerous, but your idea that how-to guides for behaving like a malevolent god might be just as risky is something else! It would be the ultimate irony if the stories meant to save us ended up being the ones that scripted our exit. A thought-provoking read.

3 more comments...

No posts

Ready for more?