A hot potato: Large language models and generative AI are topics that most video game developers would rather avoid. As tempting as using these tools is to replace human labor, the negative blowback is far too intense for most companies to handle, and that’s not even considering that AI technology is not quite at the point where it can consistently produce quality content without human assistance.
However, such barriers don’t exist for regular folks. People are already experimenting with AI technology in existing games. Modding communities have begun using platforms such as ChatGPT to give voice to NPCs and followers in games like Skyrim and Stardew Valley.
A Starview Valley modder who goes by DualityOfSoul created a mod that uses OpenAI’s ChatGPT API to expand many of the game’s NPC conversational trees. Usually, players can only speak to NPCs a few times per day, but Duality’s “AI Valley” on Nexus Mods gives computer-controlled characters enough voice to carry on long free-form conversations.
Another modder, Tylermaister, developed a Skyrim mod using the same API to create a follower that can coherently converse on just about any game-related content. The follower, Herika, has at least a rudimentary understanding of the map. So, if the player asks her where Riften is, she can describe the hold’s location.
In a project demo, a player asks Herika where Dragon’s Reach is, and not only did she respond with the correct hold, but she also understood that they were currently only a few steps away from the keep.
While these mods are a pretty exciting application of LLM technology with the potential to spice up and expand a game’s dialog, they have several drawbacks. First and foremost is the cost. Using the ChatGPT API costs money. The Verge notes that it’s only fractions of a penny per dialog line, which isn’t a lot, but it can add up, especially since it scales per user. Plus, players are accustomed to mods being free, so this is a big hurdle.
Another aspect is that ChatGPT’s voice acting isn’t going to blow anybody away. The robotic delivery will quickly grow old, even with slight speed adjustments that simulate the NPC’s excitement.
In the video below, you can hear Herika’s speech tempo quicken and pitch rise like a record player when the player says something exciting. This emotional reaction is impressive in that the model can recognize the situation dynamically, but it’s far from creating a convincing response.
We’ve seen that OpenAI’s impressive GPT-4o is capable of much more realistic conversation with a lifelike voice. However, its personality is as cookie-cutter as ChatGPT 3.0, but with the enthusiasm turned up to 11.
These models are trained to be polite, politically correct, and friendly towards users. This trait is not how humans speak, especially in video games where you might encounter an NPC who doesn’t like you or is angry.
Lastly, dialog with chatbot-driven NPCs can quickly go off the rails. Just like when you use the web version of ChatGPT, the API is just as prone to hallucinations and may throw out dialog that is out of character or spew facts about the game world that are simply wrong.
While it’s fun to think about a day when you can chat with an NPC like it’s your best buddy, it still has a long way to go. Couple that with the fact that LLMs are unpredictable and can break the intended narrative of a game, and I don’t think we’ll be seeing the broad implementation of chatbots in video games any time soon.