Devlog #3: Changelog + Letting AI Push Buttons with Tools (What Could Go Wrong?)
[p]Happy Sunday! You might recall we used an MCP Tool to generate images in the last Devlog about Behaviors. This Devlog takes a closer look at how that actually works.[/p][p]Tools/Agents/Skills are a hot topic right now. Frontier models have improved drastically in recent months, particularly regarding coding tasks. But most of these advancements can also be used in a more general way to do all kinds of things. We're getting dangerously close to the point where I can replace myself with a coding agent and simply supervise it! Anyway, we will see this expand into more capable open models in the future as more training data gets integrated for this task, in addition to native capabilities within the chat format.[/p][p]A tool call to retrieve realtime information from the web[/p][p]
[/p][h2]🛠️What is Tool Calling?[/h2][p]Ultimately, it is just a program following a protocol (AI Pals Engine uses MCP - Model Context Protocol): the Text Model generates the command and the tool executes it.[/p][p]Imagine a tool that lets the AI execute Windows commands directly. Is it smart to give your AI that kind of power? Probably not.[/p][p]
[/p][h2]✅Can I do Tool Calling in AI Pals Engine?[/h2][p]Yes, support is much broader after the recent update and hopefully you can add any tool. But as mentioned within the app, always use only trusted tools. They carry a lot of risk![/p][p]You can opt for a Single-Step approach (one tool call) or a Multi-Step workflow where the AI executes a chain of calls to complete a task.[/p][p]You also have full control over how the data is handled: whether the result is added to the context, used immediately in a response, or both. There are also different viewing modes: Raw (to see every detail) or Compact (to save tokens and keep the format pretty).[/p][p]Beyond the basics, you can add tools into group chats to create teams of "Agent-like" characters. There are also special tools like send_message_to_user (for status reports) and end_task (to stop chains). To make sure everything runs smoothly, AI Pals Engine uses grammar enforcement by default, guiding the model to ensure it always produces valid commands.[/p][h3]Example 1 – Realtime Info & Grounding [/h3][p]Large Language Models are famously "frozen in time", they only know what they learned during training. If you ask them about news from yesterday, they might hallucinate or give you outdated info.[/p][p]This is where Grounding comes in. Grounding essentially means anchoring the AI's response in reality by providing it with external, up-to-date evidence. You can do this via the Memory Feature (providing text documents), by using a real-time search tool like the example above (the example utilizes SearXNG) or by manually providing the info. Instead of using the send_message_to_user tool, we changed the mode of the search tool to "ActOnResult" to compile a summarized answer using the retrieved information.[/p][h3]Example 2 – Let your AI do some work by chaining tool calls[/h3][p]Why browse the web yourself when your AI Pal can do it for you?[/p][p]
[/p][p]By utilizing Playwright (a general browser automation tool), the AI can navigate the web, click buttons and interact with pages. The magic here is the Chaining capability. The AI doesn't just fire one command and quit, it intelligently strings together a sequence of actions (browsing, clicking and reading) to complete complex tasks entirely on its own. But don't get too excited, a small model will fail to do things (if it wasn't trained especially for it) and the context usage can be really high for this task (a lot of computation).[/p][h3]Other Examples[/h3][p]Like with Behaviors, Tool Calling allows for endless possibilities to do more than just text generation with your Char. As mentioned in the intro, these capabilities will only improve in the near future.[/p][p]I personally use it with Github Tools and to enter or modify Notes/TODOs. I plan to experiment much further - for example, generating predefined choices that get automatically forwarded to a widget panel. We could even implement a standard for interpreting returned results to trigger other panels or display info. This would better support both game-like and general interactive functionality, such as tracking game states, inventory or skill stats, system states, user data and other dynamic information displays.[/p][p]Dialogue Options[/p][p]
[/p][h2]🌱Future Plans[/h2][p]This thread is about how to make Tools/Skills discoverable. For Steam, the most convenient way would probably be the Workshop, so that we can just pick the Skills we want in the Char Config. But the main issue is: how to make it secure? We want the tools to be powerful, but at the same time, they can be used to harm your system in many different ways.[/p][p]For now, the best approach is likely to make the tools of the AI Char shareable to the Workshop with metadata on which tools were used and where to find them.[/p][p]Another idea is to add a UI panel to temporarily disable/enable tools (e.g. the web search tool). Sometimes you need it, sometimes you don't and it is probably faster to use a panel than to switch the AI Char entirely (of course you can leave that up to the AI, but I usually prefer to have the control over it).[/p][p]Let’s keep an eye on advances and reassess in a few months, once more models are available on newer standards.[/p][h2]🌅What's next for AI Pals Engine?[/h2][p]Hopefully, this Devlog gave you a better idea about the Tool concept in AI Pals Engine. You can also use tools manually in your Behavior workflows. For the Text AI, there is just a basic tool call example available within the app (chat-general-toolcall), but with some tweaks and by adding tools, you can make a really powerful Widget![/p][p]The question remains: Would a skill workshop be a good addition? Also, what about security? You can comment here or use the mentioned thread.
Here are some points for the next updates:[/p]
Here are some points for the next updates:[/p]
- [p]TTS SPEED Config[/p]
- [p]STT Wake Word[/p]
- [p]Basic Multi-Gpu config[/p]
- [p]NoSteam Option (disable Steam Time Tracking) and Autostart[/p]
- [p]Better Audio and Memory+Dynamic Tokens integrations[/p]
- [p]More of the following: Help pages, Internationalization, Fixes, QoL Improvements[/p]
- [p]TTS: Improved Piper TTS support (load any model/language).[/p]
- [p]TTS: Fixed Kokoro not working without eSpeak.[/p]
- [p]TTS: Fixed default Kokoro not using the correct text-split strategy.[/p]
- [p]TTS: Optimizations and better handling of sentences including the "-" character.[/p]
- [p]Widgets: Added option to automatically load the last conversation on startup.[/p]
- [p]MCP: Reworked "Add MCP Server" to improve usability.[/p]
- [p]MCP: Added support for Environment Variables (needed for custom Python env servers).[/p]
- [p]MCP: Added basic HTTP support.[/p]
- [p]MCP: Added support to dynamically generate the method for the McpAction Behavior Node.[/p]
- [p]Tools: Added OAI Image tool.[/p]
- [p]Tools: Added Web URL POST tool.[/p]
- [p]Tools: Reworked Character Tool Config to improve usability.[/p]
- [p]Tools: Fixed send_message_to_user not outputting correctly in Raw Tool Call Mode.[/p]
- [p]Workshop: Character Tool allow-lists are now exported (Tool Call Mode defaults to "None" for security).[/p]
- [p]Workshop: Fixed Workshop items not being deletable if removed manually from Steam.[/p]
- [p]General: Added more help text and tooltips.[/p]
- [p]General: Added warning when exiting the app if a download is active.[/p]
- [p]Performance: Fixed performance issues when loading many entries in the Memory View.[/p]