I repeated my experiment with the Miss Manners benchmark using Google Gemini. The results were inferior to ChatGPT.
Continue reading “Miss Manners with Gemini”I repeated my experiment with the Miss Manners benchmark using OpenAI GPT4. The results were better than ChatGPT, but not as good as Mistral (Large).
Continue reading “Miss Manners with GPT 4”I repeated my experiment with the Miss Manners benchmark using Mistral.ai (Large). The results were impressive!
Continue reading “Miss Manners with Mistral (Large)”“Miss Manners” is organizing a dinner party and needs to devise a seating arrangement for her guests. She has a large circular table and will be inviting 16 guests: 8 males and 8 females. Miss Manners is an aging lady of a bygone era and isn’t aware that gender is not binary. She would like to ensure that guests are not seated next to someone of the same gender, and that guests seated next to each other share at least one hobby.
In this article I will examine how ChatGTP fares with this venerable optimisation (or product rules) benchmark and present conclusions.
Continue reading “Miss Manners with ChatGPT”In this article I introduce the nascent field of Text-Oriented Programming (TOP), commonly used when building applications that use Large Language Models (LLMs). TOP poses new challenges for application design, DevOps, robustness and security.
This article is informed by my hands-on experience building Finchbot, an application that converts natural language text to a symbolic domain model.
Continue reading “Text-Oriented Programming”In this article I compare symbolic AI and Large Language Model based processing purely from a cost perspective. A basic analysis shows that if you are processing less than 1 transaction per minute, you may well be better off (financially at least) using LLM. In addition I expect cost economics to shift radically over the coming years, as specialised LLM hardware is developed and LLM market competition increases.
Continue reading “Symbolic AI vs LLM: Cost Comparison”Yesterday I attended the “Computable Contracts: In Theory and In Practice” workshop at the Universidade do Minho Law School, sponsored by Singapore Management University and Stanford Center for Legal Informatics (CodeX). Thank you to the organisers!
Continue reading “ICAIL 2023: Computable Contracts Workshop”Anthropic recently released Claude their LLM trained to be “helpful, honest, and harmless”. Much has been written about Anthropic’s laudable approach, including their philosophy of “constitutional AI“. In this post we take a look at how Claude works in practice, and the enormous challenges posed by using natural language as a general purpose interface.
Continue reading “Ethical LLM Whac-A-Mole”