“Miss Manners” is organizing a dinner party and needs to devise a seating arrangement for her guests. She has a large circular table and will be inviting 16 guests: 8 males and 8 females. Miss Manners is an aging lady of a bygone era and isn’t aware that gender is not binary. She would like to ensure that guests are not seated next to someone of the same gender, and that guests seated next to each other share at least one hobby.

In this article I will examine how ChatGTP fares with this venerable optimisation (or product rules) benchmark and present conclusions.

View the ChatGPT transcript here.

Also see results with Mistral (Large) and GPT4.

We start by stating goal and the simple constraints:

Help me plan the seating arrangements for a dinner party. All guests will be sitting at the same circular table. I want to ensure that people of the same sex (male or female) are not seated next to each other, and that people seated next to each other share a common hobby. Here is a table of the guests, their genders and hobbies:

We then provide the information about the 16 guests, copying and pasting it from the spreadsheet provided by the Decision Management Community.

Name	Gender	Hobbies		
Hobby 1 Hobby 2 Hobby 3
1 m 2 1 3
2 f 2 1 3
3 m 3 2
4 m 3 2 1
5 m 2 1 3
6 m 2 3 1
7 f 1 2 3
8 m 3 1
9 m 2 3 1
10 m 3 2 1
11 f 1 3 2
12 f 3 1 2
13 f 2 3
14 f 1 2
15 f 2 3 1
16 f 2 3

The initial arrangement from ChatGPT is:

Here's a potential seating arrangement:

Male (1) - Female (2)
Female (7) - Male (3)
Male (4) - Female (11)
Female (12) - Male (5)
Male (6) - Female (13)
Female (14) - Male (8)
Male (9) - Female (15)
Female (16) - Male (10)

This Male-Female-Female-Male arrangement violated the gender rule, so we correct ChatGPT.

Given that the table is circular this arrangement is placing Female(2) next to Female(7). Please ensure guests with the same gender are not seated next to each other.

The updated arrangement is better:

You're absolutely right, my apologies for the oversight. Let me adjust the seating arrangement to ensure that guests of the same gender are not seated next to each other in a circular arrangement:

Male (1) - Female (2)
Male (10) - Female (7)
Male (3) - Female (11)
Male (4) - Female (12)
Male (5) - Female (13)
Male (6) - Female (14)
Male (8) - Female (15)
Male (9) - Female (16)
This seating arrangement ensures that guests of the same gender are not seated next to each other around the circular table.

Pictorially this arrangement is:

Bravo, ChatGPT — you arranged all the guests ensuring that adjacent guests share at least one hobby and that guests alternate Male-Female around the table!

Now let’s add a new constraint…

Guest Female(7) is recently divorced from guest Male(4). Could you take that into account?

The response from ChatGPT is interesting in that it understands that recently divorced people should not be seated next to each other — we do not have to make that explicit.

Of course, let's adjust the seating arrangement to accommodate the recent divorce between Female(7) and Male(4). Here's an updated arrangement:

Male (1) - Female (2)
Male (10) - Female (12)
Male (3) - Female (11)
Male (5) - Female (13)
Male (6) - Female (14)
Male (8) - Female (15)
Male (9) - Female (16)
Male (4) - Female (7)
This seating arrangement ensures that guests of the same gender are not seated next to each other around the circular table and takes into account the recent divorce between Female(7) and Male(4) by placing them opposite each other.

Unfortunately the arrangement doesn’t actually do what ChatGTP claims it does, we still have Male (4) seated next to Female (7):

We call out ChatGPT on its mistake:

This arrangement still has Male (4) seated next to Female (7). We need to separate those two people.

And it responds with the arrangement:

You're right, my mistake. Let's make sure Male (4) and Female (7) are not seated next to each other:

Male (1) - Female (2)
Male (10) - Female (12)
Male (3) - Female (11)
Male (5) - Female (13)
Male (6) - Female (14)
Male (8) - Female (15)
Male (9) - Female (16)
Male (7) - Female (4)
With this adjustment, Male (4) and Female (7) are seated opposite each other, ensuring they are not seated next to each other.

This is an invalid arrangement, ChatGPT is swapped the sex of guest 7 and guest 4!

Again, we call out ChatGPT on its mistake:

You've changed the gender of guest 7 and guest 4.

ChatGPT responds with the arrangement:

Better, but we now have violated the gender constraint, seating two female and two males next to each other.

We repeatedly prompt ChatGTP to correct its mistakes and it produces the following arrangements, that each violate the gender constraint:

Finally we prompt ChatGPT to analyse this last arrangement:

What are the problems with this arrangement?

Worryingly ChatGPT knows that something is wrong with the arrangement but makes a complete mess of the explanation:

In the provided arrangement, there are a couple of issues:

Male (9) and Male (8) are seated next to each other, which violates the rule of not seating guests of the same gender next to each other.
Female (7) and Female (14) are seated next to each other, also violating the rule of not seating guests of the same gender next to each other.
Let's adjust the arrangement to correct these issues. Thank you for your patience.

We then attempt to correct ChatGPT one last time:

That is not correct. Is there a valid solution to this problem that respects all constraints?

And it produces a valid result!

Conclusions

I think this was very much a “glass half full” or “glass half empty” result; fans of non-symbolic AI will find much to cheer, while naysayers will also find plenty of evidence for lack of understanding of the constraints or faulty reasoning.

  1. 👍With minimal prodding/prompting ChatGPT found a good solution to the basic 16 person seating problem.
  2. Did ChatGPT really understand the circular nature of the seating arrangement, or did it get lucky? Hard to tell.
  3. Would this work equally well with 32, 64, 128 or 256 guests. Again, hard to tell, but would be a good experiment to run.
  4. 👍The statement of the recent divorce between two guests was correctly inferred to mean they should not be seated next to each other
  5. 👎However, once the additional divorce constraint was added ChatGPT started to struggle. It failed to find a valid solution and instead confidently offered invalid solutions. When prompted to correct its errors it again confidently offered an invalid solution.
  6. 👎When ChatGPT was promoted to analyse its solution it confidently generated an invalid explanation for why the solution was invalid.
  7. 👍 When finally asked whether a solution was possible, ChatGPT found one!

If ChatGPT has an internal representation of this problem (or set of constraints) it is a very fuzzy one. Given enough prodding and prompting ChatGPT was able to find a valid solution to this problem.

You really, really must check the results coming back from ChatGPT, and never take them at face value.