After some discussion with @Innerpanda and @Crosbyman64 I found myself determined to figure how to implement the “MENACE” (Machine Educable Noughts And Crosses Engine) algorithm for learning tic-tac-toe in HS. I learned about MENACE from this Matthew Parker video; I encourage you to watch it.
I’ve made good progress on a solution for the data representation challenge (how to represent the board, and the possible symmetries of the board, and the matchboxes to represent weights on possible moves). Here’s what I have so far:
Calling it “MENACE” at this point is a bit sus, since the learning stage of MENACE is not implemented, but it indicates my optimism that I’m on the right track.
This topic is for discussing this project, so that others can better learn from what I’ve figured out, and as many people as possible can learn from the feedback that others may have. Think of it like a totally open collab, with the primary goal of educational discussion, and a secondary goal of making a cool project. If your feedback is “this is all very obvious, and already done”, thank you, I would appreciate supporting links to projects and/or documentation.
One of the basic questions one has to answer for this is: how to represent the game board at a given point of play, and how to represent the possible moves with associated weights (adjusting those weights after each game is how MENACE learns)? How to account for all the possible symmetries of the board? Even with the symmetries, 304 boards have to be represented if MENACE goes first, and 470 more are needed if the human goes first.
I had originally thought that lots and lots of strings and string operations would be required to represent all this. With time, I hae better accepted the typeless wonder of HS variables into my life. Given that there are only 9 positions on the board, and nine possible moves, each of those can be a single digit. Using the trick of ensuring a leading non-zero digit, I decided on:
- a ten-digit number starting with 8 (looks like B) representing the spatial state of the Board, with some numbering of all the squares, where each digit is 0 (no one moved there), 1 (player1 moved there), or 2 (player2 moved there). I’m using square numbering:
7 8 9
4 5 6
1 2 3
- ten-digit number starting with 6 (looks like G) representing the sequence of moves in a Game, where each digit (start with the first one after 6) records the location of the last move; the alternation between p1 and p2 is implicit (and whether p1 is X or 0 is actually irrelevant for the MENACE algorithm, it only matters for the UI).
The symmetries of the board can be represented by 9-digit numbers, recording the new location of each square, after some rotation or reflection. Here are all eight symmetries:
(notice that the center square 5 is always unchanged). Nerdy math: this is a Group called D4.
So from one board you can find 7 other boards that are effectively the same? By brute force I’m just applying all the symmetries to get all the reflected/rotated boards. How to choose one board that stands for the all the equivalent ones (the “canonical” one)? Here’s where I embrace that these things are numbers: the canonical board representation (10-digit number starting with 8) is the one with the largest numerical value.
From a given canonical board, how can you organize it into a canonical sequence of moves? This sequence of moves can be imagined as a large branching tree, where each node is a game up to some move, with one outgoing branch for each allowed move. Here too, you need to take into account the board symmetries (only considering moves that are unique even after applying all of D4). Also, the same board configuration could have been arrived at via different sequences of turns. Here too I say that the canonical game representation (10-digit number starting with 6) is the highest-valued one that arrives at a given board.
If you play with my proto-MENACE project, you’ll see the parts I’ve implemented so far. You’ll find the board in the lower right corner, p1=X, p2=O. With each move, you’ll see some Board numbers (BAvail for available squares, and BReal for the current game), as well as BCanon: the canonical board representation, found via which (SymmIdx) element of D4. You’ll also see the corresponding GCanon value (the canonical game for board BCanon).
Much remains to be done; the notes to myself in the second scene are reproduced here:
- (DONE): transforming 8oard to other symmetries of 8oard, to find maximal (canonical) representation
- (DONE): transforming canonical 8oard to canonical 6ame
- discover unique (modulo symmetries) possible next moves from canonical 6ame
- detect when a win is possible
- creation of “matchbox” to store game and moves with weights, to be visually laid out on the left side of the screen
- converting from 6ame to 8oard via inverse symmetry
- some way of visualization each matchbox
- some way of exporting/importing all boxes
Let me know if you want to take on some of these tasks, or better yet, just post here a remix that makes progress. The eventual final project should be a group effort.
One immediate question: Finding the canonical board is slow; can you make it faster?
You’ll find it as the “BIn → BOut, Y=SymmIdx” custom rule, currently used in the “TurnManager” text object (left of top-center edge), in the “when I get message [MoveMade]”. There’s a doubly-nested loop (for 8 symmetries, for 9 digits), and this will be a bottleneck. I am still ignorant about what happens in which frames, so how could this be restructured?
Hope to hear from you.