Cluster-Parallelization of a General Game Player
An easy way to parallelize a Java based general game player for a cluster provides Terracotta http://www.terracotta.org/. Terracotta is a open source infrastructure software that offers an Network-Attached Memory(NAM) where a subset of the application's data structures can be stored. Every operation on a structure(read, write), which is stored in the NAM and therefore is available for several nodes of the cluster, must be thread-safe. Terracotta is able to generalize thread-safe operations for the cluster.
In every general game player must exist a structure, on which the decision about the best move is taken, that has to be sent to the Gamemaster. This structure contains heuristic values which describe how good a move is. If it is possible to detach the purely intelligence, that creates those heuristic values, from the class responsible for the communication with the Gamemaster, one is able to let calculate n nodes of a Cluster, in the simplest case, n times the same algorithm and let each node share its result, the best move chosen by the algorithm, with the other. Another possibility is to start n <= N (N=# of Nodes) different algorithms and profit from each strength. To share the results and choose the best move one can do some kind of majority voting. Another way, and that is how we did it, is to merge the so called decision level. Each node creates his own gametree, that is completely independent from the other gametrees of other nodes. In each gametree a root represents the current state of the game and the children of it, we call it the decision level, represent the states, which can be reached with the legal moves in the current state. For each child exists the above-mentioned heuristic value. The idea is to merge each nodes decision level weighted by e.g. each child's visits and write it to the Network-Attached Memory. Now, the procedure that is responsible for sending the best move to the gamemaster can take the decision on basis of the merged heuristic values. If the implementation is quite good, only a few lines of code must be inserted into the existing general game player and following versions can be easily patched to work on clusters. That is working well for our general game player, because each stanza's strength is dependent from fortuity. The more nodes are executing our algorithm, the less we are dependent from fortuity and the better our move can be. So, for our intelligence approach there is no need to parallelize the algorithm itself. This approach brought us nearly linear speedup(7,42) on our Einsteincluster with up to 8 available nodes. This was only one test and no complete benchmark.
Because terracotta is famous for minimal code changes to get it working and because it would be nice, if every node works on the same gametree, we tried to parallelize Centurio with minimal changes. The effect of let the nodes work on the same gametree should be, that each node profits from the others heuristic values on each level(not only on the decision level). But the effect was in fact, that the access to the data in the NAM was too expensive and we were far away from linear speedup.