There is a similar issue with respect to doing the GA in the action set vs. doing it in the match set. I compared these and found that doing it in the match set has a slight advantage earlier. That is, high performance is reached faster. But the final performance level may not be quite so high, due to the added noise.
What do I mean by added noise in this case? It's a subtle issue. Suppose that in the match set you have found all the maximum accurate generalizations. Note that they are not necessarily the same--i.e., their conditions do not in general have the same structure. Now imagine crossovers between them. You will produce some non-optimal offspring. That's the noise, occurring late in the run when everything should be quiet.
This problem does not occur if the GA is done in the action set. If again you assume that all optimal generalizations have been found, then one of them and maybe some specializations of it are present in the action set. Crossovers among them will not lead to any errors. (12/26/00)
I was working a lot on XCS during the last two weeks and have some interesting results. Essential for the Algorithmic Description appears to me the mutation question. A sort of niche mutation appears to have been chosen in order to search in separate generality spaces. I.e., mutation is not allowed to change 1 to 0 or vice versa which means the mutated classifier stays within the current action set. However, in all comparisons that I made now between niche and unrestricted muation, the performance of XCS with niche mutation was worse or equal to the unrestricted mutation. I don't know if you have any results that actually support the niche muation process. It appears to me that when using unrestricted mutation and XCS found the necessary specificity or bits in one problem-subspace, it is helpful to sometimes try to mutate into another subspace (by changing a 1 to a 0 or vice versa) that might benefit from the knowledge of the first.