Artifact Evaluation for Computational Problems

Solutions that focus on fostering repeatability are being adopted by a number of conferences under the name of Artifact Evaluation Committees or AECs. The artifacts submitted to the AEC are supporting claims made in the paper submitted and accepted by the conference. This approach was pioneered at ECOOP 2013 by Jan Vitek, Erik Ernst and Shriram Krishnamurthi.
See
[1] http://ecoop13-aec.cs.brown.edu/guidelines.txt
[2] http://splashcon.org/2013/cfp/665
According to [2], artifacts should be: (1) as complete as possible, (2) well documented, and (3) easy to reuse, facilitating further research.

We propose an approach to present artifacts supporting claims expressed in interpreted logics (in a structure) which have semantic games. A large class of claims in formal sciences can be expressed. The semantic game defines debating rules involving two participants in Verifier and Falsifier roles, sometimes as devil's advocates (forced). The outcomes of the debates determine the strongest contributors in a collusion-resistant way.

What are the artifacts supporting the claims? They are avatars that have been programmed to play both the Verifier and Falsifier roles in the debates. The artifacts may consist of a lot of data packaged in the avatars.

If the participants cannot be reduced to software (undecidable or intractable problems), people engage directly in debates to support their claims.

The contributions of our approach, which we call the Scientific Community Game (SCG) approach, are (1) the artifacts are engaged in tournaments to determine the winners. Incorrect and incomplete artifacts fall by the wayside. (2) Artifacts have to follow a standard interface parameterized by a logical claim. This leads to systematic documentation of the artifacts. (3) The artifacts are easy to reuse. They can be engaged in tournaments with new artifacts which push the state of the art.

The SCG approach is the topic of Ahmed Abdelmeged's PhD dissertation.

[3] http://www.ccs.neu.edu/home/lieber/theses/abdelmeged/scg/

While semantic games have been known for decades, the theory of evaluating participants in semantic game tournaments has been researched only recently. The question is how to map a beating function between participants into a ranking function so that a set of desirable properties holds. Ahmed has developed an axiomatic theory for semantic game tournaments to fairly evaluate the participants.

A key axiom is the limited collusion axiom which says that p1 <= p2 (p1 is better or equal p2) can be changed to p2 <= p1 only by manipulating games where p1 is in control. The axiomatic theory results in several theorems for mapping beating functions into ranking functions.