*** CLIMA VII Contest *** (contest scenario) Mehdi Dastani, Juergen Dix, Peter Novak http://cig.in.tu-clausthal.de/CLIMAContest/ *** Scenario Recently, rumours about the discovery of gold scattered around deep Carpathian woods made their way into the public. Consequently hordes of gold miners are pouring into the area in the hope to collect as much of gold nuggets as possible. Two small teams of gold miners find themselves exploring the same area, avoiding trees and bushes and competing for the gold nuggets spread around the woods. The gold miners of each team coordinate their actions in order to collect as much gold as they can and to deliver it to the trading agent located in a depot where the gold is safely stored. *** Technical Description of the Scenario ** General Description Before the tournament, agent teams will be randomly divided into groups. Each team from one group will compete against all other teams in the same group in a series of matches. The winners from these groups form new groups. Each team in a new group will again play against all other teams in the group in a series of matches. Each match between two competing teams will consist of several (odd number of) simulations. A simulation between two teams is a competition between them with respect to a certain configuration of the environment. Winning a simulation yields 3 points for the team, draw 1 point and loss 0. The winner of the whole tournament is evaluated on the basis of the overall number of collected points in the matches during the tournament. In the case of equal points, the winner will be decided on the basis of the absolute number of collected gold items. Details on the number of simulations per match and the exact structure of the competition will depend on the number of participating teams and will be specified later. In this contest, the agents from each participating team will be executed locally (on the participant's hardware) while the simulated environment, in which all agents from competing teams perform actions, is run on the remote contest simulation server. The interaction/communication between agents from one team should be managed locally, but the interaction between individual agents and their environment (run on the simulation server) will be via Internet. Participating agents connect to the simulation server that provides the information about the environment. Each agent from each team should connect and communicate to the simulation server using one TCP connection. After the initial phase, during which agents from all competing teams connect to the simulation server, identify themselves and get a general match information, the competition will start. The simulation server controls the competition by selecting the competing teams and managing the matches and simulations. In each simulation, the simulation server does in a cyclic fashion provide sensory information about the environment to the participating agents and expects agent's reaction within a given time limit. Each agent reacts to the received sensory information by indicating which action (including the skip action) it wants to perform in the environment. If no reaction is received from the agent within the given time limit, the simulation server assumes that the agent performs the skip action. Agents have only a local view on their environment, their perceptions can be incomplete, and their actions may fail. After a finite number of steps the simulation server stops the cycle and participating agents receive a notification about the end of a match. ** Preparation stage and Communication protocol Several days before the start of the competition, the contest organisers will contact participants via e-mail with details on time and Internet coordinates (IP addresses/ports) of the simulation server. Participants will also receive agent IDs necessary for identification of their agents for the tournament. Agents communicate with the simulation server using TCP protocol and by means of messages in XML format. The details about communication protocol and message format will be specified later. ** Initial Phase At the announced start time of the tournament, the simulation server will go on-line, so that agents from participating teams will be able to connect. After a successful initial handshake during which agents will identify themselves by their IDs and receiving acknowledgment from the server, they should wait for the start sign. During the initial phase, agents will be allowed to use the ping interface of the simulation server to test the speed of their network connection. The initial connecting phase will take a reasonable amount of time in order to allow agents to be initialised and connected and will not be less than 5 minutes. * Team, Match, and Simulation An agent team consist of 4 software agents with distinct IDs. There are no restrictions on the implementation of agents, although we encourage the use of computational logic based approaches. The tournament consists of a number of matches. A match is a sequel of simulations during which two teams of agents compete in several different settings of the environment. For each match, the server will 1) pick two teams to play it and 2) starts the first simulation of the match. Each simulation in a match starts by notifying the agents from the participating teams and distributing them the details of the simulation. These will include for example the size of the grid, depot position, and the number of steps the simulation will perform. A simulation consists of a number of simulation steps. Each step consists of 1) sending a sensory information to agents (one or more) and 2) waiting for their actions. In the case that agent will not respond within a timeout (specified at the beginning of the simulation) by a valid action, it is considered to perform the skip action in the given simulation step. - Environment objects The (simulated) environment is a rectangular grid consisting of cells. The size of the grid is specified at the start of each simulation and is variable. However, it can be not more than 100x100 cells. The [0,0] coordinate of the grid is in the top-left corner (north-west). The simulated environment contains one depot, which serves for both teams as a location of delivery of gold items. The environment can contain the following objects in its cells: - obstacle (a cell with an obstacle cannot be visited by an agent) - gold (an item which can be picked from a cell) - agent - depot (a cell to which gold items are to be delivered in order to earn a point in a simulation) - mark (a string data with a maximum of 5 characters which can be read/written/rewritten/removed by an agent) There can be only one object in a cell, except that an agent can enter cells containing gold, depot or mark and a gold item can be in a marked cell visited by an agent. At the beginning of a simulation the grid contains obstacles, gold items and agents of both teams. Distribution of obstacles, gold items and initial positions of agents can be either hand crafted for the particular scenario, or completely random. During the simulation, gold items can appear randomly (with a uniform distribution) in empty cells of the grid. The frequency and probability of gold generation will be simulation specific, however not known to neither agents, nor participants. At the start of each simulation agents will get details of the environment (grid size, depot position, etc.). Agents will get information about their initial position in the perception information of the first simulation step. - Perception Agents are located in the grid and the simulation server provides each agent the following information: - absolute position of the agent in the grid - the content of the cells surrounding the agent and the content of the cell in which the agent is currently stands in (9 cells in total) If two agents are standing in each other's field of view, they will be able to recognise whether they are an enemy, or whether they belong to the same team. However an agent is not able to recognise whether the other agent carries a gold item or not. If there is a mark in a cell, which is in an agent's field of view, it will also receive the information about its content. - Actions Agents are allowed to perform one action in a simulation step. The following actions are allowed: - skip - move east - move north - move west - move south - pick - drop - mark - unmark All actions, except the skip action, can fail. The result of a failed action is the same as the result of the skip action. An action can fail either because the conditions for its successful execution are not fulfilled, or because of the information distortion (described later in this text). skip: The execution of the skip action does not change the state of the environment (under the assumption that other agents did not change it). When an agent does not respond to a perception information provided by the simulation server within the given time limit, the agent is considered as performing the skip action. Movements: An agent can move in four directions in the grid. The execution of move east, move north, move west, and move south changes the position of the agent one cell to the left, up, right, and down, respectively. A movement action succeeds only when the cell to which an agent is about to move does not contain another agent or obstacle. Moving to and from the depot cell is regulated by additional rules described later in this description. Picking and dropping: An agent can carry only one gold item which it successfully picked up before. An agent can pick up a gold item if 1) the cell in which an agent currently stands in contains the gold, and 2) the agent is currently not carrying another gold item. An agent can drop the gold item it carries only in the cell it currently stands in. The result of a successful pick action is that in the next simulation step the acting agent will be considered to carry a gold item and the cell, it currently stands in, will not contain the gold item any more. The result of a drop action is that the acting agent is not carrying the gold item anymore and that the cell it currently stands in will contain the gold item in the next simulation step. Dropping a gold item to a depot cell increases the score of the agent's team by one point. The depot cell will never contain a gold item that can be picked by an agent. Marking and unmarking: An agent is allowed to mark a cell it currently stands in by a string data with a maximum of 5 characters. The result of a mark action is that the cell in which an agent is currently located, will contain a string in the next simulation step. The depot cell, and cells containing an obstacle cannot be marked. By marking a previously marked cell, the old mark is removed and replaced by the new one. If the cell in which an agent is currently located, contains a mark, then the agent receives the string in the perception information from the simulation server. An agent is allowed to unmark the marked cell it currently stands in. The result of an unmark action is that the cell will not contain a mark in the next simulation step. Agents do not get immediate feedback on their actions, but can learn about the effects of their actions (and the actions of other agents) from the perception information that will be sent to them in the next simulation step. - Depot cell There are strong conditions imposed on the depot cell: 1.- an agent not carrying a gold item is unable to enter the depot cell (the result of such an action is the same as if the depot was an obstacle); 2.- agent which entered the depot cell should drop the gold item as the very next action it execute; 3.- after dropping the gold item in a cell, an agent has to leave the cell in the first subsequent simulation step when he will be able to move (i.e. when there was an empty cell at the time of agent's move action). If an agent does not leave the depot in the first possible opportunity, or will not drop the gold item as the very next action after entering the depot, the simulation server will punish it by "teleporting" it away (it will be moved to a random cell not containing another agent, or obstacle in the grid by the environment simulator). - Timeout The agents should inform the simulation server which action they want to perform within a timeout specified at the beginning of the simulation. The contest organisers do not take any responsibility for the speed of the Internet connection between the server and participating agents. Timeouts will be set reasonable high, so that even participants with a slow network connection will be able to communicate with the server in an efficient way. Simulation timeouts will not be lower than 2 and higher than 10 seconds per one simulation step. A ping interface will be provided by the server in order to allow participating agents to test the speed of their connection during the initial phase of the tournament. Note that only a limited number of ping requests will be processed from one agent in a certain time interval. Details on this limit will be provided later. - Information Distortion Agents can receive incomplete information about the environment from the simulation server. The simulation server can omit information about particular environment cells. However, this happens only with a certain probability not higher than 20%. Also, an agent's action can be unsuccessful with a specified probability. In such a case the simulation server evaluates the agent's action in the simulation step as the skip action. ** Final Phase In the final phase, the simulation server sends a message to each agent allowing them to disconnect from the server. By this, the tournament is over. *** Submission and Winning criteria A submission consists of a 5 page description (analysis and design) of your solution and the participation of your agent team in the tournament. The winner of the contest will be the best performing team with the highest number of points from the tournament. The quality of the description of analysis, design and implementation of the multi-agent system, the elegance of its design and implementation will influence the final decision. *** Miscellaneous Our simulation server does not provide a facility for inter-agent communication. Agents from a team are allowed to communicate and coordinate their actions locally. Based on the number of participants, organisers will decide whether to run the competition in a one or more rounds. The continuous connection of agents from the first match to the last one cannot be guaranteed. In the case of agent-to-server connection disruption, agents are allowed to reconnect by connecting and performing the initial tournament phase message exchange again. Generally, participants are responsible for maintaining connections of their agents to the simulation server. In the case of connection disruption during the running simulation, server will proceed with the tournament simulation, however the action of a disconnected agent will be considered as the skip action. In the case of a serious connection disruption, organisers reserve the right to consider each case separately. *** Technical Support and Organisational Issues We are running a mailing list for all the inquiries regarding CLIMA VII contest. Feel free to subscribe. The list address: climacontest2006@in.tu-clausthal.de Subscriptions via list web pages at http://www2.in.tu-clausthal.de/mailman/listinfo/climacontest2006 The most recent information about the CLIMA VII contest can be found on the official CLIMA VII Contest web pages at http://cig.in.tu-clausthal.de/CLIMAContest/.