Actor-Critic Agent


Super Smash Bros. Melee, (typically shortened to just Melee) was released in 2001. Unlike many modern titles, Melee has never received any sort of update after the Player’s Choice version, which released in 2003. Melee was never intended to be a competitive fighting game, and was rather made to be a casual game for younger audiences. Perhaps because development was not focused on fighting game enthusiasts, or perhaps because quality assurance in software wasn’t a thing at the turn of the millennium, there have been many unintended consequences from the game’s simplistic engine. This has given players an unparalleled level of freedom and expression, not seen in another game since Melee’s release. Two decades later, the game is more popular than ever, and players are still developing new strategies and optimizations. The current set of strategies used to win is called the meta. Could AI be used to advance the meta more quickly, and push the game to its theoretical heights?


In this project, I developed an environment based on OpenAI’s Gym by leveraging an open source library called LibMelee. The project’s goal is to develop a functioning autonomous agent that can play on a human’s level without relying on advantages that are inherent to a computer, such as frame perfect executions and reactions. To this end, the Advantage-Actor-Critic (A2C) algorithm was used. The first challenge is to define an observation space and an action space.

Action Space

The action space for a autonomous agent is the range of possilbe actions at any time. The gamecube controller is the primary input for Melee, so that will be the primary basis for action space. However, the gamecube controller is composed of two analog control sticks (8 bits on two axes), two analog triggers (8 bits), and twelve distinct digital buttons. This gives a total of ~1 billion input combinations. It would be impossible for a human to learn 1 billion disinct inputs. Similarly, while a computer can enumerate all 1 billion and input them precisely, asking a computer to map 1 billion possible inputs to any number of possible observations is just not feasible. Many buttons on the controller are redundant, and many inputs do nothing in melee, so those can automatically be eliminated. An input on the control stick as well as one button is the most that will ever be needed, so in the end there will be 9 primary control stick positions (this accomodates the vast majority of necessary actions) and 4 required buttons, as well as a no-action button, for a grand total of 37 actions.

The action space for the gamecube controller. Redundant buttons are excluded. Only the 9 primary stick positions are used for simplicity.

Observation Space

Melee has an absolutely massive observation space. Positional data is given by floating point coordinates for characters but also stage bounds. It is also important to keep track of the damage for the opponent but also the agent’s own damage. LibMelee allows the agent to read the current action state for itself and the opponent.