malib.rollout.envs package

Subpackages

Submodules

malib.rollout.envs.env module

class malib.rollout.envs.env.Environment(**configs)[source]

Bases: object

static action_adapter(policy_outputs: Dict[str, Dict[str, Any]], **kwargs)[source]: Convert policy action to environment actions. Default by policy action

property action_spaces: Dict[str, Space]: A dict of agent action spaces

close()[source]

collect_info() → Dict[str, Any][source]

env_done_check(agent_dones: Dict[str, bool]) → bool[source]

property observation_spaces: Dict[str, Space]: A dict of agent observation spaces

property possible_agents: List[str]: Return a list of environment agent ids

record_episode_info_step(state: Any, observations: Dict[str, Any], rewards: Dict[str, Any], dones: Dict[str, bool], infos: Any)[source]

Analyze timestep and record it as episode information.

Parameters:

state (Any) – Environment state.
observations (Dict[AgentID, Any]) – A dict of agent observations
rewards (Dict[AgentID, Any]) – A dict of agent rewards.
dones (Dict[AgentID, bool]) – A dict of done signals.
infos (Any) – Information.

render(*args, **kwargs)[source]

reset(max_step: Optional[int] = None) → Union[None, Sequence[Dict[str, Any]]][source]: Reset environment and the episode info handler here.

seed(seed: Optional[int] = None)[source]

step(actions: Dict[str, Any]) → Tuple[Dict[str, Any], Dict[str, Any], Dict[str, float], Dict[str, bool], Any][source]

Return a 5-tuple as (state, observation, reward, done, info). Each item is a dict maps from agent id to entity.

Note

If state return of this environment is not activated, the return state would be None.

Parameters:: actions (Dict[AgentID, Any]) – A dict of agent actions.
Returns:: A tuple follows the order as (state, observation, reward, done, info).
Return type:: Tuple[ Dict[AgentID, Any], Dict[AgentID, Any], Dict[AgentID, float], Dict[AgentID, bool], Any]

time_step(actions: Dict[str, Any]) → Tuple[Dict[str, Any], Dict[str, float], Dict[str, bool], Dict[str, Any]][source]

Environment stepping logic.

Parameters:: actions (Dict[AgentID, Any]) – Agent action dict.
Raises:: NotImplementedError – Not implmeneted error
Returns:: A 4-tuples, listed as (observations, rewards, dones, infos)
Return type:: Tuple[Dict[AgentID, Any], Dict[AgentID, float], Dict[AgentID, bool], Dict[AgentID, Any]]

class malib.rollout.envs.env.GroupWrapper(env: Environment, aid_to_gid: Dict[str, str], agent_groups: Dict[str, List[str]])[source]

Bases: Wrapper

Construct a wrapper for a given enviornment instance.

Parameters:: env (Environment) – Environment instance.

action_mask_extract(raw_observations: Dict[str, Any])[source]

property action_spaces: Dict[str, Space]: A dict of agent action spaces

property agent_groups: Dict[str, List[str]]

agent_to_group(agent_id: str) → str[source]

Mapping agent id to groupd id.

Parameters:: agent_id (AgentID) – Agent id.
Returns:: Group id.
Return type:: str

build_state_from_observation(agent_observation: Dict[str, Any]) → Dict[str, ndarray][source]

Build state from raw observation.

Parameters:: agent_observation (Dict[AgentID, Any]) – A dict of agent observation.
Raises:: NotImplementedError – Not implemented error
Returns:: A dict of states.
Return type:: Dict[str, np.ndarray]

build_state_spaces() → Dict[str, Space][source]: Call self.group_to_agents to build state space here

env_done_check(agent_dones: Dict[str, bool]) → bool[source]

property observation_spaces: Dict[str, Space]: A dict of agent observation spaces

property possible_agents: List[str]: Return a list of environment agent ids

record_episode_info_step(observations, rewards, dones, infos)[source]

Analyze timestep and record it as episode information.

Parameters:

state (Any) – Environment state.
observations (Dict[AgentID, Any]) – A dict of agent observations
rewards (Dict[AgentID, Any]) – A dict of agent rewards.
dones (Dict[AgentID, bool]) – A dict of done signals.
infos (Any) – Information.

reset(max_step: Optional[int] = None) → Union[None, Dict[str, Dict[str, Any]]][source]: Reset environment and the episode info handler here.

property state_spaces: Dict[str, Space]

Return a dict of group state spaces.

Note

Users must implement the method build_state_space.

Returns:: A dict of state spaces.
Return type:: Dict[str, gym.Space]

time_step(actions: Dict[str, Any])[source]

Environment stepping logic.

Parameters:: actions (Dict[AgentID, Any]) – Agent action dict.
Raises:: NotImplementedError – Not implmeneted error
Returns:: A 4-tuples, listed as (observations, rewards, dones, infos)
Return type:: Tuple[Dict[AgentID, Any], Dict[AgentID, float], Dict[AgentID, bool], Dict[AgentID, Any]]

class malib.rollout.envs.env.Wrapper(env: Environment)[source]

Bases: Environment

Wraps the environment to allow a modular transformation

Construct a wrapper for a given enviornment instance.

Parameters:: env (Environment) – Environment instance.

property action_spaces: Dict[str, Space]: A dict of agent action spaces

close()[source]

collect_info() → Dict[str, Any][source]

property observation_spaces: Dict[str, Space]: A dict of agent observation spaces

property possible_agents: List[str]: Return a list of environment agent ids

render(*args, **kwargs)[source]

reset(max_step: Optional[int] = None) → Union[None, Tuple[Dict[str, Any]]][source]: Reset environment and the episode info handler here.

seed(seed: Optional[int] = None)[source]

step(actions: Dict[str, Any]) → Tuple[Dict[str, Any], Dict[str, Any], Dict[str, float], Dict[str, bool], Any][source]

Return a 5-tuple as (state, observation, reward, done, info). Each item is a dict maps from agent id to entity.

Note

If state return of this environment is not activated, the return state would be None.

Parameters:: actions (Dict[AgentID, Any]) – A dict of agent actions.
Returns:: A tuple follows the order as (state, observation, reward, done, info).
Return type:: Tuple[ Dict[AgentID, Any], Dict[AgentID, Any], Dict[AgentID, float], Dict[AgentID, bool], Any]

malib.rollout.envs package

Subpackages

Submodules

malib.rollout.envs.env module

malib.rollout.envs.vector_env module