malib.rollout.envs.mdp package

malib.rollout.envs.mdp.env_desc_gen(**config)[source]

Submodules

malib.rollout.envs.mdp.env module

class malib.rollout.envs.mdp.env.MDPEnvironment(**configs)[source]

Bases: Environment

property action_spaces: Dict[str, Space]

A dict of agent action spaces

close()[source]
property observation_spaces: Dict[str, Space]

A dict of agent observation spaces

property possible_agents: List[str]

Return a list of environment agent ids

render(*args, **kwargs)[source]
reset(max_step: Optional[int] = None) Union[None, Sequence[Dict[str, Any]]][source]

Reset environment and the episode info handler here.

seed(seed: Optional[int] = None)[source]
time_step(actions: Dict[str, Any]) Tuple[Dict[str, Any], Dict[str, float], Dict[str, bool], Dict[str, Any]][source]

Environment stepping logic.

Parameters:

actions (Dict[AgentID, Any]) – Agent action dict.

Raises:

NotImplementedError – Not implmeneted error

Returns:

A 4-tuples, listed as (observations, rewards, dones, infos)

Return type:

Tuple[Dict[AgentID, Any], Dict[AgentID, float], Dict[AgentID, bool], Dict[AgentID, Any]]