pgtg

Environment Arguments

The PGTG environment is highly customizable. Use these constructor arguments to modify it for your use case.

Argument Type Default Value Meaning
map_path str or None None Path to a pregenerated map to use. If set to None a new random map is generated instead.
random_map_width int 4 How many tiles the random generated map is wide.
random_map_height int 4 How many tiles the random generated map is heigh.
random_map_percentage_of_connections float 0.5 How many of the possible connections between tiles are generated in the random map. Each map has a path from start to goal, so a value of 0.0 will result in minimal connections, not none.
random_map_start_position tuple[int, int] or tuple[int, int, str] or "random" (0, -1, "west") Where the start of the random generated map is located. The two integers specify the x and y coordinates of the tile the start is located in, optionally its direction (“north”, “east”, “south”, or “west”) can be specified. The start can only be located on a map border.
random_map_goal_position tuple[int, int] or tuple[int, int, str] or "random" (-1, 0, "east") Where the goal of the random generated map is located. The two integers specify the x and y coordinates of the tile the goal is located in, optionally its direction (“north”, “east”, “south”, or “west”) can be specified. The goal can only be located on a map border.
random_map_minimum_distance_between_start_and_goal int or none None The minimal length of the path between start and goal. Can only be used if both random_map_start_position and random_map_goal_position are set to "random". The maximal allowed value is the hight plus the width of the map minus 2.
random_map_obstacle_probability float in [0,1] 0.0 The probability for each tile of the random map to be generated with an obstacle.
random_map_ice_probability_weight float 1.0 The relative probability of ice being chosen when an obstacle is generated for the random map.
random_map_broken_road_probability_weight float 1.0 The relative probability of broken road being chosen when an obstacle is generated for the random map.
random_map_sand_probability_weight float 1.0 The relative probability of sand being chosen when an obstacle is generated for the random map.
random_map_traffic_light_probability_weight float 1.0 The relative probability of traffic lights being chosen when an obstacle is generated for the random map.
render_mode "human" or "pil_image" or None None The Gymnasium render mode. If "human" is chosen a pygame window opens and advances to the next frame whenever step() is called, "pil_image" makes render() return a PIL.Image.Image and None to no output being generated.
features_to_include_in_observation list[str] ["walls", "goals", "ice", "broken road", "sand", "traffic", "traffic_light_green", "traffic_light_yellow", "traffic_light_red"] What features are included in the observation as one-hot encodings. Changing this argument changes the observation space.
use_sliding_observation_window bool False Wether or not the part of the map that is observable should slide along centered on the agent. If set to False the tile the agent is currently in is observed. Changing this argument changes the observation space.
sliding_observation_window_size int 4 How many squares the sliding observation window extends in all directions. A value of 1 results in a 3 x 3 observation windows, 2 in 5 x 5, and 3 in 7 x 7. Has no effect if use_sliding_observation_window is False. Changing this argument changes the observation space.
use_next_subgoal_direction bool False Wether or not to include a additional observation that indicates the direction of the next (sub) goal in the observation. Changing this argument changes the observation space.
sum_subgoals_reward int 100 The total reward awarded for reaching all (sub) goals. It is equally divided among all (sub) goals.
final_goal_bonus int 0 Additional reward for reaching the final goal in addition to the normal (sub) goal reward defined by sum_subgoals_reward.
crash_penalty int 100 Penalty for moving into a wall or traffic. The value is subtracted from the reward, thus a positive value should be used.
traffic_light_violation_penalty int 50 Penalty for running a red light. The value is subtracted from the reward, thus a positive value should be used.
standing_still_penalty int 0 Penalty for not moving or accelerating each. It is applied each step. The value is subtracted from the reward, thus a positive value should be used.
already_visited_position_penalty int 0 Penalty for moving to a square that was already visited this episode. The value is subtracted from the reward, thus a positive value should be used.
ice_probability float in [0,1] 0.1 The probability of the ice obstacle triggering when moved onto and moving the agent in a random direction.
street_damage_probability float in [0,1] 0.1 The probability of the broken road obstacle triggering when moved onto and destroying the agents tires (thus permanently limiting its to speed to 1).
sand_probability float in [0,1] 0.2 The probability of the sand obstacle triggering when moved onto and halting the agent.
traffic_density float in [0,1] 0.0 How many percent of the squares that could have traffic are occupied by it. A value of 1.0 means that all lanes are permanently completely filled with traffic, making it impossible to avoid collisions. Values between 0.01 and 0.05 are recommended.
traffic_light_phases_duration tuple[int, int, int] (10, 3, 10) Duration of the traffic light phases (green, yellow, red) in steps. The second value can be set to 0 to disable the yellow light phase.
ignore_traffic_collisions bool False Wether to ignore collisions with traffic. Sometimes useful for testing.