FoPra Beluga Challenge - Reinforcement Learning v1.0
Deep Reinforcement Learning solution for the Beluga Challenge shipping container optimization problem using PPO and MCTS
|
Beluga Challenge environment for reinforcement learning. More...
Public Member Functions | |
__init__ (self, str path, int base_index=-1) | |
Initialize the Beluga Challenge environment. | |
step (self, str action_name, params=None) | |
Execute a single environment step with the given action. | |
reset (self) | |
Reset the environment with a new problem instance. | |
reset_specific_problem (self, problem) | |
Reset the environment with a specific problem instance. | |
get_reward (self, bool could_execute, str action_name, production_line_n_old) | |
Calculate the reward for the current action. | |
get_observation_high_level (self) | |
Get the high-level observation of the current state. | |
get_max_steps (self) | |
Get the maximum number of steps to solve the current problem. | |
check_action_execution (self, str action_name, obs) | |
Check if the action can be executed without actually executing it. | |
Public Attributes | |
ProblemState | state = None |
path = path | |
int | step_count = 0 |
problem_name = None | |
list | sorted_problems = [] |
int | problem_count = 0 |
base_index = base_index | |
int | problems_solved = 0 |
int | block_size = 6 |
dict | check_action_map |
Beluga Challenge environment for reinforcement learning.
This class implements the main environment for the Beluga Challenge shipping container optimization problem. It manages problem states, action execution, reward calculation, and episode management.
rl.env.environment.Env.__init__ | ( | self, | |
str | path, | ||
int | base_index = -1 ) |
Initialize the Beluga Challenge environment.
path | Path to the directory containing problem JSON files |
base_index | Base index for problem selection |
rl.env.environment.Env.check_action_execution | ( | self, | |
str | action_name, | ||
obs ) |
Check if the action can be executed without actually executing it.
action_name | Name of the action to check |
obs | Current observation of the environment |
rl.env.environment.Env.get_max_steps | ( | self | ) |
Get the maximum number of steps to solve the current problem.
rl.env.environment.Env.get_observation_high_level | ( | self | ) |
Get the high-level observation of the current state.
rl.env.environment.Env.get_reward | ( | self, | |
bool | could_execute, | ||
str | action_name, | ||
production_line_n_old ) |
Calculate the reward for the current action.
could_execute | Boolean indicating if the action was successfully executed |
action_name | Name of the action taken |
production_line_n_old | Number of production lines before the action |
rl.env.environment.Env.reset | ( | self | ) |
Reset the environment with a new problem instance.
Resets the environment's state from a JSON file in the problems folder, selecting problems in ascending order of jigs count (in blocks of 6 problems)
rl.env.environment.Env.reset_specific_problem | ( | self, | |
problem ) |
Reset the environment with a specific problem instance.
problem | Path to the specific problem JSON file |
rl.env.environment.Env.step | ( | self, | |
str | action_name, | ||
params = None ) |
Execute a single environment step with the given action.
action_name | Name of the action to execute |
params | Parameters for the action (optional) |
rl.env.environment.Env.base_index = base_index |
int rl.env.environment.Env.block_size = 6 |
dict rl.env.environment.Env.check_action_map |
rl.env.environment.Env.path = path |
int rl.env.environment.Env.problem_count = 0 |
rl.env.environment.Env.problem_name = None |
int rl.env.environment.Env.problems_solved = 0 |
list rl.env.environment.Env.sorted_problems = [] |
rl.env.environment.Env.state = None |
rl.env.environment.Env.step_count = 0 |