FoPra Beluga Challenge - Reinforcement Learning v1.0
Deep Reinforcement Learning solution for the Beluga Challenge shipping container optimization problem using PPO and MCTS
rl.mcts.mcts_node.MCTSNode Class Reference

Node class for Monte Carlo Tree Search. More...

Public Member Functions

 __init__ (self, state, parent=None, action=None, depth=0)
 Initialize a new MCTS node.
 
 is_root (self)
 Check if this node is the root of the tree.
 
 is_terminal (self)
 Check if this node represents a terminal state.
 
 is_fully_expanded (self)
 Check if all possible actions have been tried from this node.
 
 get_untried_action (self)
 Get list of actions that haven't been tried yet from this node.
 
 expand (self, tuple candidate)
 Expand the node by adding a new child for the given action.
 
 add_child (self, child)
 Add a child node to this node.
 
 best_child (self, exploration_weight=1.0)
 Select the best child node using UCT (Upper Confidence bounds applied to Trees)
 
 backpropagate (self, reward)
 Backpropagate the reward up the tree to all ancestors.
 

Public Attributes

 state = state
 
 parent = parent
 
 action = action
 
 depth = depth
 
list children = []
 
int visits = 0
 
float total_reward = 0.0
 
int children = 0:
 

Detailed Description

Node class for Monte Carlo Tree Search.

Each node represents a state in the search tree and maintains information about visits, rewards, and possible actions.

Constructor & Destructor Documentation

◆ __init__()

rl.mcts.mcts_node.MCTSNode.__init__ ( self,
state,
parent = None,
action = None,
depth = 0 )

Initialize a new MCTS node.

Parameters
stateCurrent problem state
parentParent node in the search tree
actionAction taken to reach this node
depthDepth of this node in the tree
23 def __init__(self, state, parent=None, action=None, depth=0):
24 """!
25 @brief Initialize a new MCTS node
26 @param state Current problem state
27 @param parent Parent node in the search tree
28 @param action Action taken to reach this node
29 @param depth Depth of this node in the tree
30 """
31 self.state = state # Current state (later observation)
32 self.parent = parent # Parent node
33 self.action = action # Action taken to reach this node
34 self.depth = depth # Depth in the tree
35 self.children = [] # List of child nodes
36 self.visits = 0 # Number of visits to this node
37 self.total_reward = 0.0 # Total reward accumulated from this node
38

Member Function Documentation

◆ add_child()

rl.mcts.mcts_node.MCTSNode.add_child ( self,
child )

Add a child node to this node.

Parameters
childChild node to add
91 def add_child(self, child):
92 """!
93 @brief Add a child node to this node
94 @param child Child node to add
95 """
96 self.children.append(child)
97

◆ backpropagate()

rl.mcts.mcts_node.MCTSNode.backpropagate ( self,
reward )

Backpropagate the reward up the tree to all ancestors.

Parameters
rewardReward value to propagate
119 def backpropagate(self, reward):
120 """!
121 @brief Backpropagate the reward up the tree to all ancestors
122 @param reward Reward value to propagate
123 """
124 self.visits += 1
125 self.total_reward += reward
126 if self.parent:
127 self.parent.backpropagate(reward)

◆ best_child()

rl.mcts.mcts_node.MCTSNode.best_child ( self,
exploration_weight = 1.0 )

Select the best child node using UCT (Upper Confidence bounds applied to Trees)

Parameters
exploration_weightWeight for exploration vs exploitation trade-off
Returns
Best child node according to UCT formula
98 def best_child(self, exploration_weight=1.0):
99 """!
100 @brief Select the best child node using UCT (Upper Confidence bounds applied to Trees)
101 @param exploration_weight Weight for exploration vs exploitation trade-off
102 @return Best child node according to UCT formula
103 """
104 # Example: UCT formula
105 best_score = float('-inf')
106 best = None
107 for child in self.children:
108 if child.visits == 0:
109 score = float('inf')
110 else:
111 exploitation = child.total_reward / child.visits
112 exploration = exploration_weight * math.sqrt(math.log(self.visits) / child.visits)
113 score = exploitation + exploration
114 if score > best_score:
115 best_score = score
116 best = child
117 return best
118

◆ expand()

rl.mcts.mcts_node.MCTSNode.expand ( self,
tuple candidate )

Expand the node by adding a new child for the given action.

Parameters
candidateTuple of (action_name, parameters) to expand
Returns
The newly created child node
79 def expand(self, candidate: tuple):
80 """!
81 @brief Expand the node by adding a new child for the given action
82 @param candidate Tuple of (action_name, parameters) to expand
83 @return The newly created child node
84 """
85 new_state = self.state.copy() # Create copy
86 new_state.apply_action(candidate[0], candidate[1]) # Apply action on copy
87 child_node = MCTSNode(state=new_state, parent=self, action=candidate, depth=self.depth + 1)
88 self.add_child(child_node)
89 return child_node
90

◆ get_untried_action()

rl.mcts.mcts_node.MCTSNode.get_untried_action ( self)

Get list of actions that haven't been tried yet from this node.

Returns
List of untried (action_name, parameters) tuples
60 def get_untried_action(self):
61 """!
62 @brief Get list of actions that haven't been tried yet from this node
63 @return List of untried (action_name, parameters) tuples
64 """
65 # If root node has a specific action without parameters
66 if self.is_root() and self.action is not None and self.action[1] is None:
67 # Only return parameters for this specific action
68 action_name = self.action[0]
69 all_params = self.state.enumerate_valid_params(action_name)
70 tried_params = [child.action[1] for child in self.children]
71 # Return only untried parameters
72 return [(action_name, param) for param in all_params if param not in tried_params]
73 else:
74 # Normal behavior for other nodes
75 all_possible_actions = self.state.get_possible_actions()
76 tried_actions = [child.action for child in self.children]
77 return [action for action in all_possible_actions if action not in tried_actions]
78

◆ is_fully_expanded()

rl.mcts.mcts_node.MCTSNode.is_fully_expanded ( self)

Check if all possible actions have been tried from this node.

Returns
True if fully expanded, False otherwise
53 def is_fully_expanded(self):
54 """!
55 @brief Check if all possible actions have been tried from this node
56 @return True if fully expanded, False otherwise
57 """
58 return len(self.get_untried_action()) == 0
59

◆ is_root()

rl.mcts.mcts_node.MCTSNode.is_root ( self)

Check if this node is the root of the tree.

Returns
True if this is the root node, False otherwise
39 def is_root(self):
40 """!
41 @brief Check if this node is the root of the tree
42 @return True if this is the root node, False otherwise
43 """
44 return self.parent is None
45

◆ is_terminal()

rl.mcts.mcts_node.MCTSNode.is_terminal ( self)

Check if this node represents a terminal state.

Returns
True if this is a terminal state, False otherwise
46 def is_terminal(self):
47 """!
48 @brief Check if this node represents a terminal state
49 @return True if this is a terminal state, False otherwise
50 """
51 return self.state.is_terminal()
52

Member Data Documentation

◆ action

rl.mcts.mcts_node.MCTSNode.action = action

◆ children [1/2]

list rl.mcts.mcts_node.MCTSNode.children = []

◆ children [2/2]

int rl.mcts.mcts_node.MCTSNode.children = 0:

◆ depth

rl.mcts.mcts_node.MCTSNode.depth = depth

◆ parent

rl.mcts.mcts_node.MCTSNode.parent = parent

◆ state

rl.mcts.mcts_node.MCTSNode.state = state

◆ total_reward

float rl.mcts.mcts_node.MCTSNode.total_reward = 0.0

◆ visits

int rl.mcts.mcts_node.MCTSNode.visits = 0

The documentation for this class was generated from the following file: