AI Snake:
Explore ML as a Game
This model is designed to target specific objectives during training.
This model does not focus on specific objectives and allows for more exploration.
This model encourages exploration by incorporating curiosity-driven learning.
Adjust the proportion between generated data and experience data used during training.
Adjust the speed at which the snake moves. Lower values mean higher speed.
Fine-tune the model using existing training data to improve its performance incrementally.
Train the model from scratch using a new set of data for a completely fresh start.
Number of times the model will cycle through the training data.
Number of training examples utilized in one iteration.
How much to change the model in response to the estimated error each time the model weights are updated.
Rate at which the learning rate decreases after each epoch.
Penalty term added to the loss function to encourage smaller weights and reduce overfitting.
Alpha parameter for the Leaky ReLU activation function, controlling the slope of the activation for negative inputs.
Fraction of input units to drop to prevent overfitting during training.
Number of epochs with no improvement after which training will be stopped.
Minimum change in the monitored quantity to qualify as an improvement.
Threshold to clip gradient norms to prevent exploding gradients during training.
Coefficient for the entropy term in the loss function to encourage exploration.
Level of noise added to the state inputs to improve robustness.
Reward value for the snake when it eats food.
Penalty value for the snake when it collides with itself.
Penalty value for the snake for each move it makes (encourages faster completion).
Penalty value for the snake if it fails to make progress towards the food.
Reward value for the snake based on its proximity to the food.
Size of the replay buffer that stores experiences for training.
Exponent for prioritizing experiences in the replay buffer.
Small value added to priorities to ensure all experiences have a non-zero probability of being selected.
Probability of selecting a random action instead of the best action during training.
Rate at which the exploration epsilon decreases after each episode.
Minimum value for the exploration epsilon to ensure some exploration is always present.
If you would like to see how the ML works: right-click, select "Inspect," open the Console of your browser, and enjoy the heartbeat of the machine.
If you want to deploy your own ML app, visit ML in Health Science: Playground
Add a Comment:
Comments:
Article views: 0