Learning automata systems are finite state adaptive systems that iteratively interact with a general environment. Through a probabilistic trial-and-error response process they try to select actions, which produces the best response. Think of the learning automata model as consisting of two components, an automaton and an environment. The learning cycle begins with the automata generating an action that is input to the environment. The environment receives the action and evaluates it returning a feedback signal to the automaton that represents the quality of that action in the environment. The signal represents the degree of success that the action had in the environment and is used to alter the automaton structure to improve its action selection strategy. The actions are selected probabilistically and internal to the learning automata is a probability distribution that is used for action selections. The probability update rule is the heart of the learning automata and many different learning rules have been developed to improve the convergence speed and accuracy of these systems.
The learning rules available in this demo are:
The following rules also use times selected/ times updated vectors that are initialized by try each action ten times
For further information on the learning automata please refer to my publications or contact me at mnwhowell@gmail.com.