- In reinforcement ML learning algorithms, the machine is trained to understand the activities going on in the world around just like human beings by applying a certain level of intelligence. The machine learns from the repercussions of its own actions without being taught a thing! In short, the purpose of reinforcement learning is to have a good policy, not the good decision. To understand it, let us take an example of your pet dog. Consider teaching the dog a new trick: you cannot tell it what to do, but you can reward if it does right things and penalize if it does wrong things. RL agent(Dog) has to figure out what it did that made it get the reward/punishment. An RL agent(Dog) learns by interacting with its environment by observing the results of these interactions
5 THINGS You should know about RL:–
- There is no supervisor, only a notion of reward
- That means RLagent is the one making the decisions, based on some reward per action.
- Feedback is delayed, not instantaneous. Unlike other ML techniques, the feedback may be delayed in RL, it can even be at the end, unlike other techniques
- Time matters.The distribution is no longer independent and identically distributed, it changes with time, as our agent makes decisions
- Agent’s actions affect the subsequent data it receives. As the agent is actually interacting with the environment, the next action is actually dependent on the present action.
- For more updates, please keep reading articles