So im thinking od developing an algorithm that allows a robot to learn how to move forwards, backwards and turn. I was thinking that it would work with random numbers, and a reward system. But if the robot all ready knows that getting 'x' reward is good and getting 'y' reward is bad, surely this would 'defeat' the point of it learning how to move. As we have already implemented some knowledge on whats good and bad.