Author Topic: Learning Robot (Read 4916 times)

Commanderbob · « **on:** November 25, 2008, 11:23:01 PM »

Has anyone made a learning robot? As hard as this seems I think it would be possible. You could log events like "went forward for 10 seconds" and some how have a way of knowing if another event is good or bad. You could then have it tell if a certain sequence of events happens that the outcome will be good and it should try to make that happen or it will be bad and it should try to avoid it. I am more a software guy so I'll try my ideas out once I finally get a nice working robot platform.
Justin

szhang · « **Reply #1 on:** November 25, 2008, 11:59:56 PM »

Of course, read up on reinforcement learning/Q-learning. Search for the stanford helicopter to see some wicked videos of what they can do using reinforcement learning.

http://cs.stanford.edu/groups/helicopter/

Commanderbob · « **Reply #2 on:** November 26, 2008, 03:11:06 PM »

Thanks, that helicopter is amazing!

Has anyone here made a learning robot?

szhang · « **Reply #3 on:** November 26, 2008, 06:07:54 PM »

Define "learning". It is so general alot of robots can be said to "learn". Admin's hypersquirrel (http://www.societyofrobots.com/robot_hyper_squirrel.shtml) is an example of learning (the layout of the house) using reactive mapping.

Commanderbob · « **Reply #4 on:** November 26, 2008, 06:54:50 PM »

That is more like memorizing, not learning.

I mean something like if you gave your robot a color sensor so it can see lines of the floor and every time it crosses a line you kick it (or something bad). Eventually it should learn not to cross lines.

Or on the other hand if you had red and blue lines, every time it crosses a blue like you kick it but every time it crosses a red line something good happens. It will then learn to find red lines and avoid blue ones.

EDIT:
Even more simple things that would help it perform better like how long it takes to turn a specific angle, how fast it moves, how to get out of a corner. Originally the robot should bump into things, like a little kid, but as it runs longer it learns to avoid them. I am going to try some of this out with one of my robots. The only problem is it was a Losi RC car so it is fast and turns with a servo. Hopefully it can learn to turn itself.

ArcMan · « **Reply #5 on:** November 26, 2008, 07:51:31 PM »

How could you kick a poor defenseless robot?

You should be careful. Once it learns enough it will rise up against you and make you pay for all that kicking.

Commanderbob · « **Reply #6 on:** November 26, 2008, 07:57:36 PM »

Quote from: ArcMan on November 26, 2008, 07:51:31 PM

How could you kick a poor defenseless robot?

You should be careful. Once it learns enough it will rise up against you and make you pay for all that kicking.

No, I probably would not kick it. It might hurt my foot.

Actually I just would not want to break it.

That was just the first thing that came to mind when I thought of something bad to do to it. Do you have a better idea? How do you give a robot a sense of good and bad?

szhang · « **Reply #7 on:** November 26, 2008, 08:21:42 PM »

Memorizing IS learning. What you're talking about is generalized learning, and if you can make a robot with a general learning system like you described, it would be one of the biggest discoveries in AI this decade.

You can certainly program a robot learn which color line is okay to cross, but that is the only thing it will learn. So you'll have to program in a large number of stimuli to give it an appearance of generalized learning.

Alot of robots have some sort of system to learn specific parameters, but they don't generalize.

also, good and bad is completely arbitrary, you can just have a RF link that sends a particular signal when it does something you don't like

Webbot · « **Reply #8 on:** November 26, 2008, 09:11:50 PM »

I attempted to describe a kind of conditioning/learning that I did ages ago in analogue electronics here: http://www.societyofrobots.com/robotforum/index.php?topic=7.0.msg30869#msg30869

I keep meaning to get round to doing it in an mcu ....

Commanderbob · « **Reply #9 on:** November 27, 2008, 12:20:53 AM »

Quote from: szhang on November 26, 2008, 08:21:42 PM

Memorizing IS learning. What you're talking about is generalized learning, and if you can make a robot with a general learning system like you described, it would be one of the biggest discoveries in AI this decade.

Good so I have something to look forward to.

Memorizing can be learning. That would depend on your definition of memorizing. If saving a text file is your computer memorizing it, I don't consider that learning. Some people might though. When I think of learning I think more of putting two and two together. If I run into a wall it hurts, don't run into walls.

One thing I have come across already is I need a timer so it can measure how long things last and process things while in a delay. I have used timers 1 and 3 for PWM of servos and input capture of my ultra sonic sensor. Timer 2 is only an 8bit counter. What should I do to time long periods of times up to a minute? Should I make a timer 2 overflow interrupt to count? That could work.

Also right now my robot only has a scanning ultrasonic sensor. Do you think a scanning Sharp IR would help a lot? I also plan to add an accelerometer for collision detection and over all movement. Any other sensors you think it needs?

szhang · « **Reply #10 on:** November 27, 2008, 12:36:59 AM »

Learning: (n) The act, process, or experience of gaining knowledge or skill.

So yes, memorization is learning. Putting two and two together is reasoning, an even more difficult problem to generalize. If you have generalized reasoning and learning, you will be pretty close to human intelligence.

Besides, if a robot builds a map of its surroundings, it can plan more intelligent routes from point A to point B. It can decide not to explore a corridor because it "learned" it was a dead end 10 minutes ago. How is that not learning?

For the timer, if you can't make the divider any higher, what you can do is during the timer interrupt, increment another variable. With a 32 bit counter and an 1ms timer interrupt period, you can count almost 50 days before the counter overflows.

Sharp IR sensors are better if your range is small. Light travel faster (so you can scan faster), and the IR beam is much more focused.

Commanderbob · « **Reply #11 on:** November 27, 2008, 01:26:48 AM »

I guess you are right. I would consider that learning. I should of said reasoning robot.

The problem I see with ultrasonic alone would be walls at weird angles. Also it won't be able to tell what angle it is to the wall. Do you think I should even leave the ultrasonic sensor on there?

Right now I am trying to get it to figure out how fast it moves with different inputs to the speed controller, forward and reverse. That is why I needed the timer.

szhang · « **Reply #12 on:** November 27, 2008, 12:36:42 PM »

How big are the areas you are working with?

I had a lot of trouble with sonars (we used 16 of them on a scout nomad) with double reflection, or no reflection when at an angle, so unless you need to measure distance greater than the range of the IR, or there is strong sunlight interfering with the IR, IR would probably work better. ultrasonic sensors are useful sometimes, so leaving it on probably won't hurt.

Like I said, with the timer you can just count with another variable, and see how many interrupt cycles have passed. I use that method as a real-time clock for my robot, and it is accurate to the millisecond (the interrupt period).

arixrobotics · « **Reply #13 on:** November 27, 2008, 01:44:43 PM »

I remembered buying one of those robotic pets some time ago. They said the robot would 'learn' by interacting with the user.

So like when the robot do something, if you like it, you give it a pat (or a treat) and say "good robot" or something. If you don't like what the robot just did, give it a smack and say "bad robot".

Evemtually, the robot will know what the user like (good) and don't like (bad). So it will then do more good things and less bad things.

Is this the sort of 'learning' that you are looking at?

I also remembered one of MIT's walking robots, where the robot tries to walk by itself. Yes, like a baby. Every time it falls over, it will 'learn' something and keep getting better at walking from time to time (or trials to trials).

And from what I understand on genetic programming, it is a good machine learning technique. But I don't really understand the whole concept. Has anyone used it before?

Arif

Webbot · « **Reply #14 on:** November 27, 2008, 02:36:58 PM »

Neural networks can be good if you have a bunch of inputs, and you have a known required result. The inputs could be the current position of every servo, a gyro etc. All the inputs are mixed with all the other inputs with a variable amount of 'weighting' (ie a multiplication factor). There is normally one or more feedback loops with their own weighting. The output of the network can be a single output. If the robot knows what the output should be (eg 'dont fall over') then it can adjust all the different weightings, and the inputs (by moving the servos) until it finds a combination that produces the required result.

Commanderbob · « **Reply #15 on:** November 27, 2008, 03:09:57 PM »

szhang I already set up the timer with a 16 bit variable, 65 seconds is long enough for me.

arixrobotics the dog robot is kinda like what I want to do. I would like a more realistic good/bad input though. Like running into walls is bad. I don't want it to be dependent on me to tell it is doing a good job.

What I was planning on doing is using an accelerometer to detect collisions and an scanning IR sensor to have an event based system. The accelerometer will be constantly monitored and when something bad happens (a spike if it hit a wall) it will look at the past few scans of the IR sensor and try to determine a pattern. For example the voltage goes down from each scan then it hits something it learns that an over drop in the readings is bad. Or if every time I stand it front of it I give it a nudge with my foot. It could then fine that there are two spikes in the reading, my legs, then it should try to avoid me. The first time it might stay still. If it gets kicked again it will try something else like backing up or turning.

Admin · « **Reply #16 on:** December 02, 2008, 02:18:32 AM »

http://www.google.com/search?q="introduction+to+AI"

http://scholar.google.com/scholar?q=AI+robot&hl=en&lr=&btnG=Search

rogerfgay · « **Reply #17 on:** December 18, 2008, 08:06:05 AM »

Quote from: Commanderbob on November 25, 2008, 11:23:01 PM

Has anyone made a learning robot?

RoboBusiness: Robots that Dream of Being Better
http://mensnewsdaily.com/2007/05/16/robobusiness-robots-with-imagination/

iRobis Announces Complete Cognitive Software System for Robots
http://www.defpro.com/news/details/4056/

Brainstorm Responds to Robot Ethics Challenge
http://mensnewsdaily.com/2008/12/10/brainstorm-responds-to-robot-ethics-challenge/

bryan922 · « **Reply #18 on:** December 25, 2008, 01:18:27 PM »

Self organizing maps (SOMs) are a type of neural network which do unsupervised learning as opposed to supervised learning in feed forward back propagation networks.

Presumably there are hard-coded stimuli which can provide negative reinforcement, such as pain, etc. You want a robot to associate negative stimuli with the current event, which is nothing more than a state of mind.

Because a SOM can classify input, it essentially creates a state of mind. Other input, presumably negative stimuli, also creates a state of mind. The key is to get the robot to associate one input with another.

A SOM can learn to classify 'crossing a line' by operating on its vision input. Imagine the robot sees the line, and then proposes to move forward. We combine these two inputs into one vector:

1) Seeing the line
2) Thinking about crossing the line

This one vector is input into the SOM, and the SOM will classify it as a certain 'state of mind'. If we temporally retain this 'state of mind' for a couple of seconds, and then when the robot crosses the line, we create negative stimuli, then by some association method (hopfield network, custom coding, etc.), the robot will recall negative stimuli when the following is in its mind:

1) Seeing the line
2) Thinking about crossing the line

The robot should want to remove negative stimuli from its mind, and to do so it needs to remove the following from its current state of mind:

1) Seeing the line
2) Thinking about crossing the line

The above theory is incomplete, but you should get the gist.

I am currently working on an architecture where the current state of mind is competed for by what is currently being input as well as what was being thought of earlier. The winner is, I believe, that which has the most novelty. Novelty is determind by which classifies the least effectively, which in turn is determined by what has not been learned effectively by the SOM.

That which is currently in mind (The winner of the novelty competition) is over time associated with other things in mind. Different inputs associate together, such as seeing a ball, and hearing the words: "ball".

See the Toco the Robot video: http://techtv.mit.edu/videos/352-toco

News:

Author Topic: Learning Robot (Read 4916 times)