2015-11-20

Some comment about artificial neural networks

Just want to share some of my understanding from the course of Machine Learing (Coursera, Stanford University, lectured by Andrew Ng). I finished this course earlier 2015 and have become very interested in the method of neural networks (NN). This was really mind-changing to me, in that one non-linear, complex system can "explain" another arbitrarily complex system.

When I talked about NN with my academic friends, they are mostly worrying about overfitting and incapability to find the "global optimum". The second worried thing is the required size of training samples, which industrial friends are also concerned. At this moment , I would not worry about the first issue.

By nature NN is supposed to solve problems that human-recognition are good at solving but linear computer algorithms cannot, and indeed NN is like humans. Humans have their prejudice, understand the same thing from many different perspectives, and do the same thing in different (sometimes wrong) ways . NN has adopted all these, in forms of over-fitting and the existence of multiple local optima.

Just take this plain example: many of us walk everyday. We walk with our feet, legs, backbones and arms moving in such different and diverse manners, but we walk equally well (almost)! We accept these multiple local optima and don't care a "globally optimal" way of walking.

And our life experience in early ages will affect so much how we behave in adulthood, sometimes causing misalignment when we cope with new situations. This is a typical "over-fitting" phenomenon. However most of us don't really think this is a problem to solve. We accept this as a fact-of-life, if not beauty-of-life. And yes, we always learn from new experience when we grow up, using new "training samples" to reshape our ways of doing things, probably a practical solution.

As another simple example of "over-fitting", just think how many times you interpret anything in your house as like a "face".

However I do care about the second issue, the sample size. Complex questions require us to have a big size of training samples, but we always worry how big is big enough. Practically we want the training samples to cover "all different situations", but we often don't really know if this requirement is fully met.

And there is a dilemma here: when the training samples are indeed "enough", we may not need to train any model any more. If the training samples have already covered all possible situations, all we have to do is to use the training samples as a lookup table for new data. Too few training data is useless, while too many training data is meaningless. In practical we have to face this question.

Thinking in another way from the first issue, NN has opened a door of re-recognizing the world. There may be more than one "correct way" to explain the things in this world. Our previous way of optimizing procedures, maximizing profit, doing things, and even explaining physical, chemical and biological rules, may only be some "local optima". NN may provide totally different ways of doing those, which may be equally good or even better, though sometimes strange (like "google deep dream").


没有评论:

发表评论