What is Intelligence: Limitations of current neural networks

Limitations of current neural networks

The biggest limitation of current neural networks is that they need enormous amounts of input data to properly generalize the data. To properly generalize data like images of dogs, cats or whatever we need millions to billions of instances of these data classes to train the network to be able to identify species of dogs or species of cats. It is clear that the human being requires very little input data to enable it to identify what is a dog and what is a cat, this is because it uses its imagination to generate all kinds of variations of the input data internally so it doesn't need so much input.


This is why it is important for machines to develop the equivalence of the human imagination. It is the human imagination that fills in data where data is missing. It is the imagination that takes a picture of a dog and concocts images that look very close to that of a dog by using all the images in its database to start coming up with weird new configurations of dogs. When it encounters something in the real world that looks like one of these concocted images, it may ask from an external authority if what it is observing is a dog? If it is then it trains itself on this new piece of information.

Generative Adversarial Networks (GANs) which are used to generate new images that look very real is a step forward in engineering algorithms that can perform what is similar to the human imagination.

Another reason for why a human requires very little data to learn something could be that the learning algorithms of a human brain/mind are powered by more efficient algorithms than backpropagation. Somehow AI engineers will keep on coming up with better algorithms until we are able to approximate that which is always running in a human being.

Some AI engineers are of the opinion that all we need is more hardware and that if we keep on running backpropagation on increasingly more powerful computers and bring in lots of data, then a weak algorithm would not necessarily be a problem. There is a general convention that if you have lots of data and of course more hardware then the algorithm you are using need not be clever.

Hardware is supposed to be the hard part of the equation and not software, the hardware is expensive and consume lots of energy; the software simply requires ingenuity and if that ingenuity is missing then we have a problem.

Its just like in the sudoku example in an earlier chapter, we could by brute force try out every possible configuration one by one and hope that we will soon get one that solves the problem, and only hope that if we had more pieces already on the board then it will make our “computation” by brute force of the solution easier. But what “intelligence” does, in essence, is not brute force even though with just memory you could improve raw brute force computation by caching your past “experiences”. What intelligence does is, in essence, invent something like a CSP solver which mechanizes the process of finding a solution to the puzzle in the fastest possible way. If the problem were to be a chess problem then intelligence will invent something like alpha-beta search or some other fancy algorithm to solve the chess problem.

I have been using board games to describe the action of intelligence but what of the real world? It is intelligence that invented chess via an algorithm that creates a “game” which is, in essence, systematization of the wild world out there. A game is like a world with its own rules. Intelligence is what creates these worlds, and even though intelligence is capable of creating a world like a game it might not by default be able to navigate its way to a “solution” of the fundamental problems of the world by using the default pattern recognize → generate action coping algorithm that is the backbone of all its processes. It would have to apply intelligence again to the very problem of the game that it had created to be able to arrive at a solution in the most efficient way possible.

Our minds/brains are not the product of our current lives alone but in essence the summation of the evolution of all our ancestor species from their very first appearance to the current moment. So, therefore, our brains are muddled up concoctions of simple algorithmic processes that have been feeding back on themselves since the very first moment the first species that contained our DNA appeared. Does this mean that it is fruitless to attempt unravelling the seeds of our intelligence? No, I don’t think so! The fact that we possess some kind of intelligence going on inside our heads or elsewhere means that we can, by ingenuity, come across the very algorithms and data structures that make us able to invent algorithms.

We shouldn't get stuck on the current paradigm of neural networks and hope that the way to intelligence is just by scaling. Many people are of the opinion, that scaling a dumb looking algorithm is the "only" way forward. Why I do not deny that scaling goes a long way to make algorithms that may appear weak when running small to actually demonstrate greater power when running large, I think it's an interesting artefact of the process of engineering intelligence demonstrating algorithms and should not be taken to extremes.

Our current neural networks work when running on a large scale but I am cautioning that this might just be an example of the power of simple algorithms and not a justification to try to engineer dumb algorithms hoping that we could scale them till they work.

This is like the difference between the explanations of what "critical mass" really means. In one definition it means gathering large amounts of uranium (gross mass) and in another, it means the atomic mass of uranium.

Some people thought that gathering large volumes of Uranium was the key to fission but the reality was that just one atom of enriched uranium, i.e. reaching critical mass, just needed an extra neutron to create fission.

With the urge for more data, more hardware and endless cry for scaling we are going the large volume of uranium pathway to achieve AI explosion. We should be investigating clever algorithms that do not need too much scaling to be useful.

Let's not burn the bridges of efficiency too early.

Comments

Popular posts from this blog

Its not just about learning how to code

Nigeria and the Computational Future

This powerful CEO works remotely, visits the office only a few times a year