Skip to main content

AI as the most generalized tool

The final goal of AI should be to create a generalized tool that can create specific tools for achieving specific tasks.

The current approach to creating AI is that we take a problem like speech recognition, analyse the problem and then employ a bunch of methods like deep learning and or other regular algorithmic approaches to create a tool that recognizes speech. The same goes for building a tool that recognizes or generates images, etc.
This way of thinking borrows directly from the empirical paradigm, which is to obtain data, develop a hypothesis about the structure of the data, and design systems that manipulate the data in reasonable ways to produce desired results.

The empirical way of reasoning about the solutions to problems is quite different from the point of view of a pure theorist. The pure theorist seems to look for the correct axiom systems that will predict the existence of certain things and then go about to perform the experiments to confirm the theoretical results. This is just a simple explanation and the intention is not to explain things in total detail but to whet the mind as I point out two major ways we go about problem-solving.

One way of problem-solving which is the way the natural empiricist will go about is picking a specific problem and solving it completely, but the solution is of no use in the solving of a related problem because it is a highly specialized solution.

On the other hand, those of a theoretical bent will observe a class of problems and find a single solution that cuts across similar classes of problems.

No one style of problem-solving is better than the other. They all have their roles and we usually switch between different modes of problem-solving depending on the nature of the task we are faced with.

Most of the time when we are faced with a new problem, our natural approach would be to try to solve that problem in a highly specialized way. If the problem is unique enough then we would usually have no previous experience with it. We could do research into the works of others and see if anyone has encountered this problem before and most of the time someone has. But if the new instance of that problem that someone else has encountered is unique enough, it is usually very difficult to adapt the old solution to this new problem and thus we would have to engineer a brand new solution to the problem we are faced with. 

If we succeed in solving such a problem we will usually come up with a highly specialized solution to it. A modern approach to problems that are algorithmic in nature but have no clearly set pattern to enable us to engineer solutions would be to try to obtain data about the problem, then use the hammer of deep learning to beat the data into shape until we start extracting patterns from this data that lead us towards engineering a solution to the problem that had defied typical algorithmic methods.

This is the edge the new paradigm of deep learning and machine learning gives us over older methods, it enables us to say something about the solution to a problem by merely obtain as much data as is available and proceeding to solve the problems. This is a beautiful problem-solving methodology but when taken too far can result in a shortage of real problem-solving skill as we lazily believe that just getting as much data as possible is the solution to all kinds of problems.

The main limitation of this approach is that we can only solve one problem at a time because we can only collect data that is important to the problem we want to solve and then when we want to solve another problem go out and collect another class of data.

In some situations, data generated for one scenario can be used for another, but this is not always reliable and we cannot expect this to work in many scenarios as the problems we are trying to solve might be too specific and required very specialized data.

If we want to solve speech recognition we obtain speech data, if we want to solve image recognition we obtain pictures. While many may argue that to solve all kinds of different problems we just need to collect the specific kind of data needed and eventually we will collect all kinds of data for all kinds of things and we will have a general system, this kind of thinking is no different from designing a different kind of glider for different kinds of wind conditions, it is inefficient and will not take us to our main goal of creating Strong AI (AGI).

Rather like the Wright brothers we should be trying to create one kind of powered flight system that adapts its structure to new flight conditions (new problems) rather than different gliders for different wind conditions, new ML models for new kinds of data.

Our goal with AI is to create a generalized tool that can create any kind of model we seek rather than us creating a model for every use case.

This requires that we bring more theoretical reasoning into the problem of solving AGI and rather than expend so much energy and resources tuning large energy-inefficient models for solving narrow domains we should think more of what the theoretical underpinnings of such systems are and find axiom systems that are able to generate better models that are faster, smaller, flexible and robust.

I suggest that there could be new systems that can generate models that are 1000x better than the current models engineered by humans if we search deeper for the needed structures. The funny thing is that most of these models might not be comprehensible to humans and I don't mean this in the same light of thought that we don't understand how deep learning models do their computation but what I am trying to say is that if we use some deep axiomatic structures that automatically and systematically generate models to generate better models that do specific tasks like image recognition or even driving a car then we will have no real way of making these models parsimonious to human understanding.

This is somewhat related to neural architecture search but in this case, we are expanding the domain of search far beyond "Neural" architectures and searching the entire computational universe of possible programs.

If we are finding it hard to understand current deep learning models then we might as well give up hope on understanding the kind of AI system that could be created by machines operating on some axiom system. To see more about why it will be extremely difficult or even impossible to explain what is going on inside certain systems you should read this blogpost by Stephen Wolfram.

Strong AI will be the most generalized tool possible. Like the human mind, it will be able to generate a model for solving any problem be it the design of an aeroplane, a lecture on AI or a simple model for recognizing images.


Popular posts from this blog

Next Steps Towards Strong Artificial Intelligence

What is Intelligence? Pathways to Synthetic Intelligence If you follow current AI Research then it will be apparent to you that AI research, the deep learning type has stalled! This does not mean that new areas of application for existing techniques are not appearing but that the fundamentals have been solved and things have become pretty standardized.

How to become an AI researcher

Artificial Intelligence is all the rage these days. Everyone is getting on the bandwagon but there seems to be a shortage of AI researchers everywhere these days. Although many people are talking about doing AI not many people are actually doing AI research.