Skip to main content

AI as the most generalized tool

The final goal of AI should be to create a generalized tool that can create specific tools for achieving specific tasks.

The current approach to creating AI is that we take a problem like speech recognition, analyse the problem and then employ a bunch of methods like deep learning and or other regular algorithmic approaches to create a tool that recognizes speech. The same goes for building a tool that recognizes or generates images, etc.
This way of thinking borrows directly from the empirical paradigm, which is to obtain data, develop a hypothesis about the structure of the data, and design systems that manipulate the data in reasonable ways to produce desired results.

The empirical way of reasoning about the solutions to problems is quite different from the point of view of a pure theorist. The pure theorist seems to look for the correct axiom systems that will predict the existence of certain things and then go about to perform the experiments to confirm the theoretical results. This is just a simple explanation and the intention is not to explain things in total detail but to whet the mind as I point out two major ways we go about problem-solving.

One way of problem-solving which is the way the natural empiricist will go about is picking a specific problem and solving it completely, but the solution is of no use in the solving of a related problem because it is a highly specialized solution.

On the other hand, those of a theoretical bent will observe a class of problems and find a single solution that cuts across similar classes of problems.

No one style of problem-solving is better than the other. They all have their roles and we usually switch between different modes of problem-solving depending on the nature of the task we are faced with.

Most of the time when we are faced with a new problem, our natural approach would be to try to solve that problem in a highly specialized way. If the problem is unique enough then we would usually have no previous experience with it. We could do research into the works of others and see if anyone has encountered this problem before and most of the time someone has. But if the new instance of that problem that someone else has encountered is unique enough, it is usually very difficult to adapt the old solution to this new problem and thus we would have to engineer a brand new solution to the problem we are faced with. 

If we succeed in solving such a problem we will usually come up with a highly specialized solution to it. A modern approach to problems that are algorithmic in nature but have no clearly set pattern to enable us to engineer solutions would be to try to obtain data about the problem, then use the hammer of deep learning to beat the data into shape until we start extracting patterns from this data that lead us towards engineering a solution to the problem that had defied typical algorithmic methods.

This is the edge the new paradigm of deep learning and machine learning gives us over older methods, it enables us to say something about the solution to a problem by merely obtain as much data as is available and proceeding to solve the problems. This is a beautiful problem-solving methodology but when taken too far can result in a shortage of real problem-solving skill as we lazily believe that just getting as much data as possible is the solution to all kinds of problems.

The main limitation of this approach is that we can only solve one problem at a time because we can only collect data that is important to the problem we want to solve and then when we want to solve another problem go out and collect another class of data.

In some situations, data generated for one scenario can be used for another, but this is not always reliable and we cannot expect this to work in many scenarios as the problems we are trying to solve might be too specific and required very specialized data.

If we want to solve speech recognition we obtain speech data, if we want to solve image recognition we obtain pictures. While many may argue that to solve all kinds of different problems we just need to collect the specific kind of data needed and eventually we will collect all kinds of data for all kinds of things and we will have a general system, this kind of thinking is no different from designing a different kind of glider for different kinds of wind conditions, it is inefficient and will not take us to our main goal of creating Strong AI (AGI).

Rather like the Wright brothers we should be trying to create one kind of powered flight system that adapts its structure to new flight conditions (new problems) rather than different gliders for different wind conditions, new ML models for new kinds of data.

Our goal with AI is to create a generalized tool that can create any kind of model we seek rather than us creating a model for every use case.

This requires that we bring more theoretical reasoning into the problem of solving AGI and rather than expend so much energy and resources tuning large energy-inefficient models for solving narrow domains we should think more of what the theoretical underpinnings of such systems are and find axiom systems that are able to generate better models that are faster, smaller, flexible and robust.

I suggest that there could be new systems that can generate models that are 1000x better than the current models engineered by humans if we search deeper for the needed structures. The funny thing is that most of these models might not be comprehensible to humans and I don't mean this in the same light of thought that we don't understand how deep learning models do their computation but what I am trying to say is that if we use some deep axiomatic structures that automatically and systematically generate models to generate better models that do specific tasks like image recognition or even driving a car then we will have no real way of making these models parsimonious to human understanding.

This is somewhat related to neural architecture search but in this case, we are expanding the domain of search far beyond "Neural" architectures and searching the entire computational universe of possible programs.

If we are finding it hard to understand current deep learning models then we might as well give up hope on understanding the kind of AI system that could be created by machines operating on some axiom system. To see more about why it will be extremely difficult or even impossible to explain what is going on inside certain systems you should read this blogpost by Stephen Wolfram.

Strong AI will be the most generalized tool possible. Like the human mind, it will be able to generate a model for solving any problem be it the design of an aeroplane, a lecture on AI or a simple model for recognizing images.

Comments

Popular posts from this blog

Virtual Reality is the next platform

VR Headset. Source: theverge.com It's been a while now since we started trying to develop Virtual Reality systems but so far we have not witnessed the explosion of use that inspired the development of such systems. Although there are always going to be some diehard fans of Virtual Reality who will stick to improving the medium and trying out stuff with the hopes of building a killer app, for the rest of us Virtual Reality still seems like a medium that promises to arrive soon but never really hits the spot.

Next Steps Towards Strong Artificial Intelligence

What is Intelligence? Pathways to Synthetic Intelligence If you follow current AI Research then it will be apparent to you that AI research, the deep learning type has stalled! This does not mean that new areas of application for existing techniques are not appearing but that the fundamentals have been solved and things have become pretty standardized.

New Information interfaces, the true promise of chatGPT, Bing, Bard, etc.

LLMs like chatGPT are the latest coolest innovation in town. Many people are even speculating with high confidence that these new tools are already Generally intelligent. Well, as with every new hype from self-driving cars based on deeplearning to the current LLMs are AGI, we often tend to miss the importance of these new technologies because we are often engulfed in too much hype which gets investors hyper interested and then lose interest in the whole field of AI when the promises do not pan out. The most important thing about chatGPT and co is that they are showing us a new way to access information. Most of the time we are not interested in going too deep into typical list-based search engine results to get answers and with the abuse of search results using SEO optimizations and the general trend towards too many ads, finding answers online has become a laborious task.  Why people are gobbling up stuff like chatGPT is not really about AGI, but it is about a new and novel way to rap