Skip to main content

Building Human Centered AI with the Wolfram Neural Net Repository

The first phase of Modern AI is grinding to a halt. The PhDs have trained neural networks for doing all kinds of tasks for us. Everything from image recognition, audio synthesis, language modelling and translation already have some prebuilt deep neural net already in existence engineered from deep learning papers. Now it is time for the second phase of Modern AI development which involves using all these pre-built stuff as building blocks for Human Centered AI applications, that is applications that have AI as some submodule to perform some task.
The first phase of Modern AI development, by modern I am focusing on the era of deep learning, involved the building of deep neural network architectures for doing all kinds of stuff. Researchers used very powerful machines to train deep neural networks, sometimes taking weeks or even months to get good results.

Various competitions like the ImageNet competition etc. provided a fertile ground for rapid development of deep neural networks. Researchers would come up with different network architectures, each time improving performance by some percentage and taking AI to new levels.

It was prohibitive for the researcher with a single laptop computer to get into this field. To train a network of any reasonable size with adequate amounts of data required the amassing of huge computational power from GPUs. And although the dedicated researcher could achieve much from a single GPU running in their dorm room in the early days of the deep learning era, as time went on it took only the muscle of large companies like Google, Microsoft, etc. training neural networks on about 8 to more powerful GPUs at a time to produce meaningful progress.

But thankfully the open nature of AI research made it possible for outsiders with lesser resources access these neural network architectures bundled with appropriate weights, so that with little compute power they could use the fruits of this expensive research in their own applications and with some slight modifications perform transfer learning by merging a pretrained net with a tiny tail of their own to train on new data.

The term Human Centered AI has not been standardized and you can see a lot of people offering their own definitions. My own definition is quite clear and is without much of the complications that plague others. I define Human Centered AI as utilizing AI modules in the construction of larger-scale applications that serve some human need.

Unlike the first phase of modern AI which I call Machine centred AI because our goal was getting this machines to work in the first place, now that we have machines that work, the question is what do we do with them?

Of course, if you have some image classification system that tells you that this is a cat and that is a dog or some segmentation system that separates the different kinds of objects in a scene. The question is what do we do with this in the real world. Without much of a stretch, we can think of building an App that identifies novel objects in the real world that we do not know about like the Wolfram Image Identification project, or in the case of segmentation, we can think about self-driving cars and how they are able to keep on the road and avoid pedestrians.

Although large companies like Google, Wolfram, Amazon, etc. are already using AI internally in many of their applications, the reason I am writing about Human Centered AI is to make that which is already done implicitly by the big boys explicit.

Many organizations are still concerned with training data scientist and this is a noble cause because there is an enormous amount of data out there that needs to be analysed, this will involve training some machine learning model from the scratch in some cases or just performing transfer learning in others. This data science thing is part of the first phase of modern AI development because the machine is still the center of the operation but in the second phase we are going to have AI practitioners who just take an entire pretrained net as a given, using it via an interface to build higher level stuff that directly interacts with humans.

The new phase of AI will be more about using lower level stuff like pretrained neural networks to build higher-level human interfacing stuff. It will be increasingly less about training new network architectures.

While the first phase of modern AI can be described as AI science, the new phase is more like AI technology and it will involve a lot of creativity and ingenuity no lower than that of great historical figures like Douglas Engelbart an early Human-Computer-Interaction pioneer to find ways of combining low-level AI modules to build things that augment the human being.

While I cannot describe every possibility of how low-level AI modules will be used to create new and novel stuff, I will try to take you on a cruise of the achievement of modern AI. Nowhere else can you find all the most recent powerful AI modules ready to deploy in your own applications with the convenience of a single function call like you will find in the Wolfram Neural Net Repository. You can head there straight right now or continue reading to discover the coolest AI modules that get me all amped up!

If you're building an App and you discover that you need to do some image classification of some sort, the Wolfram Neural Net Repository got you covered. There are AI modules that do everything from vanilla image recognition, and you can choose amongst several nets like Inception, VGG or ResNet, or nets that enable you to count prominent items in an image, handwriting recognition, gender prediction, age estimation, etc.

As you must have known many things that we would want to do in the past with hand coding is now done efficiently using Deep learning AI techniques. Rather than reinvent the wheel browse the Repo thoroughly till you find something you need. Even if your problem is super unique you can always start from a pretrained net and perform transfer learning. I think Wolfram language is the easiest and most efficient language to do transfer learning in. It feels very quirky and unnatural in most other languages.

If you're building an app that has to do a lot of natural language processing where you have to represent words as vectors either to extract contextual information or generate text, there are lots of feature extractors already built. Most of these feature extractors were built from millions to billions of tokens and it will be hideously difficult for you to build your own feature extractor considering the resources required, especially if you are an independent researcher. It is better to focus on the high-level human-centric picture when building human facing applications rather than spend so much time doing low-level things like representing words as vectors.
Whether you're doing something artistic like transferring style from one image to another, or increasing the resolution of an image, or generating a satellite photo from a street map, etc. there is a pre-built net that already does that and you could simple NetModel it into your application and focus on exactly what you want to build rather than reinventing the wheel.

If you are building a chatbot that has to do some kind of text generation you need all kinds of tools in your toolbox and a neural net based language modelling module like The GPT transformer.

Assistive apps that have to speak back to the user could be built using stuff like Deep Speech 2. Or an application that runs on some device in the rain forest that has to listen and identify the sounds of animals in the current vicinity you do that using Wolfram AudioIdentify.

If you're working on a self-driving car project the Semantic Segmentation section contains pretrained nets that you could incorporate into your project for doing things like segmenting an image of a driving scenario into semantic component classes.

A robot that needs to see and identify objects in its environ would benefit from using models in the object detection section.

Human-centred AI is about augmenting the human experience of the world with artificial intelligence. In a sense this has been all that we have been doing with technology all along but in the past if we wanted to solve a problem, of course, it is going to be a human problem, we had to build a lot of things from scratch right from the bottom of the technology stack to the very top.

In the early days of deep learning, we struggled to build machines to attain human-level performance. This was the obsession of the era. Although strong AI is still some time away, we already have some kind of high performing weak AI in our hands. It is now our job to use this AI to augment human performance. The pretrained neural networks models on the Wolfram Neural Net Repository are the basic building blocks from which creative engineers and entrepreneurs can solve all sorts of real-world human facing problems.

Just like we have libraries for doing all kinds of things in traditional computer programming, pretrained neural net models are AI libraries that you can incorporate into your projects. 


Popular posts from this blog

Next Steps Towards Strong Artificial Intelligence

What is Intelligence? Pathways to Synthetic Intelligence If you follow current AI Research then it will be apparent to you that AI research, the deep learning type has stalled! This does not mean that new areas of application for existing techniques are not appearing but that the fundamentals have been solved and things have become pretty standardized.

How to become an AI researcher

Artificial Intelligence is all the rage these days. Everyone is getting on the bandwagon but there seems to be a shortage of AI researchers everywhere these days. Although many people are talking about doing AI not many people are actually doing AI research.