What is Intelligence: Learning in Humans

LEARNING IN HUMANS

When humans are presented with information in their environment without a tag, this information could be about static stuff like trees, cats, cloud formations, etc. or it could be about dynamic processes like observing the seasons, the flight of birds, the swimming of fish or about other humans or entities performing some actions, the first thing the observing human mind does is to form an internal representation of whatever is being observed.


Humans obtain information through the senses and each class of information, be it audio, visual, tactile, etc. is transformed into a unified representation in the brain. If the information is labelled, this would occur in a scenario where we are being taught by some teacher or when we are observing labelled information presented on some medium (like TV, or books) or when we use our creativity to formulate labels/tags for identifying things ourselves. The representation we generate will be associated, i.e in a mapping, with the representation of the label.

We must realize that even the label/tag has to be transformed into an internal representation and so, learning involves mapping the internal representation of one thing to the internal representation of another.

This mapping between data and label is bidirectional. If a human is queried for its knowledge, assuming that it has gone through some learning process, and as a simple example the image of the cat is presented to a human and it is requested that an appropriate label be produced. The associated label, either by the visual expression on some medium like writing, making the sound “cat” or some other gesture which was tagged with the image of the cat will be produced by the humans in response. Also, when the human is presented with the tag, the human must be also able to think back about the image for which the tag or label represents.

The key to keep in mind is that at some stage in the learning process, some kind of representation about the input data is formulated, and this input data is a mapping between the item to be learnt and the tag of that item. We must also keep in mind that the specific data item, for example, the image of the cat, is just an instance of a class which consists of all cats. The process of learning, therefore, involves connecting a data item to its tag/label such that anytime either of the mapped items is requested the human would produce the other.

Machine learning systems with some plumbing are also able to return the image of something when the label of that thing is requested. Although most image recognition systems are designed to store knowledge in a single direction from item to label, a human learns a bidirectional relationship from the very start.

When teaching children about things in the world we usually point at something, usually an image, and say that is a dog. We point at another and say that is a tree, a house etc. Although the data is not explicitly supplied as a mapping, pointing out these items in the environment is no different from supplying a system with mapped training data from the outside like we do in supervised learning.

When you think of unsupervised learning the foremost thing to keep in mind is clustering of data into groups, although this is not strictly what unsupervised learning systems are about. Any system that can learn some representation of data without any labels supplied is doing some kind of unsupervised learning.

If an unsupervised system in the mind of an entity has clustered data into groups then it only needs to use its database of knowledge obtained from supervised learning to apply a tag/label to grouped data. This is actually called semi-supervised learning where a supervised learning system drives an unsupervised learning system to gain an understanding of the world very rapidly without requiring millions of pieces of tagged data.

Tight clusters of data obtained from our unsupervised learning system are related to what we have in our supervised system. The features extracted from our supervised learning system are related to that which we have obtained from our unsupervised learning system. Those that are close together are tagged with what we have from our supervised learning system thus adding data with high confidence to our knowledgebase. Outlier data that doesn’t fit into any cluster generates “doubt” in the system and thus cannot be easily associated with a tag.

When in doubt a human being usually asks questions for clarification so that it can see how best to group some data items that might be far away from any cluster. If after questioning itself or another entity it still cannot find a way to fit the data with any of the known clusters, the data is either used as the foundation of a new cluster or it is just kept aside till more information is acquired from the environment or imagination.

The human brain will constantly seek out ways to integrate these outlier data into the main framework of the mind because the mind is constantly seeking unification of its database of knowledge. If the outlier data in the mind is never queried in course of physical existence its priority in the hierarchy of patterns is reduced and it is stored in some kind of backend storage so that resources are not spent on it.

At any time there are pieces of information in our minds that are outliers and have no relationship with what we already know. The only thing we have done with these pieces of information is to transform it into our own internal representation as features and as we go along living our lives and experiencing different environments, we are constantly subconsciously seeking things that are related to those outliers in our minds.

Someday we encounter some similar structure in the environment and we instantly relate it with that outlier representation in our minds and we get an “aha” moment. If this new information we encountered has a tag then it is a simple process of just associating the new tag with the outlier representation in our minds. If our internal nearest function computes the relationship between the two representations, the outlier we already have and the new representation obtained from the new input, and finds that the degree of closeness is high, we associate the new tag we just obtained with the old outlier data very strongly, but if this closeness is weak, we do this with caution.

Another knowledge acquisition scenario is when we have a bunch of data we have already grouped in our minds and we are seeking some kind of confirmation about what they are in common speak. A child might have been observing horses for a while but does not have a name for the images in their minds. So, they inquire of an adult by probably pointing to a horse and asking: what is this? And they are told, it’s a horse! They go ahead and apply that horse tag to the grouping of images they have already clustered in their minds using some kind of unsupervised learning. Asking questions about that clustered group will affirm their knowledge and give it a label “horse” so that whenever they encounter an animal in that group they will immediately know it’s a horse, subject to memory retrieval constraints.

If they have doubts about whether a particular animal is a horse because it has been put in the class of zebra-like creatures or it is an abnormally small horse or whatever caused that particular image of a horse to be far away from the centre of all that their unsupervised system calls a horse, they will question some superior authority, probably an adult. Is this also a horse? On confirmation of a positive identification they immediately fit it in some cluster closer to the centre or on negation they might create a new group in the cluster and say, this is some new kind of creature and although it looks like a small horse, it is not a horse.

Maybe the adult says it’s a zebra and they now create a new class of animals that look like horses except for the single distinguishing feature of black and white stripes.

We might not have access to the core feature extractors that the human mind uses to group things into certain groups but we can approximate the feature extractors using engineering with methods like deep neural networks, support vector machines or some other kind of algorithm that does the feature extraction for us.

Comments

Popular posts from this blog

Software of the future

At the edge of a cliff - Quantum Computing

Eliminate past trauma with Kirtan Kriya