What is Intelligence: Representation

REPRESENTATION


These days in AI research the word representation is being tossed around a lot, but it is important that we gain a deep understanding of what representation is as it is the key to understanding what the underlying intelligence of a human mind actually is.


A representation is a way of presenting a collection of data. This definition is simplistic but it helps the mind at least have an anchor towards understanding the fullness of the totality of all the phenomena we tag with the word representation. If we look at a picture of a cat printed on some paper, we are seeing the representation of a cat and not the cat itself. If we take the data in the cat picture and sonify it, that is, turn it into a sequence of sounds, then we have obtained a different representation of the image of a cat this time around we are representing the pixel information of the image of a cat as a sound instead of arranging it as an m x n grid of pixel values that our computer screen or paper and printer ink represents as colours which are frequency emitting surfaces that emit frequencies our eyes represent as colours and thus an image is formed internally.

In the case of sound where we sonify the data, it also results in a series of compression/decompression of airwaves that our ears recognize as sound.

Even the numbers that we use to represent pixel information is also a system of representation, we could use any kind of symbol imaginable. Transferring this pixel information we took using some camera unto a computer to be represented as bits on computer circuitry is also a system of representation. Even talking about bits is a representation of an abstract idea on computer circuitry that can maintain 2 voltage levels of high and low.

When we try to explain the idea representation in deep learning the first example we display is usually about the heliocentric representation of the solar system vs the geocentric representation.

See the source image


From left to right: Heliocentric (sun at the centre) representation vs. Geocentric (earth at the centre) representation of the solar system.

This is a very high example of the power of representation and can lead the mind to think that it is only in this kind of grand cases that we find power of representation being demonstrated but the fact is that representation, that is stating one thing in terms of another is everywhere and is indeed a very powerful attribute of intelligence.

In the image above early thinkers influenced by the church and their insistence that the earth is at the centre of the universe went ahead to justify such an assertion by producing a “model” in which the earth was at the centre. And to reconcile their assertion with facts from observation, they went ahead and produced that squiggly complicated image on the rightmost part of the image.

Before we think high and mightily that representation is only in the reconciliation of conceptions with facts, we must realize that by even drawing the solar system on a picture with circles representing planets and lines representing orbits we are representing the structure we live in which is the solar system as an image on a computer or paper, depending on where you are reading this.

Representation is indeed a complex phenomenon and can wrack a mind trying to trace at what it truly is, but simply we can view it as mapping one set of data from one structure (representation) to another structure.

In the case of heliocentrism vs geocentrism, we are trying to map observational data of the solar system into representation as some image. While geocentrism arose from trying to force this data to fit in with a prior belief resulting in that complicated mess of an image of planets and orbits. Heliocentrism was an attempt to represent the fact as it is in reality and this turned out to be accurate.

In the case of geocentrism, the mapping function had some bias (the preconception enforced by the church) into it resulting in a poor representation, while in the heliocentric picture the mapping function represented the data accurately without bias.

We can see representation everywhere, even language is a representation of the desire to communicate. Why entities like human entities choose to use sounds or symbols to represent discrete states of mind and communicate it to other entities is something we might never know. But we can see that for a representation to make sense it has to appeal to the observational capacity of the entity observing and there must be some common structure within the entities that are involved transferring representation back and forth to understand what this representation is.

If when we want to describe what a representation is we usually resort to another representation, that abstracts away the details of the base representation and present a higher level or clearer view of the underlying representation. In no field of endeavour has the power of representation been demonstrated to humans like in computer architecture.

In the computer, we take raw nature and lump it up into an abstraction of circuitry. We take raw chaotic nature and represent it in the form of computing logic. We start from building basic stuff like capacitors that hold energy, representing storage, another for controlling the flow of electrons like resistors and transistors and those for inducing charges like inductors. We must not forget the wires that connect everything and power source.

So, in essence, we take raw nature and by structuring it we represent computer logic with 0s and 1s, which we control through logic gates and using these simple structures we are able to build other higher-level structures like adders, registers, etc. which give us a simple system where we have a bunch of circuits interacting with other circuits to produce usable results.

This is all there is to computing but the challenge comes when we try to compute something. The basic computer with an instruction system, an ALU (arithmetic logic unit) and a control system can only perform computations if these computations are transferred from their original representation into the representation that is acceptable to the computing circuits.

So if we say 1 + 1 on paper, to be able to compute it using a computer, we must transfer this literal expression to one that is represented by the instruction set of a particular computer system.

The computer has a basic language in which instructions and data can be passed into it for computation. This language is an abstract representation of the core machine layout, but the language is so close to the machine that we are actually pushing bits around on the circuitry but still it is abstract because it is a level beyond other very primitive languages that have nothing to do with actual computation but just capable of moving information around on the circuitry of the computer. Below the typical machine language of the computer are more primitive (by primitive I mean low level) languages like RTLs (Register Transfer Language).

The machine language of a computer is a human facing language that presents an interface to the underlying hardware. It is a representation of the hardware and our goal in programming the computer is in transferring our problems from their original representation to the representation that the computer wants so that we can use it to compute solutions.

If we remember the sudoku problem we tasked an agent with solving and remember the stage where the agent had the electronic computer in hand after several iterations of applying its inventive capacity to produce some kind of mechanized computing system to save it physical energy.

When the agent had a computer it hand to transfer it sudoku problem from the representation on a sudoku board to that which was acceptable to a computer, in this case digital. We also explored the scenario where the agent had learned mathematics and was able to transfer the sudoku from the board representation into that of mathematical equations which are much more easier for a computer to handle. Without the equation, the agent would be tasked with simulating a sudoku board in actuality which results in large usage of resources.

Now back to our machine language representation, in order to input a simple addition problem like 1 + 1 we would first have to break up our problem into operands and operators. The operands, in this case, is just 1 while the operator is the plus operator.

Assuming we are dealing with a 4-bit computer, that is a computer that is capable of dealing with a block of 4 bits at once, we would encode our 1 as the binary digit 0001.

The instruction set of a computer contains all the commands the computer is capable of carrying out, which is its entire machine language. In a typical digital computer, it is easy to arrange the primitive gates circuits in a way that we can get a circuit that directly adds its operands. This is called an adder circuit and it receives at most two operands and in some cases a carry (the overflow from a previous computation) and adds its operands to produce a result and a carry where it applies.

This calculation is mostly carried out by fixed circuits and all that is needed is to load the operator and operand data for the circuit to allow electricity to flow and achieve computation.

The machine instruction for doing this electronic addition could be

LOAD 0001 TO A (load the first operand to a register storage location called A)

LOAD 0001 TO B (do the same for the second operand into location B)

ADD A B (add whatever is in location A to B, results are now in a location called the accumulator)

STORE C (store the result in register location c)

If we look up the register location at c, our results are there which will be 0010 if your computation proceeded accurately.

What we just witnessed above is a computer program, a way to instruct a computer. But what we are doing intrinsically is actually transferring the representation of our problem like 1 + 1 on paper into a representation that can be dealt with on a computer.

As we can see the computer that can only perform arithmetic and logic is a very limited device and with some severe ingenuity on our part we can try to transfer every problem we encounter into this primitive representation of arithmetic and logic and indeed this would be a hideous accomplishment on our side but rarely worth the effort and the computer would not be important for doing anything beyond simple calculations.

But the true power of computation is not only in its calculative power, which is what drove its early development, but actually in its ability to represent arbitrary computation because of its memory manipulating capacity.

Beyond arithmetic and logic, the power of the computer lies in the fact that it can represent stuff in memory and transform that stuff in arbitrary ways. Apart from arithmetic and logical operations, the computer also has other directives form storing and retrieving memories in locations beyond the registers and of making arbitrary jumps from one location of the program to the other.

Using the arbitrary transformation capacity the computer is able to build new operators by allowing the programmer create rules that define how one state transforms to another state and thus making it possible to perform arbitrary transformations which are the true powers of computation.

Because of this rule building power, the programmer rather than limiting itself to the hard ALU circuitry can proceed to build other languages by creating lookup tables. Lookup tables are a simple rule that says when you see this thing on the left replace it with this thing or things on the right.

With this power, all built on the foundation of being able to arbitrarily manipulate memories, the programmer rather than struggling with primitive raw logic could go ahead and build another layer of abstraction which is the assembly language level that enables it represents its problems with higher fidelity and ease than would have been possible on the machine language level.

So we can see that the programmer takes just a tiny piece of the whole space of possible machine language instructions and builds a ladder on that island that enables it to reach a new level of abstraction and representation power, enabling it to express large swath of possible programs easily.

But this is not the end of the ladder. Using certain instructions that occur in the assembly language level the programmer is able to climb into a level that we call higher level languages like C and thus is able to express larger swath of programs that would have been hideously difficult to convert to the assembly language representation/abstraction.

In a way this looks like the kind of exploration the AI program called AlphaGo does internally

See the source image



Source: https://thenewstack.io/google-ai-beats-human-champion-complex-game-ever-invented/

If we look at the tree from left to right we can see that where we have a large amount of branching off indicates a kind of local settlement, a layer of abstraction/representation. And in order to climb the next level of abstraction, the programmer picks a bunch of language structures and uses it build a new level of abstraction/representation. The analogy between this mental process and the process of game tree exploration done by AlphaGo was so similar I couldn’t help but reference it.

When we have a high-level language like C, we can go ahead to write all sorts of programs we can conceive of, ignoring the lower level details. We can write everything from games to operating systems to space ship control modules easily than we would have been able to do if operating at the assembly language level, a level closer to the machine in representation and far away from the actual way we represent problems in our minds.

With a higher level language, we can represent a problem in programming code in nearly the same way we would usually think about it, which makes it easy to actually write the code, but in a lower level language, we would have to limit our thinking to assembly language level.

Given our new power to express programs in a high-level language, we thought that we were at the right representation to represent any program and thus we went ahead to try to create programs for mimicking our own cognition, which is the entire AI endeavour.

Early attempts at this were a disaster leading to several AI winters where no one wanted to mention the word AI in any research circle. When we realized we could simulate any system on computers we jumped right in with excess optimism to simulate the processes of our minds. But as it happens we did not have the right representation and this phase of AI research looked more like geocentrism than heliocentrism.

Rather than focusing on what we could learn from the brain like Rosenblatt did, we went ahead to simulate specific aspects of cognition that we wrongly thought was at the “centre” of human cognition. Some researchers even had to guts to name a program general problem solver after formulating a narrow definition of what a problem is.

If we had worked on improving the idea of the perceptron rather than bashing it to pieces like Minsky and Co. did we would have been well on our way to the kind of AI we do in modern times. We should also note that backpropagation was first published in a journal of psychology and without it, we wouldn’t have been able to train deep neural nets.

Early AI work in image recognition was mostly concerned with hand engineering features/representations of images that were general enough to be used for image recognition.

It was only when we tried to invent a system for automatically extracting features of data, that is neural networks, that we were well on our way to what we are doing with modern AI.

Many people will say that deep learning is representation learning and truly it is because it learns a representation of the input data that enables it to generalize it such that when input data that contains a similar representation is passed into the network it can identify this data as similar or the same with some previously seen data.

Remembering the AlphaGo game exploration tree above we can see that neural networks are another ladder into a new level of abstraction beyond what we can do with high-level languages. Climbing to this new level is no different from the way we went from assembly language to high-level language or from machine language to assembly. With high languages which gives our minds an easier means of representing any problem, we are able to create neural networks a ladder into a new level of abstraction where we can solve problems that would have been difficult or even impossible to solve with hand coding in a high-level language no matter how high.

Apart from the fact that deep neural network learns a representation of data, we should also look at the importance of the network layout itself as a problem-solving representation.

This is the idea that I am silently ringing at throughout this work, we should not look at only specialized networks like deep learning when we are seeking to build AI, the specialization of generalized networks that make deep learning possible is the idea of weighted connections and the multiply-sum-activate process that is going on inside the nodes of the network i.e. the neurons.

Basic research should look into the capabilities of generalized networks of all kinds and not just focus on deep networks because we also have to understand that the human brain has no layers and doesn’t do backpropagation. Backprop and layers are just a gate to lead us to more general network ideas that could lead us to more efficient ways of doing what we are already doing with deep networks. The variety of networks available these days is a testament to this truth.

Deep neural networks learn a representation of data, but neural networks and networks, in general, are the most generalized representation that perfectly fits many problems and is probably what is Artificial General Intelligence.

Comments

Popular posts from this blog

Software of the future

At the edge of a cliff - Quantum Computing

Eliminate past trauma with Kirtan Kriya