Deep learning’s approach to Creativity — Is it the real intelligence?

Here we are going to discuss a simplified and fascinating deep learning system/process we have made so far in the deep learning industry. We will then take a close look at the concepts of perception and creativity and what they mean for the machine to perceive, learn and create. In the end, I would like to dig deeper to analyze these concepts from a philosophical perspective and to discuss whether machines are creative or intelligent and if they can be intelligent.

One purpose of deep learning or A.I. is to make computer and devices able to do things that brains do and at the same time, this makes us interested in real brains and neural science as well.

Historically one of those areas that we think it is impossible for the machine to achieve has been perception: the process by which things are out there in the world(sounds and images) can turn into concepts in the mind — this is essential for our own brains. The flip side of perception is creativity: turning a concept into something out there into the world.

Over the past years, the work on machine perception has also unexpected connected with the world of machine creativity and machine art. One of the views of this Dual relationship between perception and creativity is presented by Michelangelo is that: we create by perceiving and that perception itself is an act of imagination and is the stuff of creativity.

Vision begins with the eyes, but it truly takes place in the brain,”Even though the progress on understanding the brain proceeded slowly over the next few decades, but we knew so far that neuron used electricity. By World War II, we start doing real electrical experiments on live neurons to better understand how they worked. At pretty much the same time we invented the computer and very much based on the idea of modeling the brain of ‘intelligent machinery’.

Let’s take the visual cortex(the cortex that processes imagery that comes from the eye)as an example. Warren Sturgis McCulloch and Walter Pitts think the shape of visual cortex looked like a circuit diagram. Even though the mechanism of the McCulloch-Pitt’s circuit diagram is not quite right. But this basic idea that visual cortex works like a series of computational elements that pass information one to the next in a cascade is essentially correct. Based on this idea, the McCulloch-Pitts neuron was an early model of brain function. This linear model could recognize 2 different categories of inputs by testing whether f(x,w) is positive or negative. For the model to correspond to the desired definition of the categories, the weights needed to be set correctly.

What would a model for processing visual information need to do?

The basic task of perception is to get the input image and be able to recognize the name of the object. Basically, the process between the image of the object and the output word of the object is essentially a set of neurons connected to each other in a neural network like the graph below.

The pixel of the image is the first layer of the neurons and those feed forward into one layer after another layer of neurons which are all connected by synapses of different weights. The behavior of this network is characterized by the strengths of all those synapses(the computational properties of this network).

We can represent the input pixels and the synapses in the neural network and the output by 3 variables: x, w, and y. But there are maybe a million or so x’s (represents a million pixels in the image); there are billions or trillions of w’s (which represent the weights of all these synapses in the neural network; the number of output y could vary depends on the task.

We can simplify the model like x * w = y. Here we have 1 equation and 3 variables. We can solve the problem if given any 2 variables in the equation. Here we know the x(input: pixels of the image) and w(weights) in order to calculate y(output).

After the first step, we need to train the model or to be more specifically. In order words, We should facilitate the model to learn.

In order to achieve this goal, we should let the network to figure out the w by providing the value of x and y. However, this problem seems very easy to achieve by doing the following operation: w = y/w. But the ‘multiplication’ here is not the regular linear multiplication as we learned in algebra and this operator does not have inverse like ‘division’ here for the machine. So we have to figure out a solution without division operator such as the process below:

We can turn this problem into an optimization problem by replacing 0 with an error term since the function x*w is only the approximation of y:

If there is an error on the left side, it means our model has not figured out the right w. Since the problem turns into the optimization issue — we can just ask the computer to take guesses to minimize the error and this is sort of calculation that computers are very good at. Step by step we can teach our model to decrease the error and find the relative better solution.

Let’s review the whole process, here again, we give the model the inputs(x) and the right answer(y) and we let the model figure out a path to go from the input to the right output. It is exactly the same way that we do our own learning. Besides designing the regular perceiving and learning paths for the machine, we can also come up with the solution to literally let the machine to create. Here are the 3 paths:

  1. We fix the x and w to let the model generate y — perception

Inductive reasoning: supplying strong evidence for the truth of the conclusion.

2. We fix the x and y to let the model solve w — learning

Deductive reasoning: the process of reasoning from one or more statements (premises) to reach a logically certain conclusion)

3. Here is another way: let’s fix w and y to let the model find x — creativity

Abductive reasoning: a form of logical inference which starts with an observation then seeks to find the simplest and most likely explanation

In the 3rd situation, we have our trained model(with a set of w’s) and the output(the name of the object y) then we try to answer the question:

what the initial image(input x) should look like?

It turns out by using exactly the same error-minimization procedure and the network we trained to recognize the object — we can let the machine to generate/project object image for us.

You can also begin with a non-empty canvas(initial x’) to find and generate x on that canvas. The model will end up finding a lot of object’s image x that you are looking for in the initial x’ image like the one below:

In the end, we may conclude that: perception and creativity are inner-connected. We have known and seen that the neural networks are entirely trained to discriminate or to recognize different things in the world. Also, it is able to be running inverse, to generate.

As we used to think perception and creativity are by no means uniquely human but we start to have computers model that can do exactly these sorts of things. In turn, it ought to be unsurprising that the brain is in a way a computational machine as well. Computing is not just about accounting. From the beginning, we modeled them after our minds and they give us both the ability to understand our own minds better and to extend them.

But here I would like to talk a little bit more about the description of Creativity or in a broader range Intelligence which is a word gets thrown around without fully understanding. Here we would have to go back to think about how we human are defined as intelligent despite the fact that ‘What is conspicuously lacking is a broad framework of ideas in which to interpret these different approaches”.

It turns out we should ask ourselves

“Is intelligence defined by behaviors?”

But is that true? We are intelligent because we behave intelligently or because we do intelligent things. In order words, sometimes we are following this behavioral metric to define intelligence. But I believe intelligence is not defined purely by the behaviors and we are trying so hard to make the machine ‘behave’ intelligently like the diagram below:

There are few challenges AI cannot achieve such as the Common-sense reasoning (that is concerned with simulating the human ability to make presumptions about the type and essence of ordinary situations they encounter every day. These assumptions include judgments about the physical properties, purpose, intentions and behavior of people and objects, as well as possible outcomes of their actions and interactions.) Most AI scientists were trying to create a symbolic information processing — which is a philosophical approach to the mind.

People who are dedicated to making computer intelligent may think about mind or conscious as below: the concepts were rules and we have representations in our minds of the world and we made inferences from those representations of the world and that representations make us behave intelligently. It turns out some people get into this thinking happen to be trapped by the frame problem. Some people may also think that we have the inner mental representation of the world. But the truth is the knowledge is not stored somewhere like mind, instead, the knowledge is stored out there in the world.

So far, I wouldn’t call deep learning “creative” or it gives machines ability to be creative but instead, it is us who are manipulating computers for our own enjoyment. The computer doesn’t know it’s “created” anything other than it finished running the computation because the machine must be able to perform (or learn to perform with instruction and training — but without explicit post-factory computer programming) any task that a human might “reasonably” expected to be able to perform given its effector/sensor suite.

So the next question would be

Do we always need biological processes in order to be truly intelligent?

Maybe and maybe not. Maybe as the ‘society of mind’ concept suggests subprocesses called ‘agents’ combine to eventually give rise to something that is greater than the sum of its parts. One thing is clear here: only if features such as consciousness or self-reflection could possibly emerge from such an assembly we can expect the machine to be truly intelligent. Take the development of neural network as an example, the artificial neural network will probably be able to have the same number of neurons as the human brain at the 2050s assuming that there is no new technology enable faster scaling. But the brain is more than just hundred billions of neurons. Instead, it may represent more complicated functions that machine with the same amount of neurons cannot compete. But still, we have to admit that the potential for superintelligence lies dormant in matter.

What will future superintelligence look like?

Assume the machine will become extremely powerful in the future, we would then have a future that would be shaped by the preferences of the A.I.. Then we need to think of this intelligence as an optimization process — a process that steers the future into a particular set of configurations. It is just like if you create a powerful optimization process to maximize the object x you better make sure that the definition of x incorporate everything you care about.