What Neural Nets Really Are: Or, Artificial Intelligence Pioneer Says Start Over

By Briggs September 19, 201722 Comments

There is ancient wisdom in the phrase never believe your own press that computer scientists have laid aside in their quest to discover “artificial” “intelligence”.

In the Axios article “Artificial intelligence pioneer says we need to start over” Steve LeVine writes:

In 1986, Geoffrey Hinton co-authored a paper that, four decades later, is central to the explosion of artificial intelligence. But Hinton says his breakthrough method should be dispensed with, and a new path to AI found…

Speaking with Axios on the sidelines of an AI conference in Toronto on Wednesday, Hinton, a professor emeritus at the University of Toronto and a Google researcher, said he is now “deeply suspicious” of back-propagation, the workhorse method that underlies most of the advances we are seeing in the AI field today, including the capacity to sort through photos and talk to Siri. “My view is throw it all away and start again,” he said…

In back propagation, labels or “weights” are used to represent a photo or voice within a brain-like neural layer. The weights are then adjusted and readjusted, layer by layer, until the network can perform an intelligent function with the fewest possible errors.

But Hinton suggested that, to get to where neural networks are able to become intelligent on their own, what is known as “unsupervised learning,” “I suspect that means getting rid of back-propagation.”

If you have not read the series on how an abacus cannot be a brain, nor considered “artificial intelligence”, please do so first (Part I, Part II, Unsupervised learning digression).

Back propagation is only a technique, meaning there are others, to create weights in an “artificial neural network”. Not for the first time do I praise, with genuine enthusiasm, the marketers of computer science for creating wonderful names.

Here is the world’s simplest ANN:

y –> w*y –> z

Some value of y is input into the “network”, and it is then “hit” by a weight, to produce the outcome z. So that if y = 7 and the weight is 2, then (brace yourselves!) z = 14.

It does not matter where the weight w came from, whether from back propagation or from the Lord Himself. It is the weight.

Now this is an ANN. The only thing that separates it from the over-hyped versions marketed in “deep learning” and similar-sounding programs is the complexity, by which is meant the number of possible inputs, layers (the “w*y” is a “layer”), and outputs. Some ANNs can be a tangled mess, with lines connecting layers here and there and everywhere, with weights aplenty. But none differs in any essential sense from our simple network.

In short, an ANN is just like our wooden abacus. It is not alive. It is completely dumb. It is a machine which takes inputs, applies definite operations to them, and produces an output. Your sewing machine and typewriter do the same. And so does an abacus. This is not intelligence, though these are all artefacts.

The proof is complete, but it is doubtful it will be convincing to those who have for too long believed their own press. So let’s press the example.

Suppose we add more complexity to our ANN, as in the picture above. The topmost “hidden” node takes inputs from three input nodes, and produces, after weighting these three inputs, two inputs to output nodes, which in those output nodes are hit with other weights to produce z_1 (the topmost output node). The weights are not drawn, but they are there, as described.

Well, this is simply a mechanical process, once the weights are specified. Barring malfunction, it is entirely deterministic. It is a dry process. There is no mystery to it. Adding layers and complexity just makes it bigger and more expensive to run. It will never make it alive, or intelligent. A pipe organ is not more alive than a flute because it has more gears and levers.

It does not matter where the weights come from, via back propagation or something else. A weight is a weight. Changing from w = 2 to w = π does not make our simple ANN alive or intelligent because the second weight is more complex. Playing only the black keys on piano does not make the tune closer to an intelligence than playing only Jingle Bells.

Since the origin of the weights do not matter, it does not matter if they are recreated on some regular basis, perhaps as a function of how far the output nodes are from some eventual reality. That is, making the mechanism “dynamic” does not make it alive, or intelligent. If fact, as was explained in the abacus example, it makes no difference whatsoever. It is just makes it more complex.

Too, speeding up the calculations only makes the thing run faster; speed is not intelligence. And there is nothing that will “emerge” from the structure as complexity grows—not supported by any physical process, that is. (The topic of “emergence” needs its own article.)

There is no hope of creating intelligence from artificial neural networks, or anything that works in a similar fashion to them.

I have more on this topic in Uncertainty: The Soul of Modeling, Probability & Statistics.

I learned of the Axios article from Christos Argyropoulos.

Discover more from William M. Briggs

Subscribe to get the latest posts sent to your email.

Last updated on September 19, 2017

Briggs

Briggs is an internationally reviled thoughtcriminal, listed as One Of The Top 7 Dangerous Minds by the Hague.

View All Posts

22 Comments

Gary

September 19, 2017, 8:49 am

Call it “simulated decision-making”, then. Is it useful? Will it become dangerous? What might we discover with it or about it? Those are more important (i.e., practical) questions than a quibble about the meaning or nature of “intelligence” in the context of mechanical devices. Not that accurate definitions are unimportant, but they’re only the starting point for conversations about initiatives that could have large consequences.
John

September 19, 2017, 8:51 am

Axios article link is to http://www.washingtontimes.com/news/2017/sep/11/climate-change-activists-want-punishment-for-skept/

Not at all what I wanted to be reading.
Briggs

September 19, 2017, 9:00 am

John,

I have a Stream piece on this subject forthcoming.

Gary,

AI models are no different than any other statistical model. They’re only dangerous to the extent we forget they are the map and not the territory.
Bill

September 19, 2017, 10:04 am

@John
Is this the link you are looking for?

https://www.axios.com/ai-pioneer-advocates-starting-over-2485537027.html
The original is a climate change at WP
John

September 19, 2017, 10:10 am

That’s the one, thanks Bill. Looking forward to the Stream article too.
Bill_R

September 19, 2017, 10:50 am

Briggs,

Does anyone in the know actually believe that happy hoohah about neural nets being anything other than an over-parametized regression with large numbers of interactions? The conversations I’ve had with ML and ANN “advocates” rely on a limited understanding of regression and objective functions and an abiding faith in over-parameterization and regularization.
James

September 19, 2017, 11:00 am

ML folks usually don’t care about parameters, but instead care about out of sample prediction quality.

So they’ve got that going for them.
Bill_R

September 19, 2017, 11:32 am

James,

Agreed, but that is word games. To paraphrase Trotsky “You may not care about the parameters, but the parameters care about you.”

Whether we call them weights, parameters, or angelic beings they have the same function in applications: making generalizable predictions. Claiming no interest in the “guts” does not remove the problems with having large degrees of freedom (e.g. adjustable weights, or parameters, etc.)

Likewise, by choosing an objective function I implicitly am setting a distribution. Statisticians tend to pay attention to that. Mathematical tractability is historically important for figuring out what the heck you are doing. Computational tractability is important for getting an answer.
Ken

September 19, 2017, 11:52 am

Maybe one of the reasons for endorsing a new approach to AI (“he is now “deeply suspicious” of back-propagation, the workhorse method that underlies most of the advances we are seeing in the AI field today, including the capacity to sort through photos and talk to Siri.) is that the current approach seems to require, in practice, information that when accommodated creates significant security vulnerabilities:

https://www.fastcodesign.com/90139019/a-simple-design-flaw-makes-it-astoundingly-easy-to-hack-siri-and-alexa

When Briggs says, “There is no hope of creating intelligence from artificial neural networks, or anything that works in a similar fashion to them.” he’s obviously not using as a benchmark any broad average assessment of what constitutes “intelligence.” Defining one’s terms matters. Consider:

About $12.5B is projected to be spend on AI in 2017.

About 59M people in North America (US & Canada) spend some $7B on fantasy football annually (a figure that’s grown each year), with as much as $30B spent annually on all fantasy sports and directly related ‘things’ (e.g. special software, services, etc.). Estimates vary, but the numbers are staggering and growing.

Fantasy sports has become a multi-billion dollar industry…and there seems no end in sight–fantasy sports becoming a trillion dollar industry before our children graduate from college is not out of the realm of possibility. There’s even legal arbitration available (e.g. sportsjudge.com).

Such is the nadir reached by our species and culture to which we belong.

Pondering such things one cannot help but conclude that what accounts for average human “intelligence” isn’t all it’s cracked up to be, or, that it will be all that hard to reproduce artificially. In other words, if by “intelligence” Briggs is referring to ‘average humans’ or something like that, it would seem the threshold of “intelligence” is rather low for AI developers.
Sander van der Wal

September 19, 2017, 2:08 pm

Neural networks were apparently created for tasks that use the brain for processing, but which were not in itself tasks for which you need intelligence. Computer vision, for instance. Recognizing a cat is not a rational process. Making up the cat Universal is a rational process, but that was not what the network was for.

Also, the argument about extra levers does’nt apply here. You need to recognize that the cat is big enough and fast enough to eat you. Processing should be fast enough for that, and more levels can give you faster processing.
MatVel

September 19, 2017, 2:16 pm

Briggs,

But what about quantum computers? If a breakthrough is made and humanity finally gets to the level of quantum computation, could people then make more reliable claims of ”artificial intelligence” ?

After all, quantum computers wouldn’t be deterministic since quantum mechanics isn’t, so a non-deterministic quantum super-computer could seem to be the provider of the magic spark that finally achieves ”artificial intelligence”.
DG

September 19, 2017, 2:38 pm

Not a bad article Briggs…I agree, there’s no reason to believe that they will ever create a robot or computer that will have actual intelligence. They may come up with machines that better imitate intelligence but nothing with actual intelligence. You ought to look at John Searle’s Chinese room thought experiment if you haven’t already.
Bob Kurland

September 19, 2017, 3:35 pm

MatVel, just because probabilities are involved in quantum mechanical measurements, doesn’t mean that quantum mechanics isn’t “deterministic”. Given inital, boundary conditions the solution of the Hamiltonian is fixed. I think that’s a common misconception. On the other hand you can say that there are choices between alternatives. There’s the Conway-Kochen Free Will Theorem that says if we have free will, so do fundamental particles. Which is to say, that past history doesn’t determine a specific choice, just possible choices.
John Q Public

September 19, 2017, 4:46 pm

Even worse, all this AI stuff depends on doing statistics on large data sets. Where do most of the data sets come from? People! If nobody is on Twitter, the algos can’t do anything with your tweets.
Chaeremon

September 20, 2017, 4:34 am

In “The application of machine learning for evaluating anthropogenic versus natural climate change” Jennifer Marohasy and John Abbot wrote about their simulated attack on the ClimateMonster® by using ANN as their Excalibur© (GeoResJ 2017) sword. Since there might be clients of mine who ask “what’s they do with A.I.” I extracted from that paper the flow of control and data, so that I can continue paying my bills by $answering$ $questions$ (etc) to clients (and / or write / apply them the software they are $willing to pay for$).

To make the long ANN Excalibur© sword story short: the authors used signal analysis software for extracting (er, cherry-picking, arbitrarily …) sinus curves (aka. nameable but pairwise incomparable pattern) from ClimateMonster® proxy data. In the second step they used ANN for combining a _sufficient_ handful of such sinus pattern so that the ClimateMonster®’s wiggling tail appears at the ANN’s output (third step omitted here for brevity). In essence the ANN “learned, from throwing dice, ad nauseam” which sinus pattern (yes / no) to combine for fitting their overall ClimateMonster® data curve within tolerable (academically preconcerted, jet physically unmeasurable) error bars.

But this approach is merely a typical application of tools for operational research, every decent ILP (etc) program does that for you (if you understand how to do that). It simply does not matter what’s the magician’s name of the tool, what matters is that any two of such tools are Turing equivalent (can be reduced to each other depending on your cleverness and patience). It also does not matter that you might substitute patience by ultra fast computer – no way.

Speaking of tolerable error bars: search the internet for cheddar and room: you’ll fool yourself if you believe the search engine knows the difference between a) cheese, b) village and c) hotel; the trick here is that hordes of internet users can have the same interest as you, and therefore the search results look as if they fit your vaguely specified expectation.
John Morris

September 20, 2017, 6:26 pm

Sounds great, the “AI” people get to build useful things with no risk of accidentally getting SkyNet by accident. Where is the downside?
Tony S

September 20, 2017, 8:14 pm

Please post on “emergence”. Seems to be the answer du jour for the secularist who can’t explain a feature of reality within their materialist mindset.
Chaeremon

September 20, 2017, 10:52 pm

@John Morris, Re “Where is the downside?”
The current++ state of affair/s is that the / an “A.I.” singularity is expected to emerge within 10 years from now. So, gimme the grant money!

IMO “A.I.” is a tool for fooling naive competitors into investing the wrong way.

Would Google open-source their search algo? No way. But they open-source their “A.I.” algo. Give me a break.
Chaeremon

September 20, 2017, 11:16 pm

@Tony S, Re: “emergence” and “A.I.”:

There are two extremes: Marvin L. Minsky proposed to dump a brain on a CD and then re-emerge it (for whatever purpose).

The other side was stated by Alan C. Kay in an interview recently (09.15.17, fastcompany.com): “There’s all sorts of pre-processing you can do with the computing we have now to put a lot more semantics in there, and look at the s_h_t you’re retrieving.” and “Why the f_c_k can’t we type in a question and get a decent answer?”.
Let me add to this: why the f_c_k does the machine not check back on non-trivial questions. Isn’t it curious? (Yes!).

[2nd attempt to overcome the spam folder]
Pingback: Real Versus Fake Fake News: Update – William M. Briggs
Pingback: Judea Pearl Is Wrong On AI Identifying Causality, But Right That AI Is Nothing But Curve Fitting – William M. Briggs
Pingback: Strong AI From Ping Pong Balls – William M. Briggs

Briggs on How Can You Tell If You Have ESP?June 30, 2025
Spetzer, All good points. I go into many of them later in the chapter.
spetzer86 on How Can You Tell If You Have ESP?June 30, 2025
If the receiver and sender independently wrote down the guess / card with no verbal communication, it might be better.…
JH on Class 56: The Best Model!June 29, 2025
How can you make predictions in the big data era without using a model? Trust your gut.
Jonas P. Kay on Which God Are You Rejecting? David Bentley Hart’s The Experience of God, Part IJune 28, 2025
I answer all by saying you cannot get something from nothing. There is beauty in its simplicity and it is…
Johnno on England’s Mandatory Suicide & WomenJune 27, 2025
So what's the betting pool on the odds about when and in which country will be the first to perform…