What Neural Nets Really Are: Or, Artificial Intelligence Pioneer Says Start Over

There is ancient wisdom in the phrase never believe your own press that computer scientists have laid aside in their quest to discover “artificial” “intelligence”.

In the Axios article “Artificial intelligence pioneer says we need to start over” Steve LeVine writes:

In 1986, Geoffrey Hinton co-authored a paper that, four decades later, is central to the explosion of artificial intelligence. But Hinton says his breakthrough method should be dispensed with, and a new path to AI found…

Speaking with Axios on the sidelines of an AI conference in Toronto on Wednesday, Hinton, a professor emeritus at the University of Toronto and a Google researcher, said he is now “deeply suspicious” of back-propagation, the workhorse method that underlies most of the advances we are seeing in the AI field today, including the capacity to sort through photos and talk to Siri. “My view is throw it all away and start again,” he said…

In back propagation, labels or “weights” are used to represent a photo or voice within a brain-like neural layer. The weights are then adjusted and readjusted, layer by layer, until the network can perform an intelligent function with the fewest possible errors.

But Hinton suggested that, to get to where neural networks are able to become intelligent on their own, what is known as “unsupervised learning,” “I suspect that means getting rid of back-propagation.”

If you have not read the series on how an abacus cannot be a brain, nor considered “artificial intelligence”, please do so first (Part I, Part II, Unsupervised learning digression).

Back propagation is only a technique, meaning there are others, to create weights in an “artificial neural network”. Not for the first time do I praise, with genuine enthusiasm, the marketers of computer science for creating wonderful names.

Here is the world’s simplest ANN:

y –> w*y –> z

Some value of y is input into the “network”, and it is then “hit” by a weight, to produce the outcome z. So that if y = 7 and the weight is 2, then (brace yourselves!) z = 14.

It does not matter where the weight w came from, whether from back propagation or from the Lord Himself. It is the weight.

Now this is an ANN. The only thing that separates it from the over-hyped versions marketed in “deep learning” and similar-sounding programs is the complexity, by which is meant the number of possible inputs, layers (the “w*y” is a “layer”), and outputs. Some ANNs can be a tangled mess, with lines connecting layers here and there and everywhere, with weights aplenty. But none differs in any essential sense from our simple network.

In short, an ANN is just like our wooden abacus. It is not alive. It is completely dumb. It is a machine which takes inputs, applies definite operations to them, and produces an output. Your sewing machine and typewriter do the same. And so does an abacus. This is not intelligence, though these are all artefacts.

The proof is complete, but it is doubtful it will be convincing to those who have for too long believed their own press. So let’s press the example.

Suppose we add more complexity to our ANN, as in the picture above. The topmost “hidden” node takes inputs from three input nodes, and produces, after weighting these three inputs, two inputs to output nodes, which in those output nodes are hit with other weights to produce z_1 (the topmost output node). The weights are not drawn, but they are there, as described.

Well, this is simply a mechanical process, once the weights are specified. Barring malfunction, it is entirely deterministic. It is a dry process. There is no mystery to it. Adding layers and complexity just makes it bigger and more expensive to run. It will never make it alive, or intelligent. A pipe organ is not more alive than a flute because it has more gears and levers.

It does not matter where the weights come from, via back propagation or something else. A weight is a weight. Changing from w = 2 to w = π does not make our simple ANN alive or intelligent because the second weight is more complex. Playing only the black keys on piano does not make the tune closer to an intelligence than playing only Jingle Bells.

Since the origin of the weights do not matter, it does not matter if they are recreated on some regular basis, perhaps as a function of how far the output nodes are from some eventual reality. That is, making the mechanism “dynamic” does not make it alive, or intelligent. If fact, as was explained in the abacus example, it makes no difference whatsoever. It is just makes it more complex.

Too, speeding up the calculations only makes the thing run faster; speed is not intelligence. And there is nothing that will “emerge” from the structure as complexity grows—not supported by any physical process, that is. (The topic of “emergence” needs its own article.)

There is no hope of creating intelligence from artificial neural networks, or anything that works in a similar fashion to them.

I have more on this topic in Uncertainty: The Soul of Modeling, Probability & Statistics.

I learned of the Axios article from Christos Argyropoulos.

Signal + Noise vs. Signal — Important Update

If we imagine these are atmospheric concentrations or stock price anomalies, this is a terrific example of reification, or replacing what did happen with what did not.

Update I see that I failed below to demonstrate the ubiquity of the problem. So your homework is to search “testing trend time series” and similar terms and discover for yourself. Any kind of hypothesis test used on a time series counts.

My impetus was in reading an article about a paper some colleagues and I wrote about atmospheric ammonia. The author wrote, “The statistical correlation between hourly ammonia concentrations between measurement stations is weak due to large variability in local agricultural practice and in weather conditions. If data are aggregated to longer time scales, correlations between stations clearly increase due to the removal of noise at the hourly timescale.”

There’s the belief in “noise”, which does not exist, and there’s also the second (bigger) mistake, which is measuring correlation of time series after smoothing, which increases (in absolute value) the correlation (as has been proved here and Uncertainty_ many, many, many times). This happens even for two strings of absolutely unrelated, made-up numbers. Try it yourself.

So you just look for mentions of “noise” in stock prices, and so on and see if I’m right about the scale of the problem.

Original article

Two weeks ago the high temperature on the wee island upon which I live was 82F (given my extreme sloth, I am making all details up).

Now for the non-trick question: What was the high temperature experienced by those who went out and about on that day?

If you are a subscriber to the signal+noise form of time series modeling, then your answer might be 78F, or perhaps 85F, or even some other figure altogether. But if you endorse the signal form of time series modeling, you will say 82F.

Switch examples. Three days back, the price of the Briggs Empire stock closed at $52 (there is only one share). Query: what was the cost of the stock at the close of the day?

Signal+noise folks might say $42.50, whereas signal people will say $52.

Another example. I was sitting at the radio AM DXing, pulling in a station from Claxton, Georgia, WCLA 1470 AM. The announcer came on and through the heavy static I thought I heard him give the final digit of a phone number as “scquatch”, or perhaps it was “hixsith”.

Here are two questions: (1) What number did I hear? (2) What number did the announcer say?

The signal+noise folks will hear question (1) but give the answer to (2) (they will answer (2) twice), whereas the signal folks will answer (1) with “scquatch or hixsith”, and answer (2) by saying, “Hey signal+noise guys, a little hand here?”

We have three different “time series”: temperature, stock price, radio audio. It should be obvious that everybody experiences the “numbers” or “values” of each of these series as they happen. If it is 82F outside, you feel the 82F and not another number (and don’t give me grief about fictional “heat indexes”); if the price is $52, that is what you will pay; if you hear “scquatch”, that is what you hear. You do not experience some other value to which ignorable noise has been added.

For any time series (and “any” include our three), some thing or things caused each value. A whole host of physical states caused the 82 degrees; the mental and monetary states of a host of individuals caused the $52; a man’s voice plus antenna plus myriad other physical states (ionization of certain layers of the atmosphere, etc.) caused “scquatch” to emerge from the radio’s speakers.

In each case, if we knew—really knew—what these causes were, we would not only know the values, which we already knew because we experienced them, but we could predict with certainty what the coming values would be. Yet this list of causes will really only be available in artificial circumstances, such as simulations.

Of the three examples, there was only one in which there was a true signal hidden by “noise”, where noise is defined as that which is not signal. Temperature and stock price were pure signal. But all three are routinely treated in time series analysis as if they were composed of signal+noise. This mistake is caused by the Deadly Sin of Reification.

No model of any kind is needed for temperature and stock price; yet models are often introduced. You will see, indeed it is vanishingly rare not to see, a graph of temperature or price over-plotted with a model, perhaps a running-mean or some other kind of smoother, like a regression line. Funny thing about these graphs, the values will be fuzzed out or printed in light ink, while the model appears as bold, bright, and thick. The implication is always that the model is reality and values a corrupted form of reality. Whereas the opposite is true.

The radio audio needs a model to guess what the underlying reality was given the observed value. We do not pretend in these models to have identified the causes of the reality (of the values), only that the model is conditionally useful putting probabilities on possible real values. These models are seen as correlational, and nobody is confused. (Actual models, depending on the level of sophistication, may have causal components, but since the number of causes will be great in most applications, these models are still mostly correlational.)

We agreed there will be many causes of temperature and stock price values. One of the causes of temperature is not season—how could the words “autumn” cause a temperature?—though we may condition on season (or date) to help us quantify our uncertainty in values. Season is not a cause, because we know there are causes of season, and that putting “season” (or date) into a model is only a crude proxy for knowledge of these causes.

Given an interest in season, we might display a model which characterizes the average (or some other measure) of uncertainty we might have in temperature values by season (or date), and from this various things might be learned. We could certainly use such a model to predict temperature. We could even say that our 82F was a value so many degrees higher or lower than some seasonal measure. But that will not make the 82F less real.

That 82F was not some “real” seasonal value corrupted by “noise”. It cannot be because season is not a cause: amount of solar insolation, atmospheric moisture content, entrainment of surrounding air, and on and on are causes, but not season.

Meteorologists do attempt a run at causes in their dynamic models, measuring some causes directly and others by proxy and still others by gross parameterization, but these dynamical models do not make the mistake of speaking of signal+noise. They will say the temperature was 82F because of this-and-such. But this will never be because some pure signal was overridden by polluting noise.

The gist is this. We do not need statistical models to tell us what happened, to tell us what values were experienced, because we already know these. Statistical models are almost always nothing but gross parameterization and are thus only useful in making predictions, thus they should only be used to guess the unknown. We certainly do not need them to tell us what happened, and this includes saying whether a “trend” was observed. We need only define “trend” and then just look.

Why carp about this? Because the signal+noise view brings in the Deadly Sin of Reification (especially in stock prices, where everybody is an after-the-fact expert), and that sin leads to the worse sin of over-certainty. And we all know where that leads.

Addendum

“But, Briggs. What if we measured temperature with error?”

Great question. Then we are in the radio audio case, where we want to guess what the real values were given our observation. There will be uncertainty in these guesses, some plus-or-minus to every supposed value. This uncertainty must always be carried “downstream” in all analyses of the values, though it very often isn’t. Guessing temperatures by proxy is a good example.

I have more on this topic in Uncertainty: The Soul of Modeling, Probability & Statistics.

Summary Against Modern Thought: Our Intellects Are United To Our Bodies

This may be proved in three ways. The first...
This may be proved in three ways. The first…
See the first post in this series for an explanation and guide of our tour of Summa Contra Gentiles. All posts are under the category SAMT.

Previous post.

We’ve finally done enough groundwork to get to some juicy details.

Chapter 90 That an intellectual substance is united only to a human body as its form (alternate translation) We’re still using the alternate translation.

1 Having shown that a certain intellectual substance—the human soul—is united to a body as its form, we must now inquire whether any intellectual substance is united to any other body as its form. As to the heavenly bodies, we have, indeed, already presented Aristotle’s opinion on the question of their being animated by an intellectual soul, and have observed that Augustine leaves the matter in doubt. Bodies composed of elements, then, should be the focal point of the present inquiry.

2 Now, it is quite clear that an intellectual substance is not united as form to such a body except a human one. For, were it united to a body other than the human, the latter would be either mixed or simple. But it cannot be united to a mixed body, because that body would have to be the most symmetrically structured one of its genus; and it is a fact of observation that mixed bodies have forms so much the more noble, the nearer they come to possessing an equable blending of their constituent parts.

Thus, if the subject of a form of the noblest type, such as an intellectual substance, is a mixed body, it must possess that harmonious quality in the highest degree. And this explains why we find that flesh of fine texture and a keen sense of touch, which reveal evenness of bodily temperament, are signs of mental acuteness. Now, the most evenly tempered body is the human, so that, if an intellectual substance is united to a mixed body, the latter must be of the same nature as the human body; and its form, too, would be of the same nature as the human soul, if it were an intellectual substance. Hence, there would be no specific difference between the animal so constituted and man.

Notes Thus Bushmen and Europeans are all men.

3 It is likewise impossible for an intellectual substance to be united as form to a simple body, such as air, water, fire, or earth. For each of these bodies is of uniform character in the whole and in the parts; a part of air is of the same nature and species as the whole air, having, indeed, the same motion; and so it is with the other simple bodies.

Like movers, however, must have like forms. Therefore, if any part of any one of those bodies—air, for example—is animated by an intellectual soul, then for that very reason the whole air and all its parts will be animated. But this manifestly is not so; for there is no evidence of vital operation in the parts of the air or of other simple bodies. Therefore, a substance of intellectual type is not united as form to any part of the air or of similar bodies.

4 Moreover, if an intellectual substance is united as form to one of the simple bodies, it will either be endowed with an intellect only, or will have other powers such as those that belong to the sensitive or to the nutritive part, as in man. In the first case, there would be no point in its being united to a body. For every corporeal form has some operation proper to itself which is exercised through the body; whereas the intellect has no operation pertaining to the body, except by way of moving it; because understanding is not an operation that can be exercised through any bodily organ, and, for the same reason, neither is the act of the will.

The movements of the elements, moreover, are derived from natural movers, namely, from generators; the elements do not move themselves. Hence, the mere possession of movement on their part does not imply that they are animated. But, if the intellectual substance, hypothetically united to an element or a part of an element, is endowed with other psychic parts, then, since these parts are parts of certain organs, a diversity of organs will necessarily be found in the body of the element. But this is incompatible with its simplicity. An intellectual substance, therefore, cannot possibly be united as form to an element or to a part thereof.

Notes Not for the first, and not for the last, time, we remind the reader that our intellects are not material.

5 There is also the fact that the nearer a body is to prime matter, the less noble it is, being more in potentiality and less in complete act. The elements, however, are nearer than mixed bodies to prime matter, since they are the proximate matter of mixed bodies. Hence, the bodies of the elements are less noble in their specific nature than mixed bodies. Since, then, the nobler form belongs to the nobler body, it is impossible that the noblest form, namely, the intellective soul, should be united to bodies of the elements.

Notes How complex a material body must be to united to an intellect is, of course, an open question. Our complexity is enough, as observation proves, but was it enough in Neanderthals?

6 Furthermore, if such bodies or any of their parts were animated by souls of the noblest type—the intellective—then the more closely bodies are annexed to the elements, the nearer they must be to life. Yet this evidently is not so, but rather the contrary; for plants have life in a lesser degree than animals, yet they are nearer to earth; and minerals, which are nearer still, have no life at all. Therefore, an intellectual substance is not united as form to an element or to a part thereof.

7 Then, too, extreme contrariety is destructive of life in all corruptible agents; excessive heat or cold, wet or dryness, are fatal to animals and plants. Now, it is in the bodies of the elements especially that we find the extremes of these contraries. So, life cannot possibly exist in them. It is, therefore, impossible for an intellectual substance to be united to them as their form.

Notes This proof won’t be completely satisfactory, but I suppose it does rule out clouds of gas possessing rationality. Sorry, Trekkies.

8 Again, although the elements are incorruptible as a whole, each of their parts is corruptible as having contrariety. So, if some of their parts have cognitive substances united to them, it seems that the power of discerning things corruptive of them will be attributed to them in the highest degree. Now, this power is the sense of touch, which discriminates between hot and cold, and similar contraries; and for this reason, all animals possess that sense, as something necessary for preservation from corruption. But the sense of touch cannot possibly be present in a simple body, since the organ of touch must not contain contraries actually but only potentially; and this is true of mixed and tempered bodies alone. It is, therefore, impossible that any parts of the elements should be animated by an intellective soul.

9 And again, every living body has local motion of some kind through its soul; thus, the heavenly bodies—if in fact they are animated—have circular movement; perfect animals, a progressive movement; shell fish, a movement of expansion mid contraction; plants, a movement of increase and decrease; and all these are in some way movements in respect of place. Yet in the elements there is no evidence of any motion deriving from a soul, but only of natural movements. Therefore, The elements are not living bodies.

10 There is, however, another hypothesis, namely, that although an intellectual substance be not united to a body of an element, or to a part thereof, as its form, nevertheless it is united to it as its mover. Now, the former cannot be said of the air; for, since a part of air is not terminable through itself, no determinate part of it can have its own proper movement, by reason of which an intellectual substance may be united to it.

11 Moreover, if an intellectual substance is naturally united to a body as a mover to its proper movable, then the motive power of that substance must be limited to the movable body to which it is united naturally; for in no case does the exercise of the power of a proper mover exceed its proper movable. But it seems ridiculous to say that the power of an intellectual substance does not, in discharging its function of moving, exceed a determinate part of an element, or some mixed body. Seemingly, then, it must not be said that an intellectual substance is in a natural fashion united to an elemental body as its mover, unless it is also united to it as its form.

12 Furthermore, principles other than the intellectual substance can cause the movement of a body composed of elements. Therefore, intellectual substances would not need to be naturally united to such bodies so as to account for this movement.

13 This rules out the opinion of Apuleius and of certain Platonists, who said that “the demons are animals ethereal in body, endowed with reason, passive in soul, and of eternal duration”; as well as the theory of certain heathen thinkers, who, supposing the elements to be animated, instituted divine worship in their honor. Likewise set aside is the opinion of those who say that angels and demons have bodies naturally united to them—bodies of the nature of the higher or lower elements.

Notes We see the same sort of thing these days in people worshipping technology.

Doubting The EM Drive

Figure 19, from the paper.
Figure 19, from the paper.

This article originally appeared on 5 December 2016, but I’m reposting it because interest in the EM Drive has renewed. RT reports, “China’s announcement would put them ahead of NASA in the race for an EM Drive.” Yahoo reports, “A video put out by a Chinese propaganda channel claims that Chinese scientists have a working prototype — something NASA has failed to achieve.” The doubts I express below have increased.

Heard of the EM Drive? EM is for electromagnetic. The idea is that, in an enclosed cone, some microwaves are bounced around, and that this bouncing somehow propels the cone, and presumably whatever is attached to it, forward.

Nothing comes out of the cone, mind. There is no propellant. The cone is sealed tight.

So how does it push, when nothing pushes back against it? As one popular article put it, the EM Drive appears to violate Newton’s third law, which is for every action, there is an equal and opposite reaction. In the EM Drive, there is an action but no apparent reaction. Conservation of momentum is no more. Apparently.

Another name for the machine is the RF resonant cavity thruster. A version of it was put to the test recently by NASA. And it seemed to work.

I have doubts.

The paper (which is free to read) is “Measurement of Impulsive Thrust from a Closed Radio-Frequency Cavity in Vacuum” by Harold White, Paul March, James Lawrence, Jerry Vera, Andre Sylvester, David Brady, and Paul Bailey in the Journal of Propulsion and Power.

You have to read the paper for the introduction and apparatus and experimental description (there is no reason to repeat it here). Many readers of this blog won’t have trouble understanding the gist.

Wrong regression

Finished reading the paper? Let’s jump to the end.

Figure 19 is the summary of results of the tests of the forward and reverse thrust vacuum testing. The large, original version should be consulted instead of the smaller image which leads this post.

Power was varied and force estimated. The red circles are the results of the estimates of force from individual experiments at the given power levels. The purple circles/lines are only averages and can be ignored. A (dashed gray) line was over-plotted, the result of a linear regression of power and estimated force. Technically (and you can ignore this point), the uncertainty in the estimated force should be used in the regression, but it’s not clear they did this. That means the gray line, and subsequent equation Force = 1.16843 x Power will be too certain.

Skip the technicalities and notice something more important. The red circles at powers of about 40 W are tightly clustered and indicate a low level of estimated force. The red circles at powers of about 60 W are much more variable, but do indicate some (not all) higher levels of estimated force. But the red circles at 80-85 W look to be about the same, with a tad less variability, than the estimated forces at 60 W.

In other words, it appears as if the estimated forces tails off, or plateaus after 60 W. Might the estimated force jump or increase again at, say, 200 W or greater? Sure. It might do anything. But all we have is the data in front of us. And from that, it looks like it levels off.

If that is so—and I emphasize I am only guessing—then there are two things to consider. The first is that the equation reported by the authors of 1.2 ± 0.1 mN/kW isn’t quite right and is far too optimistic. If the force plateaus, then the better statistical estimate of force is roughly 100 micro-Newtons for powers greater than 60 W (with some plus and minus), which is at best more than 10 times smaller than the forces estimated by the regression. (The regression is also optimistic, because power levels didn’t even reach 100 W, let alone kilo-watts.)

But so what. 100 micro-Newtons is still greater than 0 Newtons, and any force north of 0 proves the concept of the EM drive.

That’s brings us to the second consideration. That leveling off casts suspicion that a form of energy leakage has not been identified. We’d expect greater thrust with greater power levels, but we didn’t see it, which is evidence, but far from conclusive evidence, that something has been missed. We’re talking estimated micro-Newtons here, so it wouldn’t take much leakage to provide the thrusts seen.

Now the force is estimated (via a chain of inference) at first via aluminum electrostatic fins, so it looks like leakage of magnetic field from the cone wouldn’t effect these; but where the aluminum connects, at the circuitry, there could have been induced fields (or maybe the aluminum was dirty or dusty?). And that’s just one of many places to look. But there’s no point me going over possibilities. Let somebody who is better do it. (Perhaps you?)

Error sources

I found this to be the most fascinating part of the paper. The authors took leakage seriously and went to great pains to measure potential errors. But from the language describing some of the sources, you have to wonder if a wee bit of over-confidence snuck in.

The second error is RF interaction with the surrounding environment, which has the potential for possible RF patch charging on the walls of the vacuum chamber interacting with the test article to cause displacement of torsion pendulum. Leaking RF fields are kept very low by ensuring RF connections are tight and confirmed by measuring with an RF leakage meter (levels are kept below a cell phone RF leakage level). Any wall interaction needs to be a well-formed resonance coupling and, because of the high frequency, will be highly sensitive to geometry.

Keeping the RF test article on resonance inside of the frustum volume requires a phase-locked loop system to maintain resonance as the test article expands during operation, so it is not likely that the RF test article can establish and maintain an effective external RF resonance. [Paragraph break mine.]

Well, it’s true at these frequencies geometry is important (ask anybody who builds antennas for gigahertz signals), it’s also true these same RF signals show up in the damnedest places. I merely mention this as one example of how one can fool oneself.

To the stars!

I’d be thrilled to learn my doubts were baseless, or were quibbles, and the EM drive worked. But there are more doubts about how the EM drive is supposed to work. Quantum mechanical pilot-waves. If you don’t know about these, you’ll have to wait for another day. But the authors appear to mix up, as most do, what is from our knowledge of what is, of the ontic with the epistemic, with ontology and epistemology. I’ll save those criticisms, because what is above is enough for now.

Update Forbes is talking about it, and so are others.