Shoutout to Tom English: How much of the animus you display against Marks and Dembski is scholarly?
|October 13, 2014||Posted by News under Conservation of Information, Informatics, Intelligent Design|
Why am I asking?
Recently, I wrote a piece for Salvo discussing the Law of Conservation of Information, in which I traced its history:
Dembski [author of Being as Communion, 2014] did not invent the underlying idea of conservation of information. Biologist Peter Medawar (1980s) and computer scientist Tom English (1996) advanced the view that information is not created from scratch but rather is redistributed from existing sources. Robert J.Marks II and his students at Baylor University in Texas have developed the idea in terms of “search,” and their approach has profound consequences for plausible ideas of how evolution occurs, especially when vast claims are made for WEASEL and other “evolution” computer programs. More. 
Not long after, Tom English contacted Salvo to say,
Ms. O’Leary, my 1996 formulation of “search” was needlessly complicated. With simplification, “search” is clearly a process of sampling a set of alternatives (which Dembski and Marks refer to as the sample space). To my huge embarrassment, “conservation of information” turns out to be nothing but obfuscation of statistical independence — a concept that undergraduates encounter early in introductory courses on probability and statistics. There can be no conservation of information in random selection of a sample because there is no information whatsoever. It is absurd to speak of conserving what does not exist.
If samplers have no information about the samples they draw, then how do we account for the fact that sampler (“search”) A is more likely than sampler B to select a sample that includes at least one element of the target (to “hit the target”)? There is not the least mystery here. Samplers differ in their biases. That is the gist of why I was wrong to indicate in 1996 that information somehow resides in samplers, and why Dembski and Marks are wrong to do so today.
The following includes a technical correction of my own errors, but ends with exposition that should make sense to everyone who is able to follow you:
The errors of Dembski and Marks apparently derive from a misunderstanding of the “no free lunch” theorem for search. The following links to an interview in which Marks attempts to explain the theorem in layperson’s terms, and provides an accessible discussion of how he goes awry: Bob Marks grossly misunderstands “no free lunch”
He added a a PS:
P.S.–Note that much of the misunderstanding is attributable to misnaming. I know that Ms. O’Leary appreciates the powerful impact of language upon thought. If you refer to the process of sample selection as “search,” designate a particular subset of the sample space as the “target,” and say that the selection process “hits the target” when the sample includes an element of the subset, then you will have a very hard time thinking straight about sampling.
As readers will see, I have taken the liberty of copying the whole for the convenience of addressing the issues raised, embedding the links.
Who is Dr. Thomas English?
Now, as Dr. English implies, lay readers such as your post author and perhaps some others are not well placed to understand the theoretical background to the discussion. However, we may be able to identify and appreciate some of the critical issues.
So I went to the links provided. I learned that Tom English is an Oklahoma-based consultant computer scientist, on Blogger since 2011, at Bounded Theoretics. He also has a business page, Bounded Theoretics and a Facebook page. Here is a list of his publications in the relevant discipline, evolutionary computation.
In his biosketch, he notes that
After earning bachelor’s and master’s degrees in psychology and English, respectively, at Mississippi College, and master’s and doctoral degrees in computer science at Mississippi State University, Tom English began investigating evolution in computational processes. He independently proved what came to be known as the “no free lunch” theorem for optimization, and subsequently published six papers related to it. In empirical research, he obtained by computational evolution a predictor of annual sunspots activity that was far more accurate than any previously reported. …
English was an Assistant Professor at Texas Tech University in Lubbock, Texas (1990-98). He did not proceed to tenure, but went into business as the Thomas English Project, now here. He lists a number of professional links.
But at the blog, as opposed to the “professional introduction” Web site, we encounter a different type of information,
I was a teenage creationist. And science was not the silver bullet. What put an end to my howling was a scholarly survey of the Bible and an introduction to philosophy of science, both in my freshman year at a Baptist college. Fifteen years later, I began researching evolutionary computation. Six of my published papers relate to the “no free lunch” theorems for optimization. I became interested in the “intelligent design” variety of creationism (IDC) when one of its leading proponents, William A. Dembski, referred to the theorems, and also bashed evolutionary computation, in No Free Lunch (2002). My peer-reviewed critique of IDC, coauthored by Garry Greenwood, is the opening chapter of Design by Evolution. I have explained here why IDC is bad theology and bad science.
Okay, so what of the accusations against their approach to the Law of Conservation of Information (COI)?
Here is English’s 1996 paper, Evaluation of Evolutionary and Genetic Optimizers: No Free Lunch, published in Evolutionary Programming.  At the time, he was at the Computer Science Department at Texas Tech University (Lubbock, TX).
Abstract—The recent “no free lunch” theorems of Wolpert and Macready indicate the need to reassess empirical methods for evaluation of evolutionary and genetic optimizers. Their main theorem states, loosely, that the average performance of all optimizers is identical if the distribution of functions is average. The present work generalizes the result to an uncountable set of distributions. The focus is upon the conservation of information as an optimizer evaluates points. It is shown that the information an optimizer gains about unobserved values is ultimately due to its prior information of value distributions. Inasmuch as information about one distribution is misinformation about another, there is no generally superior function optimizer. Empirical studies are best regarded as attempts to infer the prior information optimizers have about distributions–i.e., to determine which tools are good for which tasks.
The paper was updated as late as 2004. Readers are told to see http://www.TomEnglishProject.com for current information, but the site does not seem to be currently online; the information, as noted above, is now here.
English appears to have more or less retracted his own paper since 1996, as this amended version at his Bounded Theoretics site shows. The Abstract is now heavily edited. A number of pages feature crossouts of the text (explanation appended at ).
That perhaps is the context for his comment at Salvo, “conservation of information” turns out to be nothing but obfuscation of statistical independence—a concept that undergraduates encounter early in introductory courses on probability and statistics.
But, whatever the fate of English’s paper, the sense in which Marks and Dembski have used the phrase conservation of information (COI)  is well supported by the work of others in the literature.
Here is a similar statement from Harvard mathematician Yu-Chi Ho about the related No Free Lunch theorem (NFLT):
… unless you can make prior assumptions about the … [problems] you are working on, then no search strategy, no matter how sophisticated, can be expected to perform better than any other. 
The term “No Free Lunch” itself was coined by Wolpert and MacReady (1997)  who write that search can be improved only by “…incorporating problem-specific knowledge into the behavior of the [optimization or search] algorithm.”
Carnegie-Mellon computer scientist Tom Mitchell seems to have originated the COI model in 1980, though he did not call it that. He noted that, in order for computer programs to learn, programmers must insert their own bias:
If consistency with the training instances is taken as the sole determiner of appropriate generalizations, then a program can never make the inductive leap necessary to classify instances beyond those it has observed. Only if the program has other sources of information, or biases for choosing one generalization over the other, can it non-arbitrarily classify instances beyond those in the training set. 
A similar observation was later made by Cullen Schaffer (1994), a principle he called a “conservation law for generalization performance,” comparing a learning program that learns well regardless of circumstances to a perpetual motion machine (a machine that is impossible under the law of conservation of energy): “… a learner [without prior knowledge] … that achieves at least mildly better-than-chance performance … is like a perpetual motion machine.” 
What Marks & Dembski did in “Conservation of Information in Search”  is to quantify the degree of information infused into a search algorithm when “problem-specific knowledge” (Wolpert and MacReady, 1997) is used. To take a simple example, a kid at an Easter egg hunt has a better chance of finding an egg if an adult is shouting “Warmer. You’re getting warmer!” than if the adult were silent. The probabilities of success with problem-specific knowledge and no problem-specific knowledge are combined into a measure Dembski & Marks call active information. The more problem-specific knowledge that is successfully applied, the greater the active information. And if someone imposes faulty knowledge (yelling “Warmer!” when the kid is getting colder), the active information is negative. This, basically, is how they use the law of conservation of information  when assessing evolutionary search programs.
With respect to Dr. English’s P.S., don’t Lenski et al. use a search model in the computer program AVIDA (see their Nature article “The evolutionary origin of complex features”)? What about computer evolution simulation EV? I am not sure the word choices have so much to do with the power of language (rhetoric?) as the conventional use of terminology.
But now, about that Wikipedia entry…
Now I must raise a sensitive topic: It’s not clear that all the animus here is scholarly. Apart from the tone of the blog intro noted above, two other issues are worth noting:
Dr. English wrote a chapter with G. W. Greenwood, “Intelligent Design and Evolutionary Computation,” the first chapter in Design by Evolution: Advances in Evolutionary Design, edited by Philip F. Hingston, Luigi C. Barone, and Zbigniew Michalewicz (2008). There, his evolutionary computation arguments are introduced by a frankly political argument. Here’s the first paragraph:
In the United States, a succession of lost legal battles forced opponents of public education in evolution to downgrade their goals repeatedly. By the 1980’s, evolution was ensconced in the biology curricula of public schools, and references to the creator of life were illegal. The question of the day was whether instruction in creation, without reference to the creator, as an alternative explanation of life violated the constitutional separation of church and state. In 1987, the U.S. Supreme Court decided that it did, and intelligent design (ID) rose from the ashes of creation science. ID may be seen as a downgraded form of creation. While the creation science movement sought to have biology students introduced to the notion that creation is evident in the complexity of living things, the ID movement sought to have students introduced to the notion that design, intelligence, and purpose are evident. ID preserves everything in the notion of creation but the making.
Suppose all of this is true. It is nonetheless irrelevant to the question of whether Marks & Dembski are correct about the limits COI imposes on Darwinian evolution.
While the intent may have been to apprise readers why the subject of the chapter is important to them, the net effect is to create a question whether the material will be handled in an intellectually responsible way. For that reason, most scholars avoid mingling their political opposition to a social movement with computational reasoning as to why some of its assertions are incorrect.
The second issue is that Dr. English has been subject to a number of disciplinary actions at Wikipedia for attempted edits to the bio entry for Marks.
In this context, it is perhaps relevant that he says in one post at his blog, “I’ve come to see Marks as the quintessential late-career jerk,” also admitting (July 29, 2010),
The reason I come off as a nasty bastard on this blog is that I harbor quite a bit of anger toward the creationist bastards who duped me as a teenager. The earliest stage of overcoming my upbringing was the worst time of my life. I wanted to die. Consequently, I am deadly serious in my opposition to “science-done-right proves the Bible true” mythology. William A. Dembski provokes me especially with his prevarication and manipulation. He evidently believes that such behavior is moral if it serves higher ends in the “culture war.” My take is, shall we say, more traditional.
This is hardly the most neutral or, dare we say, productive mindset for an editor of a biography, even at Wikipedia.
One hopes that further critical review of Marks and Dembski’s papers (and Dembski’s Being as Communion) focuses on the issues at hand and takes into account the usual use of terminology in the field.
 Peter Medawar (1915–1987) was a Nobelist (Physiology or Medicine, 1960). Here is a statement of COI from a book by Jan Kahre, The Mathematical Theory of Information, which attributes the concept to Medawar. Medawar was very much a believer in science, as conventionally understood, to judge from his 1988 book, The Limits of Science and in the fact of evolution. He also chaired the mathematics-heavy Wistar Conference in 1966, at which he spoke about mathematical challenges to current Darwinian theory:
“[T]he immediate cause of this conference is a pretty widespread sense of dissatisfaction about what has come to be thought as the accepted evolutionary theory in the English-speaking world, the so-called neo-Darwinian Theory. … There are objections made by fellow scientists who feel that, in the current theory, something is missing … These objections to current neo-Darwinian theory are very widely held among biologists generally; and we must on no account, I think, make light of them. The very fact that we are having this conference is evidence that we are not making light of them.”(Sir Peter Medawar, “Remarks by the Chairman,” in Mathematical Challenges to the Neo-Darwinian Interpretation of Evolution (Wistar Institute Press, 1966, No. 5), pg. xi
 Thomas Milford English “Evaluation of Evolutionary and Genetic Optimizers: No Free Lunch.” In Evolutionary Programming, pp. 163-169. 1996. Available online:
I was not able reproduce it here visually but the bracketed material has replaced material in the Abstract published in the book. So this is how the Abstract now reads:
The recent \no free lunch” theorems of Wolpert and Macready indicate the need to reassess empirical methods for evaluation of evolutionary and genetic optimizers. Their main theorem states, loosely, that the average performance of all optimizers is identical if the distribution of functions is average. [An \optimizer" selects a sample of the values of the objective function. Its \performance" is a statistic of the sample.] The present work generalizes the result to an uncountable set of distributions. The focus is upon the conservation of information as an optimizer evaluates points [statistical independence of the selection process and the selected values]. It is shown that the information an optimizer gains about unobserved values is ultimately due to its prior information of value distributions. [The paper mistakes selection bias for prior information of the objective function.] Inasmuch as information about one distribution is misinformation about another, there is no generally superior function optimizer. Empirical studies are best regarded as attempts to infer the prior information optimizers have about distributions [match selection biases to constrained problems] | i.e., to determine which tools are good for which tasks.
 William A. Dembski and Robert J. Marks II “Conservation of Information in Search: Measuring the Cost of Success” IEEE Transactions on Systems, Man and Cybernetics A, Systems & Humans, vol.5, #5, September 2009, pp.1051-1061.
 Yu-Chi Ho, D.L. Pepyne; “Simple explanation of the no free lunch theorem of optimization,” Proceedings of the 40th IEEE Conference on Decision and Control, 2001. pp.4409 – 4414 ; Yu-Chi Ho, Qian-Chuan Zhao, Pepyne, D.L., “The no free lunch theorems: complexity and security,” IEEE Transactions on Automatic Control, Volume 48, Issue 5, May 2003 pp. 783 – 793.
 David H. Wolpert, William G. Macready, “No free lunch theorems for optimization,” IEEE Trans. Evolutionary Computation 1(1): 67-82 (1997).
 T. M. Mitchell. “The need for biases in learning generalizations,” Technical Report CBM-TR-117, Department of Computer Science, Rutgers University (1980). p.59. Reprinted in Readings in Machine Learning edited by J.W. Shavlik and T.G. Dietterich, Morgan Kauffmann Series in Machine Learning, 1990, pp.184-190.
 Cullen Schaffer, “A conservation law for generalization performance,” in Proc. Eleventh International Conference on Machine Learning, H. Willian and W. Cohen. San Francisco: Morgan Kaufmann, 1994, pp.295-265.
See also: William Dembski, Responding to My Talk at the University of Chicago, Joe Felsenstein’s Argument by Misdirection, for a related discussion.