Home » Uncategorized (Page 2)

Category Archives: Uncategorized

essay

Was Darwin Wrong?

Or have critics – and some fans – missed the point?

Christopher Booker is a contrarian English journalist who writes extensively on science-related issues.  He has produced possibly the best available critical review of the anthropogenic global warming hypothesis. He has cast justifiable doubt on the alleged ill effects of low-level pollutants like airborne asbestos and second-hand tobacco smoke.

Booker has also lobbed a few hand-grenades at Darwin’s theory of evolution.  He identifies a real problem, but his criticism misses a point which is also missed even by some Darwin fans.

Is anti-Darwin ‘politically incorrect’?

In that 2010 article, Booker was reacting to a seminar of Darwin skeptics, many very distinguished in their own fields.  These folk had faced hostility from the scientific establishment which seemed to Booker excessive or at least unfair. Their discussion provided all the ingredients for a conspiracy novel:

[T]hey had come up against a wall of hostility from the scientific establishment. Even to raise such questions was just not permissible. One had been fired as editor of a major scientific journal because he dared publish a paper sceptical of Darwin’s theory. Another, the leading expert on his subject, had only come lately to his dissenting view and had not yet worked out how to admit this to his fellow academics for fear that he too might lose his post.

The problem was raised at an earlier conference:

[A] number of expert scientists came together in America to share their conviction that, in light of the astonishing intricacies of construction revealed by molecular biology, Darwin’s gradualism could not possibly account for them. So organizationally complex, for instance, are the structures of DNA and cell reproduction that they could not conceivably have evolved just through minute, random variations. Some other unknown factor must have been responsible for the appearance of these ‘irreducibly complex’ micromechanisms, to which they gave the name ‘intelligent design’. [my emphasis]

I am a big fan of Darwin. I also have respect for Booker’s skepticism.  The contradiction can be resolved if we look more carefully at what we know now – and at what Darwin actually said.

The logic of evolution

There are three parts to the theory of evolution:

  1. The fact of evolution itself. The fact that the human species shares common ancestors with the great apes.  The fact that there is a phylogenetic “tree of life” which connects all species, beginning with one or a few ancestors who successively subdivided or became extinct in favor of a growing variety of descendants.  Small divergences became large ones as one species gave rise to two and so on.
  2. Variation: the fact that individual organisms vary – have different phenotypes, different physical bodies and behaviors – and that some of these individual differences are caused by different genotypes, so are passed on to descendants .
  3. Selection: the fact that individual variants in a population will also vary in the number of viable offspring to which they give rise. If number of offspring is correlated with some heritable characteristic – if particular genes are carried by a fitter phenotype – then the next generation may differ phenotypically from the preceding one.
    Notice that in order for selection to work, at every stage the new variant must be more successful than the old.

An example: Rosemary and Peter Grant looked at birds on the Galapagos Islands.  They studied populations of finches, and noticed surprisingly rapid increases in beak size from year to year. The cause was weather changes which changed the available food for a few years from easy- to hard-to-crack nuts.  Birds with larger beaks were more successful in getting food and in leaving descendants.  Natural selection operated amazingly quickly, leading to larger average beak size within just a few years.  Bernard Kettlewell observed a similar change, over a slightly longer term, in the color of the peppered moth in England.  As tree bark changed from light to dark to light again as industrial pollution waxed and waned over the years, so did the color of the moths. There are several other “natural experiments” that make this same point.

None of the serious critics of Darwinian evolution seems to question evolution itself, the fact that organisms are all related and that the living world has developed over many millions of years.  The idea of evolution preceded Darwin. His contribution was to suggest a mechanism, a process – natural selection – by which evolution comes about.  It is the supposed inadequacy of this process that exercises Booker and other critics.

Looked at from one point of view, Darwin’s theory is almost a tautology, like a theorem in mathematics:

  1. Organisms vary (have different phenotypes).
  2. Some of this variation is heritable, passed from one generation to the next (have different genotypes).
  3. Some heritable variations (phenotypes) are fitter (produce more offspring) than others because they are better adapted to their environment.
  4. Ergo, each generation will be better adapted than the preceding one. Organisms will evolve.

Expressed in this way, Darwin’s idea seems self-evidently true.  But the simplicity is only apparent.

The direction of evolution

Darwinian evolution depends on not one but two forces: selection, the gradual improvement from generation to generation as better-adapted phenotypes are selected; and variation: the set of heritable characteristics that are offered up for selection in each generation.  This joint process can be progressive or stabilizing, depending on the pattern of variation.  Selection/variation does not necessarily produce progressive change.  This should have been obvious, for a reason I describe in a moment.

The usual assumption is that  among the heritable variants in each generation will be some that fare better than average.  If these are selected, then the average must improve, the species will change – adapt better – from one generation to the next.

But what if  variation only offers up individuals that fare worse than the modal individual?  These will all be selected against and there will be no shift in the average; adaptation will remain as before.  This is called stabilizing selection and is perhaps the usual pattern.  Stabilizing selection is why many species in the geological record have remained unchanged for many hundreds of thousands, even millions, of years.  Indeed, a forerunner of Darwin, the ‘father of geology’ the Scot, James Hutton (1726-1797), came up with the idea of natural selection as an explanation for the constancy  of species.  The difference – progress or stasis – depends not just on selection but on the range and type of variation.

The structure of variation

Darwin’s process has two parts: variation is just as important as selection.  Indeed, without variation, there is nothing to select. But like many others Richard Dawkins, a Darwinian fundamentalist, puts all weight on selection: “natural selection is the force that drives evolution on.” says Dawkins in one of his many TV shows.  Variation represents “random mistakes” and the effect of selection is like “modelling clay”.  Like Christopher Booker, he seems to believe that natural selection operates on small, random variations.

Critics of evolution simply find it hard to believe that the complexity of the living world can all be explained by selection from small, random variations.  Darwin was very well aware of the problem: “If it could be demonstrated that any complex organ existed which could not possibly have been formed by numerous, successive, slight modifications, my theory would absolutely break down.” [Origin]  But he was being either naïve or disingenuous here.  He should surely have known that outside the realm of logic, proving a negative, proving that you can’t do something, is next to impossible.  Poverty of imagination is not disproof!

Darwin was concerned about the evolution of the vertebrate eye: focusing lens, sensitive retina and so on.  How could the bits of an eye evolve and be useful before the whole perfect structure has evolved?  He justified his argument by pointing to the wide variety of primitive eyes in a range of species that lack many of the elements of the fully-formed vertebrate eye but are nevertheless better than the structures that preceded them.

There is general agreement that the focusing eye could have evolved in just the way that Darwin proposed.  But there is some skepticism about many other extravagances of evolution: all that useless patterning and behavior associated with sexual reproduction in bower birds and birds of paradise, the unnecessary ornamentation of the male peacock and many other examples of apparently maladaptive behavior associated with reproduction, even human super-intelligence – we seem to be much smarter than we needed to be as hunter-gatherers.  The theory of sexual selection was developed to deal with cases like these, but it must be admitted that many details are still missing.

The fundamental error in Booker’s criticism of Darwin as well as Dawkins’ celebration of him, is the claim that evolution always occurred “just through [selection of] minute, random variations.  Selection, natural or otherwise, is just a filter.  It creates nothing.  Variation proposes, selection just disposes.  All the creation is supplied by the processes of variation.  If variation is not totally random or always small in extent, if it is creating complex structures, not just tiny variations in existing structures, then it is doing the work, not selection.

Non-random variation

In Darwin’s day, nothing was known about genetics.  He saw no easy pattern in variation, but was impressed by the power of selection, which was demonstrated in artificial selection of animals and crops.  It was therefore reasonable and parsimonious for him to assume as little structure in variation as possible.  But he also discussed many cases where variation is neither small nor random.  So-called “sporting” plants are  examples of quite large changes from one generation to the next, “that is, of plants which have suddenly produced a single bud with a new and sometimes widely different character from that of the other buds on the same plant.” What Darwin called correlated variation is an example of linked, hence non-random, characteristics.  He quotes another distinguished naturalist writing that “Breeders believe that long limbs are almost always accompanied by an elongated head” and “Colour and constitutional peculiarities go together, of which many remarkable cases could be given among animals and plants.”  Darwin’s observation about correlated variation has been strikingly confirmed by a long-term Russian experiment with silver foxes selectively bred for their friendliness to humans.  After several generations, the now-friendly animals began to show many of the features of domestic dogs, like floppy ears and wagging tails.

“Monster” fetuses and infants with characters much different from normal have been known for centuries.  Most are mutants and they show large effects.  But again, they are not random.  It is well known that some inherited deformities, like extra fingers and limbs or two heads, are relatively common, but others – a partial finger or half a head, are rare to non-existent.

Most monsters die before or soon after birth.  But once in a very long while such a non-random variant may turn out to succeed better than the normal organism, perhaps lighting the fuse to a huge jump in evolution like the Cambrian explosion.  Stephen Jay Gould publicized George Gaylord Simpson’s “tempo and mode in evolution” as punctuated equilibrium, to describe the sometimes sudden shift from stasis to change in the history of species evolution.  Sometimes these jumps  may result from a change in selection pressures.  But some may be triggered by an occasional large monster-like change in phenotype with no change in the selection environment.

The kinds of phenotypic (observed form) variation that can occur depend on the way the genetic instructions in the fertilized egg are translated into the growing organism.  Genetic errors (mutations) may be random, but the phenotypes to which they give rise are most certainly not.  It is the phenotypes that are selected not the genes themselves.  So selection operates on a pool of (phenotypic) variation that is not always “small and random”.

Even mutations themselves do not in fact occur at random.  Recurrent mutations occur more frequently than others, so would resist any attempt to select them out.  There are sometimes links between mutations so that mutation A is more likely to be accompanied by mutation B (“hitchhiking”) and so on.

Is there structure to variation?

An underlying mystery remains: just how is the information in the genes translated during development into the adult organism?  How might one or two modest mutations sometimes result in large structured changes in the phenotype?  Is there any directionality to such changes?  Is there a pattern?  Some recent studies of the evolution of African lake fish suggests that there may be a pre-determined pattern. Genetically different cichlid fish in different lakes have evolved to look almost identical.  “In other words, the ‘tape’ of cichlid evolution has been run twice. And both times, the outcome has been much the same.” There is room, in other words, for the hypothesis that natural selection is not the sole “driving force” in evolution.  Some of the process, at least, may be pre-determined.

The laws of development (ontogenesis), if laws there be, still elude discovery. But the origin of species (phylogenesis) surely depends as much on them as on selection.  Perhaps these largely unknown laws are what Darwin’s critics mean by ‘intelligent design’?  But if so, the term is deeply unfortunate because it implies that evolution is guided by intention, by an inscrutable agent, not by impersonal laws.  As a hypothesis it is untestable.  Darwin’s critics are right to see a problem with “small, random variation” Darwinism.  But they are wrong to insert an intelligent agent as a solution and still claim they are doing science. Appealing to intelligent design just begs the question of how development actually works. It is not science, but faith.

Darwin’s theory is not wrong. As he knew, but many of his fans do not, it is incomplete.  Instead of paying attention to the gaps, and seeking to fill them, these enthusiasts have provided a straw man for opponents to attack.  Emboldened by its imperfections they have proposed as an alternative ‘intelligent design’: an untestable non-solution that blocks further advance.   Darwin was closer to the truth than his critics – and closer than some simple-minded supporters.

––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

John Staddon is James B. Duke Professor of Psychology and Professor of Biology, Emeritus, at Duke University. Recent books are (2016) Adaptive Behavior and Learning (2nd edition) Cambridge University Press and Scientific Method: How science works, fails to work or pretends to work. (2017) Routledge.

 

A study in perception: Feelings cause…feelings

Statistical correlations and thousands of subjects are not enough

The #MeToo movement has taken off and so have the bad effects attributed to anything from mildly disagreeable or misperceived ‘microaggressions’ to physical assault.  Naturally, there is a desire among socially concerned scientists to study the issue. Unfortunately, it is tough to study the effects of a bad social environment. You can’t do experiments – vary the environment and look at the effect – and feelings are not the same thing as verifiable data. But the pressure to demonstrate scientifically what many ‘know’ to be true is irresistible. The result is a plethora of supposedly scientific studies, using methods that pretend to prove what they in fact cannot. Here is a recent example.

“Recent social movements such as the Women’s March, #MeToo, [etc.] draw attention to the broad spectrum gender-related violence that is pervasive in the United States and around the world”, the authors claim in a May 5 op-ed in the Raleigh News and Observer. The title of their study is: “Discrimination, Harassment, and Gendered Health Inequalities: Do Perceptions of Workplace Mistreatment Contribute to the Gender Gap in Self-reported Health?”  It captures in one place some of the worst errors that have crept into social science in recent decades: correlations treated as causes, and subjective judgement treated as objective data.  This study even manages to combine the two: subjective judgments are treated as causes of…subjective judgments.

The article, in the Journal of Health and Social Behavior, is based on reports from 5579 respondents collected in three surveys in 2006, 2010 and 2014. The report applies a battery of statistical tests (whose assumptions are never discussed) to people’s answers to questions about how they feel about mental and physical health, gender, age and racial discrimination, sexual and other harassment.  The large number of subjects just about guarantees that some ‘statistically significant’ correlations will be found.

The study looks at two sets of subjective variables – self-reports – and associates them in a way that will look like cause-effect to most readers.  But the link between these two sets is not causal – no experiments was done or could be done – but a statistical correlation.

Did the authors check to see if self-reports (by “economically active respondents” healthy enough to answer a survey) are reliable predictors of actual, physical health? No, they did not. Their claim that self-reports give an accurate picture of health is inconsistent even with data they do report “In general, studies show that men report better self-rated health than women…[self-report] is nonetheless an important dimension of individuals’ well-being and is strongly correlated with more ‘objective’ indicators of health, including mortality.” Er, really, given that women live longer than men but (according to the authors) report more ill-health? And why the ‘scare’ quotes around ‘objective’?

The authors long, statistics-stuffed, report is full of statements like “Taken together, these studies suggest that perceptions of gender discrimination, sexual harassment, and other forms [of] workplace mistreatment adversely affect multiple dimensions of women’s health.[my emphasis]” So, now perceptions (of gender discrimination) affect [i.e., cause] not mere perceptions but “multiple dimensions” of women’s health.  Unfortunately, these “multiple dimensions” include no actual, objective measures of health.  In other words, this study has found nothing – because finding a causal relation between one ‘perception’ and another is essentially impossible, and because a health study should be about reality, not perceived reality.

The main problem with this and countless similar studies is that although they usually avoid saying so directly, the authors treat a correlation between A and B as the same as A causes B.  Many, perhaps most, readers of the report will conclude that women’s bad experiences are a cause of their bad mental and physical health.  That may well be true, but not because of this study. We have absolutely no reason to believe either that people’s self-reports are accurate reflections of reality or, more importantly, that a correlation is guaranteed to be a cause. Even if these self-reports are accurate, it is impossible to conclude that one causes the other: either that feeling harassed causes sickness, or that feeling sick makes you feel harassed.

Studies like this are nothing but “noise” tuned to prevailing opinion. They overwhelm the reader with impressive-sounding statistics which are never discussed. They mislead and muddle.

The periodical The Week has a column called “Health Scare of the Week”; that is where items like this belong, not on the editorial pages – or in a scientific journal.

Is this why so many NHST studies fail to replicate?

Most ‘significant’ results occur on the first try

Leif Nelson has a fascinating blog on the NHST method and statistical significance and the chance of a false positive.  The question can be posed in the following way: Suppose 100 labs begin the same bad study, i.e., a study involving variables that in fact have no effect. Once a lab gets a “hit”, it stops trying. If the chosen significance level is p (commonly p = 0.05), then approximately 5 of the 100 labs will, by chance, get a “hit”, a significant result, on the first try.  If the remaining 95 labs attempt to replicate, again a fraction between 4 and 5 will “hit” – and so on.  So, the number of ‘hits’ is a declining (exponential) function of the number of trials – even though the chance of a hit is constant, trial-by-trial.

The reason for the trial-by-trial decline, of course, is that every lab has an opportunity for a hit on trial 1, but a smaller number, 1-p = 0.95, has a chance at a second trial, and so on. The ratio of hit probability per opportunity remains constant, p.  The average number of trials per hit is 1/p = 20 in this case. But the modal number is just one, because the opportunity is maximal on the first trial.

On the other hand, the more trials are carried out, the more likely that there will be a ‘hit’ – this even though the maximum number (but not probability) of hits is on the first trial.  To see this, imagine running the hundred experiments for, say 10 repeats each. The probability of non-significance on trial 1 is 1-0.05 = 0.95, on trial 2, (1-p), on trial 3 (1-p)2 and so on.  These trials are independent, so the probability of failure, no ‘hit’ from trials 1 through N is obviously (1-p)N. The probability of success, a ‘hit’ somewhere from trial 1 to trial N is obviously the complement of that:

P(‘hit’|N) = 1-(1-p)N,

Which is  an increasing, not a decreasing function of N. In other words, even though, most false positives occur on the first trial (because opportunities are then at a maximum), it is also true that the more trials are run, the more likely one of them will be a false positive.

But Leif Nelson is undoubtedly correct that it is those 5% that turned up ‘heads’ on the very first try that are so persuasive, both to the researcher who gets the result and the reviewer who judges it.

 

 

 

Adaptive Behavior and Learning

This site is about behaviorism, a philosophical movement critical of the idea that the contents of consciousness are the causes of behavior.  The vast, inaccessible ‘dark matter’ of the unconscious is responsible for recollection, creativity and that ‘secret planner’ whose hidden motives sometimes overshadow conscious will.   But early behaviorism went too far in its attempts to simplify.  ‘Thought’ is not just covert speech.  B. F. Skinner’s claim that “Theories of learning are [not] necessary” is absurd.   The new behaviorism proposes simple, testable processes that can summarize the learned and instinctive adaptive behavior of animals and human beings.

Sourcebooks:

The New Behaviorism

Adaptive Behavior and Learning

Where operant conditioning went wrong

Operant conditioning is BF Skinner’s name for instrumental learning, for learning by consequences.  Not a new idea, of course.  Humanity has always known how to teach children and animals by means of reward and punishment.  What gave Skinner’s label the edge was his invention of a brilliant method of studying this kind of learning in individual organisms.  The Skinner box and the cumulative recorder  were an unbeatable duo.

Three  things have prevented the study of operant conditioning from developing as it might have: a limitation of the method, over-valuing order and distrust of theory.

The method.  The cumulative record was a fantastic breakthrough in one respect: it allowed the study of the behavior of a single animal to be studied in real time.  Until Skinner, the data of animal psychology consisted largely of group averages – how many animals in group X or Y turned left vs. right in maze, for example.  And not only were individual animals lost in the group, so were the actual times – how long did the rat in the maze take to decide, how fast did it run?  What did it explore before deciding?

But the Skinner-box setup is also limited – to a single response and to changes in its rate of occurrence.  Operant conditioning involves selection from a repertoire of activities: the trial bit of trial-and-error.  The Skinner-box method encourages the study of just one or two already-learned responses.  Of the repertoire, that set of possible responses emitted for “other reasons” – of all those possible modes of behavior lurking below threshold but available to be selected – of those covert responses, so essential to instrumental learning, there is no mention.

Too much order? The second problem is an unexamined respect for what might be called “order at any price”.  Fred Skinner frequently quoted Pavlov: “control your conditions and you will see order.”   But he never said just why “order” in and of itself is desirable.

The easiest way to get order, to reduce variation, is to of course take an average.  Skinnerian experiments involve single animals, so the method discourages averaging across animals.  But why not average all those pecks?  Averaging responses was further encouraged by Skinner’s emphasis on probability of response as the proper dependent variable for psychology.   So the most widely used datum in operant psychology is response rate, the number of responses that occur over a time period of minutes or hours.

Another way to reduce variability is negative feedback.  A thermostatically controlled HVAC system reduces the variation in house temperature.  Any kind of negative feedback will reduce variation in the controlled variable.  Operant conditioning, almost by definition, involves feedback.  The more the organism responds, the more reward it gets – subject to the constraints of whatever reinforcement schedule is in effect.  This is positive feedback.  But the most-studied operant choice procedure – concurrent variable-interval schedule – also  involves negative feedback.  When the choice is between two variable-interval schedules, the more time is spent on one choice the higher the  payoff probability for switching to the other.   So no matter the difference in payoff rates for the choices, the organism will never just fixate on one.

As technology advanced, these two things converged: the desire for order, enabled by averaging and negative feedback, and Skinner’s idea that response probability is an appropriate – the appropriate – dependent variable.  Variable-interval schedules either singly or in two-choice situations, became  a kind of measuring device.  Response rate on VI is steady – no waits, pauses or sudden spikes.  It seemed to offer a simple and direct way to measure response probability.    From response rate as response probability to the theoretical idea of rate as somehow equivalent to response strength was but a short step.

Theory Response strength is a theoretical construct.  It goes well beyond response rate or indeed any other directly measureable quantity.  Unfortunately, most people think they know what they mean by “strength”.  The  Skinnerian tradition made it difficult to see that more is needed.

A landmark 1961 study by George Reynolds illustrates the problem (although George never saw it in this way).   Here is a simplified version:  Imagine two experimental conditions and two identical pigeons.  Each condition runs for several daily sessions.  In Condition A, pigeon A pecks a red key for food reward delivered on a VI 30-s schedule.  In Condition B, pigeon B pecks a green key for food reward delivered on a VI 15-s schedule.  Because both food rates are relatively high, after lengthy exposure to the procedure, the pigeons will be pecking at a high rate in both cases: response rates – hence ‘strengths’ – will be roughly the same.  Now change the procedure for both pigeons.  Instead of a single schedule, two schedules alternate, for a minute or so each, across a one-hour experimental session.  The added, second schedule is the same for both pigeons: VI 15 s, signaled by a yellow key (alternating two signaled schedules in this way is called a multiple schedule).  Thus, pigeon A is on a mult VI 30 VI 15 (red and yellow stimuli) and pigeon B on a mult VI 15 VI 15 (green and yellow stimuli).  In summary, the two experimental conditions are (stimulus colors above):

Experiment A:  VI 30 (Red), mult VI 30 (Red) VI 15 (Yellow)

Experiment B:   VI 15 (Green), mult VI 15 (Green) VI 15 (Yellow)

Now look at the second condition for each pigeon.  Unsurprisingly, B’s response rate in green will not change.  All that that has changed for him is the key color – from green all the time to green and yellow alternating, both with the same payoff.  But A’s response rate in red, the VI 30 stimulus, will be much depressed, and response rate in yellow for A will be considerably higher than B’s yellow response rate, even though the VI 15-s schedule is the same in both.  The effect on responding in the yellow stimulus by pigeon A, an increase in response rate when a given schedule is alternated with a leaner one, is called positive behavioral contrast and the rate decrease in the leaner schedule for pigeon A is negative contrast.

The obvious conclusion is that response rate alone is inadequate as a description of the ‘strength’ of an operant response.  The steady rate maintained by VI schedules is misleading.  It looks like a simple measure of strength.  Because of Skinner’s emphasis on order, because the  averaged-response and feedback-rich variable-interval schedule seemed to provide it and because it was easy to equate response probability with response rate, the idea took root.  Yet even in the 1950s, it was well known that response rate can itself be manipulated – by so-called differential-reinforcement-of-low-rate (DRL) schedules, for example.

Conclusion: response rate does not equal response strength; hence our emphasis on rate may be a mistake.  If the strength idea is to survive the demise of rate as its best measure, something more is needed: a theory about the factors that control an operant response.  But because Skinner had successfully proclaimed that theories of learning are not necessary, real theory was not forthcoming for many years.