Survivorship bias

From Wikipedia, the free encyclopedia

Survivorship bias or survival bias is the logical error of concentrating on entities that passed a selection process while overlooking those that did not. This can lead to incorrect conclusions because of incomplete data.

Survivorship bias is a form of selection bias that can lead to overly optimistic beliefs because multiple failures are overlooked, such as when companies that no longer exist are excluded from analyses of financial performance. It can also lead to the false belief that the successes in a group have some special property, rather than just coincidence as in correlation "proves" causality.

Another kind of survivorship bias would involve thinking that an incident happened in a particular way because the only people who were involved in the incident who can speak about it are those who survived it. Even if one knew that some people are dead, they would not have their voice to add to the conversation, making it biased.

As a general experimental flaw[edit]

The parapsychology researcher Joseph Banks Rhine believed he had identified the few individuals from hundreds of potential subjects who had powers of extra-sensory perception (ESP). His calculations were based on the improbability of these few subjects guessing the Zener cards shown to a partner by chance.[1] A major criticism that surfaced against his calculations was the possibility of unconscious survivorship bias in subject selections. He was accused of failing to take into account the large effective size of his sample (all the people he rejected as not being "strong telepaths" because they failed at an earlier testing stage). Had he done this he might have seen that, from the large sample, one or two individuals would probably achieve purely by chance the track record of success he had found.

Writing about the Rhine case in Fads and Fallacies in the Name of Science, Martin Gardner explained that he did not think the experimenters had made such obvious mistakes out of statistical naivety, but as a result of subtly disregarding some poor subjects. He said that, without trickery of any kind, there would always be some people who had improbable success, if a large enough sample were taken. To illustrate this, he speculates about what would happen if one hundred professors of psychology read Rhine's work and decided to make their own tests; he said that survivor bias would winnow out the typically failed experiments, but encourage the lucky successes to continue testing. He thought that the common null hypothesis (of no result) would not be reported, but "[e]ventually, one experimenter remains whose subject has made high scores for six or seven successive sessions. Neither experimenter nor subject is aware of the other ninety-nine projects, and so both have a strong delusion that ESP is operating." He concludes "The experimenter writes an enthusiastic paper, sends it to Rhine who publishes it in his magazine, and the readers are greatly impressed."[2]

If sufficiently many scientists study a phenomenon, some will find statistically significant results by chance, and these are the experiments submitted for publication. Additionally, papers showing positive results may be more appealing to editors.[3] This problem is known as positive results bias, a type of publication bias. To combat this, some editors now call for the submission of "negative" scientific findings, where "nothing happened".[4]

Survivorship bias is one of the research issues brought up in the provocative 2005 paper "Why Most Published Research Findings Are False", which shows that a large number of published medical research papers contain results that cannot be replicated.[3]

One famous example of immortal time bias was discovered in a study by Redelmeier and Singh in the Annals of Internal Medicine that reported that Academy Award-winning actors and actresses lived almost four years longer than their less successful peers.[5] The statistical method used to derive this statistically significant difference, however, gave winners an unfair advantage, because it credited an Academy Award winner's years of life before winning toward survival subsequent to winning. When the data was reanalyzed using methods that avoided this "immortal time" bias, the survival advantage was closer to one year and was not statistically significant.[6]

Examples[edit]

Business, finance, and economics[edit]

In finance, survivorship bias is the tendency for failed companies to be excluded from performance studies because they no longer exist. It often causes the results of studies to skew higher because only companies that were successful enough to survive until the end of the period are included. For example, a mutual fund company's selection of funds today will include only those that are successful now. Many losing funds are closed and merged into other funds to hide poor performance. In theory, 70% of extant funds could truthfully claim to have performance in the first quartile of their peers, if the peer group includes funds that have closed.[7]

In 1996, Elton, Gruber, and Blake showed that survivorship bias is larger in the small-fund sector than in large mutual funds (presumably because small funds have a high probability of folding).[8] They estimate the size of the bias across the U.S. mutual fund industry as 0.9% per annum, where the bias is defined and measured as:

"Bias is defined as average α for surviving funds minus average α for all funds"

(Where α is the risk-adjusted return over the S&P 500. This is the standard measure of mutual fund out-performance).

Additionally, in quantitative backtesting of market performance or other characteristics, survivorship bias is the use of a current index membership set rather than using the actual constituent changes over time. Consider a backtest to 1990 to find the average performance (total return) of S&P 500 members who have paid dividends within the previous year. To use the current 500 members only and create a historical equity line of the total return of the companies that met the criteria would be adding survivorship bias to the results. S&P maintains an index of healthy companies, removing companies that no longer meet their criteria as a representative of the large-cap U.S. stock market. Companies that had healthy growth on their way to inclusion in the S&P 500 would be counted as if they were in the index during that growth period, which they were not. Instead there may have been another company in the index that was losing market capitalization and was destined for the S&P 600 Small-cap Index that was later removed and would not be counted in the results. Using the actual membership of the index and applying entry and exit dates to gain the appropriate return during inclusion in the index would allow for a bias-free output.

Framed quotes of successful CEOs in a public library

Michael Shermer in Scientific American[9] and Larry Smith of the University of Waterloo[10] have described how advice about commercial success distorts perceptions of it by ignoring all of the businesses and college dropouts that failed.[11] Journalist and author David McRaney observes that the "advice business is a monopoly run by survivors. When something becomes a non-survivor, it is either completely eliminated, or whatever voice it has is muted to zero".[12] Alec Liu wrote in Vice that "for every Mark Zuckerberg, there's thousands of also-rans, who had parties no one ever attended, obsolete before we ever knew they existed."[13]

In his book The Black Swan, financial writer Nassim Taleb called the data obscured by survivorship bias "silent evidence".[14]

History[edit]

Diagoras of Melos was asked concerning paintings of those who had escaped shipwreck: "Look, you who think the gods have no care of human things, what do you say to so many persons preserved from death by their especial favour?", to which Diagoras replied: "Why, I say that their pictures are not here who were cast away, who are by much the greater number."[15]

Susan Mumm has described how survival bias leads historians to study organisations that are still in existence more than those that have closed. This means large, successful organisations such as the Women's Institute, which were well organised and still have accessible archives for historians to work from, are studied more than smaller charitable organisations, even though these may have done a great deal of work.[16]

Architecture and construction[edit]

Just as new buildings are being built every day and older structures are constantly torn down, the story of most civil and urban architecture involves a process of constant renewal, renovation, and revolution. Only the most (subjectively, but popularly determined) beautiful, most useful, and most structurally sound buildings survive from one generation to the next. This creates another selection effect where the ugliest and weakest buildings of history have long been eradicated from existence and thus the public view, and so it leaves the visible impression, seemingly correct but factually flawed, that all buildings in the past were both more beautiful and better built.

Highly competitive careers[edit]

Whether it be movie stars, athletes, musicians, or CEOs of multibillion-dollar corporations who dropped out of school, popular media often tells the story of the determined individual who pursues their dreams and beats the odds. There is much less focus on the many people that may be similarly skilled and determined, but fail to ever find success because of factors beyond their control or other (seemingly) random events. There is also a tendency to overlook resources and events that helped enable such success, that those who failed didn't have.[17]

This creates a false public perception that anyone can achieve great things if they have the ability and make the effort. The overwhelming majority of failures are not visible to the public eye, and only those who survive the selective pressures of their competitive environment are seen regularly.

Military[edit]

This hypothetical pattern of damage of surviving aircraft shows locations where they can sustain damage and still return home. If the aircraft was reinforced in the most commonly hit areas, this would be a result of survivorship bias because crucial data from fatally damaged planes was being ignored; those hit in other places did not survive.

During World War II, the statistician Abraham Wald took survivorship bias into his calculations when considering how to minimize bomber losses to enemy fire.[18] The Statistical Research Group (SRG) at Columbia University, which Wald was a part of, examined the damage done to aircraft that had returned from missions and recommended adding armor to the areas that showed the least damage.[19][20][21] The bullet holes in the returning aircraft represented areas where a bomber could take damage and still fly well enough to return safely to base. Therefore, Wald proposed that the Navy reinforce areas where the returning aircraft were unscathed,[18]: 88  inferring that planes hit in those areas were the ones most likely to be lost. His work is considered seminal in the then nascent discipline of operational research.[22]

Cats[edit]

In a study performed in 1987, it was reported that cats who fall from less than six stories, and are still alive, have greater injuries than cats who fall from higher than six stories.[23][24] It has been proposed that this might happen because cats reach terminal velocity after righting themselves at about five stories, and after this point they relax, leading to less severe injuries in cats who have fallen from six or more stories.[25] In 1996, The Straight Dope newspaper column proposed that another possible explanation for this phenomenon would be survivorship bias. Cats that die in falls are less likely to be brought to a veterinarian than injured cats, and thus many of the cats killed in falls from higher buildings are not reported in studies of the subject.[26]

Tropical trees[edit]

Tropical vines and lianas are often viewed as macro-parasites of trees that reduce host tree survival. The proportion of trees infested with lianas was observed to be much greater in shade-tolerant, heavy wooded, slow-growing tree species while light-demanding, lighter wooded and fast-growing species are often liana free. Such observations led to the expectation that lianas have stronger negative effects on shade-tolerant species.[27] Further investigations, however, revealed that liana infestation is far more harmful to light-demanding fast-growing tree species where liana infestation greatly decreases survival such that the observable sample is biased towards those that survived and are liana-free.[28] Hence, the observable sample of trees with lianas in their crown is skewed due to survivorship bias.

Studies of evolution[edit]

Large groups of organisms called clades that survive a long time are subject to various survivorship biases such as the "push of the past", generating the illusion that clades in general tend to originate with a high rate of diversification that then slows through time.[29]

Business law[edit]

Survivorship bias can raise truth-in-advertising issues when the success rate advertised for a product or service is measured by reference to a population whose makeup differs from that of the target audience for the advertisement. This is especially important when

  1. the advertisement either fails to disclose the relevant differences between the two populations, or describes them in insufficient detail; and
  2. these differences result from the company's deliberate "pre-screening" of prospective customers to ensure that only customers with traits increasing their likelihood of success are allowed to purchase the product or service, especially when the company's selection procedures or evaluation standards are kept secret; and
  3. the company offering the product or service charges a fee, especially one that is non-refundable or not disclosed in the advertisement, for the privilege of attempting to become a customer.

For example, the advertisements of online dating service eHarmony.com pass this test because they fail the first two prongs but not the third:

  1. they claim a success rate significantly higher than that of competing services while generally not disclosing that the rate is calculated with respect to a viewership subset of individuals who possess traits that increase their likelihood of finding and maintaining relationships and lack traits that pose obstacles to their doing so, and
  2. the company deliberately selects for these traits by administering a lengthy pre-screening process designed to reject prospective customers who lack the former traits or possess the latter ones, but
  3. the company does not charge a fee for administration of its pre-screening test; thus its prospective customers face no "downside risk" other than wasting their time, expending the effort involved in completing the pre-screening process,[30] and suffering disappointment.

See also[edit]

References[edit]

  1. ^ Shermer, Michael (3 August 2011). "Deviations: A Skeptical Investigation of Edgar Cayce's Association for Research and Enlightenment". Skeptic.com. Retrieved 21 May 2018.
  2. ^ Fads and Fallacies in the Name of Science, Martin Gardner, p. 303, 1957, Dover Publications Inc.
  3. ^ a b Ioannidis, J. P. A. (2005). "Why Most Published Research Findings Are False". PLoS Med. 2 (8): e124. doi:10.1371/journal.pmed.0020124. PMC 1182327. PMID 16060722.
  4. ^ Goodchild van Hilten, Lucy. "Why it's time to publish research "failures"". Elsevier Connect. Elsevier. Archived from the original on 2 June 2021. Retrieved 31 May 2021.
  5. ^ Redelmeier, Donald A.; Singh, Sheldon M. (2001-05-15). "Survival in Academy Award–Winning Actors and Actresses". Annals of Internal Medicine. 134 (10): 955–62. doi:10.7326/0003-4819-134-10-200105150-00009. ISSN 0003-4819. PMID 11352696.
  6. ^ Sylvestre, Marie-Pierre; Huszti, Ella; Hanley, James A. (2006-09-05). "Do Oscar Winners Live Longer than Less Successful Peers? A Reanalysis of the Evidence". Annals of Internal Medicine. 145 (5): 361–3, discussion 392. doi:10.7326/0003-4819-145-5-200609050-00009. ISSN 0003-4819. PMID 16954361. S2CID 13724567.
  7. ^ "How to think about diversification - Page 3 - Bogleheads.org". www.bogleheads.org. Archived from the original on 2020-10-29. Retrieved 2020-10-26.
  8. ^ Elton; Gruber; Blake (1996). "Survivorship Bias and Mutual Fund Performance". Review of Financial Studies. 9 (4): 1097–1120. doi:10.1093/rfs/9.4.1097. S2CID 154097782. In this paper the researchers eliminate survivorship bias by following the returns on all funds extant at the end of 1976. They show that other researchers have drawn spurious conclusions by failing to include the bias in regressions on fund performance.
  9. ^ Michael Shermer (2014-08-19). "How the Survivor Bias Distorts Reality". Scientific American. Archived from the original on 2019-04-29. Retrieved 2015-07-20.
  10. ^ Carmine Gallo (2012-12-07). "High-Tech Dropouts Misinterpret Steve Jobs' Advice". Forbes. Archived from the original on 2019-04-02. Retrieved 2017-09-15.
  11. ^ Robert J Zimmer (2013-03-01). "The Myth of the Successful College Dropout: Why It Could Make Millions of Young Americans Poorer". The Atlantic. Archived from the original on 2019-04-09. Retrieved 2017-03-11.
  12. ^ Karen E. Klein. "How Survivorship Bias Tricks Entrepreneurs". Bloomberg. Archived from the original on 2016-03-07. Retrieved 2017-03-11.
  13. ^ Alec Liu (2012-10-03). "What Happened to the Facebook Killer? It's Complicated". Motherboard - Tech by Vice.
  14. ^ Taleb, Nassim Nicholas (2010). The Black Swan: The Impact of the Highly Improbable (2nd ed.). New York: Random House. p. 101. ISBN 9780679604181.
  15. ^ Cicero, De Natura Deor., iii. 37.
  16. ^ Mumm, Susan (2010). Women and Philanthropic Cultures, in Women, Gender and Religious Cultures in Britain, 1800-1940, Eds Sue Morgan and Jacqueline deVries . London: Routledge.
  17. ^ Karen E. Klein (2014-08-11). "How Survivorship Bias Tricks Entrepreneurs". Bloomberg Business. Archived from the original on 2017-05-23. Retrieved 2017-04-11.
  18. ^ a b Wald, Abraham. (1943). A Method of Estimating Plane Vulnerability Based on Damage of Survivors. Statistical Research Group, Columbia University. CRC 432 — reprint from July 1980 Archived 2019-07-13 at the Wayback Machine. Center for Naval Analyses.
  19. ^ Wallis, W. Allen (1980). "The Statistical Research Group, 1942-1945: Rejoinder". Journal of the American Statistical Association. 75 (370): 334–335. doi:10.2307/2287454. JSTOR 2287454.
  20. ^ "Bullet Holes & Bias: The Story of Abraham Wald". mcdreeamie-musings. Archived from the original on 2020-04-16. Retrieved 2020-05-29.
  21. ^ "AMS :: Feature Column :: The Legend of Abraham Wald". American Mathematical Society. Archived from the original on 2020-05-27. Retrieved 2020-05-29.
  22. ^ Mangel, Marc; Samaniego, Francisco (June 1984). "Abraham Wald's work on aircraft survivability". Journal of the American Statistical Association. 79 (386): 259–267. doi:10.2307/2288257. JSTOR 2288257. Reprint on author's web site Archived 2019-08-17 at the Wayback Machine
  23. ^ Whitney, WO; Mehlhaff, CJ (1987). "High-rise syndrome in cats". Journal of the American Veterinary Medical Association. 191 (11): 1399–403. PMID 3692980.
  24. ^ "Highrise Syndrome in Cats". Archived from the original on 2015-02-21. Retrieved 2014-06-13.
  25. ^ Falling Cats Archived 2007-07-12 at the Wayback Machine
  26. ^ "Do cats always land unharmed on their feet, no matter how far they fall?". The Straight Dope. July 19, 1996. Archived from the original on 2012-04-16. Retrieved 2008-03-13.
  27. ^ Schnitzer, S (2002). "The ecology of lianas and their role in forests". Trends in Ecology & Evolution. 17 (5): 223–230. doi:10.1016/S0169-5347(02)02491-6. ISSN 0169-5347. Archived from the original on 2020-05-06. Retrieved 2020-08-29.
  28. ^ Visser, Marco D.; Schnitzer, Stefan A.; Muller-Landau, Helene C.; Jongejans, Eelke; de Kroon, Hans; Comita, Liza S.; Hubbell, Stephen P.; Wright, S. Joseph; Zuidema, Pieter (2018). "Tree species vary widely in their tolerance for liana infestation: A case study of differential host response to generalist parasites". Journal of Ecology. 106 (2): 781–794. doi:10.1111/1365-2745.12815. hdl:2066/176867. ISSN 0022-0477.
  29. ^ Budd, G. E.; Mann, R. P. (2018). "History is written by the victors: the effect of the push of the past on the fossil record" (PDF). Evolution. 72 (11): 2276–2291. doi:10.1111/evo.13593. PMC 6282550. PMID 30257040. Archived (PDF) from the original on 2019-04-28. Retrieved 2018-12-12.
  30. ^ Farhi, Paul (May 13, 2007). "They Met Online, but Definitely Didn't Click". Washington Post. Archived from the original on May 16, 2017. Retrieved September 15, 2017.
  31. ^ "Archived copy" (PDF). Archived (PDF) from the original on 2019-07-05. Retrieved 2019-07-05.{{cite web}}: CS1 maint: archived copy as title (link)