Part 5 (1/2)
Upon a.n.a.lyzing the lottery win data, Rosenthal uncovered an unusual pattern of wins by retail store insiders, much too unusual to have been produced by chance. With similar logic, some people stopped flying after the EgyptAir crash because to them, four crashes in four years seemed like an unusual pattern of disasters in the same region-too many to have happened completely at random. Did such behavior const.i.tute a ”personality disorder”?
The facts on the ground were immutable: the four flights, the location, the accident times, and the number of casualties were there for all to see. Many rejected random chance as an explanation for the pattern of crashes. Yet, to Professor Barnett, four in four looked just like the work of chance. He even used the same tool of statistical testing but arrived at a different conclusion. The difference lay in how he a.s.similated the data.
Statisticians are a curious lot: when given a vertical set of numbers, they like to look sideways. They look into the nooks and crannies; they look underneath the cracks; they turn over every pebble. From decades of experience, they learn that what is hidden is just as important as what is in front of their eyes. No one ever gets to see the whole picture, so the key is to know what you don't know know what you don't know. When you read the table of fatalities presented earlier, you may have visualized four black dots over the ”Nantucket Triangle” and connected the dots; Barnett, by contrast, saw four black dots plus millions of white dots. Each white dot stood for one flight that safely traversed the air s.p.a.ce during those four years. Seen in this light, we would hardly find the black dots, let alone connect them. Then, taking it further, Barnett envisioned ten, even twenty, years of flights over the Nantucket Triangle, bringing millions more white dots into the picture, and only dozens of additional black dots. This method creates a new picture, one altogether different from the list of worst disasters frequently displayed in postcrash news reports. Listed separately, the four accidents stuck out like stars in the night sky; however, they became all but invisible when buried within a sea of whiteness (see Figure 5-2 Figure 5-2). Considering that the Northeast Corridor is one of the busiest airways in the world, it would follow that this area would see a larger number of fatal accidents.
As to whether fear of flying could be considered a ”personality disorder,” one esteemed statistician answered firmly in the negative during a lecture to an audience at Boeing. He suggested that as the airline industry has fended off systematic causes of jet crashes such as equipment failure, new types of risks are rising to the surface. He cited three ”menaces that caused scant fatalities in the 1990s but which could cause more deaths in forthcoming years”: sabotage, runway collisions, and midair collisions. The lecture, t.i.tled ”Airline Safety: End of a Golden Age?” could not have been more aptly timed; it was delivered on September 11, 2001. The future he had antic.i.p.ated arrived early.
Figure 5-2 The Statistician's Worldview The Statistician's Worldview [image]
Who was this professor with such impressive foresight? None other than Arnold Barnett, who has been studying airline safety data for more than thirty years at the MIT Sloan School of Management. In the 1970s, he initiated a remarkably productive research program that has continuously tracked the safety record of airlines worldwide. Before he arrived on the scene, people considered it impossible to measure airline safety accurately, because the contributing factors could not be directly observed. How could one appraise the att.i.tudes of corporate managers toward safety? How could one compare the efficacy of different training programs? How could one take into account disparate flight routes, airports, flight lengths, and age of airlines? Barnett the statistician made an end run around these obstacles, realizing he did not need any of those unknowns. When a pa.s.senger boards a plane, his or her fear is solely of dying in a fatal crash; it is thus sufficient to merely track the frequency of fatal accidents and the subsequent survival rates. Similarly, universities rely on SAT scores and school ranks to evaluate applicants because they cannot possibly visit every family, every home, and every school. How to compare Mary's parents to Julia's? How to rank Michael's gymnasium against Joseph's? So, instead of measuring such specific influences on student achievement as parental upbringing and quality of education, educators merely track the actual scholastic ability as represented by SAT scores and school ranks.
Under Barnett's watch, airlines in the developed world saw the risk of death drop from 1 in 700,000 in the 1960s to 1 in 10 million in the 1990s, a fourteen-fold improvement in three decades. He was the first to prove that American carriers were the safest in the world, and by 1990, he was telling everyone about a golden age of air safety. The rest of the developed world has since caught up, while the developing world still lags by two decades. Barnett believes that fatal air crashes have essentially become random events with a minuscule strike rate. In other words, it is no longer possible to find any systematic cause of an air disaster, like mechanical failure or turbulence. Air crashes today are practically freak accidents.
What does the visionary Barnett say about two of our biggest fears?
1. Don't choose between U.S. national airlines based on safety. Don't choose between U.S. national airlines based on safety. Jet crashes occur randomly, so the carrier that suffers a recent crash would have been merely unlucky. Between 1987 and 1996, USAir happened to be the unlucky airline. It operated 20 percent of domestic flights but accounted for 50 percent of all crash fatalities, by far the worst record among the seven major airlines in the United States (see Jet crashes occur randomly, so the carrier that suffers a recent crash would have been merely unlucky. Between 1987 and 1996, USAir happened to be the unlucky airline. It operated 20 percent of domestic flights but accounted for 50 percent of all crash fatalities, by far the worst record among the seven major airlines in the United States (see Figure 5-3 Figure 5-3). Barnett asked what the chance was that such a lopsided allocation of deaths could have hit any one of the seven carriers. The chance was 11 percent; it was quite likely to happen, and if not USAir, another airline would have borne the brunt. In another study, Barnett found that no U.S. airline has sustained an advantage in safety: the top-ranked airline in one period frequently came in last in the next period, giving further proof that all operators were materially equal in safety. It is just not possible to predict which airline will suffer the next fatal crash. Pa.s.sengers have nowhere to run for air safety.
Figure 5-3 Relative Proportion of Flights and Deaths for USAir and Six Other U.S. Carriers, 19871996: Evidence That USAir Was Less Safe? Relative Proportion of Flights and Deaths for USAir and Six Other U.S. Carriers, 19871996: Evidence That USAir Was Less Safe?
[image]
2. Don't avoid foreign airlines, even after one of their planes has crashed. Don't avoid foreign airlines, even after one of their planes has crashed. Flights operated by developing-world airlines are just as safe as those run by U.S. airlines on routes where they directly compete with one another, typically those between the developed and developing worlds. Where they do not overlap, foreign airlines suffer many more crashes, for unknown reasons. (Some speculate that they may be a.s.signing better crews to international flights.) Because of their poor domestic record, the overall risk of death a.s.sociated with developing-world carriers was eight times worse than for their developed-world peers. But Barnett found no difference between these two groups of operators on competing routes: the risk was about 1 in 1.5 million during 20002005. The once-a-day frequent flier could expect to die in a jet crash in 4,100 years, on any of the operators that offer service on these routes. Moreover, while the worldwide risk of aviation fatality has been more than halved since the 1980s, the risk differential between developing-world and developed-world operators has stayed minute. Thus, we can trust these supposedly hulking, inefficient state enterprises with old planes, undertrained pilots, and unmotivated staff to take us overseas safely. Flights operated by developing-world airlines are just as safe as those run by U.S. airlines on routes where they directly compete with one another, typically those between the developed and developing worlds. Where they do not overlap, foreign airlines suffer many more crashes, for unknown reasons. (Some speculate that they may be a.s.signing better crews to international flights.) Because of their poor domestic record, the overall risk of death a.s.sociated with developing-world carriers was eight times worse than for their developed-world peers. But Barnett found no difference between these two groups of operators on competing routes: the risk was about 1 in 1.5 million during 20002005. The once-a-day frequent flier could expect to die in a jet crash in 4,100 years, on any of the operators that offer service on these routes. Moreover, while the worldwide risk of aviation fatality has been more than halved since the 1980s, the risk differential between developing-world and developed-world operators has stayed minute. Thus, we can trust these supposedly hulking, inefficient state enterprises with old planes, undertrained pilots, and unmotivated staff to take us overseas safely.
Like Rosenthal, Barnett used statistical testing to prove his point. For the decade leading up to 1996, developing-world airlines operated 62 percent of compet.i.tive flights. If they were just as safe as U.S. airlines, they should have caused about 62 percent of pa.s.senger deaths, or well over 62 percent if they were more p.r.o.ne to disasters. In those ten years, developing-world carriers caused only 55 percent of the fatalities, indicating that they did no worse (see Figure 5-4 Figure 5-4).
Figure 5-4 Relative Proportion of Flights and Deaths for Developed-World and Developing-World Carriers, 19871996: No Evidence That Developing-World Carriers Were Less Safe on Comparable Routes Relative Proportion of Flights and Deaths for Developed-World and Developing-World Carriers, 19871996: No Evidence That Developing-World Carriers Were Less Safe on Comparable Routes [image]
The news about the Ontario lottery investigation spread all over Canada, and in every province, the lottery corporations were overrun by phone calls and e-mails from concerned citizens.
British Columbia's...o...b..dsman, in reviewing past winners, unmasked dozens of extraordinarily lucky store owners, including one who took home CDN$300,000 over five years, winning eleven times. When the president of the British Columbia Lottery Corporation, which runs the province's lottery, was fired, his buddy, himself a former president, came to his defense: ”Of course, it's possible retailers cheated players of their prize money, but only if you're a fool.”
In New Brunswick, the Atlantic Lottery Corporation, which runs lotteries in four provinces, attempted to reshape the publicity by hiring an external consultant to audit past wins using the same method as Rosenthal. The a.n.a.lysis, however, showed that between 2001 and 2006, store owners claimed 37 out of 1,293 prizes of CDN$25,000 or more, when they were expected to have won fewer than 4 of those. It was inconceivable that this group of players could have won so many prizes if each ticket had an equal chance of winning.
Meanwhile, the CBC hired Rosenthal again, this time to examine the pattern of wins in the lotteries in the Western provinces from November 2003 to October 2006. The professor found that insiders earned sixty-seven wins of CDN$10,000 or more- twice as many as could be expected if the lotteries were fair to all players. Just how lucky were these insiders? Using statistical testing, Rosenthal further explained that the chance was 1 in 2.3 million that insiders could have racked up so many wins under a fair lottery system. While not as extreme as in Ontario, these odds were still negligible. Again, Rosenthal could hardly believe that the store owners were that much luckier than the rest of the ticket holders, so he suspected fraud. (Unlike in Ontario, neither the Atlantic nor the Western Lottery Corporation has been able to catch any individual cheater.) To restore the public's confidence, the lottery authorities announced a series of measures to protect customers, including installation of self-service scanning machines, reconfiguration of monitors to face out to the customers, improvement in win-tracking technology, background checks for retailers, and the requirement of winners to sign the back of their winning tickets. It remains to be seen whether these policies will succeed at lifting the cloud of suspicion.
Both statisticians grappled with real-life data, noticed unusual patterns, and asked whether they could occur by chance. Rosenthal's answer was an unequivocal no, and his result raised myriad doubts about insider wins in Ontario lotteries. Employing the same type of logic, Barnett alleviated our fear of flying by showing why air travelers have nowhere to run, because freak accidents can hit any unlucky carrier, anywhere.
You may still be wondering why statisticians willingly accept the risk of death while they show little appet.i.te for playing with chance. Why do they behave differently from most people? We know it is not the tools at their disposal that affect their behavior; we all use the same sort of statistical testing to weigh the situational evidence against chance, whether we realize it or not. The first difference lies in the way that statisticians perceive data: most people tend to hone in on unexpected patterns, but statisticians like to evaluate these against the background. For Barnett, the background is the complete flight schedule, not just a list of the worst disasters, while for Rosenthal, it includes all lottery players, not just retailers with major wins.
Moreover, in the worldview of statisticians, rare is impossible: rare is impossible: jackpots are for dreamers, and jet crashes for paranoids. For Rosenthal to believe that all retail store insiders acted with honor, he would have had to accept that an extremely rare event had taken place. That would require disavowing his statistical roots. Barnett keeps on flying, twice a week, as he believes air disasters are nigh extinct. Had he stopped out of fear at any point, he would have had to admit that an incredibly unlikely incident could occur. That, too, would contravene his statistical instinct. jackpots are for dreamers, and jet crashes for paranoids. For Rosenthal to believe that all retail store insiders acted with honor, he would have had to accept that an extremely rare event had taken place. That would require disavowing his statistical roots. Barnett keeps on flying, twice a week, as he believes air disasters are nigh extinct. Had he stopped out of fear at any point, he would have had to admit that an incredibly unlikely incident could occur. That, too, would contravene his statistical instinct.
Rather than failing at risk a.s.sessment, as many have alleged, the people who avoid flying after air crashes also are reasoning like statisticians. Faced with a raft of recent fatal crashes, they rule out the possibility of chance. What leads them to draw different conclusions is the limited slice of data available to them. There are many everyday situations in which we run statistical tests without realizing it. The first time our bags get searched at the airport, we might rue our luck. If it happens twice, we might start to wonder about the odds of being picked again. Three or four times, and we might seriously doubt whether selection has been random at all. Rare is impossible.
At the request of two senators in 1996, the Federal Aviation Administration acted to close the information gap between the experts and the public by releasing limited air safety data on its website. How have we done since? Poorly, unfortunately. As of 2006, anyone can find black dots (the disasters) in those databases but not the white dots (the safe arrivals). Every incident from total loss to no damage is recorded with ample details, making it difficult to focus on the relevant events. Clearly, weak execution has run afoul of good intention. It is time we started turning over those pebbles! As the professors showed us, a few well-chosen numbers paint a far richer picture than hundreds of thousands of disorganized data.
Conclusion Statistical thinking is hard,” the n.o.bel prize winner Daniel Kahneman told a gathering of mathematicians in New York City in 2009. A revered figure in the world of behavioral economics, Professor Kahneman spoke about his renewed interest in this topic, which he first broached in the 1970s with his frequent collaborator Amos Tversky. The subject matter is not inherently difficult, but our brains are wired in such a way that it requires a conscious effort to switch away from the default mode of reasoning, which is not statistical. Psychologists found that when research subjects were properly trained, and if they recognized the statistical nature of the task at hand, they were much likelier to make the correct judgment.
Statistical thinking is distinct from everyday thinking. It is a skill that is learned. What better way to master it than to look at positive examples of what others have accomplished. Although they rarely make the headlines, many applied scientists routinely use statistical thinking on the job. The stories in this book demonstrate how these pract.i.tioners make smart decisions and how their work benefits society.
In concluding, I review the five aspects of statistical thinking: 1. The discontent of being averaged: Always ask about variability Always ask about variability.2. The virtue of being wrong: Pick useful over true Pick useful over true.3. The dilemma of being together: Compare like with like Compare like with like.4. The sway of being asymmetric: Heed the give-and-take of two errors Heed the give-and-take of two errors.5. The power of being impossible: Don't believe what is too rare to be true. Don't believe what is too rare to be true.
Some technical language is introduced in these pages; it can be used as guideposts for those wanting to explore the domain of statistical thinking further. The interst.i.tial sections called ”Crossovers” take another look at the same stories, the second time around revealing another aspect of statistical thinking.
The Discontent of Being Averaged Averages are like sleeping pills: they put you in a state of stupor, and if you overdose, they may kill you.
That must have been how the investors in Bernie Madoff's hedge fund felt in 2008, when they learned the ugly truth about the streak of stable monthly returns they'd been receiving up until then. In the dream world they took as real, each month was an average month; variability was conquered-nothing to worry about. Greed was the root cause of their financial ruin. Those who doubted the absence of variability in the reported returns could have saved themselves; instead, most placed blind faith in the average.
The overuse of averages pervades our society. In the business world, the popular notion of an annualized growth metric, also called ”compound annual growth rate,” is borne from erasing all year-to-year variations. A company that is expanding at 5 percent per year every year every year has the same annualized growth rate as one that is growing at 5 percent per year has the same annualized growth rate as one that is growing at 5 percent per year on average on average but operates in a volatile market so that the actual growth can range from 15 percent in one year to 10 percent in another. The financing requirements of these two businesses cannot be more different. While the compound annual growth rate provides a useful basic summary of the past, it conveys a false sense of stability when used to estimate the future. The statistical average simply carries no information about variability. but operates in a volatile market so that the actual growth can range from 15 percent in one year to 10 percent in another. The financing requirements of these two businesses cannot be more different. While the compound annual growth rate provides a useful basic summary of the past, it conveys a false sense of stability when used to estimate the future. The statistical average simply carries no information about variability.
Statistical thinking begins with noticing and understanding variability. What gets commuters upset? Not the average travel time to work, to which they can adjust. They complain about unexpected delays, occasioned by unpredictable accidents and weather emergencies. Such variability leads to uncertainty, which creates anxiety. Julie Cross, the Minnesota commuter in Chapter 1 Chapter 1, was surely not the only driver who found ”picking the fastest route” to be ”a daily gamble.”
It is therefore no surprise that effective measures to control congestion attack the problem of variability. For Disney guests arriving during busy hours, FastPa.s.s lines eliminate the uncertainty of waiting time by s.p.a.cing out spikes in demand. Similarly, metered ramps on highways regulate the inflow of traffic, promising commuters smoother trips once they enter.
The Disney ”Imagineers” and the highway engineers demonstrated impressive skills in putting theoretical science into practice. Their seminal achievements were in emphasizing the behavioral aspect of decision making. The Disney scientists learned to focus their attention on reducing perceived wait times, as distinct from actual wait times. In advocating perception management, they subordinated the well-established research program in queuing theory queuing theory, a branch of applied mathematics that has produced a set of sophisticated tools for minimizing actual average wait times in queues. As with traditional economics, queuing theory makes an a.s.sumption about rational human behavior that does not match reality. For example, in putting up signs showing inflated estimates of waiting time, the Disney engineers counted on irrationality, and customer surveys consistently confirmed their judgment. For further exploration of the irrational mind, see the seminal work of Daniel Kahneman, starting with his 2003 overview article ”Maps of Bounded Rationality: Psychology for Behavioral Economics” in American Economic Review American Economic Review, and Predictably Irrational Predictably Irrational by Dan Ariely. by Dan Ariely.
Political considerations often intrude on the work of applied scientists. For instance, Minnesota state senator d.i.c.k Day seized upon the highway congestion issue to score easy points with his const.i.tuents, some of whom blamed the ramp-metering policy for prolonging their commute times. A huge commotion ensued, at the end of which the highway engineers were vindicated. The Minnesota Department of Transportation and the senator agreed to a compromise solution, making small changes to how the meters were operated. For applied scientists, this episode conveyed the valuable lesson that the technical good (reducing actual travel time) need not agree with the social good (managing the public's perception). Before the ”meters shutoff” experiment, engineers doggedly pursued the goal of delaying the onset of congestion, which preserves the carrying capacity of highways and sustains traffic flow. The experiment verified the technical merit of this policy: the benefits of smoother traffic on the highway outweighed the drawback of waiting at on-ramps. Nevertheless, commuters disliked having to sit and stew at the ramps even more than they disliked the stop-and-go traffic on jam-packed highways.
Statisticians run experiments experiments to collect data in a systematic way to help make better decisions. In the Minnesota experiment, the consultants performed a form of to collect data in a systematic way to help make better decisions. In the Minnesota experiment, the consultants performed a form of prepost a.n.a.lysis prepost a.n.a.lysis. They measured traffic flow, trip time, and other metrics at preselected sections of the highways before the experiment and again at its conclusion. Any difference between the pre- period and post- period was attributed to shutting off the ramp meters.
But note that there is a hidden a.s.sumption of ”all else being equal.” The a.n.a.lysts were at the mercy of what they did not, or could not, know: was all else really equal? For this reason, statisticians take absolute caution in interpreting prepost studies, especially when opining on why the difference was observed during the experiment. The book Statistics for Experimenters Statistics for Experimenters by George Box, Stuart Hunter, and Bill Hunter is the cla.s.sic reference for proper design and a.n.a.lysis of experiments. (The Minnesota experiment could have benefited from more sophisticated statistical expertise.) by George Box, Stuart Hunter, and Bill Hunter is the cla.s.sic reference for proper design and a.n.a.lysis of experiments. (The Minnesota experiment could have benefited from more sophisticated statistical expertise.) Crossovers Insurance is a smart way to exploit variability, in this case, the ebb and flow of claims filed by customers. If all policyholders required payout concurrently, their total losses would swallow the c.u.mulative surplus collected from premiums, rendering insurers insolvent. By combining a large number of risks acting independently, actuaries can reliably predict average future losses and thus set annual premiums so as to avoid financial ruin. This cla.s.sic theory works well for automotive insurance but applies poorly to catastrophe insurance, as Tampa businessman Bill Poe painfully discovered.
For auto insurers, the level of total claims is relatively stable from year to year, even though individual claims are dispersed over time. By contrast, catastrophe insurance is a ”negative black swan” business, to follow Na.s.sim Taleb's terminology. In Taleb's view, business managers can be lulled into ignoring certain extremely unlikely events (”black swans”) just because of the remote chance of occurrence, even though the rare events have the ability to destroy their businesses. Hurricane insurers hum along merrily, racking up healthy profits, until the big one ravages the Atlantic coast, something that has little chance of happening but wreaks extreme damage when it does happen. A mega-hurricane could cause $100 billion in losses-fifty to a hundred times higher than the damage from the normal storm. The cla.s.sic theory of insurance, which invokes the bell curve, breaks down at this point because of extreme variability and severe spatial concentration of this risk. When the black swan appears, a large portion of customers makes claims simultaneously, overwhelming insurers. These firms might still be solvent on average on average-meaning that over the long run, their premiums would cover all claims-but the moment cash balances turn negative, they implode. Indeed, catastrophe insurers who fail to plan for the variability of claims invariably find themselves watching in horror as one ill wind razes their entire surplus.
Statisticians not only notice variability but also recognize its type. The more moderate type of variability forms the foundation of the automotive insurance business, while the extreme type threatens the hurricane insurers. This is why the government ”take-out” policy, in which the state of Florida subsidizes entrepreneurs to take over policies from failed insurers, made no sense; the concentrated risks and thin capital bases of these start-up firms render them singularly vulnerable to extreme events.
Variability is the reason why a steroid test can never be perfectly accurate. When the International Cycling Union (UCI), the governing body for cycling, inst.i.tuted the hematocrit test as a makes.h.i.+ft method for catching EPO dopers, it did not designate a positive finding as a doping violation; rather, it set a threshold of 50 percent as the legally permissible hematocrit level for partic.i.p.ation in the sport. This decision reflected UCI's desire to ameliorate the effect of any false-positive errors, at the expense of letting some dopers escape detection. If all normal men were to have red blood cells amounting to precisely 46 percent of their blood volume (and all dopers were to exceed 50 percent), then a perfect test can be devised, marking up all samples with hematocrit levels over 46 percent as positive, and those below 46 percent as negative. In reality, it is the proverbial ”average male” who comes in at 46 percent; the ”normal” hematocrit level for men varies from 42 to 50 percent. This variability complicates the tester's job: someone with red cell density of, say, 52 percent can be a blood doper but can also be a ”natural high,” such as a highlander who, by virtue of habitat, has a higher hematocrit level than normal.
UCI has since inst.i.tuted a proper urine test for EPO, the hormone abused by some endurance athletes to enhance the circulation of oxygen in their blood. Synthetic EPO, typically harvested from ovary cells of Chinese hamsters, is prescribed to treat anemia induced by kidney failure or cancer. (Researchers noted a portion of the annual sales of EPO could not be attributed to proper clinical use.) Because EPO is also naturally secreted by the kidneys, testers must distinguish between ”natural highs” and ”doping highs.” Utilizing a technique known as isoelectric focusing, the urine test establishes the acidity profiles of EPO and its synthetic version, which are known to be different. Samples with a basic area percentage (BAP), an inverse measure of acidity, exceeding 80 percent were declared positive, and these results were attributed to illegal doping (see Figure C-1 Figure C-1).
To minimize false-positive errors, timid testers set the threshold BAP to pa.s.s virtually all clean samples including ”natural highs,” which had the effect of also pa.s.sing some ”doping highs.” This led Danish physician Rasmus Damsgaard to a.s.sert that many EPO-positive urine samples were idling in World Anti-Doping Agency (WADA) labs, their illicit contents undetected. If testers would lower the threshold, more dopers would get caught, but a few clean athletes would be falsely accused of doping. This trade-off is as undesirable as it is unavoidable. The inevitability stems from variability between urine samples: the wider the range of BAP, the harder it is to draw a line between natural and doping highs.
Figure C-1 Drawing a Line Between Natural and Doping Highs Drawing a Line Between Natural and Doping Highs [image]
[image]
Because the anti-doping laboratories face bad publicity for false positives (while false negatives are invisible unless the dopers confess), they calibrate the tests to minimize false accusations, which allows some athletes to get away with doping.
The Virtue of Being Wrong The subject matter of statistics is variability, and statistical models statistical models are tools that examine why things vary. A disease outbreak model links causes to effects to tell us why some people fall ill while others do not; a credit-scoring model identifies correlated traits to describe which borrowers are likely to default on their loans and which will not. These two examples represent two valid modes of statistical modeling. are tools that examine why things vary. A disease outbreak model links causes to effects to tell us why some people fall ill while others do not; a credit-scoring model identifies correlated traits to describe which borrowers are likely to default on their loans and which will not. These two examples represent two valid modes of statistical modeling.