When the Supreme Court held that the Federal
Rules of Evidence superseded the Establishing causation in a toxic tort case is
difficult because the cause of the injury is not so directly
obvious. Critics have asserted that in deciding the
admissibility of scientific evidence, judges have collapsed legal
standards of proof into the more rigorous scientific standard,
thereby unfairly barring plaintiff’s claims. One possible answer is to apply a statistical
method which is useful with epidemiological and other “unsupervised”
data sets, which are not under strict laboratory controls. It
also can explain things in terms of “more likely than not” with
beloved 95% certainty. Principal components analysis (PCA) is
a decorrelation technique often used for exploratory data analysis
and is especially useful when there are multiple factors. This
analysis can sometimes be useful when there are hidden dependencies
between different object measures. PCA is commonly utilized in
fields such as astronomy, neural networks, psychology, forestry,
geochemistry and systems engineering to decrease the number of axes,
thereby removing variables not of interest, in the data field in
order to facilitate interpretation. This paper explains how principal components
analysis can be used to introduce causation evidence when the
standard scientific tests are not 95% certain of the causation
mechanism but do show with 95% confidence that a given factor is
most likely the cause of a given result. For example, if
toxicologists are 75% certain that chemical XYZ causes fish
mortality at a given exposure, this evidence may be excluded as
invalid knowledge to prove that XYZ actually killed any fish.
But a statistical test that demonstrates with a 95% certainty that
XYZ more likely than not (say 60% probability) caused the fish to
die should be admissible according to the
To prove causation, evidence must be both
admissible and it must support the burden of proof. Following
the Supreme Court’s Daubert decision, federal trial judges have
restricted
For most of this century American courts relied
on the
1.
In 1993, the Supreme Court changed this federal
standard and ruled that the 1975 Federal Rules of Evidence superceded the In The Supreme Court interpreted Rule 702, testimony
of experts, to supersede the While expanding the types of scientific evidence
that should be admitted, the Supreme Court clarified that this does
not mean that “the Rules themselves place no limits on the
admissibility of purportedly scientific evidence. Nor is the
trial judge disabled from screening such evidence. To the
contrary, under the Rules the trial judge must ensure that any and
all scientific testimony or evidence admitted is not only relevant,
rut reliable.” Following
2.
The Eleventh Circuit Court of Appeals reversed,
holding “because the Federal Rules of Evidence governing expert
testimony display a preference for admissibility, we apply a
particularly stringent standard of review to the trial judge’s
exclusion of expert testimony.” On Certiorari, the Supreme Court reversed,
holding that the standard for review for evidentiary matters is
abuse of discretion.
3.
The third case in the As a matter of law, upon a de novo standard of
review, the 11 The Supreme Court unanimously rejected the 11 The Supreme Court, as in Interestingly, the Court seems to have backed
away from establishing guidelines, instead, clarifying that the
B. Scientific
Evidence of Causation in Fact Science is both a method for discovery and the
resulting body of knowledge. Although day-to-day science is
conducted in a more intuitive and less structured process, Statistical methods use hypothesis testing to
determine if a relationship between variables is simply the result
of chance. If a relationship is established, the next step is
to determine the strength of that relationship, which is discussed
in the epidemiological measures of risk section of this paper.
Plaintiff’s experts usually testify about studies that meet both of
these elements as distinct steps. This is not the only way to
apply scientific methodologies, but it is the way it is currently
done in toxic torts.
1. Statistical
Techniques and Significance Testing in Assessing Scientific
Integrity While traditional, physical sciences have used
calculus and other particular mathematics, modern science
increasingly relies on statistics of data to explain phenomena.
Statistics describe data, often summarizing numerous observations,
and are commonly used in psychology, astronomy, toxicology,
climatology, economics and physics, just to name a few.
Through regression models, analyzing correlations between variables
and other multivariate techniques, statistics can be used to infer
causation by describing the likelihood that observed relationships
are random. In order to say whether a distribution of
variables is random or not, one must first select the degree of
accuracy required of this conclusion. For example, if a craps
shooter throws four straight 7s, what is the appropriate reaction of
the pit boss? How sure is he that the shooter is cheating?
Is he willing to immediately throw her out of the casino? Should he
stop the game to examine the dice? Perhaps he prefers to watch
the next few throws before he’s certain enough to risk making the
superstitious players unhappy by interrupting their lucky streak. People intuitively make statistical decisions
every day that are weighted by the risks of each circumstance.
Statisticians use p-values to choose these desired certainties in
relationships, typically at the 95% level of confidence that the
observation is not random, which is equal to a p-value of 0.05.
A lower p-value indicates a higher chance that something is going on
besides random error, so the scientists would reject the null
hypothesis Simple linear regression models are used to infer
causation from association. A line is drawn through a set of
data so that the sum of the differences from each data point to the
line squared is minimized. The line is mathematically
described as a slope, which relates the variables. For
example, a regression model that roughly predicts a person’s salary
can be estimated based on their experience. Salary = $15,000 + $2,000(years of experience) In this model, a person with no experience is
predicted to make $15,000 while a person with 10 years of experience
is predicted to make $35,000. This simple linear model would
have relatively low statistical significance since it will have a
high error associated with each prediction. If the model were
improved by adding multiple regressions for the salary differences
of men and women, it would much more accurately predict salaries and
would have a higher level of confidence. This is because the
salary of men with the same number of years of experience is more
than that of women. This difference causes more variability in
the model, thus reducing its predictive value and producing results
that appear more random. Regression model predictions are typically
described by the confidence interval of the resulting estimate.
Higher confidence in an estimate is obtained when the data more
closely fit the model, there are more data to use in the model,
2.
Epidemiological Measures of Risk
Epidemiologists study the incidence, distribution and etiology of
disease in populations and the influence of the environment and
lifestyle on disease patterns; for example, how tobacco has affected
the health of a particular population.
Epidemiologists focus on general causation, things that can
cause disease, rather than specific causation, such as the cause of
disease in a specific individual.
These studies are uncontrolled, meaning not under laboratory
conditions, since that would obviously be unethical treatment of
humans.
Since relative risk is most frequently used in litigation, only this
measure will be described.
Relative risk is the
There has been much debate within the legal community whether
plaintiffs are unfairly held to a higher burden of proof when their
main evidence of causation is scientific.
Although much scientific work with results below the 95% certainty
now required in most courtrooms is considered valid and is published
in scientific journals, most lower certainty techniques are used
primarily for exploratory research, where these results are further
investigated via more robust methods, such as regression models.
After
While it is true that science is not
amenable to reduced significance standards,
There are limitless ways to work with science and its strict
methodologies without transferring an undue burden of proof onto a
plaintiff. The trick is
to keep in mind the purpose of the investigation.
Instead of aiming to unveil universal truths, science should
be applied to determine probabilities in legal, “more likely than
not”, terms.
Principal components analysis, an exploratory data analysis
technique, is useful for civil litigation because it interprets
complex environmental data more clearly than multiple regressions
and it specifies the association between each variable and the
outcome. The goal in PCA
is not always to perfectly model a phenomena, but to quickly
understand relationships between the independent variables and the
outcome at whatever completeness the researcher wishes to discern,
including 51%.
Most commonly used in discerning environmental relationships,
Principal components analysis
If you imagine the dataset as a cloud, graphed in as
many dimensions are there are descriptive variables, the first
principle component is the line that would extend furthest through
this cloud, thereby catching the most variability.
If you imagine that this line represents the line from each
corner of your mouth, as is done in lipreading programs, the form of
the lips can be described most efficiently from the first principal
component, the basic line across the mouth.
This is how PCA is used to compress data.
This is also useful to remove factors that are confusing the
analysis. The main
component should not be removed, but lets say that the person
speaking is nervously tapping and this motion is causing their face
and lips to tremble, thereby interfering in the speech recognition
program. PCA can remove
this “noise,” which is how it is used in climate studies, where
numerous subtle effects distort the more easily understandable major
climate influences, like latitude, elevation and El Niño events.
While scientists usually group these to explain as much of the outcome as
possible, they could also be grouped or singled out to explain 51%
of an outcome, such as the variables that explain cancer.
Although there are many factors which contribute to lung
cancer, a PCA would be able to separate out the factor or factors
that more likely than not caused a specific cancer.
Furthermore, other contributing variables that may be missing
from an individual’s case, such as the possibility that they never
smoked, could be included in the calculation to add confidence to a
prediction of the present cancer causing variables, such as work
exposure, via Bayes Theorem.
Intuitively, the chance that the work exposure caused the
cancer increases when other causes are eliminated.
Factor analysis is a commonly used statistical technique that
measures an unobservable element by combining the measured ones by
using linear combinations of variables to explain sets of
observations of many variables, just like principle components. For
example, factor analysis is useful for intelligence tests, where
observed variables are various test score results.
Psychologists do not care about the various tests, but the
intelligence they were designed to describe.
The difference between the two is that in PCA, the observed
variables themselves are the focus of interest, whereas in factor
analysis, it is the unobservable element that is of interest.
The principle components approach simplifies the
interpretation of confounding variables, whereas factor analysis
disregards those observed variables and looks at the underlying
factor.
2. Hypothetical Application of
Principal Components Analysis to Lung Cancer
Since this paper merely aims to lay out the general concept of how
to use PCA for legal evidence, the numerous steps required to
produce a resulting table like the example below have been omitted.
Perhaps the weakest aspect of this approach is the actual
assignment of the principal components to the nearest independent
variable. Although it
involves some subjectivity, scientific studies utilizing PCA are
generally accepted in numerous fields and authoritative
publications.
Suppose a PCA is run on 20,000 people from Texas who have lung
cancer and the analysis includes data on whether they smoke, live
with someone who smokes (both measured in packs per day), smog
levels outside their home and known work exposure to asbestos and
other high risk materials (both combined indices).
Bayes’ Theorem describes how to link distinct probabilities of
specific conditions into the total combined probability given those
conditions.
Bayes’ Theorem
P(C/E) =
P(C ) * P(E/C )
where, P(C )
the current probability plaintiff did not contract cancer
from cigarettes E
new piece of probabilistic evidence ## P(C/E)
new
probability of something, P(C ), given E
(P(E/C)
the probability of E given C (probability that smog causes
cancer = 0.2) P(not C)
the probability that C is not true (1-C = 0.4) P(E/not C).
the probability of E if C were not true (since we have
evidence that the P(C/E) =
0.6 * 0.2 = 0.75
The combined information gives a new estimate that smoking is
75% likely the cause of the cancer for a person who smokes and lives
outside a smoggy area.
This could be further improved by accounting for the missing smoking
housemate and possibly the effects of work exposure.
3. Possible Challenges to using PCA to show Causation
One problem with this analysis is that the 9
Many jurisdictions require particular evidence to show causation for the
individual plaintiff, therefore, statistics alone may be admissible
yet insufficient. Here,
a mechanism is needed, such as the etiology of the injury.
However, courts need to know that not every mechanism is
understood. The expert
should discuss possible mechanisms, based on scientific knowledge,
and leave that factual question for the jury.
Also, additional statistical tests, such as time-series
analysis, relating to the timing might sufficiently support
association evidence in proving causation.
Opposing counsel may challenge the methodology of exploratory
PCA evidence as unscientific and only suggestive of how the
scientist should proceed to conduct a meaningful investigation.
Although PCA is used in many fields and the results are
published in reputable journals, it has not yet made an appearance
in toxic tort litigation.
Hopefully this will soon change.
Principal components analysis works well in the context of civil
litigation, where a preponderance of the evidence standard of proof
is sought, since PCA can separate out the individual influence of
each independent variable.
Furthermore, it is especially well-suited for environmental
datasets, where there are many confounding influences.
Since this technique is generally accepted in the scientific
community and used in numerous scientific fields, it should be
admissible under Rule 702 when used appropriately.
Specific PCA studies, if reasonably conducted, should follow both
the law and spirit of
This combined approach of PCA and Bayes’ Theorem is useful for
environmental and toxic torts, where there is not conclusive
laboratory experimental research to support the real-world evidence
of causation, such as epidemiological associations.
PCA is commonly used to discover ecological and climate
influences and it should be applied in toxic tort litigation to
bring polluters to justice and thereby reduce disease and resource
degradation.
The Supreme Court intended to liberalize the Federal Rules of
Evidence so that cutting edge technology would be available in the
courtrooms. PCA is
precisely the type of technique that is needed in the courtrooms to
facilitate the fair resolution of conflicts.
PCA combines a scientifically rigorous method with the
flexibility to find causation at the civil burden of proof, which is
an ideal balance to ensure justice and fairness in toxic tort
litigation. JuriSense,
LLC
Seal Beach, CA
(800) 891-6592
info@jurisense.comHome | Research | Expert Testimony | Jury Selection | Graphics | CLE | Tammy Metzger | Contact | Papers | Blog |