Professor ferrets out mysteries of biology by giving computers “intelligence”
As a computer scientist who endows machines with artificial intelligence,
Daphne Koller might seem an unlikely person to draw inspiration from
the late 19th Century naturalist John Muir. But one of his quotes describes
her fascination with the world, she says, and explains why she wades
so far and so deep into the complex world of molecular biology:
“When we try to pick out anything by itself, we find
that it is bound fast by a thousand invisible cords that cannot be
broken, to everything in the universe.”
While some would see the intricacy Muir observed as too immense to understand,
Koller sees it primarily as an opportunity for “intelligent”— even
insightful — software to enable unique contributions to knowledge.
That opportunity is especially great in the rapidly exploding realm of
quantitative biological research. Torrents of raw data are currently
pouring forth from new experimental assays in genomics, proteomics, and
other molecular biology research.
“Biology is a science in flux because it has gone from being
purely experimental on small scales to becoming an information science
on a large scale,” Koller says. “But we are currently seeing
only the tip of the iceberg in terms of the biological insights that
we can extract from these data.”
Koller’s favorite subjects are genes, proteins and metabolic pathways.
Her methods are a fusion of logic and probabilistic (statistics-based)
reasoning and machine learning that she has helped to pioneer. For that
fundamental work on artificial intelligence, the 39-year-old researcher
this week was named the first-ever recipient of the ACM-Infosys Foundation
Award in the Computing Sciences. The Association for Computing Machinery
and the Indian technology giant created the $150,000 prize last year
to recognize young researchers “whose contemporary innovations
are having a dramatic impact on the computing field.” Koller, who
in 2004 was also named a MacArthur Fellow, will formally receive the
ACM-Infosys award in June.
“Prof. Koller's work on combining ‘first-order’ logic and
probability is the most important of her many research contributions in Artificial
Intelligence and Computer Science,” reads the award citation. “It
has transformed the way people handle uncertainty in large computer systems,
such as heterogeneous databases, image understanding systems, biological and
medical models, and natural language processing systems.”
Filling in biological blanks
Computers are humanity’s best tools for handling large volumes
of information, but Koller sees their real value not only in their ability
to crunch numbers, but also in their potential to infer what’s
happening in these complex systems — to help people understand
the dynamics and relationships that the data alone do not describe.
Among the bigger biological mysteries that Koller has sought to solve
with software are the mechanisms by which individual organisms come to
have the variations that make them unique. Any two individuals within
the same species — whether they are people or yeast strains — will
still have distinct sequences of DNA called “genotypes.” But
how those genetic blueprints are translated into their unique set of
traits called “phenotypes”— ranging from height
to cancer — is a very complex process involving a lot of interactions
that are not known.
In late 2006, Koller and student Su-In Lee published a paper with Harvard
geneticists in the Proceedings of the National Academy of Sciences,
in which they unveiled software called “Geronemo” that was
aimed at understanding how these individual genetic variations can perturb
the interactions between the elements in the cells (genes and proteins)
and ultimately lead to phenotypic changes. This software was able to
identify novel ways in which differences in yeast individuals led to
differences in regulating gene expression. The software took information
about gene expression and gene regulation within the specimens’ DNA
and compared which regulatory mechanisms best predicted gene expression
profiles. Ultimately, the software revealed that much of the individuality
among the yeasts was determined by how they used proteins to unfold their
DNA strands in different ways.
A year later in Genome Biology, Koller, student Haidong Wang
and computer scientists and biologists from three other institutions
unveiled another software package, called “InSite,” that
integrated protein and sequence data to infer exactly where on their
surfaces two interacting proteins would bind. How these proteins hook
up, especially when mutations warp them, may have a lot to do, biologists
believe, with how certain diseases such as cancer develop.
Both of the papers, because they explain how individual genotypes affect
cellular pathways and result in individual phenotypes, could someday
lead to advances in personalized medicine and drug design, Koller says.
Currently, many existing drugs are not usable because they are only effective
on a subset of the population, or cause severe side-effects for another
subset.
“If we could identify those individuals that will respond well
to a drug, we will have access to a whole new range of therapeutic treatments,” Koller
says.
Future ideas
Not all of Koller’s research focuses on biology. A current thrust
her group investigates is how to get computers not only to correctly
identify objects in pictures, but also to outline these objects and determine
how those objects are configured. For example, her group has managed
to create software that can find a giraffe in a picture, outline it precisely
and then use the shape of the outline to determine with high accuracy
whether the giraffe is leaning down to drink or standing upright. This
may seem obvious to people, but is a major feat for a computer.
There are a lot of applications of such sophisticated “machine
vision,” including enhancing image search and endowing mobile robots
with a better “understanding” of their surroundings. But
Koller also has a medical idea in mind. She plans to launch a new project
in which she will train the software to analyze magnetic resonance scans
of the brains of psychiatric patients. The software might prove useful
in diagnosing autism and Alzheimer’s disease by spotting characteristic
abnormalities in the size and shape of the hippocampus.
Koller, together with neonatologist Dr. Anna Penn and student Suchi
Saria, is also embarking on an effort at Stanford Hospital to apply artificial
intelligence to data collected from the vital sign monitors of premature
infants while they are in neonatal intensive care units. The question
will be whether sophisticated analysis can identify the precursors of
adverse outcomes and give doctors and nurses the opportunity for early
intervention.
Koller’s milieu, mission, and methods are each a world apart from
those of Muir, but they are rooted in the same appreciation of life’s
complexity and interdependence, something Koller likes to call the “web
of influence.” By exploring that web with artificial intelligence,
she is adding her own richness to humanity’s understanding of nature.
May 2008
|