Friday, April 17, 2009

Science by machine....

Two rather amazing papers in Science on automating scientific discovery describe a computer program that can sift raw and imperfect data to uncover fundamental laws of nature and a robot that can not only devise a hypothesis but can also run and analyze experiments to test the hypothesis. One wonders how soon old-fashioned bench scientists like myself will become obsolete.

Schmidt and Lipson use genetic programming that starts with random guesses at a solution and then employs an evolution-inspired algorithm to shuffle and change pieces of the equations until it finds a solution that works. They demonstrate their approach: automatically searching motion-tracking data captured from various physical systems, ranging from simple harmonic oscillators to chaotic double-pendula. Without any prior knowledge about physics, kinematics, or geometry, the algorithm discovered Hamiltonians, Lagrangians, and other laws of geometric and momentum conservation. The discovery rate accelerated as laws found for simpler systems were used to bootstrap explanations for more complex systems, gradually uncovering the "alphabet" used to describe those systems.
King et al. constructed a robot scientist named Adam that used artificial intelligence to come up with a hypothesis about genes in baker’s yeast and the enzymes produced by the genes. It then designed and ran experiments to test its hypothesis. Using the results, it revised its hypothesis and ran more experiments before arriving at its conclusions. From their abstract:
Adam has autonomously generated functional genomics hypotheses about the yeast Saccharomyces cerevisiae and experimentally tested these hypotheses by using laboratory automation. We have confirmed Adam's conclusions through manual experiments. To describe Adam's research, we have developed an ontology and logical language. The resulting formalization involves over 10,000 different research units in a nested treelike structure, 10 levels deep, that relates the 6.6 million biomass measurements to their logical description. This formalization describes how a machine contributed to scientific knowledge.

1 comment:

  1. There appear to me to be a number of very important but "too hard" problems confronting science. By "too hard" I mean that they are basically beyond the capacity of the human brain in the number of variables and coupled realms involved. The granddaddy is probably meteorology where it was recognised some time ago that the problem complexity was bigger that us so we had to resort to numerical models that run in a computer to get forecasts out beyond a day or two. Economics is another crazy example of people relying on simplistic stone age heuristics to grapple with computationally immense complexity. Many biological problems are also in this category where the components may be intelligible but the system has too many convoluted couplings and feedbacks to fit into anyone's head.

    The future is certainly with computers that can they sift through complexity without resorting to manifestos, aphorisms, wild simplifications, or personal preferences.

    If they can make up the stuff to test out so much the better. Let's hope they are also built with extensions that can spit out the kind of simple but tasty factoids that humans love to hear and repeat to others. (Obviously these need to be generated downstream after the main processing unlike the human knowledge model.) This will not just to keep the blogs pumping but will also allow us to take them seriously!