With a bit of theory and cash, we can at last play

DNA

Readers may be aware of a number of themes to which I return regularly. The top-three subjects are biology-versus-AI, parallel processing and object-oriented programming but recently, I've begun to see a way to combine them into one column. This one...

Last month, I described evolutionary developmental biology (Evo Devo), the recent science explaining how DNA gets turned into the multitude forms of living creatures. To relate Evo Devo to computing, I used an analogy with 3D printing, but there's a far stronger potential connection.

The aspect of modern genetics that most laypeople are familiar with is the Human Genome Project, and in particular the claims that were made for it curing genetic diseases. The cures haven't arrived, and Evo Devo tells us why: because the relationship between individual genes and organisms is not as simple as was believed even 20 years ago.

DNA-to-RNA-to-protein-to-physical-effect isn't even close to what happens. Instead, there's a small group of genes shared by almost all multicellular creatures that get used over and again for many different purposes, in many different places, at many different times, controlled by a huge network of DNA switches contained in the "junk" DNA that doesn't code for proteins. In short, genes are the almost static data inputs to a complex biological computer, contained in the same DNA, which executes programs that can build a mouse or a tiger from mostly the same few genes.

The Human Genome Project relied on actual silicon computing power, not merely to store the results for each organism it sequenced around 200GB per creature but also to operate the guts of those automated sequencing machines that made it possible at all.

However, the data structures it worked with were fairly simple, mainly lists of pairs of the bases A, C, G, T. But let's suppose that the next-generation project ought to be to simulate Evo Devo in other words, to mimic the way those lists of bases actually get turned into critters. Then you'd need some very fancy data structures indeed, ones that aren't merely static data but include active processes, conditional execution, spatial coordinates and evolutionary hierarchies. All of these components already exist and are understood in the world of computer science.

The first step would involve object-oriented programming: decompose those long lists (around 3 billion bases in each strand of your DNA) into individual genes and switch sequences, then put each one into an object whose data is the ACGT sequence and whose methods set up links to other genes and switches. You'd have to incorporate embryological findings about where in relative space (measured in cells within the developing embryo) and time (relative to fertilisation) each method is to be executed, and since millions of genes are doing their thing at the same, meticulously choreographed time, the description language would need to support synchronised parallelism. Having built such a description for many creatures, you could then arrange all their object trees into an inheritance hierarchy that accorded with the latest findings of evolutionary biology. If you were feeling mischievous, you could call the base class of this vast tree "God".

Now imagine I've been made director of the project and promised tens of billions of dollars. My first action would be to propose a new programming language is created as a hybrid of Python which has excellent sequence handling, in addition to objects and Occam, whose concept of self-syncing communication channels the IT business is only just catching up with after 30 years. Naming it would be a problem, since Darwin (the obvious choice) is already taken.

The rest of the dosh would go towards a colossal multiprocessor computer, bigger than those used for weather forecasting, simulating nuclear explosions or the EU's Human Brain Project. Distribute the object tree for some creature over its millions of cores, connect up to a visualisation system and you'd be in the business of Virtual Creation. And once it could tell mice from tigers, you'd perhaps be in the business of curing human diseases too. Unfortunately, the people who have this sort of cash would rather spend it on one-way trips to Mars.

This article originally appeared in PC Pro