The pandemic has enabled us to study the details of how evolution happens – in real time. Scientists have generated more than two million genome sequences of the SARS-CoV-2 virus, allowing us to dissect the minutiae of evolutionary changes to a degree never previously possible for any replicating biological agent outside of the laboratory.
So what does this tell us about mutations and variants? Mutations are the ultimate engine of evolution and provide the raw material for natural selection to act. Some mutations are helpful for an organism and can become widespread in the species. Others are harmful or have little consequence. They arise due to errors when the genome is copied as a virus replicates, resulting in a single “base” (letter) being replaced with another.
The SARS-CoV-2 genome is made up of 30,000 individual bases. The rate at which mutations arise is typically expressed as the probability that any individual base will be erroneously replaced when the genome replicates. According to recent experimental evidence – which is yet to be published in a scientific journal – this is around three in a million.
Given this rate, we can ask how many mutations might arise every time someone gets infected. By multiplying 30,000 bases with the probability of 3/1,000,000, we get a total of about 0.1 mutations each time the genome replicates.
Peak infection lasts five to seven days, during which time the virus typically completes three to seven “replication cycles” (the steps from initial attachment to a host cell to the generation and release of newly synthesised virus particles). Five replication cycles would result in around 0.5 mutations (5×0.1), or one new mutation for every two people infected.
A different approach is to use genome sequence data. As each genome sequence is taken from a different infected person, this data allows us to calculate the rate at which mutations have accumulated in the global viral population, rather than within a single infection. By comparing the sequence data to an original “reference” genome (a very early virus genome) we can count how many mutations have accumulated in each genome. We can then see how quickly the number of mutations increases over time.
This tells us that the global population of viruses accumulates an average of about one mutation every two weeks – a rate similar to that within a single infected person.
To put this mutation rate into context, human genomes experience the equivalent of around 0.05 mutations every two weeks. On the face of it, this is not so different from SARS-CoV-2 (only 20 times slower), until you consider that the human genome is 100,000 times larger, making the rate of mutation per base to be around two million times faster in the virus than in humans.
So SARS-Cov-2 has experienced roughly the same amount of mutational evolutionary change during the pandemic (proportional to genome size), as humans have since Homo habilis first walked the Earth about 2.5m years ago.
New variants
The calculation described above refers to the number of mutations expected within a single line of descent (lineage) from one virus particle to the next, and so on. To work out the total number of mutations arising during an infection we also need to take into account all the virus particles produced, each of which follow their own mutational path.
The total number of infectious virus particles produced over the course of an infection is around 300,000 and 300,000,000. If each lineage accrues an average of 0.5 mutations, then the estimate of the total number of mutations during an infection in all the virus particles combined will be somewhere around 100,000 to 100,000,000 – being conservative, rather than exact.
The virus’s RNA code contains four letters: G,C,U and A – there are 30,000 of them in the genome. Mutation might change any one of these letters to any of the other three letters in the code. This gives about 100,000 possible single mutations in total.
It therefore follows that all possible single mutations are likely to arise during each single infection. So why did we not see new dangerous variants emerging until several months into the pandemic?
The overwhelming majority of these mutations will not have any meaningful consequences, or will even be harmful to the virus. What’s more, only a tiny fraction of virus particles within an infected person cause further infections. Almost all of the mutations that accumulate within a host will be lost once the infection is resolved. Also, because the time between infections is short, natural selection will have little opportunity to pick the “best” mutants with which to infect new hosts.
We should be extremely thankful for these tight genetic “bottlenecks” as the virus transmits from one host to another. It is sobering to reflect that countless new dangerous variants may have emerged within infected people across the world, but apart from the half dozen or so mutants lucky enough to get passed on and subsequently spread to become variants of concern, they have been quickly consigned to evolutionary oblivion.
Evolutionary handicap?
The fact that almost all the mutations arising within a single infection never make it out into the wider world confers a major evolutionary handicap on the virus. However, this can be compensated for if the total number of infections is very large.
At the time of writing, there were about 620,000 infections a day globally. If an infection passes on an average of 0.5 mutations, this means that globally around 300,000 new mutations are passed from one host to another each day.
Just as the overwhelming majority of mutants arising within a single infected person will never be passed on, so the vast majority of those that make it through one initial transmission event will not go on to spread more widely in the population. But recall that the maximum number of possible mutations is around 100,000. So it is conceivable that every possible single mutation in the viral genome is transmitted from one person to another every day.
This may give the impression, as some commentators have recently opined, that the virus may be running out of evolutionary options, and that the chance of new, dangerous types occurring is small.
However, some properties of the virus might not be determined by single mutations acting alone, but by the interaction of multiple mutations acting in concert on the same genome. For example, the effect of a specific mutation might be greatly enhanced if it happens to arise within a genome that has already been affected by other specific mutations. If such effects are common in SARS-CoV-2, then the virus may yet have some evolutionary tricks to pull.
Ed Feil does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.