If there is need to prove that selection works at multiple levels, then one need look no further selfish genetic elements (SGEs). These are the ultimate parasites. Many have persisted for millions of years and the most adept have evolved to the point where the host no longer recognises the element as foreign. In fact the most pernicious / smart / long-lived are those that deliver benefit to the host. Having evolved in this direction labelling them “selfish” is an oxymoron. But for the parasite there is a fine line: if it becomes too benevolent, then there is risk that it loses advantages that stem from non-Mendelian behaviour.
Sequencing the Pseudomonas fluorescens SBW25 proved challenging. Even for the Sanger Centre that undertook the project. For several years the genome remained as a several hundred unassembled (and un-assemble-able) contigs. The problem was the existence of what was assumed to be repetitive elements. Fortunately new chemistries came available and the problem was eventually solved. With resolution came recognition that the SBW25 genome is packed full of elements that are short imperfect palindromes. These elements are known from work, particularly in E. coli, as Repetitive Extragenic Palindromic (REP) sequences.
REP sequences have been recognised since the 1980s with a paper in Nature reporting their existence and even suggesting that they might be SGEs. This idea never gained traction, presumably because the elements are simply too short (~20 bp) and therefore unable to encode the enzymatic machinery necessary to facilitate their movement. Nonetheless, they received much attention particularly because of their uniqueness (each strain seemed to have a different REP profile) and thus value as bar codes for genotypes. There was once an entire industry devoted to documenting diversity via REP-profiling. REP-PCR was for a while all the rage. These days its been displaced by 16S amplicon sequencing. It’s all the same. Another way of collecting stamps.
In 2008 a young computational scientist, Frederic Bertels, turned up looking for a PhD opportunity. Frederic signed up and I was interested in learning about the distribution of over-represented short sequences — irrespective of their identity as, for example, REPs. Frederic needing some persuasion: I remember suggesting that he might consider approaching the genome as a zoologist interested in plant diversity would approach a meadow full of plants. The first thing such the zoologist needs to learn is how to recognise and categorise the green leafy things growing in the meadow.
Frederic took up this challenge and built null models of the genome that allowed him to ask of different sized sequence classes whether they were over-represented in the SBW25 genome. He showed that sequences of 16 bp (and above) were vastly over-represented compared to a null model based on chance. He focused on these short repeats and showed that there were three different sequence “flavours” among the over-represented class. These three turned out to all have characteristics of REP sequences.
Going further — and cutting a long story that you can read here very short — Frederic showed that REPs are just one component of a larger element comprising two REP sequences and an intervening region: together the two REPs form a hairpin-like structure of about 100 bp. Frederic provided evidence that this larger element that we termed a REPIN to be evolutionarily alive. Moreover — and this is a long story — Frederic showed that REPINs of different flavours are associated with transposases known as RAYTs. It seems that RAYTs are involved in REPIN movement and that the REPINs are non-autonomous genetic elements that rely on RAYTs for their dissemination.
Currently Eric Hugoson is working on obtaining a Nobel prize: he is testing the hypothesis that the REPIN-RAYT system is pervasive and long-lived because of its ability to promote amplifications under selection.
Selfish genetic elements have also come to feature prominently in work on the kiwifruit pathogen Psa where integrative and conjugative elements (ICEs) prove to be potent vehicles of evolutionary change. ICEs are elements somewhere between a phage and a conjugative plasmid. Elena Colombi’s work demonstrated this to spectacular effect (Colombi et al (2017), Lindow (2017)). Elena has continued to delve into the origins of these ICEs with particular interest in a 20 kb element found commonly within ICEs present in Psa. We had come to view this 20 kb element as a likely determinant of virulence — thus explaining its widespread conservation — but recent RNA-seq work analysing a set of mutants within the 20 kb region indicate that the element is a compound transposon that having lost its autonomous ability to jump has hijacked ICEs and now manipulates their behaviour to its own advantage.