CRISPR: Background and Basics (Part 1)
CRISPR genome editing is one of the hottest topics in current science; we have already written about it several times on this blog, and it seems that hardly a week goes by without a new CRISPR-related story hitting the headlines. So this series of articles is intended as a reference for those who may be curious about the basics: what is meant by CRISPR, why are biologists so excited about it, and why does it have such a weird name which seemingly bears no relation to what it does?
Let’s get it out of the way: CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. You may wonder what this name has to do with genome editing. The answer is basically nothing, because it refers to a phenomenon which was discovered and named before its significance was understood. To explain why requires delving into some history, but to begin with, it is important to understand that, as with most tools we use in molecular biology, CRISPR is an adaptation of a feature that exists in nature; in this case, in bacteria.
A Microbial Mystery
In 1987 – the early days of DNA sequencing – scientists studying the genomes of E. coli bacteria first reported a curious set of short, repetitive sequences of unknown function; this observation was barely noted at the time. By the 1990s, other researchers had independently discovered similar genomic patterns in salt marsh microbes. Characterised by several short repeats of the same sequence – around 20-40 nucleotides (DNA letters) in length – which are often palindromic (reading the same forwards and backwards), and interspersed with variable “spacer” sequences of a similar length, the term “CRISPR” was agreed upon as a label for these still-mysterious gene regions. Adjacent to these were consistently found a number of sequences which had the hallmarks of actual genes (i.e. DNA segments which encode a specific, functional protein), but these too were of unknown biological purpose, and were therefore assigned the similarly-nebulous name of “Cas” (CRISPR-associated) genes.
A Solution from Sequence-Matching
As genome sequencing boomed in the late 1990s and early 2000s, CRISPR/Cas variants were found in ever-increasing numbers of bacterial species; clearly, a feature shared by so many distantly-related groups must serve some purpose, and many hypotheses were put forward. However, it was not until 2003 that the critical observation was made that the “spacer” sequences were an exact match for something else: viruses, specifically bacteriophages (viruses which infect bacteria). As with complex organisms such as ourselves, bacteria are engaged in a constant war with their own pathogenic enemies, and have evolved an adaptive immune system based on a “memory” of previous infections. CRISPR arrays in bacterial genomes are the genetic archive of this immune system, while Cas genes produce the biological weapons used by the cell to target and destroy invading foreign DNA using these sequences at a template.
A Precision Tool for DNA
In somewhat simplified form, here is how the system works:
The Cas9 protein, one of the products of the Cas genes (there are others, but we will focus on the well-known Cas9 for now), is an enzyme which cuts DNA. It combines with an RNA molecule copied from those repeat/spacer DNA arrays, a critical part of which is a 20-nucleotide sequence called the Guide RNA (gRNA) which is a precise match for a bit of virus DNA. This allows the Cas9-RNA complex to seek out any piece of DNA within the cell matching this precise sequence, and chop the strand where it finds it – generally resulting in the destruction of the virus.
This system may have evolved as a self-defence mechanism for bacteria, but once these details were deciphered, the significance for humans became apparent. Chemically, viral DNA is no different to DNA from humans, plants or any other organism. And in this age of molecular biology, creating synthetic enzymes and custom RNA molecules is relatively straightforward; a gRNA can be any sequence we want. In other words, what bacterial evolution has invented is a universal search tool for DNA, which works in living cells; control-F for genomes.
In the next blog post, I will talk about how this system has been co-opted by human beings, and its range of applications. In February 2018, we will also launch our first “CRISPR Genome Editing: Design & Strategy” online personalised course on our new platform named Obrizum. We can send you details closer to the launch, just let us know you are interested here.
 Francisco Mojica first made this discovery in 2003, but his attempts to publish this finding were rejected by multiple scientific journals as lacking sufficient “novelty and importance”, finally appearing in the Journal of Molecular Evolution in 2005.
By Robin Floyd