The Welfare and Safety of the Racehorse Summit meetings began in 2006 with the objective of improving the safety and soundness of the Thoroughbred racehorse. One of the central questions resulting from these meetings was whether the genetic structure of the population had become compromised, such that the Thoroughbred had become less sound and durable during the last few decades. Powerful genomic tools are now available to address this question.

As a first step to do so, we used whole genome sequencing to identify every base of DNA in an animal’s genome. We studied the genomes of two groups of U.S. Thoroughbred horses: Group 1 (82 horses born between 1965 and 1986) and Group 2 (103 horses born between 2000 and 2020). Previous genetic studies used methods that measured, at most, 0.03% of the DNA to report statistically significant increases in inbreeding over time. “Statistically significant” is a scientific term meaning that there is a difference between measured groups, but does not address the magnitude of the difference nor its consequence. Using the whole genome sequence of each horse, we calculated its inbreeding coefficient, which is illustrated in the Figure below, taken from our recent publication.

A chart with black and red dots.

Each black dot represents the inbreeding estimate (FROH) for one of the 185 Thoroughbred horses, ordered according to its year of birth. The red dots represent the average for a year. The average inbreeding measured for Group 1 (0.266) was statistically different (P<0.001) from the average for Group 2 (0.283), however, the differences were small, such that one cannot identify whether a horse belonged to Group 1 or Group 2 based on its measure of inbreeding. Indeed, an increase in inbreeding is expected in a closed stud-book population under selection. The average increase from Group 1 to Group 2 suggests that breeders are successfully removing deleterious variants while selecting for desirable variants.

The two groups were also evaluated for maternal lineage using mitochondrial DNA, as well as for the prevalence of known genetic variants for disease, performance and color. Additionally, regions of inbreeding were assessed to determine whether they occurred recently (in the past five to 10 generations) or if they were more likely to date back closer to the foundation of the breed.

Only two of 19 known disease variants were found among the 185 Thoroughbreds, and those were present at low frequencies (Fragile Foal Syndrome: expect one in 62,500 affected Thoroughbreds; Hypoparathyroidism: expect one in 27,778 affected Thoroughbreds) demonstrating that the most common Mendelian diseases (inheritance patterns of single gene diseases) of horses are uncommon in the breed. We also observed a statistically significant and relatively large increase in the frequency of the MSTN (myostatin, the ‘speed gene‘) variant when comparing the two groups (Group 1: 0.427; Group 2: 0.539). This MSTN variant was previously reported to be more prevalent among champions of short races and less common among champions of longer races. This observation may reflect selection for performance at short distances during the last 50 years. In the future, we propose the use of whole genome sequencing to provide specific data that breeders can use in assessing and managing the genetic health of the population.

The cost of whole genome sequencing is rapidly dropping, meaning that it can be a valuable tool for surveillance of the genetic integrity of the Thoroughbred. Furthermore, when hereditary factors are thought to negatively affect complex traits such as soundness, fertility or fetal loss, whole genome sequencing may allow for the identification of genomic regions that contribute to the problem. In summary, these tools will prove to be valuable additions to the experience, understanding and intuitive grasp of genetics by Thoroughbred breeders.

The manuscript, with more details, is freely available HERE.