RAPID IDENTIFICATION OF HIGH-QUALITY BEEF PRODUCTS THROUGH ON-SITE 
GENOMIC TESTING IN UNDER-RESEARCHED ASIAN CATTLE BREEDS 
ORIGINATING FROM THE UNITED STATES 

By 

Hanna Ostrovski 

A DISSERTATION 

Submitted to 
Michigan State University 
in partial fulfillment of the requirements   
for the degree of 

Animal Science – Doctor of Philosophy 

2023 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ABSTRACT 

The Wagyu breed of cattle, known for its marbled and high-quality beef, commands a 

significant premium in global markets, underlining the importance of investigating the Wagyu 

population in the United States. This small population of Red Wagyu (Akaushi) and Black 

Wagyu were imported to the United States from Japan in the early 1990’s and since then has not 

had any more live animals, semen, or embryos available. This strange introduction of this cattle 

breed to the US and the unique selection pressures on a relatively un-researched breed demands 

further investigation through genomic technologies.  

In the U.S. market, Wagyu beef products are becoming increasingly commonplace. 

However, there is a notable absence of standardization to these products which carry high price 

tag due to the breed's reputation for superior quality and taste. Verifying products through 

genotype is the most straightforward approach, yet sequencing methods have been largely 

inaccessible, limited to specialists in molecular genetics. Oxford Nanopore's new mobile 

sequencing tools aim to increase sequencing capabilities for anyone. An initial trial run with 

inexperienced user conducted seven flow cell sequencing runs on the handheld MinION 

sequencer to sequence a single animal's genome. Results achieved good breadth and a low depth 

coverage across the genome, with each run generating more data. While ONT promises over 50 

GB per run, the highest run achieved ~6 GB, signaling a significant gap between expected and 

actual output. Despite this difference, the technology's novelty and the user's inexperience didn't 

prevent successful sequencing. This emphasizes the potential of ONT's products for mobile 

sequencing, particularly for newcomers, extending beyond traditional lab settings. 

Enabling the mobility of this sequencer for on-site product verification necessitated 

developing a mobile genomic sequencing kit for field use. Establishing an out-of-lab protocol 

was essential to swiftly identify breeds, with a specific focus on identifying Wagyu animals in 

this study. Sam Red (Akaushi) Wagyu and Black Wagyu animals were sequenced using the 

mobile kit. Breed verification of all animals was initially done with principal component analysis 

(PCA), but due to low output and sporadic coverage of the genome, PCA showed to be a poor 

way to identify breed of origin. Another method of directly matching haplotypes to a reference 

population was employed which was successful and boasted high correlation (0.55) and 

concordance rates (0.94) of sample haplotypes to the correct reference breed.  

 
 
 
 
 
These identification methods successfully verified Wagyu samples and hold potential for 

broader application in field product verification. However, the genomic landscape of US Wagyu 

largely remains unexplored. While the traditional Wagyu breeds from Japan are well-

documented, the genetic composition of American Wagyu is not yet fully grasped. Initial 

explorations into this breed revealed inbreeding and extensive linkage disequilibrium within the 

genome. A particularly intriguing finding emerged in Akaushi animals: they exhibited a close 

genetic relation to the Korean Hanwoo breed, as evidenced by a Principal Component Analysis 

(PCA). This correlation isn't entirely surprising, given the historical understanding that Japanese 

animals have roots tracing back to inland Korea. This connection sheds light on the genetic 

affinity between Red Wagyu and the Hanwoo breed, offering insights into their shared ancestry. 

Further connections between the Black Wagyu, Red Wagyu, the RedBlack cross between 

the two Wagyu groups and Korean Hanwoo were tested through estimating the accuracy of 

predicting phenotype between the breeds. The accuracy between Red and Black Wagyu was low, 

approximately 0.10 and increased when using the RedBlack or the Korean Hanwoo, ranging 

from 0.23 to 0.27. To address unbalanced breed group sizes (~150 Black Wagyu versus ~5000 

Red Wagyu), the total population was divided into 10 balanced groups based on animal 

relatedness via the first principal component. Testing prediction accuracies within these splits 

revealed higher accuracies, especially between closely related splits, reaching up to 0.45. 

Notably, the split involving Red Wagyu (1st PC split) and Korean Hanwoo (10th PC split) 

demonstrated this heightened accuracy, reinforcing the close genetic relationship between these 

breeds. Lastly, a comprehensive genome-wide association study across all breeds identified new 

genomic regions on chromosomes 6, 10, and 14 associated with growth in Asian cattle.  

The uncovered genomic architecture of US Wagyu can aid in understanding the unusual 

introduction of US Wagyu and small number of animals available that have shaped the Wagyu 

population today. This understanding will pave the way for enhanced breeding programs, 

enabling producers to further refine and optimize the desirable traits of Wagyu beef, ultimately 

improving its quality and consistency. The exploration of US Wagyu's genomic landscape 

further contributes to the authenticity and traceability of Wagyu products in the market. By 

establishing a comprehensive genetic profile, it is easier to verify and certify the quality of 

Wagyu beef, thereby ensuring transparency and trust for consumers.

 
 
 
 
 
To my family, the one I was born into: my Mom and Dad, Megan and Claire, and to the one I 
was blessed with: my husband Alan, my son Isaak, and my emotional support, May. Your love, 
reassurance and encouragement are unwavering, and I will forever be grateful. 

iv 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ACKNOWLEDGEMENTS 

Firstly, I would like to acknowledge the many mentors that I have had in my life, 

especially those that have had a direct impact on my path in university, Dr. Phil Miller, Dr. Ron 

Lewis, Dr. Christian Maltecca, Dr. Francesco Tiezzi, and Dr. Cedric Gondro. I have been 

permanently shaped by your mentorship, and think of myself as a better researcher, 

communicator, and overall person because of you all. I would also like to thank my PhD 

committee members, Dr. Wen Huang, Dr. Juan Steibel, Dr. Robert Tempelman and Dr. Ana 

Vazquez, who have guided me in my journey and have given me invaluable insight and 

encouragement in my program. I thank you for your time and counsel in my growth at Michigan 

State and look forward to future collaboration and partnership.  

I would like to thank and acknowledge the many fellow graduate students and post-docs 

that I have learned from at the University of Nebraska – Lincoln, North Carolina State University 

and Michigan State University. I am very lucky to call many brilliant professionals my friends.  

I would like to thank the American Wagyu Association, it’s staff and members, who fuel 

my passion for Wagyu in America. This thesis is just the first step into exploring, expanding and 

researching Wagyu. I would also like to thank the American Akaushi Association, who 

welcomed me in Texas and supported my research. I am looking forward to the endless 

possibilities in this breed and the positive impact new discoveries will have on cattle producers.  

Lastly, I would like to acknowledge my family in friends, who many times, saw what I 

could not, and helped me to be the best version of myself. I would specifically like to 

acknowledge my Grandad, as my curiosity in life and learning is the result of listening to and 

watching you. I could not have accomplished anything in this journey without my husband Alan. 

You are my source of the most un-wavering support and love; I could not have accomplished 

much without you. 

v 

 
 
 
 
 
 
 
 
 
 
 
TABLE OF CONTENTS 

CHAPTER 1: Introduction……......................................................................................................1 

CHAPTER 2: Literature Review………………………………………………………………….6 
LITERATURE CITED....................................................................................................28 

CHAPTER 3: Investigating New Technologies for On-Site Real-Time sequencing for any 
Animal Scientist............................................................................................................................ 38 
LITERATURE CITED....................................................................................................51 

CHAPTER 4: Mobile, Rapid Beef Product Identification through 3rd generation Sequencing 
Methods..........................................................................................................................................53 
LITERATURE CITED....................................................................................................73 

CHAPTER 5: Genetic Characterization of the Akaushi Breed in the United States.....................76 
LITERATURE CITED....................................................................................................92 

CHAPTER 6: Estimation of Within and Across Breed Prediction Accuracies in the Wagyu 
Population in the United States and the Korean Hanwoo..............................................................94 
LITERATURE CITED..................................................................................................108 

CHAPTER 7: Conclusions……………………………………………………………………..117 

APPENDIX A. ............................................................................................................................119 

APPENDIX B. ............................................................................................................................121 

APPENDIX C. ............................................................................................................................122 

vi 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 1: Introduction 

The recent boom of high-quality beef in America is largely due to the expansion of use of 

Japanese breeds, Black Wagyu (Figure 1.1) and Akaushi (also known as Red or Brown Wagyu, 

Figure 1.2). 

Figure 1.1 Black Wagyu bull from the American Wagyu Association 

1 

 
 
  
 
Figure 1.2 Red/Brown Wagyu bull from the American Wagyu Association  

The first introduction of these breeds in the United States was in the 1970’s, when 4 bulls 

were imported into Texas (American Wagyu Association). From these first males, some 

crossbred Wagyu started to emerge throughout the country until a larger group of Wagyu cattle 

were imported in the early 1990’s which provided females for fullblood animals. This herd was 

made up of the two “sub-types” of Wagyu, the Japanese Black and the Japanese Red/Brown or 

Akaushi (Gotoh et al., 2018). After these animals were imported, Japan locked its borders to 

further exports of live animals, semen or embryos of Wagyu cattle. All Wagyu cattle that exist 

today in the United States have come from this initial group of animals and many can be traced 

back to Japan through pedigree and genotype. 

The initial scarcity of available Wagyu animals outside of Japan for breeding purposes 

raises concerns about the breed's long-term sustainability. A limited genetic pool can have 

adverse effects on the current Wagyu population, potentially resulting in reduced variation and 

increased inbreeding which poses a threat of population collapse. Over time, selective breeding 

and possible genetic drift have influenced this population, making it crucial to delve into its 

genomic architecture. Delving into the US Wagyu population can unravel how forces of 

selection and genetic drift have shaped the breed's characteristics and diversity. By making 

2 

 
 
 
 
 
 
informed decisions that promote genetic diversity, breeders can safeguard the health and 

resilience of the Wagyu breeding stock available beyond Japan.  

Further research in this breed, especially those animals in the United States is needed as 

no published research have used the whole American Wagyu population. This contrasts with 

other beef breeds, such as Angus or Hereford, which have thousands of research articles 

exploring the many facets of the breed. A quick search on google scholar shows ~914,000 results 

when searching “angus”, while a search for “wagyu” results in ~9,290 hits. This is a huge 

discrepancy in research articles and brings attention to a need of exploration into Wagyu and the 

genomic pressures the US population has been under.  

Uncovering Wagyu population architecture is crucial as demand for Wagyu has 

exploded, and consumers have started searching out a higher-quality beef product. An increase 

of animals and products without proper monitoring of the rapid expansion of the population. The 

Wagyu breed is exploding onto the scene for F1 commercial producers, as a cross with a 

Fullblood Wagyu to produce a ~50% Wagyu animal can increase the grade of the animal. 

Quality grading in the US tops out at Prime, which is considered 12% IMF (Lonergan et al., 

2019). International grading scales have much more specified scaling, with the top grades in 

Japan scoring up to 60% IMF (Gotoh et al., 2018). Japanese grading standards compared to 

IMF% can be seen in Fig 1.3 (Horii, 2009) 

3 

 
 
 
 
 
Figure 1.3 Intramuscular fat percent per Japanese marbling standards over time (Horii, 2009) 

This huge discrepancy in marble grading has opened an avenue for the crossbred Wagyu 

animals to be sought after in the US, as many of these cattle grade Prime here, but may not grade 

as high in other countries. This has opened the door into the world of 50% F1 Wagyu crosses 

becoming a huge leader in the US beef market, as many of these F1 animals qualify for Prime on 

the US scale. These products have flooded the market because it made “Wagyu influenced” beef 

available for any budget.  

The name “Wagyu” has now become commonplace and is synonymous to a beef product 

that has high-marbling and buttery texture. This has driven up the value of this breed in the US, 

which is rapidly growing, as the demand for higher quality beef has expanded (Forristall et al., 

2002; Gonzalez & Phelps, 2018; Kempster, 1989). A higher quality beef product does come at a 

higher price, so correct breed identification of products is necessary to give consumers piece of 

mind when purchasing more expensive products that claim to be of higher quality. The 

identification of these animals in the US has now reached a tipping point, as many products now 

4 

 
 
 
 
 
 
boast a “Wagyu”-type name in advertising a product. Testing of verification can most rapidly be 

done through genotype, and new sequencing techniques recently boast out-of-lab protocols. 

Identification of Wagyu products via these protocols can be an avenue for rapid verification of 

these products. 

The beef breeding industry outlook for Wagyu products is very positive, with many 

producers of Wagyu animals for breeding purposes boasting high prices. The obvious demand 

for this breed is apparent, and the costs associated with purchasing these animals are very high. 

An average private sale may sell Wagyu cows at an average of $10k US dollars (see Appendix 

A), which is a large discrepancy behind the US average for other breeds at $2k (see Appendix 

B). Estimating accurate breeding values for these expensive animals is crucial to keep and 

increase the value of these animals. More accurate breeding values can aid in selection decisions 

in Wagyu, especially targeting the highly coveted high-marbling animals.  

Wagyu cattle have swiftly garnered acclaim, coveted for their high-quality beef, noted for 

their unparalleled tenderness and rich marbling. However, with their rise in popularity, ensuring 

the authenticity of Wagyu products at the consumer level has become pivotal. Rapid and 

accurate identification of these products is crucial to preserve the integrity and reputation of 

authentic Wagyu. Deciphering the US Wagyu population structure will shed light on its origins, 

divergences, and unique attributes. These insights not only bolster the breed's authenticity and 

reliability but also pave the way for improved breeding practices and the preservation of Wagyu's 

distinguished qualities. Ultimately, this comprehensive understanding serves as a cornerstone in 

safeguarding the legacy and future of Wagyu in the United States. 

5 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 2: Literature Review 

Feasibility of Wagyu Product Identification 

The first hurdle to tackle in identification of animal products is how to obtain genomic 

information. There have been many attempts at tracing animal products without a genomic trail 

(Aung & Chang, 2014; Bosona & Gebresenbet, 2013; Schroeder & Tonsor, 2012; Souza-

Monteiro & Caswell, 2004), which can only lead to uncertainty due to possible human error in 

processing or packaging of these products. The US has set up Process Verified Programs (PVP) 

since the 1990’s that have tried to set a standard for safe food products and traceability.  

Tracing cattle through the production system in the US can be difficult, as an animal can 

flow through many different hands before they are processed. A reliable traceability system from 

cow/calf producer to meat processor has still to be developed, but the implication of this system 

could help reduce the cost of disease outbreak problems (Blakebrough-Hall et al., 2020) and 

identify fraudulent labeling (Jo et al., 2021; Spink & Moyer, 2011). Both issues can weigh 

heavily on the beef market in the US if not identified. For example, bovine respiratory disease 

from 2011-2015 cost the cow/calf industry $165 million dollars annually (M. Wang et al., 2018). 

Quick identification of outbreaks before they reach a full-blown endemic level can save the 

industry millions of dollars. Food fraud also costs the US industry millions of dollars per year 

(FDA), with the top fraudulent activities being dilution, unapproved additives, mislabeling and 

counterfeiting (Johnson, 2014).  

6 

 
 
 
 
 
 
Figure 2.1 Food fraud as reported by the National Center for Food Protection and Defense 
(Johnson, 2014) 

The impact that product identification would have on the Wagyu market in the US would 

be large, as labeling of “Wagyu” product in the US is largely done on “Wagyu Influenced” 

cattle. Live animal specifications done by the USDA state that the animal must have at least one 

registered parent of 15/16 Wagyu breed (see Appendix C). This does not hinder other marketing 

schemes to use these high dollar definitions to label beef products. This is largely due to the 

name “Wagyu” having no protections in the United States but can be regulated closely in Japan 

due to protections of the word “Wagyu” available in Japan.  

 The identification of these products beyond the label can be most accurate if the products 

are identified through genomic sequence, as the sequence at the start of an animal’s life does not 

change. The easiest identification methods that would impact US Wagyu would be the 

identification of how much “Wagyu” is in products that claim to be Wagyu. This would be a 

simple breed identification that can be done through sequencing meat samples. The challenge 

lies in the cost of sequencing these samples and the time it takes to obtain the sequence. If the 

sample is sent to a 3rd party lab, the time it takes to get a sequence back could be weeks. By then, 

the sample of meat would have already been consumed, and the identification not known. The 

sequencing done for a sample must be quick, accurate and the protocol must be user friendly to 

obtain correct identification during the time of consumption. 

7 

 
 
 
 
 
Genomic Sequencing and Breed Identification 

  Fred Sanger was one of the first to crack the code of DNA sequencing (Sanger et al., 

1965), with his method becoming one of the most common ways to sequence DNA in the early 

years of first-generation sequencing. Allan Maxam and Walter Gilbert also pioneered their own 

method, Maxam-Gilbert method (Heather & Chain, 2016), which was commonly used, but was 

not as easily done as Sanger sequencing. These methods were chemical and mechanical in 

nature, with low sequencing pace.  The second generation of sequencing methods utilized 

parallelization of DNA sequencing, with many strands of DNA being sequenced in one run. 454 

Life Sciences (Gupta & Gupta, 2014) produced some of the first of these parallel sequencing 

machines, which lead to quicker turnaround times from DNA to fully sequenced output. Quickly 

after, Solexa, later acquired by Illumina, produced their own high throughput sequencing 

machines that improved accuracy and read depth in sequencing (History of Illumina Sequencing 

& Solexa Technology, n.d.). To date, Illumina sequencing is the front runner in the DNA 

sequencing world, with most industry and research efforts relying on these technologies for their 

efforts.  

The traditional methods of sequencing are great for projects that have extended deadlines, 

but rapid identification needs a real-time, out-of-lab sequencing protocol. The most interesting 

technology in this space has surfaced from Oxford Nanopore Technologies (ONT). A small, 

easy-to-use machine was introduced in 2016 called the MinION (Lu et al., 2016). It is touted as a 

mobile sequencing device that has the capability to take sequencing out of the lab. Figure 2.2 

shows the scale of this sequencer, which can fit into the palm of a human hand. 

8 

 
 
 
 
 
 
Figure 2.2 Oxford Nanopore Technology’s MinION in a human hand. 

Oxford Nanopore has been around the sequencing world for many years. They have 

many products including the GridION, PromethION (Deamer et al., 2016; PromethION | Oxford 

Nanopore Technologies, n.d.) which are large sequencers that are used in an in-lab setting and 

can obtain large amounts of sequence.  

The sequencing method that ONT is known for is long read sequencing or taking the raw 

DNA without amplification and reading it through their namesake, the “nanopore”(Clamer et al., 

2014; M. Jain et al., 2016). The MinION outputs sequence through the flow cell that contains 

these nanopores. This flow cell is considered a consumable, as it can only be used 2-3 times 

when washed and stored correctly (Michael et al., 2018). This cell is inserted into the MinION, 

which electrically charges the flow cell that houses the nanopores. When a nucleotide is read 

through the nanopore, the electrical charge that is pumped through the flow cell is disrupted. 

This disruption is recorded and is noted as a “squiggle”.  See Figure 2.3 for a detailed flow of 

these steps from (Bhattaru et al., 2019) where A is the library preparation that includes attaching 

9 

 
 
 
 
 
 
 
an adapter, B is the nanopore reading the string of DNA, C is the MinION itself and D shows the 

output squiggles. 

Figure 2.3 Flow of sequencing with the MinION (Bhatarru, 2019) 

These squiggles are then basecalled through the ONT program “guppy” (Wick et al., 

2019), which uses a learning algorithm to interpret these electrical outputs. The output is then in 

a form that can be aligned, indexed, sorted and variants identified through free 3rd party 

bioinformatic tools. The finalized sequence is then usable for population analysis, breed 

identification, and more in-depth genomic analyses.  

These finalized sequences can be filtered at different depth, quality, and coverage at each 

position sequenced. This filtering can lead to differing outputs of accuracy and number of 

variants included in the filter. The cut off depth for high accuracy is usually at 40x for whole 

genome sequencing. Previous studies utilizing the MinION output have outlined a cut off for 

quality 7 (Delahaye & Nicolas, 2021) which lowers error rate in the reads. In most cases, the 

more stringent the filtering is, the lower amount of sequence will be available for analysis, but it 

can be assumed that this sequence is the most accurate.  

10 

 
 
 
 
 
 
When trying to identify breed composition with these sequencing methods, an efficient 

bioinformatic pipeline is needed as well as an already established reference population of target 

breeds as well as other common cattle breeds. This is needed to understand the genomic sample 

that was sequenced, as obtaining the sequence in an out-of-lab setting in real time is only the first 

hurdle. Without a genomic reference, there is no way to compare the sequence obtained, and thus 

no way to understand the composition of the sample.  

Traditional imputation with short-read sequence data has been well researched (Yun et 

al., 2009) and is used to infer areas of the genome that were not genotyped. Imputation software 

is quite accurate when provided with short-read sequences, to the point where it is expected that 

lowpass sequencing coupled with imputation will soon replace traditional array genotyping as 

the standard genotyping method. Imputation of long-read sequencing data is, however, more 

complicated as they consist of longer uninterrupted stretches of the genome. This in effect means 

that for the same depth of coverage in comparison to the short-read data, there will be less 

broadness of coverage (i.e. reads will be more concentrated in some areas and consequently the 

distance between sequenced regions will be longer). This hinders imputation as the methods rely 

on the linkage disequilibrium (LD) structure of the population – the weaker the LD the lower the 

imputation accuracy will be. This is detrimental towards understanding breed composition as 

accurate genotypes are necessary to create genomic relationship matrices, which are the 

backbone into looking at the genetic relationships between animals. A recent imputation 

software QUILT (Davies et al., 2021) has been developed for long-read sequencing platforms 

such as ONT and could help mitigate some of these problems, although a higher depth of 

coverage in comparison to lowpass sequencing with short reads will probably still be needed.  

Future prospects of ONT products are positive, as the toothing issues of the new 

technology are being addressed. Specifically, the introduction of the VolTRAX (Oxford 

Nanopore Technologies) will be crucial in eliminating human error in creation of the sequencing 

library. This product will remove all human errors in mixing and pipetting reagents to create 

libraries by automating the process. Other sequencing products aim for even smaller 

applications, such as a sequencer that can plug into a phone (Oxford Nanopore Technologies, 

n.d.-a). The practical application for out-of-lab sequencing that anyone can use will be through 

development of products that have easy protocols and less need for traditional lab requirements. 

11 

 
 
 
 
 
Application of new advanced technologies can be difficult to integrate, as scientists and 

researchers tend to be skeptical about the application of new tech, the technologies are usually 

high in price (Schaller, 1997), and changes in the status quo may be harder to adopt across the 

discipline.  This is not something to be overlooked by cost or difficulty, as real-time sequencing 

can have huge effects if we get to the point where product certification and identification of 

breed composition in animals can be achieved in real-time during active consumption of the 

product in question.  

Wagyu Origins and Population Structure  

Modern cattle breeds stemmed from the middle east, Bos Taurus and Bos Indicus, with 

traditional European breeds arising from the Bos Taurus lineage. These animals were already 

used for meat consumption during thousands of years in Europe and even specifically reared for 

beef production during the last few centuries. Asian breeds only more recently started to be 

raised for beef production. The Wagyu breed was originally bred for work, as they were farming 

animals that helped with heavy plowing in Japan (Motoyama et al., 2016) at a time where it was 

illegal to consume meat in Japan. For many years, the Wagyu breed was not known as a food 

source until meat became commonplace in Japan, and the realization of the Wagyu breed as 

high-quality was established. 

The original country of origin where the Wagyu breed started is somewhat debated, but 

many studies (Chen et al., 2018; Sasazaki et al., 2006) have shown that the Wagyu population in 

Japan originated from modern-day Korea. These animals were brought to Japan and crossed with 

traditional European breeds (Namikawa, 1980) to create the population today which consists of 

many sub-types of Wagyu: the Japanese Black, the Japanese Red/Brown or Akaushi, the 

Japanese Shorthorn and the Japanese Polled. The Wagyu available for production of meat 

products are the Japanese Black and the Japanese Red/Brown (MAFF, 2020). See the crossbred 

animals in Table 2.1 from Namikawa’s paper. 

12 

 
 
 
 
 
 
Table 2.1. Composite breeds that make up the modern Wagyu breed per Japanese prefecture 
(Namikawa, 1980) 

It wasn’t until the Japanese government allowed the consumption of meat that the Wagyu 

breed burst onto the food scene. With many years of no artificial selection for meat production 

traits, these animals had developed into something quite different from their European 

counterparts, with increased marbling and fineness of fat strands throughout the meat. This type 

of fat is also very different from the traditional meat breeds and boasts a profile that is 

characterized as “healthy fats” or mono-unsaturated fats (Kohama et al., 2021). The meat also 

comes with a boost of Omega 3 and Omega 6 fatty acids, which are recognized as healthy fats as 

well (Shahidi & Ambigaipalan, 2018) . 

The relationships between these breeds has been previously researched (Honda et al., 

2004, 2006; Nomura et al., 2001), with many studies reporting a large variation between the 

European breeds and Asian breeds, even with the previous crossbreeding of native Japanese 

strains. This is to be expected, as differing selection pressures from human and environmental 

pressures have driven different phenotypic outcomes for these breeds. Most recent studies show 

the Wagyu, Akaushi and Hanwoo (Lee et al., 2014) to all be closely related in relation to other 

13 

 
 
 
 
 
 
European breeds, such as Angus or Milking breeds, such as Jersey or Holstein, which can be 

seen in Fig. 2.4 from Lee’s 2014 paper.   

Figure 2.4 Principal Component Analysis of Wagyu, Hanwoo, Angus and Holstein; where the 
Asian breeds group most closely together (Lee, 2014) 

The Wagyu breed outside of Japan is known to have originated from only a few animals 

that were exported to the United States in 1970 and in the 1990s. After the initial export, the 

Japanese government banned further export of Wagyu live animals, semen or embryos declaring 

the breed a “national treasure”. All Wagyu animals that exist outside of Japan are from the small 

number of animals that were allowed out of country. This is of some concern, as the genetic pool 

of animals to grow a large herd outside of Japan is limited. This has increased the level of 

inbreeding in the American population (Heffernan et al., 2021), which can lead to recessive 

diseases without careful consideration of breeding decisions. A strong selection in the Wagyu 

population has contributed to this genomic architecture in the US but can also be seen in other 

Asian cattle breeds (Z. Wang et al., 2019). Previous population studies of Wagyu in Japan have 

also shown a large decrease (Mukai et al., 1989; Uemoto et al., 2021) in variability and effective 

population size. 

14 

 
 
 
 
 
 
Population analyses help define the state of certain populations and are important within 

the animal industry, as artificial selection (Flori et al., 2009; Seo et al., 2022) and genetic drift 

(Brüniche-Olsen et al., 2012; Kidd & Cavalli-Sforza, 1974) can affect populations in dramatic 

ways. Such changes occurring in cattle due to selection are in the growth and performance, 

specifically the quality of the carcass, fertility traits for consistent calving, and efficient growth 

per animal. These changes affect the phenotype of the animal but can also change the genomic 

architecture of populations, as animals who are favorable in production traits are selected more 

often as parents than those who are not.  

Population studies in Wagyu have previously tried to understand the changes in the 

population in the US. In particular, uncovering the structure of the relationships within the breed 

as well as the relationship to other breeds which is crucial for maintaining the breed with enough 

genomic variation and keeping inbreeding levels low. Previous studies used the numerator 

relationship matrix through pedigree, or the A-matrix, which utilizes numeric relationships 

between animals (i.e., the relationship between a parent and offspring is 0.50). These relationship 

matrices are built on the founders in a population, then estimating all relationships from those 

original animals onward. This relationship matrix only identifies those relationships that are 

Identity by Decent (IBD), or relationships that are assumed through parent-offspring 

relationships. Further identification of relationships through Identity By State (IBS), or areas of 

the genome that may be the same between animals that are not related due to genomic 

architecture of breeds. These relationships can be identified from genotypic data through the 

genomic relationship matrix, or G-matrix (VanRaden, 2008a): 

𝐺 =

𝑍′𝑍
2 ∑ 𝑝𝑖(1 − 𝑝𝑖)

 Inbreeding in a population can give the researcher a good sense of the breeding trends 

that have been occurring over the years. An increase in inbreeding could mean that this 

population was heavily selected in a line-breeding scheme, or this population has been isolated 

and was only able to breed among themselves. Previous studies on the American Wagyu 

population have been published (Heffernan et al., 2021; Scraggs et al., 2014), which state a very 

low effective population size (14) and large runs of homozygosity, which all point to a small 

gene pool and large levels of inbreeding.  

15 

 
 
 
 
 
 
Intensive selection can lead to a bottleneck in a population, as only a few animals will be 

selected for continuing generations. This will lead to a loss in genomic variability which can lead 

to inbreeding depression wreaking havoc on a population (Charlesworth & Charlesworth, 1987).  

Inbreeding depression can have an impact via low fertility, low fitness, or deficient performance 

phenotype (Brüniche-Olsen et al., 2012; González-Recio et al., 2007). The importance of 

understanding average inbreeding in a population is necessary to keep these negative effects at 

bay and to increase genetic variability in the population. Identifying highly inbred animals is also 

of importance, as utilizing them as breeding animals must be balanced with the need to decrease 

the inbreeding coefficient in future generations.  

Another good indicator of inbreeding and genetic variability is the estimate of the 

effective population size.  Effective population size can measure the amount of genetic drift that 

has occurred in the population from selection. In general, a small Ne is concurrent with a 

population that has been selected intensively, as a restricted number of animals are available for 

each generation as breeding animals.  Effective population size can give insight into the genetic 

makeup of a population through estimating the number of animals it would take to make up the 

current population that is being analyzed. 

Opposing homozygotes is also utilized in population analysis as it explores 

inconsistencies within the pedigree (Hayes, 2011). It can help identify animals that may be 

incorrectly recorded as related in the pedigree or identify unrecorded relationships. Opposing 

homozygotes in these animals (i.e.. Parent 1 is AA and offspring is BB) show inconsistencies 

within the pedigree of animals that are related and can be solved through this test. This is 

important in populations that are pedigree dependent, such as breed associations, which register 

animals based on parentage verification.   

All previous population analyses can identify structures within populations, but 

comparing different populations within one analysis is most done with principal component 

analysis (PCA). This is one of the most common ways to understand breed composition using 

genomic sequence which establishes breed grouping through eigenvectors and eigenvalues. The 

eigen decomposition of the genomic relationship matrix uncovers the variation that is in a 

population that is attributed to breed (McVean, 2009; Patterson et al., 2006). The eigenvectors 

help understand the grouping of animals per breed, the principal components, while eigenvalues 

explain the variance between the principal components, which explains the relationship between 

16 

 
 
 
 
 
the breeds (Karamizadeh et al., 2013). The top principal components that account for most 

variation between the breeds in the genomic matrix are then plotted against each other to 

visualize groupings of these breeds studied, which is usually the first and second principal 

component. For breed identification, inclusion of multiple populations and breeds is crucial for 

true identification of each sample. Without a good base representation of many breeds, a sample 

may not be grouped within any breed or may be incorrectly grouped within a principal 

component, and identification is then not possible. 

PCA can also be used within populations to identify family lines. Family lines within 

Wagyu are known by prefecture that the exported animal originated from in Japan. Previous 

studies have outlined these prefecture lines and the phenotypic differences that are associated 

with these lines (Oikawa, 2018; Smith et al., 2001). The main differences lie in the expression of 

marbling and fatty acid profile. The most prevalent and well-known prefecture being that of 

Kobe, which is cherished for its extremely high marbling properties and authentication process 

that comes with its own certificate (Kobe Beef Marketing & Distribution Promotion Association, 

2023). 

Status of Wagyu worldwide shows an increase of countries that are utilizing Wagyu as a 

prominent source of beef (Fortune Business Insights, 2023). The population is still most highly 

concentrated in Japan, with the US and Australia (Gokey, 2018; Rouse et al., 2000) producing 

many Wagyu outside of Japan. Many of these animals are Fullblood animals, or animals that are 

considered 100% Wagyu, which trace their linage back to the Japanese founders through 

pedigree and genomic analysis.  

The recent large growth of Fullblood animals present in the United States is due to an 

explosion in embryo transfer use (Elsden et al., 1976; Tanabe et al., 1985). These reproductive 

technologies paved the way for a rapid expansion of this population without the need for many 

Fullblood cows. The growing population has been under selection pressures that are different 

from their Japanese counterparts, which may have contributed to genetic drift in this population. 

This phenomenon may lead to differing Wagyu populations worldwide not overlapping in a 

principal component analysis. Addressing all population measures in the Wagyu population in 

the US before further analysis can help identify the genetic structure of the current population. 

This can help identify parent-offspring relationships, help understand breed or family 

17 

 
 
 
 
 
composition through PCA or even understand if this population has a high inbreeding coefficient 

and define selection decisions to reduce inbreeding. 

Significant QTL in Wagyu identified through GWAS  

The introduction of genotyping animals through SNP chip technology opened the door 

for producers to utilize genomic information in a new way. Phenotypes could now be connected 

to QTL (Quantitative Trait Loci), more complex traits could be explored and explained, and 

fitting random genetic effects with a genomic relationship matrix was now possible in modeling. 

The utilization of genomics in the beef industry has allowed for selection of young bulls before 

the need of progeny testing. This has increased selection intensity, which can increase the rate of 

genetic gain, but can also have adverse effects (S. K. Jain & Allard, 1966; T. Meuwissen et al., 

2016). This is seen in the Holstein population in the US in the early 2000s, as the introduction of 

genomics helped tremendously in milk output, but decreased genomic variability, which lead to a 

decrease in fertility rates (Lucy, 2007). The Wagyu population may be in danger of the same 

issues, as only a small population is available for breeding outside of Japan, and the effective 

population size is very small (Scraggs et al., 2014). 

Identification of related animals in Wagyu was previously done with pedigree records 

only, but the introduction of genomic information in the form of SNPs transformed how animals 

are identified in a breeding population. The creation of the genetic relationship matrix (Gianola 

& de los Campos, 2008; González-Recio et al., 2008; VanRaden, 2008a) was a large step in 

utilizing genomic information in animal models. The G-matrix is best described as the 

relationship between animals based on the allelic frequency in the population being analyzed. 

These genetic markers explain the random genetic variation that occurs in each population. This 

can also be classified as the additive effect, or the purely “SNP” based effect throughout the 

genome that is contributing to phenotypic variation.  

The introduction of the G-matrix did not include those older animals that may only have 

pedigree information and may not be available to collect a DNA sample (death, culling, 

harvested, etc.). A single-step approach (H-matrix) to combine the pedigree and genomic 

information in these populations was introduced by (Legarra et al., 2009) and is the current 

standard for many animal industries (J. C. Dekkers, 2004; Hutchison et al., 2014; Knol et al., 

2016; Wolc et al., 2016) .  

18 

 
 
 
 
 
 
Genome-wide association studies (GWAS) are one of the most common ways to identify 

significant SNP in a population of genotyped animals. Identifying these SNP can be done though 

a BLUP model, or GBLUP if genomic information is being utilized. Obtaining SNP effects 

through this model is done through backsolving (Strandén & Garrick, 2009; VanRaden, 2008b; 

H. Wang et al., 2012). A more specific SNP-based model estimates the genomic EBVs through a 

snpGBLUP, which estimates the value of each SNP with fixed effects considered. 

Where y is the phenotype, X is the incident matrix for fixed effects, b is the vector of 

fixed effects, W is the genotype matrix, a is the vector of regression coefficients for random SNP 

𝑦 = 𝑋𝑏 + 𝑊𝑎 + 𝑒 

effects 𝑁 ~ (0, 𝐺𝜎𝑎

2) and e is the vector of residual effects, where we assume 𝑁~(0, 𝐼𝜎𝑒
 To obtain p-values of each SNP effect we can utilize output from the snpBLUP into this 

2) .  

equation: 

𝑝𝑣𝑎𝑙𝑖 = 2 (1 − 𝛷 (|

𝑎̂𝑖
𝑠𝑑(𝑎̂𝑖)

|)) 

Which identifies the significance of the SNP effect 𝑎̂𝑖 as 𝑝𝑣𝑎𝑙𝑖 through the density 

distribution (t-distribution). Visual identification of significant SNP throughout a genome can be 

done through traditional visualization of these p-values is by way of Manhattan plots, famously 

named after the skyline of Manhattan. The highest peaks are those with most significance 

towards the desired phenotype, whether that be a positive or negative response to the phenotype. 
The highest p-values of significance that cross the p-value threshold of − 𝑙𝑜𝑔10(5 𝑥 10−8) are 
the SNP with the most influence on a phenotype. This log transformation is used due to the 

nature of very small p-values that are obtained to be easier to visualize.  

 Utilizing whole-genome sequence (WGS) in GWAS can help tease out these significant 

peaks and identify more rare variants (Onteru et al., 2012; Wu et al., 2017) but may also 

introduce more noise into these association studies. Most traits that are being explored via 

GWAS are complex, in that they are controlled by many areas on the genome. There are 

important production traits such as polled/horned or coat color that are controlled by one locus 

and follow a simplistic dominance of appearance or also known as mendelian inheritance i.e., 

polled animals or black hided animals have at least one dominant copy of the gene to express the 

phenotype.  

19 

 
 
 
 
 
Some of the previous work done on Wagyu animals has been on the Japanese and 

Chinese populations (An et al., 2019; Mizoguchi et al., 2006; Mizoshita et al., 2004; Takasuga et 

al., 2007; Zhang et al., 2019) which have identified significant QTL in both Chinese and 

Japanese Wagyu. Specific QTL identified in An et al. related to growth traits in Wagyu can be 

found in Table 2.2. 

Trait 

SNP Name 

BTA  MAF  Gene Name 

P-Value 

Body Height   Hapmap46986-BTA-34282 

Body Height   BovineHD1400007323 

Body Height   BovineH04100011295 

Body Height   BTB-00557532 

Body Height   BovineH01400007377 

Body Length   BovineHD1400015419 

Hip Height  

Hapmap46986-BTA-34282 

Hip Height  

BovineHD1400007259 

Hip Height  

BovineH04100011295 

Hip Height  

BovineHD1400007323 

Hip Height  

BTB-00557532 

Hip Height  

ARS-BFGL-NGS-98420 

Hip Height  

BovineH01400006445 

Hip Height  

BovineH01400007333 

Hip Height  

BTB-01530836 

Hip Height  

BovineHD1400007377 

Hip Height  

Hapmap32552-BTA-129045 

Hip Height  

BovineH01400007314 

Hip Height  

Hapmap26308-BTC-057761 

Hip Height  

BovineHD1400007375 

Hip Height  

BovineHD0500034451 

Hip Height  

BovineHD0500020210 

Hip Height  

BovineHD0500020213 

Hip Height  

BovineHD1400007373 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

5 

14 

14 

14 

14 

14 

14 

6 

14 

5 

5 

5 

14 

0.42 

0.42 

0.42 

0.4 

0.48 

0.2 

0.42 

0.45 

0.42 

0.42 

0.4 

0.1 

0.27 

0.41 

0.4 

0.48 

0.26 

0.34 

0.1 

0.47 

0.34 

0.19 

0.19 

0.48 

PENK 

PENK 

PENK 

XKR4 

4.19E-06 

5.69E-06 

6.97E-06 

9.48E-06 

IMPAD1 

9.58E-06 

CSMD3 

6.69E-06 

PENK 

1.34E-07 

PLAG1 

4.41E-07 

PENK 

PENK 

XKR4 

5.36E-07 

6.26E-07 

7.41E-07 

CCND2 

1.40E-06 

SNTG1 

2.86E-06 

PENK 

XKR4 

3.05E-06 

3.12E-06 

IMPAD1 

3.63E-06 

SNTG1 

3.83E-06 

PENK 

LAP3 

4.07E-06 

5.10E-06 

IMPAD1 

5.52E-06 

FAM19A5 

6.14E-06 

SYN3 

7.02E-06 

TIMP3 

7.02E-06 

IMPAD1 

9.18E-06 

Table 2.2 Significant QTL found in Chinese Wagyu for growth 

traits from An et al. (2019) 

20 

 
 
 
 
 
 
Table 2.2 (cont’d) 

Multi-Trait 

Hapmap46986-BTA-34282 

Multi-Trait 

BovineHD0500026837 

Multi-Trait 

BTB-00557532 

Multi-Trait 

ARS-BFGL-NGS-98420 

Multi-Trait 

Hapmap26308-BTC-057761 

Multi-Trait 

BovineHD4100011295 

Multi-Trait 

BovineHD1400007259 

Multi-Trait 

BovineHD1400007323 

14 

5 

14 

5 

6 

14 

14 

14 

0.42 

0.49 

0.4 

0.1 

0.1 

0.42 

0.45 

0.42 

PENK 

1.63E-06 

STRAP 

2.96E-06 

XKR4 

4.73E-06 

CCND2 

4.93E-06 

LAP3 

PENK 

6.54E-06 

6.64E-06 

PLAG1 

8.19E-06 

PENK 

8.22E-06 

Utilizing the most significant SNP identified through GWAS in genomic prediction has 

been proposed, but addition of these SNP seems to have little effect (Moser et al., 2009). In 

general, when more information is added to estimate breeding values, the more accurate the 

estimate will be. Adding these significant SNP may increase accuracy, but it may just be due to 

the addition of new information. Identifying significant areas of the genome for genomic 

prediction between breeds can be done (T. Meuwissen et al., 2021), but it will require high 

density genotypes, which may not be available to industry applications due to cost and 

computing space. Further exploration into addition of significant SNP in genomic prediction is 

by artificial intelligence (Li et al., 2018) to understand the best subset of SNP for an accurate 

genomic prediction through learning algorithms. 

Consideration of epistatic effects can also be taken into account, as these effects may 

introduce a “double-dipping” effect of including multiple SNP that are influencing each other (S. 

K. Jain & Allard, 1966; Phillips, 2008). This can inflate or deflate the significance of SNP that 

are directly influencing the phenotype. 

Genomic Prediction in Wagyu 

Traditional modeling first used sire models to understand and estimate random genetic 

effects, while still including fixed effects. Use of these methods in the animal industry has shown 

large leaps in efficiency (Lourenco et al., 2020; Pocrnic et al., 2019) as selection of animals for 

increased production traits has led to less animals being needed while producing the same 

amount or more animal products with less animals.  

21 

 
 
 
 
 
  
 
 
The model used to obtain current breeding values is a BLUP, or best linear unbiased 

prediction. This model utilizes mixed effects (random and fixed) to estimate the effect a random 

genotype on a phenotype given fixed effects. The model is as follows: 

𝑦 = 𝑋𝑏 + 𝑍𝑢 + 𝑒 

Where y is the phenotype, X is the incident matrix for fixed effects, relating each animal 

to its fixed effect in b, which is the vector of fixed effects, Z is the genotype matrix, u is the 

vector of random breeding values where we assume 𝑁~(0, 𝐺𝜎𝑢

2) and e is the vector of residual 

effects where we assume 𝑁~(0, 𝐼𝜎𝑒

2). This is the most traditional and widely used model to 

obtain breeding values for animals that combines production information and animal 

relationships via genotype. Breeding values are the backbone for the cattle industry, as selection 

for EPDs (Estimated Progeny Differences) is the most efficient way to make significant change 

within a cattle population. Solutions can be obtained through mixed model equations: 

Where 𝛼 is defined as 𝜎𝑒

] =   [

[𝑏̂
𝑎̂
2. 
2/𝜎𝑎

𝑋′𝑋
𝑍′𝑋 𝑍′𝑍 + 𝐺−1𝛼

𝑋′𝑍

] [

𝑋′𝑦
𝑍′𝑦

] 

To fully understand the power of genomic prediction, a training and testing group of 

animals is used to test the accuracy of prediction. In most prediction studies, the training group is 

much larger than the testing group as it needs to capture the variation in the population sample 

that is being tested. This means that there needs to be a large sample of animals in this group to 

get a good understanding of the allele frequencies that are present in a specific population. In 

many cases, if there is a good representation of the population in the training group, the accuracy 

of prediction in the testing group will be high (Berry et al., 2016; T. H. E. Meuwissen et al., 

2001). In this case, the underlying genomic connections between these two groups would be 

similar enough to obtain high genomic prediction accuracy of the testing group.  

All Wagyu animals outside of Japan can be traced back to a handful of animals exported 

in the 1970s and 1990s. The commonality of ancestors in Wagyu are apparent, as all US Wagyu, 

red or black, originate from Japan. There are also some Red/Black crossed animals present in the 

American population, which may aid in more accurate genomic prediction between the Red and 

Black populations (Esfandyari et al., 2015; Moghaddar et al., 2014; Van Grevenhof et al., 2019).  

If there are common ancestors that are present between training and testing groups, the accuracy 

22 

 
 
 
 
 
of prediction may increase between the two populations. This is due to the two groups having 

genomic connections or linkage, which aids in more accurate prediction of a breeding value. 

Re-estimating breeding values on a regular basis is crucial to realize higher accuracies, as 

per generation break down of linkage within the genome can attribute to lower accuracies of 

breeding values (Cuyabano et al., 2019).  Further exploration into reporting of accuracy within 

an industry setting of animals with predicted breeding values can be realized by this equation: 

𝐴𝐶𝐶 = 1 − √

𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝐸𝑟𝑟𝑜𝑟 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝐴𝑑𝑑𝑖𝑣𝑒 𝐺𝑒𝑛𝑒𝑡𝑖𝑐 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒

Where both the prediction error variance and the additive genetic variance can be 

estimated by solving the mixed model equation. Prediction error variance can be estimated by 

the inverse of the coefficient matrix (Harris & Johnson, 1998; Misztal et al., 2013), but more 

computational efficient methods have been proposed, such as an MCMC sampler to estimate 

these posterior effects (Hickey et al., 2009).  

In research settings, where the model used or the population used is the focus of genomic 

prediction, the accuracy of the breeding value is just the Pearson Correlation between the 

estimated breeding value and the actual phenotype recorded. The higher this correlation, the 

more accurate the prediction is. A standardized accuracy of prediction can be found by (J. C. M. 

Dekkers, 2007): 

𝐴𝐶𝐶 =

𝑝(𝑦, 𝑦̂)

√ℎ2

Where the correlation of the predicted and actual breeding values is in the numerator and 

the square root of the heritability can be found in the denominator. Testing the accuracy of 

prediction of a model is large focus in research and industry settings, as increases in accuracy 

can directly impact the economic impact of selection for a trait.  

Genomic prediction has changed the landscape of selection in the animal industry due to 

the shortening of the generation interval (García-Ruiz et al., 2016). Choosing animals with the 

greatest genetic potential before they are proven by progeny has increased rate of genetic gain in 

most animal industries. Identifying those animals before they are proved via their own phenotype 

or by progeny has been the main driver of selection in all animal industries as this decreases the 

generation interval and increases genetic gain, as seen in the selection response equation: 

23 

 
 
 
 
 
 
 
𝑖𝑟𝜎𝐴
𝐿
Where 𝑖 is the selection intensity, 𝑟 is the accuracy, 𝜎𝐴 is the genetic standard deviation 

∆𝐺 =

and 𝐿 is the generation interval. The response we expect to see per generation or the difference 

between the mean phenotype of the offspring of selected parents and the whole parental 

generation before the selection occurred can be identified through: 

𝑅 = ℎ2𝑆 

Where ℎ2 is the narrow sense heritability (estimated by ℎ2 =

2
𝜎𝑎
2, or the proportion of 
𝜎𝑝

phenotypic variability we can attribute to genotype) and 𝑆 is the selection differential measured 

by the average mean of the selected parents comparatively to the population mean, which is 

usually inferior to those parents selected (Falconer & Mackay, 1996).  

The effort into increasing accuracies may not be on the forefront of the average cattleman 

but must be a consistent drum in the heartbeat of the beef industry. Without high accuracies, the 

selection decisions made by industry professionals may not be realized, and significant genetic 

change will not occur. Increasing accuracy can be done by including more performance data on 

related animals (Quaas & Pollak, 1980), genotyping more animals within the population (Hayes, 

Visscher, et al., 2009), or improving the pedigree by identifying correct parentage (Geldermann 

et al., 1986). This is especially important to keep at the forefront of the American Wagyu 

population, as the pool of animals to select from is much smaller than other traditional breeds. 

Striving for larger accuracies through phenotype reporting and continuous genotype collection 

will be crucial for improving Wagyu in the US through genetic change.  

Wagyu in America and Future Outlooks 

Utilization of Wagyu animals, semen and embryos in the US has taken off due to 

crossing Wagyu with traditional breeds, such as Angus, Hereford, or dairy breeds, such as 

Holstein. This is due to the large discrepancy in marbling between Wagyu and traditional beef 

breeds, which yields a high marbling F1 animal cross. By using a high-marbling breed, the costs 

of these progeny can yield high grading outputs, as many of these animals will hit the “prime” 

grading that is highly sought after in the United States. On average, an animal graded prime will 

come at a $18 premium as of August 11, 2023, (USDA Livestock Poultry & Grain Market News, 

24 

 
 
 
 
 
 
 
2023). This increased payout per prime graded animal has driven US producers towards Wagyu 

for a higher grading product. 

Genomics is the most effective tool to identify these specialty products. This is because 

the genotype will not change, and identification of fraudulent labeling of products can be 

achieved if these genotyping methods can be brought out of the lab. Eventually, bringing 

genomic methodology into any farmer's hand will be possible. The advent of more mobile 

sequencing methods (Deamer et al., 2016; Delahaye & Nicolas, 2021; Lamb et al., 2021; Tyler et 

al., 2018) could lead to identification of animals without the wait of collecting and submitting 

samples to a 3rd party lab. This is especially important when discussing Wagyu animals, as these 

animals can fetch up to $100,000 in private sales. If the animal was to be discovered as not a 

Fullblood Wagyu with an easy to use, pocket-sized sequencer, then mistakes in purchasing a 

fraudulent animal could be exposed. Issues may still arise in the bioinformatic pathways, as a 

straightforward and easy way to obtain usable data from these sequencers can be difficult. Some 

recent advancements in this area include Dusselpore (Vogeley et al., 2021) and QUILT (Davies 

et al., 2021), which introduce new bioinformatic platforms that boasts ease-of-use results from 

ONT sequencers for human DNA. This area of research, bioinformatic platforms that are easy to 

use, should be in stronger focus if this sequencer is to be used by an average consumer. The jump 

from sequence to breeding value has many steps in-between that may not be feasible for an 

everyday user, but is a driving force for rapid selection tools in beef.  

Many producers are looking for cutting edge research that will be able to quickly 

influence their herds, either looking for positive production traits, such as high marbling in 

Wagyu or for identifying and straying from negative ones, such as recessive diseases. This is 

especially prominent in the Wagyu breed, as many producers are introducing this “new” breed 

into their herd. Understanding the breed history, composition and relatedness to other breeds can 

help industry professionals make more informed decisions in breeding their animals. This can 

also aid breed associations, as more accurate information on genetic architecture of the breed 

being evaluated will lead to more accurate breeding values.  

Agricultural practices have been around for tens of thousands of years, with very crude 

“selection” practices starting with choosing the best animals just through physical or behavioral 

attributes. Many of these selection decisions, whether direct or indirect, led to the creation of our 

modern-day cattle breeds. Specifically, the indirect selection for East Asian breeds, such as 

25 

 
 
 
 
 
Wagyu has created an individual product for consumption today. Wagyu animals selected for 

work, such as pulling plows, were unknowingly developing a high-quality marbling product that 

we cherish and strive for today. 

The evolution from the primitive selection of our very distant ancestors to the modern 

breeding program is a large jump. Breeding programs today are complex, taking many scientific 

disciplines to understand the “best” selection decisions for a herd. The many facets of creating a 

modern breeding program can lead to disconnect between all the specialized areas (molecular, 

bioinformatic, statistical, functional, etc.) and interpreting the output from each field can be a 

rather steep learning curve. Yet, all these fields must be considered to create a comprehensive 

breeding program, but many overlook how important each step is in making an overarching 

decision. A study into each specialized area and how they impact and interact with each other is 

needed to better understand the outwardly looking simple process of “selection”. An exploration 

into obtaining genomic information, creating bioinformatic pipelines, identification of animals, 

understanding population structure and how this structure can influence genomic prediction, 

prediction between breeds, and identification of influential SNP to industry traits is needed for 

full understanding of identifying the best modern breeding techniques for breeding programs. 

Fleshing out modern breeding programs in relatively new and high-quality breeds such as 

Wagyu in the US market should be explored as deeply as possible. Such information that needs 

to be available is breed composition of these animals, the population structure of these animals 

comparatively to “traditional” European breeds, and identification of if these Asian breeds can be 

grouped together in genetic evaluations. This is important to the current state of Wagyu outside 

of Japan, as evaluations are done with Red (brown wagyu or Akaushi), Black, and Red/Black 

crossed animal combined in one single evaluation. Including all animals may help give the 

model more genomic information in the Wagyu population, but there must be enough animals 

from all groups (brown/red, black, cross) that are related to animals that you want predictions on 

to have accurate estimates. Accurate estimates of breeding values are of utmost importance in an 

industry setting, as these estimates directly relate to the monetary value of the animal and its 

products, whether that be meat or future semen or embryos for breeding. 

The Wagyu population in the United States is still in its infancy, with the first Fullblood 

animals arriving in the 1990s. This has paved a pathway for the demand of high-marbling beef in 

the western world, which can be achieved even with a 50% Wagyu cross. Identifying those 

26 

 
 
 
 
 
Wagyu products in the market should also be well established, as a washing out of the Wagyu 

brand may occur, where fraudulent labeling will tarnish the Wagyu brand. Flushing these 

products from the marketplace may be in the hands of out of lab sequencing with the Nanopore’s 

MinION. This technology is not foolproof yet, but these are the first steps of product 

identification through sequence. The importance of keeping genetic diversity in this population 

should be at the forefront of most selection decisions, as the inbreeding in this population is very 

high compared to other beef breeds in the US. Increasing accuracy of breeding value estimates as 

well as keeping inbreeding low will be the critical selection decisions for establishing US Wagyu 

as a permanent market frontrunner.  

27 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LITERATURE CITED 

An, B., Xia, J., Chang, T., Wang, X., Xu, L., Zhang, L., Gao, X., Chen, Y., Li, J., & Gao, H. 

(2019). Genome-wide association study reveals candidate genes associated with body 
measurement traits in Chinese Wagyu beef cattle. Animal Genetics, 50(4), 386–390.  

Aung, M. M., & Chang, Y. S. (2014). Traceability in a food supply chain: Safety and quality 

perspectives. In Food Control (Vol. 39, Issue 1, pp. 172–184). Elsevier BV.  

Berry, D. P., Garcia, J. F., & Garrick, D. J. (2016). Development and implementation of genomic 

predictions in beef cattle. Animal Frontiers, 6(1), 32–38.  

Bhattaru, S., Tani, J., Saboda, K., Borowsky, J., Ruvkun, G., Zuber, M., & Carr, C. (2019). 

Development of a Nucleic Acid- Based Life Detection Instrument Testbed. IEEE Aerospace 
Conference. 

Blakebrough-Hall, C., Mcmeniman, J. P., & González, L. A. (2020). An evaluation of the 

economic effects of bovine respiratory disease on animal performance, carcass traits, and 
economic outcomes in feedlot cattle defined using four BRD diagnosis methods. Journal of 
Animal Science, 1–11.  

Bosona, T., & Gebresenbet, G. (2013). Food traceability as an integral part of logistics 

management in food and agricultural supply chain. In Food Control (Vol. 33, Issue 1, pp. 
32–48). Elsevier.  

Brüniche-Olsen, A., Gravlund, P., & Lorenzen, E. D. (2012). Impacts of genetic drift and 

restricted gene flow in indigenous cattle breeds: evidence from the Jutland breed. Animal 
Genetic Resources/Resources Génétiques Animales/Recursos Genéticos Animales, 50, 75–
85.  

Charlesworth, D., & Charlesworth, B. (1987). Inbreeding Depression And Its Evolutionary 

Consequences. In Ann. Rev. Ecol. Syst (Vol. 18).  

Chen, N., Cai, Y., Chen, Q., Li, R., Wang, K., Huang, Y., Hu, S., Huang, S., Zhang, H., Zheng, 
Z., Song, W., Ma, Z., Ma, Y., Dang, R., Zhang, Z., Xu, L., Jia, Y., Liu, S., Yue, X., … Lei, 
C. (2018). Whole-genome resequencing reveals world-wide ancestry and adaptive 
introgression events of domesticated cattle in East Asia. Nature Communications, 9(1).  

Christensen, O. F., & Lund, M. S. (2010). Genomic prediction when some animals are not 

genotyped. Genetics Selection Evolution, 42(3).  

Clamer, M., Höfler, L., Mikhailova, E., Viero, G., & Bayley, H. (2014). Detection of 3′-end 

RNA uridylation with a protein nanopore. ACS Nano, 8(2), 1364–1374.  

28 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Cuyabano, B. C. D., Wackel, H., Shin, D., & Gondro, C. (2019). A study of genomic prediction 

across generations of two Korean pig populations. Animals, 9(9).  

Davies, R. W., Kucka, M., Su, D., Shi, S., Flanagan, M., Cunniff, C. M., Chan, Y. F., & Myers, 
S. (2021). Rapid genotype imputation from sequence with reference panels. Nature 
Genetics, 53(7), 1104.  

Deamer, D., Akeson, M., & Branton, D. (2016). Three decades of nanopore sequencing. Nature 

Biotechnology 2016 34:5, 34(5), 518–524. 

Dekkers, J. C. (2004). Commercial application of marker- and gene-assisted selection in 

livestock: Strategies and lessons,. Journal of Animal Science, 82(suppl_13), E313–E328.  

Dekkers, J. C. M. (2007). Prediction of response to marker-assisted and genomic selection using 
selection index theory. Journal of Animal Breeding and Genetics, 124(6), 331–341.  

Delahaye, C., & Nicolas, J. (2021). Sequencing DNA with nanopores: Troubles and biases. PLoS 

ONE, 16(10).  

Elsden, R. P., Hasler, J. F., & Seidel, G. E. (1976). Non-surgical recovery of bovine eggs. 

Theriogenology, 6(5), 523–532.  

Esfandyari, H., Sørensen, A. C., & Bijma, P. (2015). A crossbred reference population can 

improve the response to genomic selection for crossbred performance. Genetics Selection 
Evolution, 47(1), 1–12.  

Falconer, D., & Mackay, T. F. C. (1996). Introduction to quantitative genetics (4th ed.). Prentice 

Hall. 

Flori, L., Fritz, S., Jaffrézic, F., Boussaha, M., Gut, I., Heath, S., Foulley, J. L., & Gautier, M. 

(2009). The Genome Response to Artificial Selection: A Case Study in Dairy Cattle. PLOS 
ONE, 4(8), e6595. 

Forristall, C., May, G. J., & Lawrence, J. D. (2002). Assessing the Cost of Beef Quality. 

Fortune Business Insights. (2023). Wagyu Beef Market Share. 

https://www.fortunebusinessinsights.com/wagyu-beef-market-106905 

García-Ruiz, A., Cole, J. B., VanRaden, P. M., Wiggans, G. R., Ruiz-López, F. J., & Van 

Tassell, C. P. (2016). Changes in genetic selection differentials and generation intervals in 
US Holstein dairy cattle as a result of genomic selection. Proceedings of the National 
Academy of Sciences of the United States of America, 113(28), E3995–E4004.  

29 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Geldermann, H., Pieper, U., & Weber, W. E. (1986). Effect of Misidentification on the 

Estimation of Breeding Value and Heritability in Cattle. Journal of Animal Science, 63(6), 
1759–1768.  

Gianola, D., & de los Campos, G. (2008). Inferring genetic values for quantitative traits non-

parametrically. Genetics Research, 90(6), 525–540.  

Gokey, M. (2018). Japan’s obsession with marbling seeps into U.S. Progressive Cattle. 

https://www.progressivecattle.com/topics/beef-quality/japan-s-obsession-with-marbling-
seeps-into-u-s 

Gonzalez, J. M., & Phelps, K. J. (2018). United States beef quality as chronicled by the National 

Beef Quality Audits, Beef Consumer Satisfaction Projects, and National Beef Tenderness 
Surveys — A review. Asian-Australasian Journal of Animal Sciences, 31(7), 1036.  

González-Recio, O., Gianola, D., Long, N., Weigel, K. A., Rosa, G. J. M., & Avendaño, S. 
(2008). Nonparametric Methods for Incorporating Genomic Information Into Genetic 
Evaluations: An Application to Mortality in Broilers. Genetics, 178(4), 2305–2313.  

González-Recio, O., López De Maturana, E., & Gutiérrez, J. P. (2007). Inbreeding Depression 
on Female Fertility and Calving Ease in Spanish Dairy Cattle. Journal of Dairy Science, 
90(12), 5744–5752.  

Gotoh, T., Nishimura, T., Kuchida, K., & Mannen, H. (2018). The Japanese Wagyu beef 

industry: Current situation and future prospects - A review. In Asian-Australasian Journal 
of Animal Sciences (Vol. 31, Issue 7, pp. 933–950). Asian-Australasian Association of 
Animal Production Societies.  

Gupta, A. K., & Gupta, U. D. (2014). Next Generation Sequencing and Its Applications. Animal 

Biotechnology: Models in Discovery and Translation, 345–367.  

Harris, B., & Johnson, D. (1998). Approximate Reliability of Genetic Evaluations Under an 

Animal Model. Journal of Dairy Science, 81, 2723–2728.  

Hayes, B. J. (2011). Technical note: Efficient parentage assignment and pedigree reconstruction 
with dense single nucleotide polymorphism data. Journal of Dairy Science, 94(4), 2114–
2117.  

Hayes, B. J., Bowman, P. J., Chamberlain, A. J., & Goddard, M. E. (2009). Invited review: 

Genomic selection in dairy cattle: Progress and challenges. Journal of Dairy Science, 92(2), 
433–443.  

30 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Hayes, B. J., Visscher, P. M., & Goddard, M. E. (2009). Increased accuracy of artificial selection 

by using the realized relationship matrix. Genetics Research, 91(1), 47–60.  

Heather, J. M., & Chain, B. (2016). The sequence of sequencers: The history of sequencing 

DNA. Genomics, 107(1), 1.  

Heffernan, K. R., Enns, R. M., Blackburn, H. D., Speidel, S. E., Wilson, C. S., & Thomas, M. G. 

(2021). Case study of inbreeding within Japanese Black cattle using resources of the 
American Wagyu Association, National Animal Germplasm Program, and a cooperator 
breeding program in Wyoming. Translational Animal Science, 5(Supplement_S1), S170–
S174. 

Hickey, J. M., Veerkamp, R. F., Calus, M. P., Mulder, H. A., & Thompson, R. (2009). 

Estimation of prediction error variances via Monte Carlo sampling methods using different 
formulations of the prediction error variance. Genetics Selection Evolution, 41(1), 1–9.  

History of Illumina Sequencing & Solexa Technology. (n.d.). Retrieved September 21, 2023, 

from https://www.illumina.com/science/technology/next-generation-sequencing/illumina-
sequencing-history.html 

Honda, T., Fujii, T., Nomura, T., & Mukai, F. (2006). Evaluation of genetic diversity in Japanese 
Brown cattle population by pedigree analysis. Journal of Animal Breeding and Genetics, 
123(3), 172–179.  

Honda, T., Nomura, T., Yamaguchi, Y., & Mukai, F. (2004). Monitoring of genetic diversity in 
the Japanese Black cattle population by the use of pedigree information. Journal of Animal 
Breeding and Genetics, 121(4), 242–252.  

Horii, M. (2009). Relationship between Japanese Beef Marbling Standard numbers and 

intramuscular liquid in M. longissimus thoracis of Japanese Black steers from 1994 to 2004. 
Nihon Chikusan Gakkaiho, 80, 55–61.  

Hutchison, J. L., Cole, J. B., & Bickhart, D. M. (2014). Short communication: Use of young 

bulls in the United States. Journal of Dairy Science, 97(5), 3213–3220.  

Jain, M., Olsen, H. E., Paten, B., & Akeson, M. (2016). The Oxford Nanopore MinION: 

Delivery of nanopore sequencing to the genomics community. Genome Biology, 17(1).  

Jain, S. K., & Allard, R. W. (1966). The Effects of Linkage, Epistasis, and Inbreeding on 

Population Changes under Selection. Genetics, 53(4), 633–659.  

Jo, M., Garc, J., Almeida, J. M. M. M. De, & Saraiva, C. (2021). Consumer Knowledge about 

Food Labeling and Fraud. Foods, 10, 1–12. 

31 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Johnson, R. (2014). Food Fraud and “Economically Motivated Adulteration” of Food and Food 

Ingredients. www.crs.govR43358 

Karamizadeh, S., Abdullah, S. M., Manaf, A. A., Zamani, M., & Hooman, A. (2013). An 

Overview of Principal Component Analysis. Journal of Signal and Information Processing, 
04(03), 173–175.  

Kempster, A. J. (1989). Carcass and meat quality research to meet market needs. Animal 

Production, 48(3), 483–496.  

Kidd, K. K., & Cavalli-Sforza, L. L. (1974). The Role of Genetic Drift in the Differentiation of 

Icelandic and Norwegian Cattle. Evolution, 28(3), 381.  

Knol, E. F., Nielsen, B., & Knap, P. W. (2016). Genomic selection in commercial pig breeding. 

Animal Frontiers, 6(1), 15–22.  

Kobe Beef Marketing & Distribution Promotion Association. (2023). Kobe Beef. 

https://www.kobe-niku.jp/en/contents/council/index.html 

Kohama, N., Yoshida, E., Masaki, T., Iwamoto, E., Fukushima, M., Honda, T., & Oyama, K. 
(2021). Estimation of genetic parameters for carcass grading traits, image analysis traits, 
and monounsaturated fatty acids in Japanese Black cattle from Hyogo Prefecture. Animal 
Science Journal, 92(1), e13664.  

Lamb, H. J., Hayes, B. J., Randhawa, I. A. S., Nguyen, L. T., & Ross, E. M. (2021). Genomic 
prediction using low-coverage portable Nanopore sequencing. PLOS ONE, 16(12), 
e0261274.  

Lande, R., & Thompson, R. (1990). Efficiency of marker-assisted selection in the improvement 

of quantitative traits. Genetics, 124(3), 743–756. 

Lee, S.-H., Park, B.-H., Sharma, A., Dang, C.-G., Lee, S.-S., Choi, T.-J., Choy, Y.-H., Kim, H.-
C., Jeon, K.-J., Kim, S.-D., Yeon, S.-H., Park, S.-B., & Kang, H.-S. (2014). Hanwoo cattle: 
origin, domestication, breeding strategies and genomic selection. Journal of Animal Science 
and Technology, 56(1), 2 

Legarra, A., Aguilar, I., & Misztal, I. (2009). A relationship matrix including full pedigree and 

genomic information. Journal of Dairy Science , 92.  

Li, B., Zhang, N., Wang, Y. G., George, A. W., Reverter, A., & Li, Y. (2018). Genomic 

prediction of breeding values using a subset of SNPs identified by three machine learning 
methods. Frontiers in Genetics, 9(JUL), 377541.  

32 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Lonergan, S. M., Topel, D. G., & Marple, D. N. (2019). Fat and fat cells in domestic animals. 

The Science of Animal Growth and Meat Technology, 51–69.  

Lourenco, D., Legarra, A., Tsuruta, S., Masuda, Y., Aguilar, I., & Misztal, I. (2020). Single-Step 
Genomic Evaluations from Theory to Practice: Using SNP Chips and Sequence Data in 
BLUPF90. Genes, 11(170).  

Lu, H., Giordano, F., & Ning, Z. (2016). Oxford Nanopore MinION Sequencing and Genome 

Assembly. In Genomics, Proteomics and Bioinformatics (Vol. 14, Issue 5).  

Lucy, M. C. (2007). Fertility in high-producing dairy cows: reasons for decline and corrective 
strategies for sustainable improvement. Society of Reproduction and Fertility Supplement, 
64, 237–254.  

MAFF. (2020). Targets of domestic animal improvement.  

McVean, G. (2009). A genealogical interpretation of principal components analysis. PLoS 

Genetics, 5(10).  

Meuwissen, T. H. E., Hayes, B. J., & Goddard, M. E. (2001). Prediction of Total Genetic Value 

Using Genome-Wide Dense Marker Maps. Genetics, 157(4), 1819–1829.  

Meuwissen, T., Hayes, B., & Goddard, M. (2016). Genomic selection: A paradigm shift in 

animal breeding. Animal Frontiers, 6(1), 6–14.  

Meuwissen, T., Van Den Berg, I., & Goddard, M. (2021). On the use of whole-genome sequence 

data for across-breed genomic prediction and fine-scale mapping of QTL. Genet Sel Evol, 
53, 19.  

Michael, T. P., Jupe, F., Bemm, F., Motley, S. T., Sandoval, J. P., Lanz, C., Loudet, O., Weigel, 

D., & Ecker, J. R. (2018). High contiguity Arabidopsis thaliana genome assembly with a 
single nanopore flow cell. Nature Communications 2018 9:1, 9(1), 1–8.  

Misztal, I., Tsuruta, S., Aguilar, I., Legarra, A., VanRaden, P. M., & Lawlor, T. J. (2013). 

Methods to approximate reliabilities in single-step genomic evaluation. Journal of Dairy 
Science, 96(1), 647–654.  

Mizoguchi, Y., Watanabe, T., Fujinaka, K., Iwamoto, E., & Sugimoto, Y. (2006). Mapping of 
quantitative trait loci for carcass traits in a Japanese Black (Wagyu) cattle population. 
Animal Genetics, 37(1), 51–54.  

Mizoshita, K., Watanabe, T., Hayashi, H., Kubota, C., Yamakuchi, H., Todoroki, J., & 

Sugimoto, Y. (2004). Quantitative trait loci analysis for growth and carcass traits in a half-

33 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
sib family of purebred Japanese Black (Wagyu) cattle. Journal of Animal Science, 82(12), 
3415–3420.  

Moghaddar, N., Swan, A. A., & Van Der Werf, J. H. J. (2014). Comparing genomic prediction 
accuracy from purebred, crossbred and combined purebred and crossbred reference 
populations in sheep. Genetics Selection Evolution, 46(1).  

Moser, G., Tier, B., Crump, R., Khatkar, M., & Raadsma, H. (2009). A comparison of five 

methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers. 
Genetics Selection Evolution, 41(1), 1–16.  

Motoyama, M., Sasaki, K., & Watanabe, A. (2016). Wagyu and the factors contributing to its 

beef quality: A Japanese industry overview. Meat Science, 120, 10–18.  

Mukai, F., Tsuji, S., Fukazawa, K., Ohtagaki, S., & Nambu, Y. (1989). History and population 
structure of a closed strain of Japanese Black Cattle. Journal of Animal Breeding and 
Genetics, 106(1–6), 254–264.  

Namikawa, K. (1980). Breeding History Of Japanese Beef Cattle And Preservation Of Genetic 

Resources As Economic Farm Animals. 

Nomura, T., Honda, T., & Mukai, F. (2001). Inbreeding and effective population size of 

Japanese Black cattle. Journal of Animal Science, 79(2), 366–370.  

Oikawa, T. (2018). Improvement of indigenous cattle to modern Japanese Black (Wagyu) cattle. 

IOP Conference Series: Earth and Environmental Science, 119(1).  

Onteru, S. K., Fan, B., Du, Z.-Q., Garrick, D. J., Stalder, K. J., & Rothschild, M. F. (2012). A 

whole-genome association study for pig reproductive traits. Animal Genetics, 43(1), 18–26.  

Oxford Nanopore Technologies. (n.d.-a). SmidgION. 
Https://Nanoporetech.Com/Products/Smidgion. 

Oxford Nanopore Technologies. (n.d.-b). VolTRAX. Retrieved February 6, 2021, from 

https://nanoporetech.com/products/voltrax 

Patterson, N., Price, A. L., & Reich, D. (2006). Population Structure and Eigenanalysis. PLoS 

Genetics, 2(12), 2074–2093.  

Phillips, P. C. (2008). Epistasis--the essential role of gene interactions in the structure and 

evolution of genetic systems. Nature Reviews. Genetics, 9(11), 855–867.  

34 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Pocrnic, I., Lourenco, D. A. L., Masuda, Y., & Misztal, I. (2019). Accuracy of genomic BLUP 
when considering a genomic relationship matrix based on the number of the largest 
eigenvalues: A simulation study. Genetics Selection Evolution, 51(1), 1–10.  

PromethION | Oxford Nanopore Technologies. (n.d.). Retrieved April 7, 2021, from 

https://nanoporetech.com/products/promethion 

Quaas, R. L., & Pollak, E. J. (1980). Mixed Model Methodology For Farm And Ranch Beef 

Cattle Testing Programs. Journal of Animal Science, 51(6).  

Rouse, G. H., Ruble, M., Greiner, S., Tait, R. G., Hays, C. L., & Wilson, D. E. (2000). Growth 
and Development of Angus-Wagyu Crossbred Steers. Iowa State University Animal 
Industry Report, 1(1).  

Sanger, F., Brownlee, G. G., & Barrell, B. G. (1965). A two-dimensional fractionation procedure 

for radioactive nucleotides. Journal of Molecular Biology, 13(2).  

Sasazaki, S., Odahara, S., Hiura, C., Mukai, F., & Mannen, H. (2006). Mitochondrial DNA 

Variation and Genetic Relationships in Japanese and Korean Cattle. Asian-Australasian 
Journal of Animal Sciences, 19(10), 1394–1398.  

Schaller, R. R. (1997). Moore’s law: past, present, and future. IEEE Spectrum, 34(6), 52–55, 57.  

Schroeder, T. C., & Tonsor, G. T. (2012). International cattle ID and traceability: Competitive 

implications for the US. Food Policy, 37(1), 31–40.  

Scraggs, E., Zanella, R., Wojtowicz, A., Taylor, J. F., Gaskins, C. T., Reeves, J. J., de Avila, J. 

M., & Neibergs, H. L. (2014). Estimation of inbreeding and effective population size of 
full-blood wagyu cattle registered with the American Wagyu Cattle Association. Journal of 
Animal Breeding and Genetics, 131(1), 3–10.  

Seo, D., Lee, D. H., Jin, S., Won, J. Il, Lim, D., Park, M., Kim, T. H., Lee, H. K., Kim, S., Choi, 
I., Lee, J. H., Gondro, C., & Lee, S. H. (2022). Long-term artificial selection of Hanwoo 
(Korean) cattle left genetic signatures for the breeding traits and has altered the genomic 
structure. Scientific Reports 2022 12:1, 12(1), 1–15.  

Shahidi, F., & Ambigaipalan, P. (2018). Omega-3 Polyunsaturated Fatty Acids and Their Health 

Benefits. Annual Review of Food Science and Technology, 9(1), 345–381.  

Smith, S. B., Zembayashi, M., Lunt, D. K., Sanders, J. O., & Gilbert, C. D. (2001). Carcass traits 
and microsatellite distributions in offspring of sires from three geographical regions of 
Japan. Journal of Animal Science, 79(12), 3041–3051.  

35 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Souza-Monteiro, D. M., & Caswell, J. A. (2004). The Economics of Implementing Traceability in 

Beef Supply Chains: Trends in Major Producing and Trading Countries. 
http://www.umass.edu/resec/workingpapers 

Spink, J., & Moyer, D. C. (2011). Defining the Public Health Threat of Food Fraud. Journal of 

Food Science, 76(9), R157–R163.  

Strandén, I., & Garrick, D. J. (2009). Technical note: Derivation of equivalent computing 
algorithms for genomic predictions and reliabilities of animal merit. Journal of Dairy 
Science, 92(6), 2971–2975.  

Takasuga, A., Watanabe, T., Mizoguchi, Y., Hirano, T., Ihara, N., Takano, A., Yokouchi, K., 
Fujikawa, A., Chiba, K., Kobayashi, N., Tatsuda, K., Oe, T., Furukawa-Kuroiwa, M., 
Nishimura-Abe, A., Fujita, T., Inoue, K., Mizoshita, K., Ogino, A., & Sugimoto, Y. (2007). 
Identification of bovine QTL for growth and carcass traits in Japanese Black cattle by 
replication and identical-by-descent mapping. Mammalian Genome, 18(2), 125–136.  

Tanabe, T. Y., Hawk, H. W., & Hasler, J. F. (1985). Comparative fertility of normal and repeat-

breeding cows as embryo recipients. Theriogenology, 23(4), 687–696.  

Tyler, A. D., Mataseje, L., Urfano, C. J., Schmidt, L., Antonation, K. S., Mulvey, M. R., & 

Corbett, C. R. (2018). Evaluation of Oxford Nanopore’s MinION Sequencing Device for 
Microbial Whole Genome Sequencing Applications. Scientific Reports, 8(1), 10931.  

Uemoto, Y., Suzuki, K., Yasuda, J., Roh, S., & Satoh, M. (2021). Evaluation of inbreeding and 

genetic diversity in Japanese Shorthorn cattle by pedigree analysis. Animal Science Journal, 
92(1), e13643.  

USDA Livestock Poultry & Grain Market News. (2023). National Weekly Cattle and Beef 

Summary. 

Van Grevenhof, E. M., Vandenplas, J., & Calus, M. P. L. (2019). Genomic prediction for 

crossbred performance using metafounders. Journal of Animal Science, 97(2), 548–558.  

VanRaden, P. M. (2008a). Efficient Methods to Compute Genomic Predictions. Journal of Dairy 

Science, 91(11), 4414–4423.  

VanRaden, P. M. (2008b). Efficient Methods to Compute Genomic Predictions. Journal of Dairy 

Science, 91(11), 4414–4423.  

Vogeley, C., Nguyen, T., Woeste, S., Krutmann, J., Haarmann-Stemmann, T., & Rossi, A. 

(2021). DuesselporeTM: a full-stack local web server for rapid and simple analysis of Oxford 
Nanopore Sequencing data. BioRxiv, 2021.11.15.468670.  

36 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Wang, H., Misztal, I., Aguilar, I., Legarra, A., & Muir, W. M. (2012). Genome-wide association 
mapping including phenotypes from relatives without genotypes. Genetics Research, 94(2), 
73–83.  

Wang, M., Schneider, L. G., Hubbard, K. J., & Smith, D. R. (2018). Cost of bovine respiratory 
disease in preweaned calves on US beef cow-calf operations (2011-2015). Journal of the 
American Veterinary Medical Association, 253(5), 624–631.  

Wang, Z., Ma, H., Xu, L., Zhu, B., Liu, Y., Bordbar, F., Chen, Y., Zhang, L., Gao, X., Gao, H., 
Zhang, S., Xu, L., & Li, J. (2019). Genome-wide scan identifies selection signatures in 
chinese wagyu cattle using a high-density SNP array. Animals, 9(6).  

Wick, R. R., Judd, L. M., & Holt, K. E. (2019). Performance of neural network basecalling tools 

for Oxford Nanopore sequencing. Genome Biology, 20(1), 1–10.  

Wolc, A., Kranis, A., Arango, J., Settar, P., Fulton, J. E., O’Sullivan, N. P., Avendano, A., 

Watson, K. A., Hickey, J. M., de los Campos, G., Fernando, R. L., Garrick, D. J., & 
Dekkers, J. C. M. (2016). Implementation of genomic selection in the poultry industry. 
Animal Frontiers, 6(1), 23–31.  

Wu, Y., Zheng, Z., Visscher, P. M., & Yang, J. (2017). Quantifying the mapping precision of 

genome-wide association studies using whole-genome sequencing data. Genome Biology, 
18(1), 1–10.  

Yun, L., Willer, C., Sanna, S., & Abecasis, G. (2009). Genotype Imputation. 

Https://Doi.Org/10.1146/Annurev.Genom.9.081307.164242, 10, 387–406.  

Zhang, R., Miao, J., Song, Y., Zhang, W., Xu, L., Chen, Y., Zhang, L., Gao, H., Zhu, B., Li, J., 
& Gao, X. (2019). Genome-wide association study identifies the PLAG1-OXR1 region on 
BTA14 for carcass meat yield in cattle. Physiological Genomics, 51(5), 137–144.  

37 

 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
CHAPTER 3: Investigating New Technologies for On-Site Real-Time sequencing for any 

Hanna Ostrovski, Rodrigo P Savegnago, Wen Huang and Cedric Gondro 

Animal Scientist 

Abstract 

Animal breeding has been significantly impacted by genomic sequencing technologies, 

traditionally accessible only to lab professionals due to their complex nature requiring 

specialized laboratories and trained personnel. However, the emergence of the latest generation 

of sequencing instruments offers small, portable, real-time devices tailored for inexperienced 

users to effortlessly obtain genomic data. These new sequencing technologies can open a wealth 

of opportunities for livestock production systems by bringing testing for livestock directly at the 

sampling site. This study aims to be the initial exploration into mobile sequencing devices to 

obtain genomic information by a novice with no previous molecular experience. Sequencing was 

done with the MinION from Oxford Nanopore Technologies, which is a small portable 

sequencing device that can be brought out of the lab with an easy protocol for any level of 

researcher. Whole-genome sequence on one animal was achieved with multiple flow cell runs, 

with each run producing more data than the last, which points to an improvement in laboratory 

skills by the user. The maximum amount of genomic information achieved was 5GB, which is a 

large discrepancy from the possible 50GB per run that ONT states can be accomplished. The 

bioinformatic pipeline used combined all flow cell outputs into one aligned sequence and had a 

high breadth of coverage, above 97%, but low depth of coverage at ~8x. This is due to the nature 

of Nanopore long-read sequencing, as the protocol does not require amplification of the DNA 

and reads each strand directly through the flow cell nanopores. Results of this initial exploration 

into the ease of use of ONT technologies are the first steps in providing a roadmap for practical 

adoption of on-site sequencing applications in agricultural production systems which can 

improve traceability and livestock production efficiency.  

38 

 
 
 
 
 
 
 
 
 
 
 
 Introduction 

Advances within the world of genomics have brought about an ease of obtaining 

sequence information from any animal at a reasonable price. Today, the most recent generation 

of sequencers, the third generation, are defined by sequencing single molecules without PCR 

amplification all while doing this in real-time (Dijk et al., 2018). Sequencing has become a 

widespread practice in most biology disciplines and has defined the past two decades of animal 

breeding and genetics. By utilizing sequence information, identification of important QTL 

(Quantitative Trait Loci), connecting genomic variation to phenotypic variation and genomic 

prediction have become commonplace for research and industry animal applications.  

One of the setbacks of obtaining any type of sequence information is the reliance on an 

external lab to obtain the genetic information. The current generation of sequencers is the most 

user-friendly, with the products from Oxford Nanopore Technologies (ONT) boasting a small 

sequencer and straight-forward protocol that can be used by even untrained personnel. This 

sequencer, the MinION from ONT (Jain et al., 2016; Lu et al., 2016), is to be the focus of this 

study which aims at understanding the “in’s-and-out’s” of this device as well as setting the 

groundwork for future work with this sequencing platform.  

Previous studies have started to explore the sequencing possibilities of this device, as it 

has an easy protocol, set up, and utilizes common lab practices (Lamb et al., 2020). The size of 

the MinION is perfect for in-field sequencing as it is small enough to fit in your hand and is very 

light. Field work with this device has already been done in disease diagnostics, e.g., a study done 

to identify Ebola outbreaks (Quick et al., 2016), Zika outbreaks (Faria et al., 2016) and most 

recently the outbreak of Covid-19 (Bull et al., 2020). These studies successfully outlined a 

protocol for mobile sequencing using the MinION while identifying samples for diagnosis and 

tracing the disease to specific strains or regions of origin. Rapid and mobile analysis of the 

genome has shown to be possible in these studies and should be applied to the field of animal 

genetics and genomics.  

Establishing usable sequence protocols for users of genomic information with no training 

in molecular biology can rapidly increase the usage of this technology in the animal industry, 

which leads to an increase in the use of genomic data. Recent studies have outlined ways to use 

this technology for genomic prediction (Lamb et al., 2021), which is the backbone to many 

39 

 
 
 
 
 
 
 
animal production groups. Understanding the full capabilities of the MinION is crucial in 

bringing genomic sequencing to the forefront of diagnosis, rapid testing, and traceability in the 

animal industry. 

Materials and Methods 

Development of Protocol  

The focus of this study was to identify the protocol to obtain whole genome sequence of 

an Akaushi bull from the MinION from ONT (Fig. 3.1). Most importantly, this study aimed to 

understand if genomic information could be obtained with this device by someone with no 

training within a traditional lab setting. The methodologies of obtaining a DNA library and 

sequencing with the MinION will be the focus, as this is a recent technology which has some 

troubleshooting to be done within the lab protocol. This protocol should serve as a blueprint for 

other newcomers to the world of sequencing and should outline the pros and cons of this device 

in this context. 

40 

 
 
 
 
 
 
 
 
Figure 3.1 ONT MinION size relative to an average human hand. 

Illumina Sequencing  

The most laborious part of obtaining Illumina data was just sending off the blood sample 

to a service provider. This is a very hands-off way to get high-coverage and accurate data. The 

beauty of the Illumina platform is the complete hands-off approach, where the goal is to obtain 

the sequence, not to revolutionize the practice. These two sequencing platforms come with 

different approaches; ONT’s (Oxford Nanopore Technologies) MinION is a low throughput 

portable device suitable for a hands-on approach without the need to send samples to external 

labs, while Illumina technologies are high throughput devices that require a dedicated lab setup.  

MinION Initial Costs and Materials 

The MinION was purchased along with flow cells and 3rd party reagents from NEBNext 

(NEBNext® Companion Module for Oxford Nanopore Technologies® Ligation Sequencing) that 

41 

 
 
 
 
 
 
 
 
are required for sequencing. All other lab materials, specifically mixers, pipettes, thermocyclers, 

computing systems, common reagents and AMPure beads needed for the ligation protocol were 

already available. 

The MinION device along with 12 flow cells cost $4,500, and the required 3rd party 

materials upwards of $1,000 for 24 reactions. The cost of additional flow cells is $900 per flow 

cell, but ONT does provide cheaper package options when purchasing more than one flow cell at 

a time.  

This system is quite time sensitive, as the flow cells have a shelf life of only 3 months. 

This amount of time can be a stretch for some experiments and limits the user to have to 

purchase the flow cells in succession of need. Obtaining substantial amounts of flow cells at the 

beginning of an experiment runs the risk of that flow cell “going bad”, which can be a huge blow 

to the cost of an experiment. In this experiment, all flow cells were used within 3-5 months of 

receiving the materials from ONT. No issues were found with the integrity of the flow cell past 

its use-by date, but differences in output could have been due to differences in flow cell 

construction by ONT.  

Library Preparation Protocol 

All experiments run on the MinION were done using the LSK-109 protocol, which is the 

ligation protocol provided by ONT. Other needs of experiments can be fulfilled with the wide 

selection of protocols/reagent kits provided by ONT with purchase of a flow cell. This ligation 

protocol contained a control experiment, which comes with all protocols from ONT. This 

experiment gives the user time to practice creating a library and loading the library into the flow 

cell without wasting DNA from the experimental source of interest. This control experiment is 

great for initial hiccups in the protocol, but can also be a large consumer of reagents, time, and 

flow cells. The issues that may arise in the actual experiment, which is the main aim, will not be 

the same issues that arise in the control experiments. These issues could range from low-quality 

DNA, not enough extracted DNA used in library prep, or even too much DNA used. Problems 

that may arise in the main experiment due to these issues may be irreversible and can cost the 

experimenter time and money. 

The subject of interest for obtaining WGS was a 6-month-old Akaushi (Red/Brown 

Wagyu) bull. Whole blood was collected from this calf and DNA was extracted using the 

42 

 
 
 
 
 
 
QIAamp blood kit (QIAamp DNA Blood Kits, n.d.). The LSK-109 protocol was laid out as an 

easy-to-use process, which included three steps: 1) DNA repair and end-prep, 2) adapter ligation 

and clean-up and 3) priming and loading the SpotON flow cell. The first step was done with the 

reagents included in the NEBNext Companion Module for ONT ligation sequencing, which 

specifically includes the DNA repair buffer and mix and the end-prep reaction buffer and 

enzyme mix. This step was the most difficult, as most of these reagents used were mixed at low 

quantities (1-3 µl) and without pipetting experience could be difficult to be precise. The most 

important part of this step is the DNA sample itself, which the protocol only calls for 1 µl if the 

sample is of high-density (100-200 ng). The DNA used for this study was measured in the range 

of 130-140 fmol with the Qubit fluorometer. Each run consisted of creating a library, loading it 

onto a flow cell and running the flow cell for about 72hrs. After each run was finished, the flow 

cell was washed to remove the library and put back into refrigeration. 

The first runs were all done with 1 µl of DNA, but due to low sequencing output, the 

amount of DNA used was increased to 2 µl on the last 2 runs. Increasing the amount of extracted 

DNA used had a positive outcome to the amount of DNA sequenced, as the final runs produced 

the most sequence. This could also be due to the experience of the user increasing. Increasing the 

amount of DNA used is a risk though, as it could either increase DNA sequence output due to the 

abundance of DNA primed in the sample or it can clog the nanopores in the flow cell array 

(Kubota et al., 2019) with an overabundance of DNA and result in low output.  

The library preparation is designed to take around 60 minutes, but without proper 

training, this preparation can take up to 2 hours. The library preparation via the protocol 

provided by ONT contains most reagents needed after the first step of DNA end-repair, except 

for the AMPpure beads, which are molecular beads that help clean the sample throughout the 

protocol. Most steps in the protocol are easy to follow, but some can be time sensitive, and many 

steps require a steady hand. These are all hurdles that a beginner will overcome with practice, but 

slight missteps will result in having to restart the preparation. Some troubleshooting steps can be 

found in Table 3.1. 

43 

 
 
 
 
 
 
 
 
 
Troubleshooting Table 

Step 

Problem 

Possible Reason 

Possible Solution 

Ethanol Wash 

Low amount of 

Evidence of ethanol 

Use higher ethanol 

sequenced DNA 

still in sample 

% in wash (75-80%) 

to dry quicker 

Loading flow cell 

Bubble formation 

Incorrect pipetting, 

More 

introduction of air in 

stable/advanced 

nanopores 

pipetting practice 

Pipetting DNA into 

Low amount of 

Not enough DNA 

Increased amount of 

sample 

sequenced DNA 

used in library 

DNA used (2 µl) 

Thermal Cycler 

Low amount of 

Not enough time in 

Increased time at 

sequenced DNA 

thermal cycler 

higher temperature 

preparation 

Not hot enough in 

in first end-prep step 

thermal cycler 

which includes use 

of thermal cycler 

Use of Ampure 

Low amount of 

Need longer amount 

Incubated for 15 min 

beads 

sequenced DNA 

of time binding and 

at 37 to allow for 

un-binding 

Need more beads for 

binding (or 

unbinding) 

clean-up when 

Increased beads 

increased DNA 

introduced by 1.5x 

sample 

when using larger 

initial DNA sample 

Table 3.1 Problems that arose in sequencing analysis with the ONT MinION and solutions that 
were explored to troubleshoot 

There are two stopping points within the library preparation, one after the DNA repair 

and end prep and one after last step in preparation before flow cell loading. Each stopping point 

states that the library can be stored and should be used within the next 12 to 24 hours to get the 

best results. This was not done in this study but could be vital to other experiments.  

44 

 
 
 
 
 
 
 
 
After library preparation the DNA library is loaded into the flow cell within the MinION 

through the SpotON port on the flow cell. Loading the SpotON point in the flow cell is the most 

detrimental step within the protocol if done incorrectly. The library must be loaded along with a 

priming solution without the introduction of air bubbles. These bubbles have the capability to 

permanently damage the sequencing array which contain the “nanopores” in a lipid bilayer 

which do all the genome sequencing. 

Basecalling 

Sequencing is done by the nanopores by recording the difference in electrical signals 

from the baseline as each base passes through the pores. These signals are lovingly named 

“squiggles” which are the raw signal output from the MinION (Rang et al., 2018). Changes in 

electrical current can be determined as different base pairs by a basecaller, the most common one 

being from Nanopore itself, “guppy”. The guppy software is fast and efficient, but is not free for 

anyone to use, as it is only available to Nanopore customers. Other 3rd party software is available 

for download and has proven to produce high-quality and accurate data that is alike to the output 

from “guppy”. Basecalling converts raw squiggles into a fastq format, which is then used in the 

alignment procedure.  

Bioinformatics 

Post-basecalling procedures process these reads into aligned sequences and are dependent 

on bioinformatic programs with a variety of steps. These steps include an initial quality control, 

an adapter trimming, alignment of the sequence, sorting all files, converting files into bam 

format, calling the variants, and finally obtaining alignment statistics such as depth of coverage, 

percent of genome covered, and total number of variants called. Alignment of this data consisted 

of established bioinformatic software which includes porechop (Wick et al., 2019), samtools and 

bcftools (Li et al., 2009; Li & Barrett, 2011), longshot (Edge & Bansal, 2019) and minimap2 (Li, 

2018). Many other programs exist that do a myriad of analyses for alignment. Here, we outline a 

proposed pipeline for analysis of nanopore reads with common bioinformatic tools.  

The first program used for these reads is the R-script based program MinIONQC 

(Lanfear et al., 2019), which outputs plots of diagnostics of each run on a flow cell. Such 

diagnostics include the q-score of the reads, the length of all the reads, how many reads were 

45 

 
 
 
 
 
 
 
produced, the number of reads produced over time, and output diagnostics per nanopore in each 

flow cell. These are all important outputs to understand each run on the MinION itself, as many 

tools focus on the output fast5 files, it is still crucial to look at how each run compared to 

improve future runs on the MinION.  

The bioinformatic pipeline continues with adapter trimming done by porechop, where 

adapters from reads were removed. The alignment was done with minimap2 using the cattle 

reference assembly. Sorting the reads, indexing the reference genome and reads (creating a 

blueprint to align reads to the genome more efficiently) and merging all reads together was done 

by many operators in samtools. Variant calling was done with Longshot, which is a variant caller 

designed for long-read sequences. Alignment statistics were produced by using operators of 

bcftools that manipulate the final vcf format.  

This pipeline aims at aligning sequence from multiple runs with the MinION with a 

reference genome while combining all runs together to get one whole genome sequence from the 

animal of interest. The pipeline is not unlike many proposed bioinformatic pipelines that exist for 

other sequence information (Zhou et al., 2019) but is specifically tailored to reads from the ONT 

MinION. This technology was used for sequencing of the whole genome, which may be the 

incorrect approach to utilize this sequencer best. The MinION is best suited for obtaining a small 

amount of genomic information very quickly while also being small enough to be mobile.  

The bioinformatic pipeline to align and call variants for the Illumina output was done 

using IVDP (https://github.com/rodrigopsav/IVDP). All of these programs are commonly used 

for Illumina outputs, as these genotypes are the most commonly used within the genomic space.  

Results 

Application of Protocol 

The final whole genome sequence obtained from this method had 5.2x coverage depth 

and 2,329,110 variants called. These variants called by longshot only include one sample, which 

may explain the low number of variants called.  Of these variants called, there were 856,924 

homozygous (1/1) calls and 926,780 heterozygous (1/0) calls. Output from each run can be found 

in Table 3.2. 

46 

 
 
 
 
 
 
 
 
 
Run 

Number 

Total Gb 

Total Reads 

Mean 

Length 

Max Length 

Ultra-

long reads 

1 

2 

3 

4 

5 

6 

7 

0.13 

1.08 

2.61 

1.79 

3.36 

0.93 

5.53 

12,000 

11,011 

66,867 

136,000 

7,993 

117,208 

262,516 

9,945 

126,941 

244,000 

7,340 

107,872 

372,000 

9,035 

112,098 

112,000 

8,334 

99,962 

644,000 

8,597 

129,442 

0 

4 

6 

4 

10 

0 

11 

Table 3.2 Output from each run at 72 hours from the MinION. 

The first runs had the worst outcome due to the learning curve of the protocol techniques 

that required some troubleshooting. By the final run, the protocol was working more efficiently, 

and the researcher had more practice with the laboratory techniques. This is still a low amount of 

coverage for producing a full genome sequence, with a more acceptable coverage hovering from 

10-30x. Utilizing low-pass imputation techniques may be able to bypass this issue and fill in the 

areas of lower coverage with reference genomes (Snelling et al., 2020).  

The sequencing protocol that was generated through all runs in this experiment differed 

from the original protocol taken from the Nanopore website. The general path taken to acquire 

WGS can be found in Figure 3.2. This was expected, as not all experiments are created equally, 

and the experience level of the author was low to none at the start. Most of the protocol 

differences were taken from the Nanopore community page, which filled in the gaps of 

knowledge where the protocol was lacking. The protocol outlined in the Methods section is 

aligned with many of the suggestions shared online, as well as trial and error from the runs done.  

47 

 
 
 
 
 
 
 
Figure 3.2 Workflow for sequencing with an ONT MinION in this study. 

The bioinformatic pipeline produced for alignment of these reads was the easiest to 

diagnose, as data can be manipulated much easier than the physical DNA sample. This pipeline, 

like most, consists of 3rd party software that is free to use except the basecalling guppy software. 

This could lead to issues for future users who may be using ONT products but do not have access 

to download guppy. The alignment of all sequence data would take the user a couple of days to 

do, as some of these programs are doing complex operations on huge data files. The user must 

also be aware of memory and storage issues, as these files are not known to be small, although 

the speed of alignment does correlate with the size of the genome being used. The longest step is 

the basecalling process, which can take many days to complete just one run that produces over 4 

GB without GPU capabilities. The success and time spent in this process depends on the 

experience of the user, as this study aimed to use this technology through the eyes of a beginner.  

Comparison of Sequencing Platforms  

Multiple methods of genomic sequencing were considered for this study: the Illumina 

HiSeq and Oxford Nanopore’s MinION with flow cells. These technologies come at different 

price points; the MinION costing around $5,000 for the technology, 3rd party reagents and flow 

cells to run experiments, and the HiSeq, which was around $2,500 for one animal at WGS. The 

difference in price is large when considering the “start-up” one-time costs with the MinION, but 

future runs will only cost the user the amount of the consumables needed.  

48 

 
 
 
 
 
 
 
 
Coverage output of the Illumina HiSeq was at 40x with 98% of the genome covered, 

comparative to the Nanopore WGS, which had a much lower coverage depth, but did have 97% 

of the genome covered (Table 3.3).  

Coverage Depth (bam files)  

Coverage Depth  

Breadth  

Depth of Covered Positions  

Illumina 

HiSeq  

Nanopore 

Flow Cells  

39.16x  

98.62%  

39.71x  

8.21x  

97.18%  

8.44x  

Table 3.3. Coverage depth and breadth of the sequencing platforms used for WGS on one 
Akaushi bull. 

Comparison of each method yielded comparable results in breadth but had a significant 

difference in coverage depth. This is partly due to protocols that do not need amplification of the 

DNA sample, which makes the library preparation process much quicker. DNA amplification 

does have its benefits, most clearly, the increase in coverage depth as seen in the Illumina output 

which utilizes amplification. Due to the time amplification can take (up to many hours), 

removing amplification from the protocol can bring sequence to the user much faster.  

Conclusions 

The  new  frontier  in  animal  sequencing  is  to  get  it  out  of  the  lab,  speed  up  the  results 

collected, and to make it available to all types of researchers and producers. The MinION has the 

possibility to do all these things, as the size and protocol are easily picked up and easy to adjust to 

any type of experiment with an experimenter of any skill level. The protocol provided by ONT for 

producing  sequence  could  be  more  in-depth,  as  many  steps  were  simply  glazed  over,  and 

inexperienced lab users could easily be lead astray. Producing good results of each experiment run 

with  the  MinION  depends  on  the  user  tweaking  the  amount  of  reagent  used,  the  time  spent 

incubating  or  even  the  time/speed  of  mixing.  Understanding  ONT  protocols  for  a  specific 

experiment for a beginner requires reliance on problem solving with the Nanopore community, 

which can make the sequencing process more labor intensive, with a constant “debugging” of sorts 

of the MinION and library preparation. The bioinformatic pipeline produced for alignment of these 

49 

 
 
 
 
 
  
  
 
 
 
 
reads consists of 3rd party software that is free to use except the basecalling guppy software and 

the MinKNOW sequencing software. This could lead to issues for future users who may be using 

ONT products but do not have access to download this software. 

Future technologies from ONT could help remove the human error and produce more 

accurate sequences. The VolTRAX (Oxford Nanopore Technologies) is a recent launch from 

ONT that takes the human out of the library preparation all together, as it is a device that does 

the library preparation itself. Other library kits also exist, such as the Rapid kit, which can take a 

high-quality DNA sample from raw DNA to library in ~20 minutes. Depending on the objectives 

of the research, the technologies and chemistry can be “mixed-and-matched" to create a user-

friendly protocol.  

The MinION has a long laundry list of pros and cons, but the novelty of this technology 

is important for use in the animal industry. This is because the idea of mobile sequencing, which 

can be used by any person of any skill level, is unheard of in the sequencing space. This is the 

only sequencer on the market that can be taken out of the laboratory, which opens the door to the 

opportunity of on-farm sequencing, at the right price. The practical application of this aspect 

allows for on-site diagnosis of disease, quick genomic sequencing of animals for quick parentage 

testing and even traceability of animal products.  

50 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LITERATURE CITED 

Bull, R. A., Adikari, T. N., Ferguson, J. M., Hammond, J. M., Stevanovski, I., Beukers, A. G., 
Naing, Z., Yeang, M., Verich, A., Gamaarachchi, H., Kim, K. W., Luciani, F., Stelzer-
Braid, S., Eden, J. S., Rawlinson, W. D., van Hal, S. J., & Deveson, I. W. (2020). 
Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis. 
Nature Communications 2020 11:1, 11(1), 1–8.  

Dijk, E. van, Jaszczyszyn, Y., Naquin, D., & Thermes, C. (2018). The third revolution in 

sequencing technology. Trends in Genetics, 34(9), 666–681.  

Edge, P., & Bansal, V. (2019). Longshot enables accurate variant calling in diploid genomes 

from single-molecule long read sequencing. Nature Communications 2019 10:1, 10(1), 
1–10.  

Faria, N. R., Sabino, E. C., Nunes, M. R. T., Alcantara, L. C. J., Loman, N. J., & Pybus, O. G. 

(2016). Mobile real-time surveillance of Zika virus in Brazil. Genome Medicine, 8(1), 97.  

Hayes, B. J., & Daetwyler, H. D. (2019). 1000 Bull Genomes Project to Map Simple and 

Complex Genetic Traits in Cattle: Applications and Outcomes. Annual Review of Animal 
Biosciences, 7, 89–102.  

Jain, M., Olsen, H. E., Paten, B., & Akeson, M. (2016). The Oxford Nanopore MinION: 

Delivery of nanopore sequencing to the genomics community. Genome Biology, 17(1).  

Kubota, T., Lloyd, K., Sakashita, N., Minato, S., Ishida, K., & Mitsui, T. (2019). Clog and 

Release, and Reverse Motions of DNA in a Nanopore. Polymers, 11(1).  

Lamb, H. J., Hayes, B. J., Nguyen, L. T., & Ross, E. M. (2020). The Future of Livestock 

Management: A Review of Real-Time Portable Sequencing Applied to Livestock. Genes, 
11(12), 1478.  

Lamb, H. J., Hayes, B. J., Randhawa, I. A. S., Nguyen, L. T., & Ross, E. M. (2021). Genomic 

prediction using low-coverage portable Nanopore sequencing. PLOS ONE, 16(12), 
e0261274.  

Lanfear, R., Schalamun, M., Kainer, D., Wang, W., & Schwessinger, B. (2019). MinIONQC: 

fast and simple quality control for MinION sequencing data. Bioinformatics, 35(3), 523–
525 

Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 

3094–3100.  

Li, H., & Barrett, J. (2011). A statistical framework for SNP calling, mutation discovery, 

association mapping and population genetical parameter estimation from sequencing 
data. Bioinformatics, 27(21), 2987–2993.  

51 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., 
Durbin, R., & Subgroup, 1000 Genome Project Data Processing. (2009). The Sequence 
Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079.  

Lu, H., Giordano, F., & Ning, Z. (2016). Oxford Nanopore MinION Sequencing and Genome 

Assembly. In Genomics, Proteomics and Bioinformatics (Vol. 14, Issue 5).  

NEBNext® Companion Module for Oxford Nanopore Technologies® Ligation Sequencing | 
NEB. (n.d.). Retrieved January 9, 2022, from https://www.neb.com/products/e7180-
nebnext-companion-module-for-oxford-nanopore-technologies-ligation-sequencing 

Oxford Nanopore Technologies. (n.d.). VolTRAX. Retrieved February 6, 2021, from 

https://nanoporetech.com/products/voltrax 

QIAamp DNA Blood Kits. (n.d.). Retrieved January 9, 2022, from 

https://www.qiagen.com/us/products/discovery-and-translational-research/dna-rna-
purification/dna-purification/genomic-dna/qiaamp-dna-blood-kits/ 

Quick, J., Loman, N. J., Duraffour, S., Simpson, J. T., Severi, E., Cowley, L., Bore, J. A., 

Koundouno, R., Dudas, G., Mikhail, A., Ouédraogo, N., Afrough, B., Bah, A., Baum, J. 
H. J., Becker-Ziaja, B., Boettcher, J. P., Cabeza-Cabrerizo, M., Camino-Sánchez, Á., 
Carter, L. L., … Carroll, M. W. (2016). Real-time, portable genome sequencing for Ebola 
surveillance. Nature, 530(7589), 228–232.  

Rang, F. J., Kloosterman, W. P., & de Ridder, J. (2018). From squiggle to basepair: 

Computational approaches for improving nanopore sequencing read accuracy. In Genome 
Biology (Vol. 19, Issue 1, pp. 1–11). BioMed Central Ltd.  

Snelling, W. M., Hoff, J. L., Li, J. H., Kuehn, L. A., Keel, B. N., Lindholm-Perry, A. K., & 

Pickrell, J. K. (2020). Assessment of Imputation from Low-Pass Sequencing to Predict 
Merit of Beef Steers. Genes, 11(11), 1312.  

Wick, R. R., Judd, L. M., & Holt, K. E. (2019). Performance of neural network basecalling tools 

for Oxford Nanopore sequencing. Genome Biology, 20(1), 1–10.  

Zhou, A., Lin, T., & Xing, J. (2019). Evaluating nanopore sequencing data processing pipelines 

for structural variation identification. Genome Biology, 20(1), 237.  

52 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 4: Mobile, Rapid Beef Product Identification through 3rd generation 

Sequencing Methods 

Hanna Ostrovski, Yasir Nawaz, Rodrigo P Savegnago, Wen Huang, and Cedric Gondro 

Abstract 

The highly sought-after Wagyu cattle breed, celebrated for its exceptional quality, originated in 

Japan, yet access to these animals has been globally restricted since the 1990s, with no live 

animal Wagyu genetics becoming available. The early slow adoption of Wagyu in the US and 

their high-quality beef products has seen a recent rise. The surge in the availability of Wagyu in 

the United States has coincided with a notable increase in consumer demand for high-marbling 

beef, in which Wagyu are prized for. This surge has brought concerns about product authenticity, 

as the label "Wagyu" has been liberally applied to products without stringent verification 

processes, raising doubts about the true origin. Quick verification of product is needed and can 

be achieved through genotype. Recent sequencing technologies have made rapid, on-site 

genomic traceability a thing of the present. This study aims to create a protocol to quickly 

identify product breed composition through mobile sequencing products from Oxford 

Nanopore’s MinION. This mobile protocol was aimed at the everyday user with a low cost that 

has wide accessibility. Wagyu samples were genotyped with the MinION and 100K Bovine SNP 

chip, then compared via principal component analysis for initial identification. Low output from 

the MinION found PCA as a poor identifier and more precise breed classification was needed. 

This was accomplished through haplotype correlation and concordance rate against a large 

reference of many breeds at whole-genome sequence. The MinION output identified these 

animals through haplotype matching with high correlation (0.55) and concordance rates (0.94).  

Product identification and certification through genotype for breed claims on Wagyu was 

accomplished through this rapid sequencing kit. Further exploration into Nanopore products will 

pave a path of putting the power of high-quality beef product verification in the hands of a 

consumer.  

53 

 
 
 
 
 
 
 
 
 
 
Introduction 

Identifying products from producer to consumer has shown to be difficult in the modern 

era of high production and consumption of millions of pounds of animal goods. The term 

traceability in research rhetoric has become commonplace and many studies have proposed 

approaches to tackle this problem (Aung & Chang, 2014; Bosona & Gebresenbet, 2013; 

Schroeder & Tonsor, 2012). One of the most varied products in the United States is Wagyu beef. 

The labeling of Wagyu products is not heavily protected and encompasses all animals from 50% 

to 100% Wagyu breed composition.  

Many tagging or electronic identification practices are not a permanent identifier and 

could fall off, not be read correctly by scanners, or may be installed incorrectly and result in a 

human error. The most permanent identifier of animal-based products is DNA. Identifying 

animal products from the beginning of its life to the product via sequencing could be the answer 

to product identification without relying on physical or electronic tagging. Although it is known 

that every animal has their own genotype, it is not always easy to sequence and connect that 

sequence to a product. The process of DNA sequencing relies on molecular biologists, 

bioinformatic scientists, geneticists, animal scientists and even computer scientists. The demand 

for genomic information obtained out of the lab has grown, as technologies have opened doors 

for scientists, researchers, and students to obtain genotypic information by themselves for their 

applications (Gupta & Gupta, 2014).  

Third-generation sequencing technologies have paved the way for the possibility of 

mobile sequencing, with the MinION being introduced in 2014 (Jain et al., 2016). Other 

technologies for DNA extraction in an easy and quick method have also been introduced 

(QIAamp DNA Blood Kits). Mobile DNA sequencing kits have not been established by a specific 

technology company, but other studies have introduced ways to sequence DNA out of the lab 

(Lamb et al., 2020). Nanopore’s MinION has paved the way for taking the complexity out of wet 

lab protocols through a protocol that requires less time and lab materials than traditional 

sequencing methods. The outputs of these protocols are genomic sequences that are obtained 

through long-read technology. These long reads can identify structural variation that is not 

detectable with traditional short-read sequencers (Nguyen et al., 2023; Zhou et al., 2019).  

54 

 
 
 
 
 
 
Comparison of multiple sequencing techniques using Nanopore technologies has been 

documented (Bowden et al., 2019; Stefan et al., 2022; Tyler et al., 2018) but an exploration into 

the use of the “flongle” has still to be done. This flongle (R9.4.1) is some of the most cost-

effective technology for sequencing using the MinION, with the cost of a single flongle starting 

at $90 (Oxford Nanopore Technology, n.d.).   Previous studies with the MinION have outlined 

obtaining WGS with smaller genomes with ease and sufficient coverage (King et al., 2021; 

McNaughton et al., 2019; Taylor et al., 2019), most notably, obtaining significant and accurate 

sequence of pathogens and bacteria. As of this date, there have been no studies that have tried to 

achieve sequencing for breed composition in an animal population. Utilization of rapid, mobile 

sequencing using this flongle technology could pave the way for animal identification in an out-

of-lab setting at a lower price.  

This study aims to create an out of lab sequencing protocol that is now available to the 

everyday scientist for sample identification. Initial exploration into feasibility of obtaining 

sequence from the MinION flongle will create the baseline for creation of a low-cost mobile 

sequencing kit to identify breed composition, especially in Wagyu labeled products. The “gold 

standard” of sequencing technologies in the animal sciences, the Illumina 100k Bovine chip, was 

utilized in this study as a baseline of breed identification. Further placement of animals in their 

respective breeds will be done using PCA. Previous studies have outlined the impact that PCA 

has on breed identification (Destefanis et al., 2000) to identify a sample’s positive match for 

breed identification. Rapid identification of sample breed composition, especially in the Wagyu 

breed, can give consumers peace of mind when purchasing these high-end beef products. Ease of 

use of these sequencing products can open the door to verification by any beef enthusiast.   

Material and Methods 

Sample Collection 

Initial blood collection was done on 14 Wagyu animals and 15 Akaushi animals at the 

Michigan State UPREC research center in Chatham, MI. The blood was sent off to NeoGEN in 

Lincoln, NE and genotyped at Illumina 100k Bovine Chip and used for sequencing with the out 

of lab protocol. 

55 

 
 
 
 
 
 
 
 
DNA extraction for Flongle Methods 

DNA extraction was done with the QIAamp mini blood kit (QIAamp DNA Blood Kits), 

but the protocol was changed for adoption to an out-of-lab protocol. The use of a mini centrifuge 

at 6,000 RPM (Mini Centrifuge, 6,000 RPM, White | Southern Labware) and a mobile heating 

block (One-Block Digital Dry Bath 115V | Southern Labware), were crucial for creating a 

protocol that can be mobile while also keeping costs low. Increasing the concentration of DNA 

for use in the flongles for each sample was integral to the out-of-lab protocol, so the amount of 

buffer in the final elution step was halved in the DNA extraction. Testing the concentration of 

the samples was done with the NanoDrop ND-1000 in a lab setting and was only used to identify 

if this protocol was viable outside of the lab. The samples ranged from a concentration of 72-89 

ng/ul, which is within the needs of the MinION flongle flow cells (Lu et al., 2016). All samples 

were extracted using these tested methods and labware that can be used in an out-of-lab setting. 

Nanopore Flongle Methods and Mobile Sequencing Kit 

Not all animals were considered in the flongle analysis due to time and cost limitations. 

In total, 9 animals per breed were chosen and run separately on individual flongles. The animals 

chosen from each breed were at random with the “random” function in R. The initial total of 18 

animals were used to establish the flongle pipeline for out-of-lab sequencing. 

The technology used to obtain genomic information was ONT’s MinION, along with the 

flongle flow cell and flongle adapter. The kit used to make the DNA library for the flongles was 

the Rapid Sequencing Kit (RAP-004), which boasts a very quick library preparation time. After 

initial practice, the protocol can be done within 30 minutes. Third party materials were needed 

for this protocol (AMPpure beads), which does largely increase the cost of the budget.  

The protocol was designed to be a mobile lab that any level of scientist could do at a 

reasonable cost. The goal for this study was to create a whole mobile kit that could follow the 

sample extraction, DNA extraction, flongle library preparation and final bioinformatic pipeline 

that could be completed out of the wet lab. This was done by purchasing mobile lab elements, 

such as a heating block, mini centrifuge and pipettes that were specifically for out-of-lab usage. 

Other consumables were needed such as pipette tips, nuclease free water, ethanol and tubes. All 

these elements and their approximate price can be found in Table 4.1. 

56 

 
 
 
 
 
 
 
Mobile Lab Component 

Price (USD) 

MinION Device 

$1,000 

Library Preparation 

Flongles (x12) and Flongle 

$1,460 

Adapter 

Pipettes 

$477 

Pipette Tips 

$150 per box 

Nuclease Free Water 

$40 for 500mL 

Omega XP Beads 

Ethanol 

$107 for 5mL 

$38 for 500mL 

Heating Block + Adapters 

$381 + $39.51 & $80 

Mini Centrifuge 

Magnetic Rack 

Computer (Dell g15 Gaming 

Laptop) 

$150 

$59 

$1,200 

Table 4.1 Price breakdown of mobile lab for MinION using flongles. 

The initial runs were done following the exact protocol laid out by Nanopore. After the 

first runs, some editing was done to the protocol to best fit the needs of the input sample and the 

wants of the goals of this study. Increasing the amount of DNA used in the preparation steps and 

increasing the Rapid Adapter (RAP) had positive outcomes by increasing total output. 

Output from some flongle runs was very low, while others were quite successful. 

Troubleshooting at the beginning of analysis was necessary, as this was a new technology 

introduced into the lab. Most problems arose due to flongle quality control, which is run through 

the MinKNOW program before library preparation is done. Many flongles were received as 

defective, which is classified as a flongle with less than 60 pores as functional out of the 126 

total pores (Rang et al., 2018). In this case, Nanopore will send a new flongle to replace the 

defective one. Time between quality control of the flongle and receiving the replacement may 

take up to 2-3 weeks, so timing issues can arise if a batch of flongles was found to be defective.  

57 

 
 
 
 
 
 
 
 
 
 
Bioinformatic Analysis 

The pipeline for the flongle samples from fast5 output files from the MinION to final vcf 

files is as follows: guppy (Wick et al., 2019) was used for basecalling and alignment with the 

UCD ARS 2.1 cattle genome, samtools (Li et al., 2009) was used for merging, sorting and 

indexing bam files, bcftools (Danecek et al., 2021) was used for final vcf calling. Bcftools was 

used because the output from the flongles were minimal compared to the initial exploration 

between Illumina and Nanopore at whole genome level.  

Since output from flongles was low, utilization of vcf outputs was compared at different 

number of calls per position as well as quality of the position call. This was done through 

manipulation of the vcf columns of DP and QUAL. DP standing for the depth of coverage at a 

certain call and QUAL standing for the quality of that call. Low quality levels and lower depth 

included more variants, but then sacrificed the accuracy of these calls. Common depth for high 

accuracy is around 40x with quality above 20 (De La Cerda et al., 2023; Delahaye & Nicolas, 

2021).   

Breed Identification through PCA 

Originally, 14 animals from the Akaushi and Wagyu herds were sampled, but not all 

animals were used in the protocol (Table 4.2). This was due to time and material availability, as 

the flongles received did not all pass the quality number of pores available. Some samples were 

also run multiple times (H002, 732, J101 and H006), as the protocol was not well established at 

the beginning of this study and is highly prone to human error. Those re-runs are marked by a 

“redo” in the sample name. 

58 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
AKAUSHI (RED WAGYU) 

BLACK WAGYU 

BREED 

H002redo 

H002 

H006 

H007 

H009 

H010 

H006redo 

G902 

H004 

A101 

19 

J101 

J103 

732 

199 

732redo 

20 

J101redo 

J102 

408 

D808 

Table 4.2 Breed identification of all samples used. Samples with "redo" were sequenced with 
more than 1 flongle due to low initial output. 

Initial exploration into breed identification utilized principal component analysis. Since 

each of these samples had different outputs, with different calls, one large matrix containing all 

animals could not have been created. Each sample was run through a bioinformatic analysis that 

included subsetting the calls the sample had against calls available in the 1000 bull’s directory. 

This subset of calls per sample yielded a PCA per sample. To obtain principal components, a 

genetic relationship matrix was created using this (VanRaden, 2008) method: 

𝐺 =

𝑍′𝑍
2 ∑ 𝑝𝑖(1 − 𝑝𝑖)

Where Z is a matrix of centered allele effects and 𝑝𝑖 is the allele frequency at locus i. The 

resulting G matrix was utilized to obtain the eigen values and eigenvectors for PCA. This 

analysis was done in R using the eigen package and plotting the first 2 principal components. 

The dispersal of these values per animal around the mean 0 was then plotted by breed, with 

percent of variation explained calculated by taking the variance of the eigenvalues (McVean, 

2009; Patterson et al., 2006).   

Breed Identification Method through haplotype Blocks 

The method of obtaining low pass sequencing with the MinION yielded results that are 

not fit for imputation due to the long-read structure of the reads obtained. Solving this issue 

through imputation of the dataset is not feasible. Long-read sequencing has this downfall, as it is 

not a pinpointed sequencing method in this case, which then outputs large chunks of the genome 

59 

 
 
 
 
 
 
 
 
 
 
 
which are usually not evenly distributed throughout the genome. A more effective method of 

identification of these animals was created without the need for imputation. 

Identifying haplotype blocks from long-read sequences from the flongle output proved to 

be straightforward approach for breed identification with the flongle. The initial haplotype from 

the flongle was used without the need for a variant caller. The haplotype blocks were then 

compared to haplotypes from all breeds in the 1000 bulls directory as well as Wagyu and 

Akaushi samples that had been collected at whole genome sequence. 

Samtools mpileup was used to get all the nucleotides and positions for reads with a 

mapping quality threshold of 60. Insertions were deleted from the data and positions were 

matched to 1000 bulls genome data to subset the reads to keep variants only. The number of total 

variants sequenced, number of reads obtained with the flongle and number of those reads aligned 

as well as passed quality control can be found in Table 4.3. Those samples that had low output of 

reads subsequently had a low number of variants and aligned reads that were able to be used in 

haplotype correlation. The sample nucleotides were considered as a haplotype and converted into 

0 and 1 based on counts of the reference allele. 

Sample ID 

19 

199 

20 

408 

732 

732redo 

A101 

D808 

G902 

H002 

Breed 

Wagyu 

Wagyu 

Wagyu 

Wagyu 

Wagyu 

Wagyu 

Akaushi 

Wagyu 

Akaushi 

Akaushi 

Number of 

Total 

Variants 

Reads 

Aligned Reads 

107,726 

254,558 

761,962 

765,020 

348,650 

10,816 

10,538 

28,902 

39,206 

29,789 

6,895 

6,831 

18,840 

25,653 

21,386 

QC 

Reads 

5,219 

4,834 

14,823 

20,506 

16,815 

4,014,254 

206,208 

135,512 

108,950 

741,301 

29,402 

2,884,050 

120,262 

1,444,057 

49,817 

39,616 

1,973 

19,209 

76,594 

31,815 

1,349 

14,304 

61,300 

24,950 

1,025 

Table 4.3 Number of variants obtained for haplotype matching procedure. This includes total 
number of variants obtained, total number of DNA strands read, number of those strands that 
were aligned and number of those strands that passed quality control. 

60 

 
 
 
 
 
  
 
 
 
Table 4.3 (cont’d) 

H002redo 

Akaushi 

1,727,485 

111,330 

H004 

H006 

Akaushi 

Akaushi 

2,598,474 

116,639 

31,082 

1,810 

H006redo 

Akaushi 

3,207,971 

160,989 

H007 

H009 

H010 

J101 

J101redo 

J102 

J103 

Akaushi 

Akaushi 

Akaushi 

Wagyu 

Wagyu 

Wagyu 

Wagyu 

3,707,136 

146,562 

997,026 

50,891 

1,196,141 

66,197 

157,110 

13,583 

1,305,754 

59,603 

699,505 

21,716 

4,711,443 

163,471 

70,094 

71,686 

1,261 

99,679 

89,832 

28,674 

41,036 

8,684 

38,517 

12,464 

98,634 

53,510 

56,608 

1,040 

78,601 

70,287 

21,176 

30,719 

6,713 

31,426 

9,611 

76,461 

The reference data, which consisted of many cattle breeds, was subset down based on 

positions of haplotypes obtained from the flongle output and converted into haplotype format. 

The correlations of sample haplotypes with reference haplotypes were calculated using Pearson’s 

correlation.  

𝑟 =  

∑(𝑥𝑖 − 𝑥̅)(𝑦𝑖 − 𝑦̅)
√∑(𝑥𝑖 − 𝑥̅)2 (𝑦𝑖 − 𝑦̅)

Concordance percentages were also calculated for every pairwise comparison between 

sample and reference haplotypes. This was done across the genome for each animal in the 

reference population for each breed. Correlations and concordances were used against each breed 

in the reference population to identify the breed with highest match with the sample. The largest 

concordance and correlation of haplotypes from sample to reference breed animal were then 

grouped as that reference animal’s breed. 

Results 

Flongle Output 

The output from each flongle run can be found in table 4. Three scenarios were 

considered, the position had a depth of 1x, 2x, or 3x or greater, with all quality filtering set to 15. 

Differences in number of variants used from Table 3 are due to differences in quality threshold, 

as the quality threshold used for PCA analysis was set very low to obtain as many variants as 

61 

 
 
 
 
 
 
 
 
 
possible, without consideration of accuracy. Output of the flongles from the mobile sequencing 

kit showed promising results, as some sample runs had a high number of variants sequenced (see 

732 redo and J103 in Table 4.4). The coverage depth and breadth of the flongles may not be 

enough for an accurate whole genome sequence but can give some insight into the sequence of 

the animal. The issue lies in the reliability of the genome, as low depth and breadth may 

introduce inaccurate calls in the genome.  

Sample ID 

Depth at 1x 

Depth at 2x 

Depth at 3x 

Position Coverage Depth 

19 

20 

199 

408 

732 

4,930,124 

32,767,833 

10,730,608 

33,942,193 

15,907,514 

34,765 

825,457 

61,254 

883,729 

281,956 

3,625 

40,237 

6,788 

55,992 

8,347 

732 redo 

174,803,776 

16,473,840 

1,138,012 

A101 

D808 

G902 

H002 

32,014,043 

126,877,137 

618,447,212 

1,704,607 

973,671 

8,928,299 

2,332,279 

19,969 

H002 REDO 

76,524,848 

3,538,080 

H006 

1,347,878 

8,897 

H006 REDO 

139,912,708 

H007 

H009 

H010 

J101 

162,263,656 

43,039,064 

52,202,351 

7,102,697 

J101 REDO 

57,663,228 

29,885,656 

J102 

J103 

10,554,241 

17,100,294 

1,308,600 

2,005,568 

84,323 

1,845,338 

828,256 

58,633 

605,251 

163,267 

4,143 

223,932 

0 

693,717 

1,385,234 

95,175 

158,992 

2,022 

97,239 

28,632 

202,218,709 

22,203,105 

1,713,119 

Table 4.4 Output of each sample at quality filtering of 15 at different depth 
coverages. 

62 

 
 
 
 
 
 
 
There were many flongles that did not run correctly due to low pore capacity or low 

amount of sequence in the library preparation (see H002 and H006 samples in Table 4.4). The 

minimum number of flongle pores that pass quality control to run is 50 pores (Delahaye & 

Nicolas, 2021) out of the 126 pores available per flongle. Anything under that threshold can be 

replaced by a new flongle if under the 4-week window of warranty that the flongles are allowed. 

Due to the variability in flongle quality received and initial protocol troubleshooting, only 7 

Akaushi and 9 Wagyu were used, as multiple flongles were run for a single animal if sequencing 

had failed.  

Due to low coverage and the nature of long-read sequencing, traditional imputation 

methods of the flongle outputs were difficult. This is due to the long-read sequence itself, as it is 

not conducive to imputation in that it outputs large chunks of the genome that may not have 

adequate coverage of the genome. In short-read sequencing, it is more likely to get many small 

segments of the genome that will cover a larger range of the genome (Whiteford et al., n.d.). 

Imputation programs are written for the latter sequencing scenario, in which the program can 

take those smaller sections in masse and imply the larger sections. When given the case of large 

sections with large missing sections in-between, the imputation software can falter.  

Breed Identification with 100k 

As a baseline for the greater sample size of animals used, the blood samples were not 

only sequenced with the flongle mobile set up but were genotyped at 100k as well. The animals 

were genotyped to understand what the PCA should look like with the flongle output, and if the 

sampled animals would group correctly within their respective breeds. In theory, the PCA plots 

should look identical if enough data was collected on the flongles. The PCA of animals used can 

be found in Figure 4.1. All samples of the two breeds, Wagyu and Akaushi, can be found via the 

legend in salmon (akaushi_test) and in red (wagyu_test). The samples that were collected for this 

study followed the pattern that was expected and were grouped into the correct breeds. This 

baseline confirms that these animals are in fact either Wagyu or Akaushi and group within what 

breed that they were assigned. 

63 

 
 
 
 
 
 
Figure 4.1 Principal Component Analysis with 100k genotypes of Wagyu and Akaushi samples 
with other cattle breeds to identify breed composition. 

Breed Identification using Flongle Mobile Kit via PCA 

The goal for the flongle output is to mimic the grouping in Figure 4.1. To do this, each 

VCF output for 9 Wagyu and 7 Akaushi were filtered at low depth and quality at each position to 

make up for low output. Each DNA strand in the library preparation is read into Nanopores as 

one long read (Clamer et al., 2014) unlike the 100k genotype that consists of small reads that 

cover each position at a large depth. The nature of long-read sequencing may not acquire 

considerable depth at each position but can span large swaths of the genome in single sequence 

strands.  Figures 4.2, 4.3 and 4.4 show each scenario when filtering for depth with 3 examples 

chosen based on output: a high amount of output (J103), an average amount of output 

(H002redo), and a low amount of output (19).   

64 

 
 
 
 
 
 
 
Figure 4.2 Principal component analysis with coverage depth filtering set to 1 and quality of 15. 
Sample 19 had low sequence output, sample H002 had an average sequence output and sample 
J103 had a high sequence output. Each sample can be identified as a purple diamond and should 
have been grouped in the Wagyu breed. 

65 

 
 
 
 
 
 
Figure 4.3 Principal component analysis with coverage depth filtering set to 2 and quality of 15. 
Sample 19 had low sequence output, sample H002 had an average sequence output and sample 
J103 had a high sequence output. Each sample can be identified as a purple diamond and should 
have been grouped in the Wagyu breed.. 

66 

 
 
 
 
 
 
 
Figure 4.4 Principal component analysis with coverage depth filtering set to 3 and quality of 15. 
Sample 19 had low sequence output, sample H002 had an average sequence output and sample 
J103 had a high sequence output. Each sample can be identified as a purple diamond and should 
have been grouped in the Wagyu breed.. 

The overall trend shows that the more positions you sequence, the better you may be at 

predicting the breed of the animal sequenced with the flongle. Yet, when you have the most 

positions when filter for low quality and depth, you are risking those positions being incorrect. 

The lower the accuracy of the call, the possibility of these animals’ correct breed identification is 

also lower or cannot even be realized with low amount of sequence available, such as in Sample 

19. The possibility of inaccurate placement of an animal in a breed becomes very high especially 

67 

 
 
 
 
 
 
in animals who belong to closely related groups, such as the Akaushi (Red Wagyu) and Black 

Wagyu.  

Breed Identification through Haplotype Correlation 

The number of reads obtained from this procedure was low, so traditional imputation 

strategies could not be employed without possibility of imputation “toward the mean” of a 

reference population. As seen in PCA results of figures 4.2, 4.3, and 4.4, it was not possible to 

subset the positions obtained against a reference to run principal component analyses for 

placement in a breed group. Another method of breed identification was employed through 

matching haplotypes with the reference population. Results of haplotype correlation indicated 

that the correlations ranged between 0.18 to 0.55 while the concordance percentages ranged 

between 0.79 to 0.94 for all animals in the reference genome.  

A plot of haplotype correlations per breed with each sample group (Wagyu or Akaushi) 

can be seen in Figure 4.5 and concordance of samples with reference haplotypes can be seen in 

Figure 4.6. 

68 

 
 
 
 
 
 
Figure 4.5 Correlation of Wagyu and Akaushi haplotypes with reference population haplotypes. 
Each breed considered can be seen on the X-axis, and the correlation of haplotype blocks in red 
(Akaushi) or green (Wagyu). Highest overall correlation can be seen between the sample’s 
realized breed and their respective reference breed group.  

69 

 
 
 
 
 
 
Figure 4.6 Concordance rates for Wagyu and Akaushi haplotype blocks with reference 
haplotypes. Highest overall correlation can be seen between the sample’s realized breed and 
their respective reference breed group. 

The mean concordance and correlations of sample haplotypes were highest for their 

respective breeds (i.e., Wagyu and Akaushi). The lowest correlations and concordances were 

observed for distantly related breeds like Brahman and Nelore which are indicine cattle.   

70 

 
 
 
 
 
 
 
 
 
Conclusions 

Overall, the mobile kit proved to be a viable option for an out-of-lab sequencing protocol. 

The initial setup is costly, but most consumables, excluding the nanopore flongles, are a cheap 

replacement. The output from the flongle sequencing via mobile kit may not prove to be 

effective in principal component analysis, but enough sequence was obtained to identify animals 

through another method using haplotypes. Breed identification using a PCA sacrifices specificity 

and accuracy, as the need for dense variant arrays that span the genome is crucial for building a 

genetic relationship matrix that can identify breed composition. Many variants may be 

sequenced, but only a few may contribute into breed identification (O’brien et al., 2020). This 

process, which includes all breeds that are to be considered, demands some sort of phasing or 

imputation, as the breed database is built on genotyped animals with a standard SNP map. To 

solve this issue, a new approach was established to better estimate breed composition without the 

need to phase to a certain SNP density. 

Recent imputation programs have been created specifically for long read nanopore data. 

The QUILT (Davies et al., 2021) program utilizes these long-reads in a haplotype imputation via 

Gibbs sampling. Utilizing this program for low-coverage, whole-genome data will be crucial in 

identifying breeds from a group of samples. Accurate imputation from low density genotype 

samples opens the door for more intensive breed identification, from principal component 

analysis to STRUCTURE (Porras-Hurtado et al., 2013) analysis. The setback to this method is 

the reliance on good databases. Breed identification cannot be known without first having the 

many breeds to compare to.  

Bringing efficient and accurate genomic sequencing out of the lab has proven to be a 

difficult task. It is even harder to bring sequencing to the masses at a low cost with an easy-to-

follow protocol. More up-front costly setups can achieve greater output (Lamb et al., 2021), 

which can be a solution if up-front cost is not issue. Many logistical issues such as sample 

collection, processing time, faulty materials, and insufficient data for bioinformatic analysis have 

been in the way of real-time sequencing success by any scientist. These are all hurdles that were 

faced while creating this protocol and sequencing kit. This technology is still in its infancy, with 

new flongle chemistry and engineering being produced annually. The cost of this technology and 

the difficulty of use reflects its novelty, and the protocol is still out of reach for an everyday 

71 

 
 
 
 
 
 
farmer or scientist. Identification of these samples was achieved, but through a novelty approach, 

which would not have been possible without the knowledge to do so, which is not commonplace. 

Even with all the setbacks, this new frontier of sequencing has proved to be able to identify breed 

composition through genotype in an out-of-lab setting. This achievement in technology was not 

even feasible 25 years ago, and new technologies upcoming from Nanopore aim to make 

sequencing more efficient and user friendly. 

The successful identification and certification of Wagyu beef through genotype analysis 

using rapid sequencing kits marks a significant stride in ensuring the authenticity of this high-

quality breed. These advancements not only offer a reliable means of verifying breed claims but 

also emphasize the potential for wider consumer trust in US Wagyu. Nanopore products have the 

potential to enhance transparency and trust within the food industry, ensuring that premium 

products like Wagyu maintain their integrity and value throughout the supply chain. 

72 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LITERATURE CITED 

Aung, M. M., & Chang, Y. S. (2014). Traceability in a food supply chain: Safety and quality 
perspectives. In Food Control (Vol. 39, Issue 1, pp. 172–184). Elsevier BV.  

Bosona, T., & Gebresenbet, G. (2013). Food traceability as an integral part of logistics 

management in food and agricultural supply chain. In Food Control (Vol. 33, Issue 1, pp. 
32–48). Elsevier.  

Bowden, R., Davies, R. W., Heger, A., Pagnamenta, A. T., de Cesare, M., Oikkonen, L. E., 

Parkes, D., Freeman, C., Dhalla, F., Patel, S. Y., Popitsch, N., Ip, C. L. C., Roberts, H. E., 
Salatino, S., Lockstone, H., Lunter, G., Taylor, J. C., Buck, D., Simpson, M. A., & 
Donnelly, P. (2019). Sequencing of human genomes with nanopore technology. Nature 
Communications 2019 10:1, 10(1), 1–9.  

Clamer, M., Höfler, L., Mikhailova, E., Viero, G., & Bayley, H. (2014). Detection of 3′-end 

RNA uridylation with a protein nanopore. ACS Nano, 8(2), 1364–1374.  

Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., 

Keane, T., McCarthy, S. A., & Davies, R. M. (2021). Twelve years of SAMtools and 
BCFtools. GigaScience, 10(2), 1–4.  

Davies, R. W., Kucka, M., Su, D., Shi, S., Flanagan, M., Cunniff, C. M., Chan, Y. F., & Myers, 

S. (2021). Rapid genotype imputation from sequence with reference panels. Nature 
Genetics, 53(7), 1104.  

De La Cerda, G. Y., Landis, J. B., Eifler, E., Hernandez, A. I., Li, F. W., Zhang, J., Tribble, C. 

M., Karimi, N., Chan, P., Givnish, T., Strickler, S. R., & Specht, C. D. (2023). Balancing 
read length and sequencing depth: Optimizing Nanopore long‐read sequencing for 
monocots with an emphasis on the Liliales. Applications in Plant Sciences, 11(3).  

Delahaye, C., & Nicolas, J. (2021). Sequencing DNA with nanopores: Troubles and biases. PLoS 

ONE, 16(10).  

Destefanis, G., Barge, M. T., Brugiapaglia, A., & Tassone, S. (2000). The use of principal 

component analysis (PCA) to characterize beef. Meat Science, 56(3), 255–259.  

Gupta, A. K., & Gupta, U. D. (2014). Next Generation Sequencing and Its Applications. Animal 

Biotechnology: Models in Discovery and Translation, 345–367.  

Jain, M., Olsen, H. E., Paten, B., & Akeson, M. (2016). The Oxford Nanopore MinION: 

Delivery of nanopore sequencing to the genomics community. Genome Biology, 17(1).  

King, J., Pohlmann, A., Dziadek, K., Beer, M., & Wernike, K. (2021). Cattle connection: 

molecular epidemiology of BVDV outbreaks via rapid nanopore whole-genome 
sequencing of clinical samples. BMC Veterinary Research, 17(1).  

73 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Lamb, H. J., Hayes, B. J., Nguyen, L. T., & Ross, E. M. (2020). The Future of Livestock 

Management: A Review of Real-Time Portable Sequencing Applied to Livestock. Genes, 
11(12), 1478.  

Lamb, H. J., Hayes, B. J., Randhawa, I. A. S., Nguyen, L. T., & Ross, E. M. (2021). Genomic 

prediction using low-coverage portable Nanopore sequencing. PLOS ONE, 16(12), 
e0261274.  

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., & 
Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 
(Oxford, England), 25(16), 2078–2079.  

Lu, H., Giordano, F., & Ning, Z. (2016). Oxford Nanopore MinION Sequencing and Genome 

Assembly. In Genomics, Proteomics and Bioinformatics (Vol. 14, Issue 5).  

McNaughton, A. L., Roberts, H. E., Bonsall, D., de Cesare, M., Mokaya, J., Lumley, S. F., 

Golubchik, T., Piazza, P., Martin, J. B., de Lara, C., Brown, A., Ansari, M. A., Bowden, 
R., Barnes, E., & Matthews, P. C. (2019). Illumina and Nanopore methods for whole 
genome sequencing of hepatitis B virus (HBV). Scientific Reports, 9(1).  

McVean, G. (2009). A genealogical interpretation of principal components analysis. PLoS 

Genetics, 5(10).  

Mini Centrifuge, 6,000 RPM, White | Southern Labware. (n.d.). Retrieved September 7, 2023, 

from https://www.southernlabware.com/mini-centrifuge 

Nguyen, T. V., Vander Jagt, C. J., Wang, J., Daetwyler, H. D., Xiang, R., Goddard, M. E., 

Nguyen, L. T., Ross, E. M., Hayes, B. J., Chamberlain, A. J., & MacLeod, I. M. (2023). 
In it for the long run: perspectives on exploiting long-read sequencing in livestock for 
population scale studies of structural variants. Genetics Selection Evolution 2023 55:1, 
55(1), 1–15.  

O’brien, A. C., Purfield, D. C., Judge, M. M., Long, C., Fair, S., & Berry, D. P. (2020). 

Population structure and breed composition prediction in a multi-breed sheep population 
using genome-wide single nucleotide polymorphism genotypes. Animal, 14(3), 464–474.  

One-Block Digital Dry Bath 115V | Southern Labware. (n.d.). Retrieved September 7, 2023, 

from https://www.southernlabware.com/one-block-digital-dry-bath-115v 

Oxford Nanopore Technology. (n.d.). Flongle. Https://Nanoporetech.Com/Products/Flongle. 

Patterson, N., Price, A. L., & Reich, D. (2006). Population Structure and Eigen analysis. PLoS 

Genetics, 2(12), 2074–2093.  

74 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Porras-Hurtado, L., Ruiz, Y., Santos, C., Phillips, C., Carracedo, Á., & Lareu, M. V. (2013). An 

overview of STRUCTURE: Applications, parameter settings, and supporting software. 
Frontiers in Genetics, 4(MAY), 48396.  

QIAamp DNA Blood Kits. (n.d.). Retrieved January 9, 2022, from 

https://www.qiagen.com/us/products/discovery-and-translational-research/dna-rna-
purification/dna-purification/genomic-dna/qiaamp-dna-blood-kits/ 

Rang, F. J., Kloosterman, W. P., & de Ridder, J. (2018). From squiggle to basepair: 

Computational approaches for improving nanopore sequencing read accuracy. In Genome 
Biology (Vol. 19, Issue 1, pp. 1–11). BioMed Central Ltd.  

Schroeder, T. C., & Tonsor, G. T. (2012). International cattle ID and traceability: Competitive 

implications for the US. Food Policy, 37(1), 31–40.  

Stefan, C. P., Hall, A. T., Graham, A. S., & Minogue, T. D. (2022). Comparison of Illumina and 
Oxford Nanopore Sequencing Technologies for Pathogen Detection from Clinical 
Matrices Using Molecular Inversion Probes. Journal of Molecular Diagnostics, 24(4), 
395–405.  

Taylor, T. L., Volkening, J. D., DeJesus, E., Simmons, M., Dimitrov, K. M., Tillman, G. E., 

Suarez, D. L., & Afonso, C. L. (2019). Rapid, multiplexed, whole genome and plasmid 
sequencing of foodborne pathogens using long-read nanopore technology. Scientific 
Reports, 9(1).  

Tyler, A. D., Mataseje, L., Urfano, C. J., Schmidt, L., Antonation, K. S., Mulvey, M. R., & 

Corbett, C. R. (2018). Evaluation of Oxford Nanopore’s MinION Sequencing Device for 
Microbial Whole Genome Sequencing Applications. Scientific Reports, 8(1), 10931.  

VanRaden, P. M. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy 

Science, 91(11), 4414–4423.  

Whiteford, N., Haslam, N., Weber, G., Prü Gel-Bennett, A., Essex, J. W., Roach, P. L., Bradley, 
M., & Neylon, C. (n.d.). An analysis of the feasibility of short read sequencing.  

Wick, R. R., Judd, L. M., & Holt, K. E. (2019). Performance of neural network basecalling tools 

for Oxford Nanopore sequencing. Genome Biology, 20(1), 1–10.  

Zhou, A., Lin, T., & Xing, J. (2019). Evaluating nanopore sequencing data processing pipelines 

for structural variation identification. Genome Biology, 20(1), 237.  

75 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 5: Genetic Characterization of the Akaushi Breed in the United States 

Hanna Ostrovski and Cedric Gondro 

Abstract 

Understanding the relationship between cattle breeds is important when considering 

crossbreeding in an industry setting or knowing the feasibility of a multi-breed genomic 

prediction. This is especially important when considering Japanese Wagyu breeds, which are not 

well characterized in a population analysis. Today, in the United States (US), Akaushi (also 

known as Japanese Brown or Red) and Japanese Black cattle are rising to prevalence due to their 

high-quality beef products. The American Akaushi population structure has never been explored 

since the arrival of these animals in the 1990’s. This study aims to better understand the Akaushi 

population by analyzing such population parameters as inbreeding, heterozygosity, opposing 

homozygotes, linkage disequilibrium (LD), and effective population size. Comparison with other 

Asian and European cattle breeds using principal component analysis was performed to identify 

the relatedness of the Akaushi breed in the US (United States) to other cattle breeds. Our study 

found that out of the 43 animals studied, the genetic variation within the Akaushi population was 

comparable to other Asian breeds but showed large amount of LD within the genome. Analyzing 

relatedness to other cattle breeds observed that the Akaushi are most closely related to the 

Korean Hanwoo and Chosun. This study outlines that within the American Akaushi population, 

large LD blocks are apparent and that this group of cattle are most closely related to Hanwoo 

cattle along with other Japanese Breeds. Further exploration into the American Japanese breeds 

is needed to fully understand the selection pressures that have occurred since the 1990’s. 

Introduction  

The first cattle domestication event of Bos taurus taurus occurred around 8,500 B.C. in 

the Fertile Crescent (Bruford et al., 2003). Soon after this first domestication, a separate 

domestication event of Bos taurus indicus occurred in the Indus Valley (Loftus et al., 1994) 

which led to the two specific subspecies that are now present today. Pinpointing the 

dissemination of these two species throughout the world is important to understand the genetic 

76 

 
 
 
 
 
 
 
 
 
 
makeup of our current cattle breeds, specifically those breeds of high-quality, for this study, 

high-marbling animals originating from the Asian continent. The evidence of Bos taurus taurus 

in the Asian continent placed their arrival around 2,500-1,900 B.C. (Cai et al., 2014) and these 

domesticated cattle then spread to the outreaches of Japan from the Asian continent via Korea 

(Sasazaki et al., 2006b).  

All modern cattle in Japan can be genetically classified as a cross between an imported 

European cattle breed, such as Simmental, and the imported Bos taurus taurus from Korea 

(Gotoh et al., 2018). The animals of focus for this study are a group of American Akaushi cattle 

that have been reproducing in the United States for ~30 years. The Akaushi population in 

America is known to have originated from the Japanese Brown (also sometimes known as Red) 

Wagyu cattle that were imported into the United States from Japan in 1994. Before this small 

nucleus, there existed two Wagyu bulls in the United States which were brought in 1976 and had 

already been used to cross with other breeds. This population has grown from around a dozen 

animals to a large herd of Fullblood and Purebred American Akaushi and Wagyu in the United 

States (Beeman, 2019. Interview).  

The Akaushi breed is known to be classified as a Japanese breed, but this study aims to 

identify relatedness between American Akaushi and other Asian breeds through genomic 

analysis using population parameters. A wide variety of cattle breeds were analyzed and 

compared to the Akaushi to understand the breed in relation to cattle used around the world. The 

methods used to define the American Akaushi population included calculating the level of 

inbreeding, the observed heterozygosity, the number of opposing homozygotes, the measure of 

linkage disequilibrium (LD), the effective population size (Ne), and the genetic relationship 

between the Akaushi population and other traditionally Eastern and Western cattle breeds. 

Identification of origin of this breed could lead to a better understanding of breed composition to 

make better management decisions based on evaluations of other comparable breeds.  

Materials and Methods  

Population measures were calculated to better understand the Akaushi population in the 

United States. A principal component analysis (PCA) was used firstly to understand the breeds 

that were most like the Akaushi based on genotype. After PCA, the breeds that most resembled 

77 

 
 
 
 
 
 
 
the Akaushi genetically were included with the Akaushi in population parameter analysis. The 

measures used in this study included the level of inbreeding, the observed heterozygosity, the 

number of opposing homozygotes, the measure of linkage disequilibrium, and the effective 

population size.  

Data 

Original animal counts for each breed can be found in Table 5.1. The sample size from 

each breed ranged from 12 to ~1500 animals.  

Breed 

Angus 

Charolais 

Hereford 

Holstein 

Limousin 

Murray Grey 

Shorthorn 

Jersey 

Hanwoo 

Wagyu 

Yeonbyun 

Akaushi 

Akaushi Crossbred 

Brindle 

Jeju Black 

Chosun 

Hanwoo Population 2 

Number of Animals Analyzed 

112 

12 

79 

978 

14 

36 

77 

688 

1492 

119 

63 

43 

56 

20 

20 

20 

239 

Table 5.1 Original animal counts per breed considered in population analysis. 

Imputation 

All genotyped animals were imputed up to a subset 700k chip that contained 507,261 

SNP. Subsetting the 700k SNP panel was necessary to remove those SNP calls that purely 

78 

 
 
 
 
 
 
 
contained NAs across most breeds. The ~507k SNP chip was chosen because it represents only a 

maximum of 10% of missing calls within the whole 700k genotyped animals.  

The imputation from 50k to 507k SNP chip was done using the phasing program EAGLE 

(Loh et al., 2016) and the imputation program MINIMAC (Das et al., 2016). The phasing and 

imputation protocol that was followed can be found in the paper published by Al-Mamun et. Al. 

(Al-Mamun et al., 2017). 

Genetic relationship to other breeds 

 To understand the genetic background of the Akaushi cattle and how related it was to 

other Asian cattle breeds, principal components were calculated using the Akaushi genetic data 

as well as data from Angus, Charolais, Hanwoo, Hereford, Holstein, Jersey, Limousin, Murray 

Grey, Shorthorn, Wagyu, Yeonbyun, Akaushi crossbreed, Brindle, Chosun, and Jeju Black 

cattle.  

All animals from all breeds mentioned above were imputed up to the subset 700k chip 

which contained 507k SNP. After imputation, a G matrix was constructed with all animals and 

was centered around the mean of the matrix. A singular value decomposition was employed to 

calculate the principal components of the centered G. The first two principal components were 

plot against each other to understand the genetic variability between all breeds. 

After analysis using PCA for all breeds, the East Asian cattle populations were chosen to 

compare population structure to the American Akaushi breed. These breeds include the Hanwoo, 

Chosun, Jeju Black, Yeonbyun, an Akaushi cross, and Wagyu. These breeds would be the most 

interesting to analyze, as the Akaushi in America has not been thoroughly studied while other 

similar cattle breeds have existed for many years and studied extensively.  

Similarities in population parameters between breeds would result in the conclusion of 

origins for the Akaushi breed. This was done using common population parameters such as 

measure of inbreeding, level of heterozygosity, analysis of opposing homozygotes, level of 

linkage disequilibrium (LD) and estimation of the effective population size. 

Level of Inbreeding  

The level of inbreeding in a population can give insight into the population structure 

itself. The inbreeding coefficient explains how related the parents of an animal are. The increase 

79 

 
 
 
 
 
 
 
in inbreeding can cause inbreeding depression which can cause fitness problems in the 

population (Falconer, 1960).  

To calculate inbreeding in the population, a G matrix was constructed by use of the R 

package “snpReady” (Granato & Fritsche-Neto, 2018). The VanRaden method (VanRaden, 

2008) for calculating a G matrix was implemented in this study and is as follows: 

𝐺 =

𝑍𝑍′
2 ∑ 𝑝𝑖(1 − 𝑝𝑖)

Where 𝐺 is the genotype relationship matrix, 𝑍 is a design matrix containing centered 

allele effects and 𝑝𝑖 are allele frequencies.  

This matrix is the relationship covariance matrix using the genotypes from the SNP chip. 

The diagonal of this matrix contains the inbreeding coefficient for each individual. 

The observed heterozygosity  

Heterozygosity can indicate the amount of genetic variability found in a population. The 

heterozygosity per animal was estimated by: 

𝐻𝐸 = 𝑓𝐴𝐵 

Where 𝑓𝐴𝐵 is the frequency of the heterozygous loci and 𝐻𝐸 is the measure of 

heterozygosity as a count. Heterozygosity was measured per animal per breed. 

The number of opposing homozygotes  

The number of opposing homozygotes between animals checks for mendelian 

inconsistencies and aids in understanding relationships in the genetic information. If animals 

have opposing homozygotes at a certain locus, then they cannot be related. On the other hand, if 

animals share an allele at a locus, then there is a chance that they might be related (Calus et al., 

2011).  

Opposing homozygotes were calculated using methods published by Calus, Mulder, & 

Bastiaansen, 2011.  

′  
𝑂𝑝 = 𝑀0 ∗ 𝑀2

Where 𝑂𝑝 is the matrix of opposing homozygotes, 𝑀0 is a square matrix of 0 and 1 

which correspond to those SNP which are coded as “0”, and 𝑀2
1 corresponding to those SNP which are coded “2”. 

′  is another square matrix of 0 and 

80 

 
 
 
 
 
 
 
 
 
Linkage Disequilibrium  

Linkage disequilibrium (LD) was measured by rate of LD decay to account for the 

physical distance between SNPs (Single Nucleotide Polymorphisms) that are included in the 

dataset. The decay of LD over time is important to understand how linked SNPs are in the 

population which is an important parameter to understand the history of populations and for 

association studies (Vos et al., 2017). 

To measure LD decay, the average LD was estimated between a certain distance between 

the SNP. This was done using the “snpStats” package in R, which estimates the r2 value of LD. 

The LD was then plot on distances on the genome of 1000 base pairs to 4000000 base pairs. 

Effective Population Size  

The effective population size parameter explains the rate of change in a population due to 

e.g., genetic drift, selection, bottlenecks and other evolutionary factors. Effective population size 

explains how many animals would be needed in an idealized population to create the same 

amount of genetic variation (Charlesworth, 2009). The larger the effective population size, the 

more variable the population was and vice versa.  

The measure of 𝑁𝑒𝑇 over T generations ago, was calculated by sampling 30 random 

animals from each breed except for the Brindle, Chosun, and Jeju Black breeds due to small 

number of total animals in the dataset. After correcting for population size, r2  was obtained to 

get the pairwise LD across the genome. The r2 was measured between each SNP in each animal. 

The Ne was then calculated by using the average r2 over a specific distance based on the measure 

of 𝑁𝑒𝑇 at a certain time in the past. The equation used to calculate 𝑁𝑒𝑇 was from de Roos, 2008: 
𝑁𝑒𝑇 = (1/4𝑐)(1/(𝑟̅2 − 1)) 

where c was the marker distance in morgans related to the population size T generations 

ago (de Roos et al., 2008).  

Results  

The principal component analysis plot between the Akaushi population and other 

previously mentioned cattle breeds can be seen in Figure 5.1.  

81 

 
 
 
 
 
 
 
 
  
Figure 5.1 Principal component analysis of all breeds considered in the Akaushi population 
analysis. 

This plot considers all the European and Asian breeds that were considered in this study. 

The principal components which account for variation between each of the breeds are on either 

the x- or y- axes. The largest variation was found between the Asian and European breeds, 

around 50% of the total variation. There was also a separation of animals within the continental 

breeds as well, with the most variation on the y-axis occurring between the Shorthorn and Jersey 

cattle at about 10%.  

82 

 
 
 
 
 
 
 
The Akaushi population within this Figure show little to no separation from many of the 

other Asian cattle breeds. A distinct separation can be seen within the Asian cattle breeds, with 

one cluster containing Akaushi cattle and the other, Wagyu cattle. This bolsters the claim that 

these cattle are genetically more like non-Japanese cattle. 

The inbreeding within the Akaushi (Fig. 5.2) wavers around 0 (when using a centered G) 

with one outlier around 0.4. The other breeds analyzed also showed the same trend, with some 

outliers in the Hanwoo population (Fig. 5.3) as well.  

Figure 5.2 Inbreeding coefficient of the Akaushi population from the genotypic relationship 
matrix. 

83 

 
 
 
 
 
 
 
 
Figure 5.3 Inbreeding coefficient of the Hanwoo population from the genotypic relationship 
matrix 

The level of inbreeding within these breeds could vary depending on the animals that are 

contained within the sample. Larger sample sizes of those breeds that were sampled with a small 

number of animals within this study could shed light on the greater population inbreeding trends. 

The observed heterozygosity within all breeds wavers around 0.2-0.5. This is also seen in 

the Akaushi (Fig. 5.4), which can show that the Akaushi has some variation within the 

population sampled.  

84 

 
 
 
 
 
 
 
 
Figure 5.4 Observed heterozygosity in the Akaushi population. 

As in most populations, variation in each breed does occur, where some outliers exist 

either extremely negative or positive. The Akaushi population shows one animal has a lower 

observed heterozygosity than the average, which can lead to decreased genetic variation.  

The opposing homozygote estimation is employed to find animals that have similar 

genetics within a population as to identify possible related pairs, such as siblings or a parent-

offspring. This analysis may bring those pairs to light and could solve pedigree discrepancies. 

Within the Akaushi (Fig. 5.5), there is a definitive observation of genetic relatedness between a 

few pairs of animals. Comparatively to other breeds analyzed, the only other distinct separation 

can be found in the Yeonbyun population (Fig. 5.6). 

85 

 
 
 
 
 
 
 
Figure 5.5 Opposing homozygotes in the Akaushi population.  

Figure 5.6 Opposing homozygotes in the Yeonbyun population. 

LD decay is the measure of LD (on the x-axis) over the distance between SNPs in base 

pairs (y-axis). The LD structure in the Akaushi population (Fig. 5.7) falls into the same trend as 

other breeds with a tight clustering of points as the distance is closer together such as the 

Hanwoo (Fig. 5.8).  

86 

 
 
 
 
 
 
 
 
 
Figure 5.7 Linkage disequilibrium decay in the Akaushi population. 

Figure 5.8 Linkage disequilibrium decay in the Hanwoo population. 

As the distance between the SNPs grows, the more sporadic the LD becomes. The one 

notable difference between the Akaushi population and other populations is that LD measure is 

large. The beginning cluster starts around 0.3 and many LD points land above 0.7, while the 

highest values from other breeds only reach just above 0.6. 

87 

 
 
 
 
 
 
 
The effective population size at 1, 5, 10 and 20 generations ago of each breed can be 

found in Table 5.2 and the Ne across multiple generations in all breeds can be visualized in 

Figure 5.9.  

Breeds 

Previous Number of Generations 

Angus 

Hanwoo 

Hereford 

Holstein 

Jersey 

Murray Grey 

Shorthorn 

Wagyu 

Yeonbyun 

Akaushi 

Akaushi Cross 

Brindle 

Chosun 

Hanwoo Pop2 

Jeju Black 

1 

3.50 

7.47 

3.24 

5.07 

2.77 

2.29 

2.15 

4.29 

11.18 

2.88 

6.41 

5.92 

6.36 

7.34 

6.27 

5 

10 

20 

10.87 

22.32 

9.98 

15.08 

10.44 

8.54 

9.17 

16.16 

24.33 

9.53 

18.13 

16.69 

20.76 

18.97 

19.08 

13.92 

25.05 

12.63 

18.16 

14.00 

11.62 

12.76 

21.59 

26.66 

13.57 

22.49 

21.14 

25.23 

22.70 

24.07 

17.45 

28.44 

16.47 

21.41 

17.72 

15.14 

16.65 

27.58 

29.59 

17.65 

26.59 

25.73 

28.78 

26.70 

28.43 

Table 5.2 Effective population size at 1, 5, 10 and 20 generations ago of each breed considered. 
Most inbred breeds show very low Ne across generations. 

88 

 
 
 
 
 
 
 
Effective Population Size over Generations

)
e
N

l

(
e
z
i
S
n
o
i
t
a
u
p
o
P
e
v
i
t
c
e
f
f
E

35.00

30.00

25.00

20.00

15.00

10.00

5.00

0.00

Angus

Hanwoo

Hereford

Holstein

Jersey

Merray Grey

Shorthorn

Wagyu

Yeonbyun

Akaushi

Akaushi Cross

Brindle

Chosun

Hanwoo Pop2

Jeju Black

Previous Generations

Figure 5.9 Effective population size per 1, 5, 10, and 20 generations ago per breed considered. 

The Akaushi sample was found to have an Ne of 14 10 generations ago, which is one of 

the lower sizes compared to the other breeds analyzed. At 10 generations, the Yeonbyun breed 

was seen to have the highest Ne and the Shorthorn and Jersey breeds were seen to have the 

lowest with values ranging from 11 to 27. Low Ne values are common within breeds which are 

genetically similar and may have smaller population sizes. The Akaushi sample comes from an 

exceedingly small population that does not have a large amount of variation due to the limited 

number of animals, which leads to small Ne estimations.  

Conclusions 

The obvious separation in Figure 5.1 of the Akaushi (and other Asian breeds) from 

European breeds was to be expected. The relationship of the Akaushi to the other Asian breeds 

was confirming of the breed background, as it was genetically most like the Hanwoo. This 

analysis confirms the relatedness to the Hanwoo, as the Akaushi and Hanwoo seem to have a 

89 

 
 
 
 
 
 
 
 
 
 
large overlap in the PC analysis. This was also to be expected as breed linage of the Akaushi 

population, as well as many Japanese breeds, was said to have been imported from Korea 

(Sasazaki et al., 2006a). Previous studies have also outlined the similarities between the Japanese 

breeds and Hanwoo (Kawaguchi et al., 2022) and showed overlap between Japanese cattle, 

which were raised in Japan, and Hanwoo through PCA analysis. This study upholds these 

mirrored findings between the American Akaushi and Hanwoo populations and helps to visualize 

the association of underlying population structure within the PC analysis.  

The results of analysis of population parameters within all Asian breeds shows that the 

Akaushi population was very closely related, but not unlike other Asian breeds. The inbreeding 

within the Akaushi showed that some animals in the population do have a large inbreeding 

coefficient, which could be due to the small amount of Akaushi currently present in the United 

States. The similarities within the Akaushi population can also be seen in the measure of 

opposing homozygotes which show a very explicit separation of unrelated to related pairs. This 

separation shows that some animals are very closely related and includes a parent-offspring pair.  

The measure of LD decay specifically shows that the Akaushi population has a high 

amount of LD, even in respect to other Asian breeds. With a high LD in this population, we can 

assume large chunks of the genome are linked together, which gives way to inheriting these large 

blocks (or haplotype blocks) together (Slatkin, 2008). This contributes to the measure of 

homozygosity in the population. This increased homozygosity can lead to recessive homozygous 

traits becoming common within the population. The sample of Akaushi cattle used here is shown 

to be very related within itself, but there was still some genetic variation found. The 

heterozygosity per animal shows some variation within this group of animals, even if it was still 

quite low. Identifying variation in this population and capitalizing on such genetic variation can 

curb the possible negative effects of breeding animals that are genetically similar (Curik, 

Ferenčaković and Sölkner, 2014), most common of these effects being inbreeding depression.  

The effective population size of the Akaushi compared to other breeds show that this 

population was similar because of the estimated small Ne. This was to be expected because of 

the background of the Akaushi breed in the United States as well as the low size of the number 

of animals analyzed. Other studies in Korean Hanwoo cattle also show a decline in the effective 

population size in recent years (Li & Kim, 2015).  

90 

 
 
 
 
 
The analysis of the American Akaushi breed in comparison to Eastern and Western 

breeds shows that the relatedness between the Akaushi and the Hanwoo is strong, which is to be 

expected due to the breed origin background. When American Akaushi population parameters 

were compared to the population parameters of other Asian breeds, many similarities arose, such 

as the similar amount of heterozygosity and inbreeding. The main difference when comparing 

the Akaushi with other Asian breeds was the measure of LD decay within the populations. The 

Akaushi showed to have more LD within the genome, leading way to assume that the Akaushi 

population has not varied much through generations and becoming more genetically related. 

These population parameters have shed light on the American Akaushi population, most 

importantly, this population is most like Korean breeds and may lack genetic diversity due to the 

breeding structure. 

91 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LITERATURE CITED 

Al-Mamun, H. A., Bernardes, P. A., Lim, D., Park, B., & Gondro, C. (2017). A guide to 

imputation of low density single nucleotide polymorphism data up to sequence level. 
Journal of Animal Breeding and Genomics.  

Bruford, M. W., Bradley, D. G., & Luikart, G. (2003). DNA markers reveal the complexity of 

livestock domestication. Nature Reviews Genetics, 4(11), 900–910.  

Cai, D., Sun, Y., Tang, Z., Hu, S., Li, W., Zhao, X., Xiang, H., & Zhou, H. (2014). The origins 

of Chinese domestic cattle as revealed by ancient DNA analysis. Journal of 
Archaeological Science, 41, 423–434.  

Calus, M. P., Mulder, H. A., & Bastiaansen, J. W. (2011). Identification of Mendelian 

inconsistencies between SNP and pedigree information of sibs. Genetics Selection 
Evolution, 43(1), 34.  

Charlesworth, B. (2009). Effective population size and patterns of molecular evolution and 

variation.  

Curik, I., Ferenčaković, M., & Sölkner, J. (2014). Inbreeding and runs of homozygosity: A 

possible solution to an old problem. Livestock Science.  

Das, S., Forer, L., Schönherr, S., Sidore, C., Locke, A. E., Kwong, A., Vrieze, S. I., Chew, E. Y., 
Levy, S., McGue, M., Schlessinger, D., Stambolian, D., Loh, P.-R., Iacono, W. G., 
Swaroop, A., Scott, L. J., Cucca, F., Kronenberg, F., Boehnke, M., … Fuchsberger, C. 
(2016). Next-generation genotype imputation service and methods. Nature Genetics, 
48(10), 1284–1287.  

de Roos, A. P. W., Hayes, B. J., Spelman, R. J., & Goddard, M. E. (2008). Linkage 

Disequilibrium and Persistence of Phase in Holstein–Friesian, Jersey and Angus Cattle. 
Genetics, 179(3), 1503–1512.  

Falconer, D. (1960). Introduction to quantitative genetics. 

Gotoh, T., Nishimura, T., Kuchida, K., & Mannen, H. (2018). The Japanese Wagyu beef 

industry: current situation and future prospects - A review. Asian-Australasian Journal of 
Animal Sciences, 31(7), 933–950.  

Granato, I., & Fritsche-Neto, R. (2018). snpReady: Preparing Genotypic Datasets in Order to 

Run Genomic Analysis. R package version 0.9.6. 

Kawaguchi, F., Nakamura, M., Kobayashi, E., Yonezawa, T., Sasazaki, S., & Mannen, H. 

(2022). Comprehensive assessment of genetic diversity, structure, and relationship in four 
Japanese cattle breeds by Illumina 50 K SNP array analysis. Animal Science Journal, 
93(1), e13770.  

92 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Li, Y., & Kim, J. J. (2015). Effective population size and signatures of selection using bovine 
50K SNP chips in Korean native cattle (Hanwoo). Evolutionary Bioinformatics.  

Loftus, R. T., MacHugh, D. E., Bradley, D. G., Sharp, P. M., & Cunningham, P. (1994). 

Evidence for two independent domestications of cattle. Proceedings of the National 
Academy of Sciences of the United States of America, 91(7), 2757–2761.  

Loh, P.-R., Danecek, P., Palamara, P. F., Fuchsberger, C., A Reshef, Y., K Finucane, H., 

Schoenherr, S., Forer, L., McCarthy, S., Abecasis, G. R., Durbin, R., & L Price, A. 
(2016). Reference-based phasing using the Haplotype Reference Consortium panel. 
Nature Genetics, 48(11), 1443–1448.  

Sasazaki, S., Odahara, S., Hiura, C., Mukai, F., & Mannen, H. (2006). Mitochondrial DNA 

Variation and Genetic Relationships in Japanese and Korean Cattle. Asian-Australasian 
Journal of Animal Sciences, 19(10), 1394–1398.  

Slatkin, M. (2008). Linkage disequilibrium--understanding the evolutionary past and mapping 

the medical future. Nature Reviews. Genetics, 9(6), 477–485.  

VanRaden, P. M. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy 

Science, 91(11), 4414–4423.  

Vos, P. G., Paulo, M. J., Voorrips, R. E., Visser, R. G. F., van Eck, H. J., & van Eeuwijk, F. A. 
(2017). Evaluation of LD decay and various LD-decay estimators in simulated and SNP-
array data of tetraploid potato. Theoretical and Applied Genetics, 130(1), 123–135.  

93 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 6: Estimation of Within and Across Breed Prediction Accuracies in the Wagyu 

Population in the United States and the Korean Hanwoo 

Hanna Ostrovski, Daniela Lourenco, Andra Nelson, Cedric Gondro 

Abstract 

Understanding the population structure among the various Wagyu subtypes outside of Japan, 

specifically the Red and Black varieties, is essential due to their high value and unique marbling 

characteristics. The comparison between these groups aims to uncover the extent of genomic 

relatedness which can directly impact the accuracy of genomic prediction of breeding values. 

Further investigation into the relationship between the Korean Hanwoo and US Wagyu will be 

explored due to historical accounts that suggest that Japanese animals may have originated from 

Korea. Relationship status between all breeds was done through Principal Component Analysis 

(PCA) and through estimating genomic prediction accuracies between breeds. Further 

investigation into the shared genetic elements between all cattle breeds was done through 

genome-wide association (GWA). Genomic prediction accuracies obtained through training and 

testing groups utilizing GBLUP using weaning weight as a phenotype for growth showed low 

accuracy between Red and Black Wagyu, around 0.10. A RedBlack population of crossed Red 

and Black Wagyu and the Hanwoo were able to predict other breeds with moderate accuracy, 

ranging from 0.23 to 0.27. To address unbalanced breed group sizes (~150 Black Wagyu versus 

~5000 Red Wagyu), the total population was divided into 10 balanced groups based on animal 

relatedness via the first principal component. Testing prediction accuracies within these splits 

revealed higher accuracies between closely related splits, up to 0.45. Notably, the split involving 

Red Wagyu (1st PC split) and Korean Hanwoo (10th PC split) demonstrated the highest 

accuracy, reinforcing the close genetic relationship between these breeds. The GWA identified 

new genomic regions on chromosomes 6, 10, and 14 associated with growth. These findings 

signify the early stages of unraveling the intricate relationships between different subsets of US 

Wagyu. Utilizing this knowledge in estimating breeding values within Wagyu will impact 

breeding practices, enhancing the selection of desirable traits. 

94 

 
 
 
 
 
 
 
 
 
Introduction 

Genomic prediction between breeds has not yielded very promising results due to the 

nature of breed structure as they have different genomic architectures. This is especially 

important in the Wagyu breed in the United States, as Red and Black Wagyu are registered in the 

same Association and combined in the genomic evaluation. Low accuracy of prediction may 

occur between the two groups due to differences in origin as both Red and Black American 

Wagyu originate from Japan but resided in different prefectures in Japan (Namikawa, 1980). 

Further investigation into the population structure between these two subtypes in the US has not 

been thoroughly explored, but initial papers suggest both groups, and the population, are highly 

inbred (Heffernan, 2022; Scraggs et al., 2014). This is to be expected, as this group of animals 

originated from a couple of bulls that were exported to the US in the 1970’s and a larger herd of 

animals brought over in the early 1990s. No other outside genetics have been utilized, so current 

genomic variability relies on the initial structure of animals that were imported. The relationships 

between these animals from the original group of Wagyu imported to the US is now used as the 

base generation, as further genomic information of previous Japanese generations is not available 

for analysis on current US Wagyu. This leaves the American population with low genomic 

variability, as well as a tight genomic population due to low effective population size. 

Genomic prediction within breed has been seen to result in high accuracy (Hayes et al., 

2009; Karaman et al., 2016) due to the underlying genomic architecture of related animals 

between testing and training groups. This predictive power has given those in the cattle industry 

an edge on selection, as animals can be culled from programs before needing to be proved by 

progeny. The accuracy of prediction has been proven to degrade when predicting the 

performance of an animal from a training population that is not related to the animal of interest. 

This usually occurs in prediction between breeds that may not be tightly related. Low accuracy 

of prediction has been reported, with bumps in accuracy attributed to inclusion of crossbred 

animals (Lund et al., 2014; Misztal et al., 2022; Olson et al., 2012), different modeling methods 

(Khansefid et al., 2020) or utilization of whole-genome sequence (Nawaz et al., 2022; Raymond 

et al., 2018).  

Utilization of whole genome sequence data can be advantageous when predicting 

between breeds or groups that are not closely related (Druet et al., 2013; Meuwissen & Goddard, 

95 

 
 
 
 
 
 
2010). This slight bump in accuracy can have rippling effects when considering how heavily 

breeding values are utilized in the cattle industry. Whole genome sequence can uncover more 

similar areas in the genome than a standard SNP chip, as more variants are available to create 

genomic relationships between animals. While increasing the amount of sequence data may 

show improvement in accuracies, other additions in the training group, inclusion of related 

animals, and utilization of other prediction models has also shown to increase accuracy 

(Meuwissen et al., 2021). The use of whole genome sequence should be explored as an avenue to 

increase accuracy between breeds when all other methods are exhausted, as the cost of obtaining 

WGS has decreased significantly over time.  

Degradation of accuracy also happens per generation, as linkages between the testing and 

training populations can break down if the generations between the two grow farther apart 

(Castro Dias Cuyabano et al., 2019). This is due to linkage disequilibrium breaking down 

through generations through recombination. Differences in breed composition between the 

training and testing populations can decrease accuracy as well (van den Berg et al., 2020), 

especially if one breed is overrepresented in a training group. All these potential breaks in 

connection between the testing and training groups can result in lower prediction accuracies.  

Yet, crossing between the Red and the Black Wagyu in the US population occurs and can 

link these two breed groups together. Across-breed prediction accuracy has been seen to increase 

when links between the two breeds are included, such as these crossbred animals in the US 

Wagyu population. Increases in accuracy have been shown to be directly related to an increase in 

profit (Thomasen et al., 2014), as more accurate breeding values can be assigned to breeding 

animals. 

The selection and origins of many Asian breeds are similar, as they are high marbling 

with a selection pattern of being used as work animals before being selected for beef 

consumption (Gotoh et al., 2018; Motoyama et al., 2016). Further selection within breed to 

create modern breeds, such as the Japanese Black Wagyu or the Korean Hanwoo, would create 

larger genomic differences in modern cattle populations. This study aims at exploring the 

structure of the American Wagyu and Korean Hanwoo population today through genome-wide 

association and estimating the accuracy of prediction between the Red and Black populations, 

including the crossed Red/Black animals.  

96 

 
 
 
 
 
 
Materials and Methods 

Data 

Wagyu data from the American Wagyu Association was utilized. This data includes 

animals that are categorized Black Wagyu, Red Wagyu (also known as Brown or Akaushi 

animals), and Red/Black crosses. Hanwoo data originated from Korea and includes animals that 

are 100% Hanwoo. All animals have been genotyped with a bovine SNP chip, the Wagyu 

animals being genotyped with either the 50K Illumina Bovine chip or the 100k Illumina Bovine 

chip. All Hanwoo animals were initially genotyped at a 50K Illumina Bovine chip. All animals 

were phased and subset to 70k after quality control using eagle and impute5 software (Rubinacci 

et al., 2019; Loh et al., 2016). Each animal was then imputed up to whole genome sequence 

using the 1000 bulls as a reference population utilizing impute5 (Rubinacci et al., 2019) 

including Hanwoo and Wagyu animals that were previously genotyped at WGS. After quality 

control, the 70K chip contained 70, 343 SNP and animals with WGS contained 14,454,093 SNP.  

The phenotype used to test the predictive ability between these populations was the 

phenotype of weaning weight. All animals considered within this study had to be genotyped and 

had to include a phenotype of weaning weight in pounds. To scale these two datasets, a linear 

model was employed, with the phenotype of all animals as the dependent variable and the 

variable of breed fit as the independent variable. Residuals from this model were then utilized as 

the adjusted phenotypes for genomic prediction.  

Principal Component Analysis  

Principal component analysis was firstly used to understand the population structure that 

exists between the Red Wagyu, Black Wagyu, and the Hanwoo. Obtaining eigen values and 

eigen vectors was done on the genetic relationship matrix of all animals, which was created via  

VanRaden, 2008: 

𝐺 =

𝑍′𝑍
2 ∑ 𝑝𝑖(1 − 𝑝𝑖)

The package ‘eigen’ in R was utilized to obtain the eigenvectors and eigenvalues to plot 

each animal by the principal components that explained most of the variation within the dataset. 

97 

 
 
 
 
 
 
 
 
 
The eigen decomposition of the genomic relationship matrix uncovers population structures that 

are only available through genotype (McVean, 2009; Patterson et al., 2006). Eigenvectors are 

used to understand the grouping of animals per breed, the principal components, while 

eigenvalues explain the variance between the principal components (Karamizadeh et al., 2013). 

The top principal components that explained the most variation was then plot against each other 

to visualize how these animals grouped together. 

Obtaining Accuracy of Prediction 

Training and testing groups were utilized to understand the accuracy of prediction in 

certain prediction scenarios. The number of animals in each breed used for within and across 

breed prediction scenarios are in Table 6.1: 

Breed Type 

Wagyu Red/Black 

Wagyu Red 

Wagyu Black 

Cross 

Hanwoo 

4766 

147 

598 

3096 

Number of 

Animals  

Table 6.1. Number of animals considered for each breed group; Wagyu and Hanwoo. 

Prediction scenarios were done within each breed with an 80/20 split for training and 

testing groups. Across breed scenarios were run, with each breed combination considered for 

training and testing scenarios. Additional scenarios were considered due to unbalanced breed 

groups; principal components were employed to create balanced training and testing groups 

based off genotypic relationships. Creating these groups was done through sorting all animal’s 

scores based off the first and second principal components then the animals were split up across 

PC1. See Fig. 6.1 for better visualization of splits using the PCA scores: 

98 

 
 
 
 
 
 
 
 
Figure 6.1. Principal Component splits over all animals using the first principal component 
scores as a baseline. Equal number of animals were represented in each split. 

All scenarios were run at 70k density and WGS density.  Both Hanwoo and Wagyu 

populations utilized pre-adjusted phenotypes utilizing farm, sex, and date of birth as fixed 

effects. Further centering methods between Hanwoo and Wagyu populations were done by 

fitting breed as a fixed effect to the adjusted phenotype. All predicted breeding values were done 

through GBLUP: 

𝑦 = 𝑋𝑎 + 𝑒 

Where y is the adjusted phenotype, 𝑋 is the design matrix connecting phenotypes to 

genetic values, 𝑎 is the vector of random animal effects  𝑎 ~ 𝑁(0, 𝐺𝜎𝑎

2) and e is the vector of 

residual effects, where we assume 𝑒~𝑁(0, 𝐼𝜎𝑒

2) . Accuracy of prediction was done through 

Pearson correlation of predicted EBV and adjusted phenotype within the training group. 

Genome-Wide Association Study 

A final step in understanding the significant areas on the genome was done through 

genome-wide association study. This was done by backsolving the SNP effects, then obtaining 

99 

 
 
 
 
 
 
 
 
the p-value of each SNP effect. This model estimates the genomic EBVs (𝑎̂ = 𝑍𝑔̂ ) through a 

snpGBLUP: 

𝑦 = 𝑀𝑔 + 𝑒 

Where y is the adjusted phenotype, 𝑀 matrix of n x m, where n is the number of animals 

and m are the number of markers, 𝑔 is the vector of random SNP effects evenly distributed, 

𝑔 ~ 𝑁(0, 𝐼𝜎𝑔

2) and 𝑒 is the vector of residual effects, where we assume 𝑒~𝑁(0, 𝐼𝜎𝑒

2) .  

Backsolving of SNP effects was done through this equation: 

1
𝑑
Where 𝑑 = 2 ∗ ∑ 𝑝𝑖(1 − 𝑝𝑖). To obtain p-values of each SNP effect we can use : 

∗ 𝑍 ∗ 𝐺 ∗ 𝑎̂ 

𝐸𝐹𝐹 =

𝑝𝑣𝑎𝑙𝑖 = 2 (1 − 𝛷 (|

𝐸𝐹𝐹̂𝑖
𝑠𝑑(𝐸𝐹𝐹̂𝑖)

|)) 

Which identifies the significance of the SNP effect 𝐸𝐹𝐹̂𝑖 as 𝑝𝑣𝑎𝑙𝑖 through the density 

distribution (t-distribution). All p-values were corrected using the Bonferroni correction to 

account for multiple comparison, dividing by number of SNPs. Visual identification of 

significant SNP throughout a genome was by way of Manhattan plots. The highest p-values of 
significance that cross the p-value threshold of − 𝑙𝑜𝑔10(5 𝑥 10−8) were considered significant. 
Locations of significant SNP were evaluated within ensembl database for candidate genes.  

Results 

A breakdown of population structure between all groups can be seen in the principal 

component analysis in Figure 6.2.  

100 

 
 
 
 
 
 
 
Figure 6.2. Principal Component Analysis of all animals considered. The largest PC1 accounted 
for 90% of the variation between the Hanwoo and Wagyu populations.  

It is apparent that each breed type is singular enough to separate into groups, while the 

Red/Black animals are rightfully connecting the Red and Black Wagyu groups. There are some 

animals within that group that may be misidentified, as some crosses are closer to Red Wagyu 

and some Black animals are gravitating more towards the mean. Crossing and backcrossing of 

animals is common within the breed, so some crosses may be more genetically like the Fullblood 

Red Wagyu than a Black Wagyu.  

The prediction accuracies per each scenario can be found in Table 6.2 and Table 6.3. The 

accuracies increased when utilizing whole genome sequence, especially in the across breed 

prediction accuracies. Prediction accuracies did not increase in the RedBlack population when 

using the WGS. This could be due to the nature of imputation to WGS, as no animals in the 

reference population for imputation were RedBlack Wagyu crosses.  

101 

 
 
 
 
 
 
 
 
 
70K 

Red 

RedBlack  Black  Hanwoo 

All 

Red  0.374594  0.335652 

-0.11648  0.085771  0.154744 

RedBlack  0.217795  0.383666  0.198072  0.086485  0.235117 

Black 

0.12328  0.302299 

0.11055  0.077759  0.122974 

Hanwoo  0.148088  0.273783  0.054902 

0.19351  0.223874 

All  0.229882  0.413217  0.133042  0.101837 

0.38774 

Table. 6.2. Prediction accuracies using GBLUP analysis between all breed types at 70K density 
chip. The training groups are in the rows, and testing groups in columns. Diagonal, within-breed 
accuracies were split 80/20. 

WGS 

Red 

RedBlack  Black  Hanwoo 

All 

Red  0.371206  0.341941 

-0.03513  0.083345  0.160455 

RedBlack  0.214048  0.385192  0.186228  0.064573  0.224814 

Black  0.133806  0.300887  0.142109  0.072744  0.126242 

Hanwoo  0.182419  0.260344  0.128508  0.192171 

0.23864 

All  0.233802  0.408325  0.167007  0.095279  0.389136 

Table 6.3 Prediction accuracies using GBLUP analysis between all breed types at whole genome 
sequence. The training groups are in the rows, and testing groups in columns. Diagonal, within-
breed accuracies were split 80/20. 

These accuracies reflect the structure seen within the PCA, with random sampling of 

animals from all breed groups having the highest accuracy. The lowest accuracies occurred 

between the Red and Black Wagyu.  

Creating training and testing groups with the first principal component yielded results 

found in Table 6.4.  

102 

 
 
 
 
 
 
 
 
PC 

Split 

1 

2 

3 

4 

5 

6 

7 

8 

9 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0.741538  0.22561  0.274736  0.274841  0.211956  0.148241 

-0.0684 

0.092799  0.080853  0.124792 

0.232632  0.757092  0.310467  0.313882  0.232305  0.254334  0.140827  0.075368  0.067799  0.058163 

0.246209  0.287265  0.771826  0.296464  0.255255  0.288262 

-0.01909  0.052816  0.056025  0.096405 

0.253963  0.285115  0.292298  0.760715  0.257011  0.256342  0.153722  0.049149  0.045374  0.077921 

0.182446  0.204958  0.260903  0.256986  0.786462  0.297649  0.328171  0.096377  0.067579  0.043934 

0.162094  0.254469  0.276273  0.257071  0.28735  0.766839  0.457719  0.051433  0.078018  0.059165 

0.120543  0.115265  0.108513  0.020819  0.100567  0.161845  0.762403  0.118151  0.086604  0.060939 

0.090184  0.098663  0.098948  0.083739  0.130673  0.060615  0.180494  0.891996  0.164283  0.09978 

0.087728  0.070019  0.056425  0.050179  0.108749  0.111891  0.242234  0.194645  0.884068  0.140601 

10 

0.140437  0.058859  0.114395  0.108622  0.052981  0.076539  0.271762  0.062811  0.108292  0.832933 

Table 6.4. Prediction accuracies using balanced principal component splits, utilizing the first PC split 10 ways. Most related groups 
showed the highest prediction accuracies. The training groups are in the rows, and testing groups in columns. 

103 

 
 
 
 
 
 
 
 
 
The accuracies for the within PC split accuracy on the diagonal was larger than the within 

breed accuracies found in Table 6.2 and 6.3. This can be attributed to the balance of genomic 

relatedness within these PC splits, as well as the balance of samples per PC split. On average, the 

farther the PC split groups are apart, the lower the prediction accuracy. In many scenarios using 

split 10 (which included all Hanwoo animals) for the training group yielded higher results than 

other splits.  

The GWAS for the 70k and WGS of all animals can be found in Fig 6.3 and Fig 6.4 

respectively.  

Figure 6.3. Genome Wide Association Study for all animals at 70K for the phenotype of weaning 
weight.  

104 

 
 
 
 
 
 
Figure 6.4. Genome Wide Association Study for all animals at whole genome sequence for the 
phenotype of weaning weight. 

In Fig 6.3, a very apparent peak can be seen in chromosome 14. When run against 

previous significant QTL observed in cattle, the PLAG1 locus was detected and has been 

previously identified as a growth locus on the genome (see all significant SNP in Table 6.5).  

In Fig 6.4, whole-genome analysis identified more areas on the genome, specifically 

areas on chromosomes 6, 10 and 14. All significant SNP identified with weaning weight (cattle 

growth), can be seen in supplemental information. Those areas identified included the PLAG1 

locus on chromosome 14, PTGR2 on chromosome 10 (which is a part of the fatty-acid 

metabolism pathway), and NPBWR1 on chromosome 14 (which is involved in food regulation 

pathway). Significant runs identified on chromosome 6 did not include any significant SNP that 

are available within the ensembl browser for Bos Taurus. Previous studies have identified QTL 

on chromosome 6 associated with growth in Hanwoo and Piedmontese cattle (Bongiorni et al., 

105 

 
 
 
 
 
 
 
2012; Naserkheil et al., 2021): LCORL, NCAPG and LAP3. The location of these QTL 

previously identified were between the 2 significant peaks that were identified in this study, 

located on chromosome 6 35550018-35560448 and 38559100-38569258.  

SNP Name 

CHR 

POSITION 

14:20640612_G_A 

14:20642540_A_G 

14:20646499_G_A 

14:23343150_A_G 

14:23446328_A_C 

14:23630896_T_C 

14:24026168_A_G 

14:25839968_G_A 

14 

14 

14 

14 

14 

14 

14 

14 

20640612 

20642540 

20646499 

23343150 

23446328 

23630896 

24026168 

25839968 

QTL 

PLAG1 

PLAG1 

PLAG1 

PLAG1 

PLAG1 

PLAG1 

PLAG1 

PLAG1 

Table 6.5. All significant SNP in the genome wide association study at 70K with weaning 
weight as the phenotype. 

Conclusions 

The US Wagyu breed sub-types showed to specifically separate which was illustrated 

through Principal Component Analysis (PCA) and offered intriguing insights to the Wagyu 

relationship to the Hanwoo. Distinct separations among the Wagyu and Hanwoo populations can 

be seen, underlining their genetic divergence. Yet, when the between breed genomic prediction 

was done between the Red Wagyu and the Hanwoo, it showed a moderate prediction accuracy. 

This linkage further proves the ancestral links between the Hanwoo and Red (Akaushi) Wagyu 

and points to possible inconsistencies within the PCA in this study. Inclusion of other cattle 

breeds may aid in uncovering the true relationships in the PCA, as the prediction accuracy 

showed the Hanwoo and Red Wagyu should group closer together. 

Interestingly, a significant separation was evident between Red and Black Wagyu, this 

distinct divide was further emphasized in prediction accuracies between the two breeds. resulting 

in the lowest prediction accuracy in this scenario. In practice, all Wagyu are considered with 

estimating breeding values. It is necessary to include the RedBlack cross animals to increase 

106 

 
 
 
 
 
 
 
 
 
 
 
accuracy within the whole US Wagyu population, as this group of crossbred animals showed 

high prediction accuracy when predicting both the Red and Black Wagyu populations. Whole-

genome sequence did increase prediction accuracy, but its implementation in industry settings 

might be hindered by cost implications and the computational demands for handling extensive 

data. 

Genome-Wide Association Studies (GWAS) uncovered notable regions on the genome 

within this population for growth. While some regions corresponded to previously identified 

Quantitative Trait Loci (QTL), specifically PLAG1, others unveiled unreported areas in Asian 

populations. These newfound genomic peaks could uncover QTL or represent novel regions in 

growth traits in Asian cattle breeds. 

Moving forward, deeper analyses within these groups should prioritize carcass-based 

phenotypes, aligning with the breeds' renowned high-marbling qualities. With prediction 

accuracy observed to be low between Wagyu sub-breeds, the inclusion of RedBlack animals will 

be pivotal in keeping and improving accuracy levels in a whole population analysis of the US 

Wagyu. Balancing the differing breed structure of the US Wagyu population will be crucial in 

guiding future research and estimating breeding values within the industry. 

107 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LITERATURE CITED 

Bongiorni, S., Mancini, G., Chillemi, G., Pariset, L., & Valentini, A. (2012). Identification of a 

short region in chromosome 6 affecting direct calving ease in Piedmontese cattle breed-
Supplementary material. PLoS One, 7(12). 

Castro Dias Cuyabano, B. C. D., Wackel, H., Shin, D., & Gondro, C. (2019). A study of 

Genomic Prediction across Generations of Two Korean Pig Populations. Animals, 9(9), 
672.  

Druet, T., Macleod, I. M., & Hayes, B. J. (2013). Toward genomic prediction from whole-

genome sequence data: impact of sequencing design on genotype imputation and accuracy 
of predictions. Heredity 2014 112:1, 112(1), 39–47.  

Gotoh, T., Nishimura, T., Kuchida, K., & Mannen, H. (2018). The Japanese Wagyu beef 

industry: Current situation and future prospects - A review. In Asian-Australasian Journal 
of Animal Sciences (Vol. 31, Issue 7, pp. 933–950). Asian-Australasian Association of 
Animal Production Societies.  

Hayes, B. J., Visscher, P. M., & Goddard, M. E. (2009). Increased accuracy of artificial selection 

by using the realized relationship matrix. Genetics Research, 91(1), 47–60.  

Heffernan, K. (2022). Evaluating the Genetic Architecture of the Japanese Wagyu Breed Within 

the United States. 

Karaman, E., Cheng, H., Firat, M. Z., Garrick, D. J., & Fernando, R. L. (2016). An Upper Bound 

for Accuracy of Prediction Using GBLUP. PLOS ONE, 11(8), e0161054.  

Karamizadeh, S., Abdullah, S. M., Manaf, A. A., Zamani, M., & Hooman, A. (2013). An 

Overview of Principal Component Analysis. Journal of Signal and Information Processing, 
04(03), 173–175.  

Khansefid, M., Goddard, M. E., Haile-Mariam, M., Konstantinov, K. V., Schrooten, C., de Jong, 

G., Jewell, E. G., O’Connor, E., Pryce, J. E., Daetwyler, H. D., & MacLeod, I. M. (2020). 
Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle. Frontiers in 
Genetics, 11.  

Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 

3094–3100.  

Loh, P.-R., Danecek, P., Palamara, P. F., Fuchsberger, C., A Reshef, Y., K Finucane, H., 

Schoenherr, S., Forer, L., McCarthy, S., Abecasis, G. R., Durbin, R., & L Price, A. (2016). 
Reference-based phasing using the Haplotype Reference Consortium panel. Nature 
Genetics, 48(11), 1443–1448.  

108 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Lund, M. S., Su, G., Janss, L., Guldbrandtsen, B., & Brøndum, R. F. (2014). Genomic evaluation 

of cattle in a multi-breed context. Livestock Science, 166(1), 101–110.  

McVean, G. (2009). A genealogical interpretation of principal components analysis. PLoS 

Genetics, 5(10).  

Meuwissen, T., & Goddard, M. (2010). Accurate Prediction of Genetic Values for Complex 

Traits by Whole-Genome Resequencing. Genetics, 185(2), 623–631.  

Meuwissen, T., van den Berg, I., & Goddard, M. (2021). On the use of whole-genome sequence 
data for across-breed genomic prediction and fine-scale mapping of QTL. Genetics 
Selection Evolution, 53(1).  

Misztal, I., Steyn, Y., & Lourenco, D. A. L. (2022). Genomic evaluation with multibreed and 

crossbred data. JDS Communications, 3(2), 156–159.  

Motoyama, M., Sasaki, K., & Watanabe, A. (2016). Wagyu and the factors contributing to its 

beef quality: A Japanese industry overview. Meat Science, 120, 10–18.  

Namikawa, K. (1980). Breeding History Of Japanese Beef Cattle And Preservation Of Genetic 

Resources As Economic Farm Animals. 

Naserkheil, M., Mehrban, H., Lee, D., & Park, M. N. (2021). Genome-wide Association Study 

for Carcass Primal Cut Yields Using Single-step Bayesian Approach in Hanwoo Cattle. 
Frontiers in Genetics, 12, 752424.  

Nawaz, M. Y., Bernardes, P. A., Savegnago, R. P., Lim, D., Lee, S. H., & Gondro, C. (2022). 

Evaluation of Whole-Genome Sequence Imputation Strategies in Korean Hanwoo Cattle. 
Animals, 12(17), 2265.  

Olson, K. M., VanRaden, P. M., & Tooker, M. E. (2012). Multibreed genomic evaluations using 
purebred Holsteins, Jerseys, and Brown Swiss. Journal of Dairy Science, 95(9), 5378–
5383.  

Patterson, N., Price, A. L., & Reich, D. (2006). Population Structure and Eigenanalysis. PLoS 

Genetics, 2(12), 2074–2093.  

Raymond, B., Bouwman, A. C., Schrooten, C., Houwing-Duistermaat, J., & Veerkamp, R. F. 

(2018). Utility of whole-genome sequence data for across-breed genomic prediction. 
Genetics Selection Evolution, 50(1), 1–12. 

Scraggs, E., Zanella, R., Wojtowicz, A., Taylor, J. F., Gaskins, C. T., Reeves, J. J., de Avila, J. 

M., & Neibergs, H. L. (2014). Estimation of inbreeding and effective population size of 
full-blood wagyu cattle registered with the American Wagyu Cattle Association. Journal of 
Animal Breeding and Genetics, 131(1), 3–10.  

109 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Thomasen, J. R., Egger-Danner, C., Willam, A., Guldbrandtsen, B., Lund, M. S., & Sørensen, A. 
C. (2014). Genomic selection strategies in a small dairy cattle population evaluated for 
genetic gain and profit. Journal of Dairy Science, 97(1), 458–470.  

van den Berg, I., MacLeod, I. M., Reich, C. M., Breen, E. J., & Pryce, J. E. (2020). Optimizing 

genomic prediction for Australian Red dairy cattle. Journal of Dairy Science, 103(7), 6276–
6298.  

VanRaden, P. M. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy 

Science, 91(11), 4414–4423.  

110 

 
 
 
 
 
 
 
Supplemental Tables 

SNP Name 
1:85082264_C_T 
1:85082469_C_T 
2:32148926_G_A 
2:32152141_T_A 
2:32153462_G_A 
2:32157167_G_A 
2:32160234_A_G 
2:32163031_A_G 
3:119087112_T_C 
3:120616386_T_G 
4:61336290_T_C 
4:95309987_G_A 
4:95319999_T_G 
4:95339993_C_T 
4:105609727_T_A 
6:35550018_T_A 
6:35550066_T_A 
6:35550067_T_A 
6:35550287_C_G 
6:35556613_T_C 
6:35557555_C_T 
6:35557917_G_A 
6:35557983_T_G 
6:35558042_C_T 
6:35558523_T_C 
6:35558737_A_G 
6:35558908_T_C 
6:35559215_G_A 
6:35559492_A_C 
6:35559558_G_A 
6:35559635_C_T 
6:35559640_A_T 

CHR  POSITION 
85082264 
85082469 
32148926 
32152141 
32153462 
32157167 
32160234 
32163031 
119087112 
120616386 
61336290 
95309987 
95319999 
95339993 
105609727 
35550018 
35550066 
35550067 
35550287 
35556613 
35557555 
35557917 
35557983 
35558042 
35558523 
35558737 
35558908 
35559215 
35559492 
35559558 
35559635 
35559640 

1 
1 
2 
2 
2 
2 
2 
2 
3 
3 
4 
4 
4 
4 
4 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 

6:35559668_G_A 
6:35559795_G_A 
6:35559801_G_A 
6:35559804_T_C 
6:35559823_A_G 
6:35559874_C_G 
6:35559930_A_G 
6:35559958_C_T 
6:35559987_G_A 
6:35559989_C_A 
6:35560103_G_T 
6:35560136_G_A 
6:35560262_A_T 
6:35560448_G_T 
6:38559100_T_A 
6:38559119_A_G 
6:38559838_C_T 
6:38559871_A_G 
6:38559901_G_A 
6:38559927_G_A 
6:38560042_C_T 
6:38560045_T_C 
6:38560268_A_G 
6:38560977_T_C 
6:38560980_T_C 
6:38561113_T_C 
6:38561154_C_G 
6:38561156_G_A 
6:38561261_A_G 
6:38561900_G_A 
6:38562229_A_G 
6:38562559_C_T 
6:38562675_A_G 
6:38563338_T_G 
6:38563658_C_G 
6:38563833_A_T 
6:38564037_T_G 
6:38564080_T_C 

6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 
6 

35559668 
35559795 
35559801 
35559804 
35559823 
35559874 
35559930 
35559958 
35559987 
35559989 
35560103 
35560136 
35560262 
35560448 
38559100 
38559119 
38559838 
38559871 
38559901 
38559927 
38560042 
38560045 
38560268 
38560977 
38560980 
38561113 
38561154 
38561156 
38561261 
38561900 
38562229 
38562559 
38562675 
38563338 
38563658 
38563833 
38564037 
38564080 

Supplemental Table 6.6: Name, chromosome, and position of each significant SNP 
related to growth in Asian cattle identified through genome-wide association with whole-
genome sequence.  

111 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Supplemental Table 6.6 (cont’d) 
6:38564419_T_C 
6  38564419 
6  38565084 
6:38565084_T_C 
6  38565365 
6:38565365_C_G 
6  38566242 
6:38566242_A_G 
6  38566447 
6:38566447_C_T 
6  38568231 
6:38568231_C_T 
6  38569258 
6:38569258_T_C 
7  93070914 
7:93070914_G_A 
8  31787017 
8:31787017_G_T 
9  89474378 
9:89474378_C_T 
9  89480926 
9:89480926_C_T 
9:89480968_G_A 
9  89480968 
10:74881199_A_G  10  74881199 
10:85206757_G_A  10  85206757 
10:85207181_A_G  10  85207181 
10:85212383_C_T  10  85212383 
10:85216887_T_C  10  85216887 
10:85221410_T_G  10  85221410 
10:85228665_C_T  10  85228665 
10:85230003_C_T  10  85230003 
10:85231678_C_T  10  85231678 
10:85244139_A_C  10  85244139 
10:85336209_A_G  10  85336209 
10:85336554_C_T  10  85336554 
10:85338651_C_A  10  85338651 
14:20089453_T_C  14  20089453 
14:20222988_C_A  14  20222988 
14:20223163_C_T  14  20223163 
14:20269421_C_T  14  20269421 
14:21609324_C_T  14  21609324 
14:21609334_C_T  14  21609334 
14:21609399_G_C  14  21609399 
14:21609456_C_T  14  21609456 
14:21609557_C_T  14  21609557 
14:21609685_G_T  14  21609685 
14:21609731_C_T  14  21609731 
14:21609861_G_A  14  21609861 
14:21610015_G_A  14  21610015 
14:21610069_G_A  14  21610069 
14:21610423_A_T  14  21610423 
14:21610510_C_T  14  21610510 
14:21610668_T_C  14  21610668 

14:21610669_T_G  14  21610669 
14:21610861_G_A  14  21610861 
14:21610871_C_T  14  21610871 
14:21610876_G_T  14  21610876 
14:21610895_A_C  14  21610895 
14:21610975_T_C  14  21610975 
14:21611515_A_G  14  21611515 
14:21611539_A_G  14  21611539 
14:21612303_A_G  14  21612303 
14:21612536_G_A  14  21612536 
14:21612729_T_A  14  21612729 
14:21612745_G_T  14  21612745 
14:21612768_C_T  14  21612768 
14:21612786_C_A  14  21612786 
14:21612829_C_A  14  21612829 
14:21612930_T_C  14  21612930 
14:21612980_A_C  14  21612980 
14:21613086_T_C  14  21613086 
14:21613099_T_G  14  21613099 
14:21613232_T_C  14  21613232 
14:21613332_G_A  14  21613332 
14:21613344_G_A  14  21613344 
14:21613362_G_A  14  21613362 
14:21613382_T_G  14  21613382 
14:21613383_T_G  14  21613383 
14:21613409_A_C  14  21613409 
14:21613413_A_C  14  21613413 
14:21613427_C_T  14  21613427 
14:21613441_T_C  14  21613441 
14:21613480_A_G  14  21613480 
14:21613585_G_A  14  21613585 
14:21613587_T_C  14  21613587 
14:21613637_T_C  14  21613637 
14:21613664_A_T  14  21613664 
14:21613690_T_C  14  21613690 
14:21613761_G_A  14  21613761 
14:21613818_C_T  14  21613818 
14:21613953_C_T  14  21613953 
14:21613960_C_T  14  21613960 
14:21613962_T_C  14  21613962 
14:21613964_T_G  14  21613964 
14:21614346_C_T  14  21614346 

112 

 
 
 
 
 
Supplemental Table 6.6 (cont’d) 
14:21614387_T_A  14  21614387 
14:21614411_A_T  14  21614411 
14:21614491_C_T  14  21614491 
14:21614565_A_G  14  21614565 
14:21614587_A_G  14  21614587 
14:21614646_A_C  14  21614646 
14:21614907_T_C  14  21614907 
14:21614946_A_G  14  21614946 
14:21615021_G_A  14  21615021 
14:21615254_G_A  14  21615254 
14:21615342_C_T  14  21615342 
14:21615348_A_G  14  21615348 
14:21615363_C_T  14  21615363 
14:21615367_G_A  14  21615367 
14:21615406_C_G  14  21615406 
14:21615508_G_A  14  21615508 
14:21615676_T_A  14  21615676 
14:21616096_A_G  14  21616096 
14:21616168_G_A  14  21616168 
14:21616723_T_A  14  21616723 
14:21618368_A_G  14  21618368 
14:21618606_C_A  14  21618606 
14:21619114_A_G  14  21619114 
14:21619120_G_A  14  21619120 
14:21619285_G_A  14  21619285 
14:21619764_C_T  14  21619764 
14:21620605_G_A  14  21620605 
14:21621311_A_C  14  21621311 
14:21621477_C_A  14  21621477 
14:21623073_G_C  14  21623073 
14:21623991_G_C  14  21623991 
14:21625776_A_G  14  21625776 
14:22670309_G_A  14  22670309 
14:22805657_G_A  14  22805657 
14:22809167_G_A  14  22809167 
14:22813456_C_T  14  22813456 
14:22814269_T_C  14  22814269 
14:22814595_C_T  14  22814595 
14:22815875_G_C  14  22815875 
14:22817795_C_T  14  22817795 
14:22840845_G_A  14  22840845 
14:22867326_T_G  14  22867326 

14:22977521_G_A  14  22977521 
14:22982222_T_C  14  22982222 
14:22983440_C_T  14  22983440 
14:22989917_G_A  14  22989917 
14:22993121_C_T  14  22993121 
14:22995861_C_G  14  22995861 
14:22997098_A_T  14  22997098 
14:22997286_C_A  14  22997286 
14:22997290_G_A  14  22997290 
14:22999212_T_G  14  22999212 
14:22999280_G_A  14  22999280 
14:23001109_G_T  14  23001109 
14:23003295_T_C  14  23003295 
14:23004135_A_C  14  23004135 
14:23004758_T_C  14  23004758 
14:23009266_C_T  14  23009266 
14:23029146_A_G  14  23029146 
14:23128784_T_C  14  23128784 
14:23155902_T_G  14  23155902 
14:23157224_C_T  14  23157224 
14:23165825_C_T  14  23165825 
14:23166431_A_T  14  23166431 
14:23168386_G_A  14  23168386 
14:23168918_C_G  14  23168918 
14:23170297_G_A  14  23170297 
14:23172658_T_C  14  23172658 
14:23172995_A_G  14  23172995 
14:23173588_C_G  14  23173588 
14:23173674_T_C  14  23173674 
14:23175099_T_C  14  23175099 
14:23176131_A_C  14  23176131 
14:23178091_A_G  14  23178091 
14:23179478_C_T  14  23179478 
14:23179846_G_C  14  23179846 
14:23182232_G_C  14  23182232 
14:23183709_C_A  14  23183709 
14:23183729_G_A  14  23183729 
14:23184342_A_C  14  23184342 
14:23184460_T_C  14  23184460 
14:23184735_T_A  14  23184735 
14:23185438_C_T  14  23185438 
14:23188778_G_A  14  23188778 

113 

 
 
 
 
 
Supplemental Table 6.6 (cont’d) 
14:23189327_T_C  14  23189327 
14:23189446_C_T  14  23189446 
14:23189453_C_A  14  23189453 
14:23189633_G_T  14  23189633 
14:23189695_G_A  14  23189695 
14:23189757_A_G  14  23189757 
14:23189783_G_C  14  23189783 
14:23189891_T_C  14  23189891 
14:23190227_T_G  14  23190227 
14:23190282_A_G  14  23190282 
14:23190369_A_T  14  23190369 
14:23190575_C_A  14  23190575 
14:23190736_C_A  14  23190736 
14:23190805_G_A  14  23190805 
14:23190818_G_A  14  23190818 
14:23190885_A_G  14  23190885 
14:23190918_A_G  14  23190918 
14:23190948_A_G  14  23190948 
14:23191049_A_C  14  23191049 
14:23191219_C_T  14  23191219 
14:23191296_G_A  14  23191296 
14:23191494_T_C  14  23191494 
14:23191567_A_G  14  23191567 
14:23191628_A_G  14  23191628 
14:23191765_T_C  14  23191765 
14:23191781_A_G  14  23191781 
14:23191800_A_G  14  23191800 
14:23191895_C_T  14  23191895 
14:23192069_C_G  14  23192069 
14:23192247_T_C  14  23192247 
14:23192277_G_A  14  23192277 
14:23192311_G_A  14  23192311 
14:23192569_G_C  14  23192569 
14:23192583_C_T  14  23192583 
14:23192592_G_A  14  23192592 
14:23192602_A_C  14  23192602 
14:23192794_A_G  14  23192794 
14:23192837_G_A  14  23192837 
14:23192858_T_G  14  23192858 
14:23192959_T_C  14  23192959 
14:23192973_A_G  14  23192973 
14:23193080_A_G  14  23193080 

14:23193261_T_C  14  23193261 
14:23193270_T_C  14  23193270 
14:23194095_T_C  14  23194095 
14:23194344_T_C  14  23194344 
14:23194350_T_C  14  23194350 
14:23194604_G_A  14  23194604 
14:23194808_A_T  14  23194808 
14:23194809_A_C  14  23194809 
14:23196362_C_G  14  23196362 
14:23196417_T_G  14  23196417 
14:23196725_C_T  14  23196725 
14:23197858_C_T  14  23197858 
14:23199580_A_G  14  23199580 
14:23199714_G_T  14  23199714 
14:23201028_G_A  14  23201028 
14:23201789_C_T  14  23201789 
14:23212612_T_C  14  23212612 
14:23214574_G_C  14  23214574 
14:23215630_T_G  14  23215630 
14:23253786_A_G  14  23253786 
14:23264575_A_G  14  23264575 
14:23296927_G_C  14  23296927 
14:23297204_T_C  14  23297204 
14:23297472_A_G  14  23297472 
14:23298062_A_G  14  23298062 
14:23300304_G_C  14  23300304 
14:23314460_T_C  14  23314460 
14:23314730_C_T  14  23314730 
14:23314761_T_C  14  23314761 
14:23326588_C_G  14  23326588 
14:23329375_T_C  14  23329375 
14:23338890_G_T  14  23338890 
14:23343150_A_G  14  23343150 
14:23346065_C_G  14  23346065 
14:23354569_A_G  14  23354569 
14:23383008_A_G  14  23383008 
14:23384445_T_C  14  23384445 
14:23392938_T_G  14  23392938 
14:23414524_A_G  14  23414524 
14:23416592_C_G  14  23416592 
14:23418801_T_C  14  23418801 
14:23423823_T_C  14  23423823 

114 

 
 
 
 
 
Supplemental Table 6.6 (cont’d) 
14:23427548_T_C  14  23427548 
14:23434525_A_G  14  23434525 
14:23438738_C_T  14  23438738 
14:23440914_A_G  14  23440914 
14:23443706_C_T  14  23443706 
14:23471725_C_T  14  23471725 
14:23473697_C_T  14  23473697 
14:23478900_A_G  14  23478900 
14:23479427_C_T  14  23479427 
14:23495771_A_T  14  23495771 
14:23529103_C_G  14  23529103 
14:23552239_G_A  14  23552239 
14:23552679_C_T  14  23552679 
14:23553197_A_G  14  23553197 
14:23554044_C_G  14  23554044 
14:23554490_A_C  14  23554490 
14:23555313_T_C  14  23555313 
14:23555320_A_C  14  23555320 
14:23555430_A_G  14  23555430 
14:23555460_G_A  14  23555460 
14:23555549_A_G  14  23555549 
14:23555550_A_G  14  23555550 
14:23555551_C_A  14  23555551 
14:23555570_A_T  14  23555570 
14:23559882_G_C  14  23559882 
14:23564369_C_A  14  23564369 
14:23565143_T_C  14  23565143 
14:23566885_C_T  14  23566885 
14:23578196_G_A  14  23578196 
14:23583466_A_G  14  23583466 
14:23583576_A_G  14  23583576 
14:23583640_T_C  14  23583640 
14:23585785_G_A  14  23585785 
14:23587335_C_T  14  23587335 
14:23587652_G_A  14  23587652 
14:23587653_A_G  14  23587653 
14:23587737_T_G  14  23587737 
14:23590382_A_G  14  23590382 
14:23590403_C_T  14  23590403 
14:23590438_C_G  14  23590438 
14:23590493_T_A  14  23590493 
14:23590931_C_T  14  23590931 

14:23590939_G_A  14  23590939 
14:23590941_T_C  14  23590941 
14:23591033_A_G  14  23591033 
14:23591827_T_C  14  23591827 
14:23592917_A_G  14  23592917 
14:23593940_C_T  14  23593940 
14:23594517_C_T  14  23594517 
14:23596025_C_T  14  23596025 
14:23596251_G_A  14  23596251 
14:23596422_A_G  14  23596422 
14:23596429_A_T  14  23596429 
14:23600267_G_A  14  23600267 
14:23601668_G_T  14  23601668 
14:23605155_T_G  14  23605155 
14:23607944_A_C  14  23607944 
14:23608832_A_G  14  23608832 
14:23612458_C_T  14  23612458 
14:23612481_A_G  14  23612481 
14:23616675_A_G  14  23616675 
14:23617053_G_A  14  23617053 
14:23618229_T_G  14  23618229 
14:23618495_A_G  14  23618495 
14:23618614_A_G  14  23618614 
14:23618734_G_T  14  23618734 
14:23618798_G_T  14  23618798 
14:23618800_T_A  14  23618800 
14:23618814_T_G  14  23618814 
14:23619269_G_A  14  23619269 
14:23619297_T_C  14  23619297 
14:23620105_C_T  14  23620105 
14:23620227_A_T  14  23620227 
14:23620640_G_C  14  23620640 
14:23621082_T_C  14  23621082 
14:23621083_G_A  14  23621083 
14:23621256_G_A  14  23621256 
14:23630842_T_C  14  23630842 
14:23630896_T_C  14  23630896 
14:23633052_G_A  14  23633052 
14:23634452_A_T  14  23634452 
14:23637052_G_T  14  23637052 
14:23638330_C_G  14  23638330 
14:23638895_A_G  14  23638895 

115 

 
 
 
 
 
Supplemental Table 6.6 (cont’d) 

14:23639458_G_A  14  23639458 
14:23639633_T_C  14  23639633 
14:23639701_C_A  14  23639701 
14:23639932_T_A  14  23639932 
14:23640016_A_G  14  23640016 
14:23644290_A_C  14  23644290 
14:23645199_G_A  14  23645199 
14:23646689_G_A  14  23646689 
14:23647559_C_T  14  23647559 
14:23652025_C_T  14  23652025 
14:23652804_G_A  14  23652804 
14:23653071_T_A  14  23653071 
14:23864771_A_G  14  23864771 
14:23888150_C_T  14  23888150 
14:23888591_A_T  14  23888591 
14:23888612_C_T  14  23888612 
14:23890981_G_T  14  23890981 
14:23892930_C_T  14  23892930 
14:23896609_G_A  14  23896609 
14:23903530_C_T  14  23903530 
14:23908621_T_A  14  23908621 
14:23912123_C_T  14  23912123 
14:23912998_G_T  14  23912998 
14:23915419_C_T  14  23915419 
14:23916478_G_T  14  23916478 
14:25446205_G_T  14  25446205 
15:13337618_G_A  15  13337618 
20:67275290_C_G  20  67275290 
20:67276662_G_A  20  67276662 
20:67276917_A_G  20  67276917 
20:71900962_G_A  20  71900962 
20:71904894_A_G  20  71904894 
20:71905923_C_A  20  71905923 
285066 
21:285066_T_G 
21:285235_G_A 
285235 
23:12495116_C_T  23  12495116 
23:15370628_A_T  23  15370628 
23:18775299_A_G  23  18775299 
23:21751972_A_G  23  21751972 
29  2392400 
29:2392400_G_C 
29:45964941_G_A  29  45964941 

21 
21 

116 

 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 7: Conclusions 

The high-quality beef revolution in the US has already begun, as consumers are 

demanding high-marbling beef products, with validation of the breed composition of these high-

cost products. Rapid identification of US Wagyu products is needed, and this can be 

accomplished through genomic sequencing. New equipment by Oxford Nanopore Technologies 

specifically designed for out-of-lab sequencing was shown to be able to tackle this though 

matching the genotypic output per sample to reference haplotypes. This out-of-lab protocol was 

also done at a low-cost and was proven to have high correlation and concordance rates of sample 

to reference, even at low depth and coverage. This technique showed that even with low genomic 

output, and relatively no knowledge about laboratory procedures, a sample from an animal can 

be traced back to the breed of origin. The out-of-lab possibility of these technologies could be 

endless, as genomic information can be utilized for breed composition, disease testing, or even 

quick identification of parentage with the correct bioinformatic tools.  

Although product verification is crucial for consistency of Wagyu product in the market, 

understanding the population structure of the Wagyu breed in the United States is of equal 

importance. This breed originates from a small number of animals, imported under strange 

circumstances, and specific selection pressures are singular to this group of animals outside of 

Japan. Genomic architectures uncovered a population that is inbred and shows signals of low 

genomic variability. This is to be expected and is good confirmation of the suspected population 

status in the US. Keeping a beat on the inbreeding and variation available to Wagyu producers 

ensures a healthy breeding stock for years to come.  

The Wagyu breed in the US is considered as a single large population, yet it comprises 

three primary subgroups: Red/Brown Wagyu (Akaushi), Black Wagyu, and the RedBlack 

crossbreed. A Principal Component Analysis (PCA) revealed distinct connections among these 

breeds, with Black and Red Wagyu groups segregating and the RedBlack animals bridging these 

groups. When the Korean Hanwoo was included in this analysis, it stood apart from the Wagyu 

animals, forming a separate cluster. However, predictive accuracy highlighted a moderate 

relationship between Red Wagyu and Korean Hanwoo. Notably, prediction accuracies were 

lowest when attempting to predict Black Wagyu solely from Red Wagyu, possibly affected by 

the substantial imbalance in breed population sizes (~150 Black Wagyu compared to ~5000 

Red). Further investigations warrant a larger representation of Black Wagyu and other breed 

117 

 
 
 
 
 
 
groups to unravel additional genomic structures not encompassed in this study. This exploration 

of connections between US Wagyu sub-breeds and their Asian relatives holds promise for 

understanding the evolutionary trajectories of this understudied breed and potentially leveraging 

breeding practices across various Asian breeds. Specifically, this study illuminated the 

relationships between the subtypes within the US Wagyu population and with the Korean 

Hanwoo, revealing that Akaushi (Red Wagyu) shares a closer genetic link with Hanwoo than 

with Black Wagyu. Further exploration into the population structure of US Wagyu for the 

estimation of breeding values within the whole population will be explored to increase accuracy. 

Inclusion of crossbred RedBlack animals will be a crucial piece in linking the two Red and Black 

populations together 

Inclusion of all Asian breeds studied in the GWAS uncovered areas of the genome that 

were not previously researched for growth traits in Asian cattle. This discovery is a new look 

into the selection patterns of Asian breeds for growth and may uncover areas of the genome that 

emerged to be highly associated with phenotypes that are previously unknown. Although this 

new discovery is an exciting development in understanding genomic architecture, the growth 

traits are not a huge target for Wagyu cattle. Further investigation through genome-wide 

association on the marbling phenotypes, such as intramuscular fat percentage or marbling 

fineness, would be the most beneficial for Wagyu producers and researchers. New genomic QTL 

controlling these sought after qualities would further open the curtain into the selection pressures 

on the most desired traits in Wagyu.  

As Wagyu emerges as a prominent breed in the United States, regulation and verification 

of animals and product through genotype can be done on-site, rapid sequencing technologies. 

Outlook of the Wagyu and Akaushi populations in the United States is positive if inbreeding is 

kept at bay and genomic variation is targeted. Connections between the Asian breeds in the 

United States and abroad should be further explored, as differing selection pressures may drift 

these groups farther apart. Improving the accuracy of breeding values in Wagyu will start with 

understanding the population currently available in the United States and working towards 

bettering those accuracies for all industry professionals through inclusion of all Wagyu-type 

animals in the reference population. The impact of this breed on the beef market to increase 

quality is an avenue for producers to best utilize these cattle, and selection for those high-quality 

animals then becomes crucial for the everyday rancher.  

118 

 
 
 
 
 
APPENDIX A: Sale report for Wagyu in March 2023. 

Triangle B Ranch – 15th Annual Production Sale  
March 18th, 2023  
Stigler, Oklahoma  

Sale Manager: James Danekas & Associates Inc.  
Live Broadcast & Bidding: LiveAuctions.tv 
Averages:  
20 Fullblood Females    

$26,525.00  

12 Fullblood Bulls  

$33,471.00  

3 Pregnancies   

1 Flush   

6 Embryos  

$8,667.00  

$8,500.00  

$1,025.00/embryo  

$167/unit  

60 Units of Semen  
TOPS:  
Females:  
Lot 1: TBR HIKOKURA 035 7702K, 8/23/2022 sired by Arubial United; $400,000 to Eden 
Valley Wagyu, Eden Valley, MN.  
Lot 5: TBR HIKOHIME B438 2-1 7637, 3/13/2021 sired by TBR MITSUITOFUKU 2149Y; 
$13,500 to G. W.  Wagyu Farm, Jay, OK.  
Lot 7: TBR HIKOFUKU 3-9-3 7148H, 5/06/2020 sired by TBR KIKUTNAMI 4051A; 
$10,000 to Perpetua Wagyu, Tulsa, OK.  
Lot 3: TBR CHIYOTAKE B487 7040G, 12/2/2019 sired by BLACKMORE YASUCHIYO 
C058; $9,000 to G. W.  Wagyu Farm, Jay, OK.  
Bulls:  
Lot 32: TBR SHIGEFUKUNAMI 7704K, 8/26/2022 sired by Arubial United; $325,000 to 
Flying A Wagyu, La Salle, CO.  
Lot 39: TBR KIKUTNAMI 7569K, 5/01/2022 sired by TBR KIKUTNAMI 4051A; $9,000 to 
Philip Parish, Eddyville, KY.  
Lot 44: TBR ITOZURUNAMI 7626K, 3/19/2022 sired by TBR SHIGENAMINAMI 3024Z; 
$8,250 to Philip Parish, Eddyville, KY.  
Pregnancy:  
Lot 46:  ARUBIAL BOND Q007 X TBR HIKOKURA 035 5 7232J; $15,000 to Wilders 
Wagyu, Turkey, NC.  
With a shot of rain the day before the sale, the pastures started to pop with fresh, bright green 
grass on sale day. With the sun shining and a large crowd gathering, the 15th Annual Triangle B 
Ranch sale was a huge success. The guests enjoyed a fullblood Wagyu lunch and a very exciting 
sale. Bidding was very active and to start the auction off, a record high seller was sold. The 80+ 

119 

 
 
 
 
 
 
  
  
  
  
  
  
  
  
people online and the full loft at the sale witnessed history. It didn’t stop there as half way 
through the sale yet another record was shattered to make this sale hold the top selling fulllblood 
Wagyu bull and female ever sold in North America. To top it off, in the crowd to witness all this 
go down was “Bubbles”, the cutest little girl pet monkey. The day couldn’t have gone any better.    

120 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
APPENDIX B: Sale report for Angus sale in April 2023. 

121 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
APPENDIX C: Live animal specifications for Wagyu in the United States from the USDA. 

122 

 
 
 
 
 
 
123