This is to certify that the dissertation entitled CASTING FOR GENES REGULATED BY PAX3 AND PAX3/FKHR IN MAMMALIAN DEVELOPMENT AND ALVEOLAR RHABDOMYOSARCOMA presented by Thomas David Barber has been accepted towards fulfillment of the requirements for Ph . D . degree in Genetics aJor p es Date 4% (B age/75‘ V MSU is an Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY Michigan State University PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 11/00 CJCIRC/DaIeDue.p65-p.14 CASTING FOR GENES REGULATED BY PAX3 AND PAX3/FKHR IN DMMWMALIAN DEVELOPMENT AND ALVEOLAR RHABDOMYOSARCOMA Thomas David Barber A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Genetics Graduate Program 2000 JABSTTuuflr CASTING FOR GENES REGULATED BY PAXB AND PAX3/FKHR IN MEEMHAIJHUH EEHDELOEEMEWT AmflD.AIAHmILARLIUEKBIKEEWOSAEKKNHA Thomas David Barber PAX3 is a transcription factor important for neural, muscle, facial, and auditory development in vertebrates. Mutations in PAX3 cause waardenburg syndrome types 1 and 3 and Craniofacial—deafness-hand syndrome. Mutations in the murine orthologue of PAX3 result in the Splotch phenotype. A translocation involving PAX3 and FKHR produces a chimeric fusion protein, PAX3/FKHR, and is responsible for the majority of alveolar rhabdomyosarcoma tumors. PAX3 is presumed to regulate the expression of many genes during embryogenesis and cancer epigenesis, but very few endogenous target genes have been identified. In this study, the complete genomic structure of PAXB has been resolved, including the identification of novel coding sequences and the complete 3’ UTR. Alternate transcripts of PAX3 were observed in various tissues including human adult skeletal muscle and mouse embryos. One Of the novel alternate transcripts has been conserved since the divergence of birds and mammals and is shown to transactivate a reporter construct containing the mouse c-met promoter. The PAX3 complete genomic structure and alternate transcripts reported herein extend our understanding of the function and evolution of PAX3 in vertebrates and enables a comprehensive mutation screen for individuals with waardenburg syndrome. In order to identify genes regulated by PAX3 we employed a Cyclic Amplification and Selection of Targets (CASTing) strategy to isolate cis—regulatory elements bound by PAX3. Three CASTing libraries were generated with mouse and human genomic DNA fragments bound by mouse Pax3, human PAX3 and human PAX3/FKHR. Approximately 1000 clones were sequenced from each library, and over 200 putative targets of PAX3 and PAX3/FKHR were identified. Many of these putative PAX3 target genes have expression patterns and predicted functions consistent with regulation by PAX3. Putative targets were evaluated for in vitro binding and transactivation by PAX3 and for PAX3—dependent expression patterns in Splotch mice and other model systems. Some of the most interesting genes identified by CASTing that are likely to be regulated by PAX3 are Itm2A,.mFAT, VEGFR, TGFa, and engrailed—Z. Further characterization of the genes regulated by PAX3 will provide a better understanding of the genetic pathways through which mutations in PAX3 cause alveolar rhabdomyosarcoma, deafness, and other developmental abnormalities. Copyright by Thomas David Barber 2000 DEDICATION This thesis is dedicated to the individuals whose lives have been affected by Waardenburg syndrome or alveolar rhabdomyosarcoma. ACKNOWLEDGMENTS This work could not have been completed without the exceptional guidance, insight, and support of Dr. Thomas Friedman. Tom has been a superb mentor and a great friend. I would also like to thank the following people for helpful discussions and suggestions: Dennis Drayna, Susan Sullivan, Jim.Battey, Bob Fridell, Edwardo Sainz, Robert Wenthold, David Anderson, and Tamar Ben—Yosef. Technical assistance was provided by Robert Singletary, Maki Saitoh, Timothy Cloutier, and Erich Boger. Sequencing was performed at the NIH Intramural Sequencing Center by Jeff Touchman, Gerard Bouffard, and Nicole Dietrich. valuable materials were contributed by Clinton Baldwin, Andrew Read, walter Nance, Frederic Barr, and Dana Tomescu. I am grateful to my committee members, Steve Triezenberg, Karen Friderici, Zachary Burton, and Will Kopachik, for their valuable suggestions and interest in this project. I would like to thank my parents, Tom and Connie Barber, for their encouragement, love, and understanding, and the Careys for welcoming me into their family. Finally, I would like to thank my wife, Melisa Carey Barber, for her critical contributions to the science presented herein as well as her love and support. vi TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS Introduction Pax gene family PAX3 gene structure and alternate isoforms Waardenburg syndrome Alveolar rhabdomyosarcoma Splotch mice Putative targets of Pax3 Cyclic Amplification and Selection of Targets .................. Overview Chapter One: Identification and characterization of novel Pax3 coding sequences PAX3 gene structure Identification of novel Pax3 isoforms Human PAX3 BAC isolation and characterizationm Sequence analysis of PAX3 exon 9 Assembly of PAX3 ESTs in GenBank Amplification of alternate PAX3 cDNAs ........................... Evolutionary conservation of Pax3 sequences ........................ Waardenburg syndrome sequencing project A PAX3 exon 8 mutation in a WSl proband ..................... PAX3 transcribed polymorphisms Cloning mouse and human Pax3 cDNAs Primer and experimental design Cloning and sequencing results Design of the negative controls In vitro synthesis of Pax3 protein Functional properties of Pax3 isoforms DNA-binding activity Transactivation potentials of Pax3 isoforms ......... Chapter Two: CASTing for Targets of Pax3 Pax3 antibodies Design of Pax3 peptide antigens Evaluation of antibodies against Pax3 ........................... Optimizations Buffer selection Immunoprecipitation conditions CASTing Methodology WGPCR strategy and rationale Preparation of genomic DNA Generation of Pax3 CASTing library Pilot analyses of clones from Pax3 library ............ Generation of PAX3 and PAX3/FKHR libraries ............ vii ix Xi mmibmp 10 13 14 15 15 18 20 21 24 25 28 32 33 33 35 36 37 39 41 44 44 47 49 49 51 53 53 56 62 65 CASTing Results- Clone sequence analysis Repetitive elements Genes identified Clones in multiple libraries Clone redundancy Chapter Three: Evaluation of putative targets of Pax3 ...... Gene selection Criteria for selecting genes for study ........................ Putative targets of Pax3 identified by CASTing Analysis of Pax3 binding activity in CASTing clones Localization of Pax3 binding sites in CASTing clones Pax3 consensus binding sequence Luciferase reporter assays Expression analyses Model systems to study Pax3 transactivation.- ........ Northern blot hybridization analyses- Summary- Discussion Appendices Appendix A: Tables Appendix B: Figures Appendix C: Materials and Methods Bibliography viii 66 66 68 71 72 73 76 77 77 8O 85 87 89 9O 91 91 95 98 102 113 114 161 186 202 Table Table Table Table Table Table Table Table Table Table Table Table Table Table 10: ll: 12: 13: 14: LIST OF TABLES Oligonucleotides PAX3 polymorphisms Pax3 Expression Constructs Pax3 antibodies Buffers for Pax3 EMSAS and IPs Summary of CASTing libraries Repetitive elements in CASTing library cq ............... Repetitive elements in CASTing library ev ............... Repetitive elements in CASTing library ew ............... BLAST results from CASTing library cq - BLAST results from.CASTing library ev BLAST results from.CASTing library ew-- Sequences in multiple CASTing Libraries ..................... Putative targets of Pax3 identified by CASTing 115 136 137 138 139 140 141 142 143 144 150 154 159 160 Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: LIST OF FIGURES Comparison of intron 8 to consensus motifs ......... PAX3 BAC Contig PAX3 Sequence Exons 8, 9, and 10 Sequence of Quail Pax3 intron 8 Pax3 protein synthesized in Vitro EMSAs with PaXBC and Pax3d Pax3c and Pax3d activation of cMET : Western blot analysis with antibody PB33 ............... Western blot analysis with antibody PB35 ............... Immunoprecipitation of Pax3—DNA complexes ............ Immunoprecipitation of Pax3 CASTing clones ......... .mFAT clones in CASTing library cq EMSAs with 4 CASTing clones Pax3 binding site in mFAT Pax3 consensus binding sequence Transactivation of reporter constructs ..................... Strategy used to genotype Splotch embryos ............ SplotCh—delayed embryo genotypes Western blot of transient transfections .................. Western blot of stable cell lines IthA is induced by PAX3/FKHR Itm2A expression in mouse embryos BVES expression correlates with Pax3 VEGFR expression correlates with PAX3/FKHR. ........ 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 LIST OF ABBREVIATIONS aa, amino acid(s); ARMS, alveolar rhabdomyosarcoma; *, translation stop codon; BAC, bacterial artificial chromosome; B—gal, beta-galactosidase; BLAST, basic local alignment search tool; bp, base pair(s); BSA, bovine serum.albumin; CASTing, cyclic amplification and selection of targets; CDHS, Craniofacial—deafness-hand syndrome; cDNA, DNA complementary to mRNA; cq, Pax3 CASTing library; dbest, GenBank EST database; ds, double—stranded; E, Expect value in BLAST analyses; e5, ds oligonucleotide derived from the even- skipped promoter; EthBr, ethidium bromide; EMSA, electrophoretic mobility shift assay; EST, expressed sequence tag; ev, PAX3 CASTing library; ew, PAX3/FKHR CASTing library; FKHR, gene encoding the fork head in rhabdomyosarcoma protein; LINE, long interspersed nuclear element; LTR, long terminal repeat; MIM, Mendelian Inheritance in Man catalog number; NISC, NIH Intramural Sequencing Center; nr, GenBank on—redundant database; ORF, open reading frame; PAX3, gene ncoding the human paired box protein 3; Pax3, gene encoding he mouse paired box protein 3; p.c., post coitus; PCR, olymerase chain reaction; Q, glutamine; rt, room emperature; RT—PCR, reverse transcription—polymerase chain eaction; SINE, short interspersed nuclear element; Sp, plotch; UTR, untranslated region; V, volts; WGPCR, whole enome polymerase chain reaction; WSl, Waardenburg syndrome ype 1. xi Introduction Pax gene family The paired box is a 128 amino acid DNA—binding domain originally identified in the Drosophila melanogaster paired gene (Bopp et al., 1986; Frigerio et al., 1986). Other members of the paired box gene family in Drosophila include gooseberry—distal, gooseberryeproximal, POXfmeSO, and Pox— neuro (Baumgartner et al., 1987; Bopp et al., 1989). Nine paired box genes have been identified in mice and humans, Paxl—Pax9 and PAXl—PAX9, respectively (Burri et al., 1989; Chalepakis et al., 1993; Deutsch et al., 1988; Walther et al., 1991). All of the Pax genes exhibit spatially and temporally restricted expression patterns and have important roles in embryogenesis (Dahl et al., 1997). The nine Pax genes have been divided into four groups based on sequence and structural similarity. Pax3 and Pax7 are most similar to one another and constitute Group 3. Pax3 and Pax7 contain two DNA-binding domains, the paired domain and a homeodomain. Pax3 and Pax7 also contain C—terminal ransactivation and N—terminal repression domains (Bennicelli t al., 1999; Bennicelli et al., 1996; Bennicelli et al., 995; Chalepakis et al., 1994b). PAX3 gene structure and alternate isoforms Murine Pax3 mRNA includes an open reading frame of 1437 bp and encodes a 479 amino acid, 56 kDa protein (Goulding et al., 1991). The genomic structure for mouse Pax3 has not been resolved, but the human gene was reported to have eight exons (Lalwani et al., 1995; Macina et al., 1995). In 1994, two novel alternate transcripts of PAX3 were identified (Tsukamoto et al., 1994). Both of the novel isoforms, PAX3A and PAXBB, include the first four exons of PAX3 and the paired domain, but lack the homeodomain and transactivation domain. PAXBA is generated by the failure to recognize the intron 4 donor site; transcription continues for 491 bp into intron 4 and encodes a novel C—terminus (KRWRLGRRTCWVTWRASAS*). RAXBB is generated by the use of an alternate intron acceptor in intron 4 and also encodes a novel C—terminus (KALVSGVSSH*). Functional studies have not been performed on either the PAX3A or PAX3B proteins, but the DNA—binding and transactivation properties are likely to be very different as compared to longer isoforms. Another alternate isoform.of PAX3 was reported that makes use of an alternate intron acceptor site in intron 3 (Vogan et al., 1996). The result of this alternative splicing event is a 3 bp insertion in the paired domain encoding a glutamine residue. The DNA—binding specificity of PAX3 is significantly influenced by this change (Vogan and Gros, 1997). The presence or absence of the glutamine residue is designated by adding “Q+" or “Q—“ after the gene name (i.e. PAX3AQ+ or PAX3AQ—) (Barber et al., 1999). Zebrafish also have both the (Q+) and (Q—) isoforms (Seo et al., 1998). Recently a quail cDNA was submitted to GenBank (AFOOO673) that includes the alternatively spliced glutamine. It is not known if quail also produce the (Q—) isoform. Alternative splicing has also been observed in other Pax genes, Paxl (Ogasawara et al., 1999), Pax2 (Heller and Brandli, 1997; Sanyanusin et al., 1996; Tavassoli et al., 1997; Ward et al., 1994), Pax4 (Tokuyama et al., 1998), Pax5 (Zwollo et al., 1997), Pax6 (Carriere et al., 1995; Epstein et al., 1994; Jaworski et al., 1997; Kozmik et al., 1997; Okladnova et al., 1998; Plaza et al., 1995), Pax7 (Barr et al., 1999; Kay and Ziman, 1999; Ziman et al., 1997; Ziman and Kay, 1998), Pax8 (Kozmik et al., 1997; Kozmik et al., 1993; Poleev et al., 1995), and Pax9 (Nornes et al., 1996). The alternate isoforms in Pax8 exhibit distinct DNA—binding nd/or transactivation properties (Kozmik et al., 1997; ozmik et al., 1993). The functions of the alternate ranscripts in other Pax genes have not been determined. Waardenburg syndrome Waardenburg syndrome (WS) was first described 50 years ago by the Dutch ophthalmologist Petrus Johannes waardenburg (waardenburg, 1951). Dr. Waardenburg described an autosomal dominant syndrome characterized by the variable penetrance and expressivity of congenital unilateral or bilateral sensorineural hearing loss, dystopia canthorum.(an eye phenotype caused by a lateral displacement of the inner canthi), hypopigmentation and hyperpigmentation of the skin, white forelock, and heterochromia irides (different color irises). The original syndrome has since been separated into four clinical subtypes: WSl (MIM 193500), WS2 (MIM 193510), WS3 (MIM 148820), and WS4 (MIM 277580). WSl includes the phenotypes described by waardenburg and other less common abnormalities including synophrys, broad nasal root, spina ifida, heart abnormalities, and premature graying. WS2 is imilar to WSl, but lacks dystopia canthorum. WS3 is similar 0 WSl, but includes upper limb abnormalities, probably due 0 abnormal muscle development. WS4, also known as irschprung disease, is similar to WSl but includes ganglionic megacolon. The diagnostic criteria for WSl have een redefined to permit accurate diagnosis of WSl and istinguish it from.the other WS subtypes (Farrer et al., 994; Liu et al., 1995a; Liu et al., 1995b). waardenburg syndrome type 1 was hypothesized to have nserved synteny to one of the genes causing the mouse phenotypes Splotch, microphthalmia, patch, or piebald (Asher and Friedman, 1990). WSl was mapped to chromosome 2q35, a region with conserved synteny to the Splotch locus, supporting the hypothesis that Splotch was a good model for waardenburg syndrome (Asher et al., 1991; Read et al., 1991). The first mutations identified in WSl probands were found in the paired domain of PAX3 (Baldwin et al., 1992; Morell et al., 1992; Tassabehji et al., 1992). Approximately 100 mutations have now been reported in WSl probands (Attaie et al., 1997; Baldwin et al., 1995; Carey, 1996; Carey et al., 1998; Chalepakis et al., 1994a; Lalwani et al., 1995; Lalwani et al., 1996; Sotirova et al., 2000; Tassabehji et al., 1994a; Tassabehji et al., 1993; Wildhardt et al., 1996). In addition to WSl, a mutation in the paired domain of PAX3 also causes Craniofacial—deafness—hand syndrome (MIM 122880) (Asher et al., 1996b). A deletion of PAX3 was reported to cause the more severe form of Waardenburg syndrome, WS3, which includes upper limb abnormalities with arthromyodysplasia (Pasteris et al., 1993). Mutations were reported in PAX3 in a family with W82, but the report was 3ubsequently revised when the phenotype segregating in the family was reevaluated and determined to be WSl (Tassabehji at al., 1993). Alveolar rhabdomyosarcoma Alveolar rhabdomyosarcoma (ARMS) is a soft tissue tumor with striated muscle differentiation arising primarily in the trunk and extremities of adolescents and young adults (Barr, 1997a; Barr, 1997b; Raney et al., 1993). A spontaneous translocation, t(2;13)(q35;q14), is observed in approximately 75% of ARMS tumors (Barr, 1997b; Douglass et al., 1987; Lizard—Nacol et al., 1987; Rowe et al., 1987; Turc—Carel et al., 1986). The breakpoints of this translocation were cloned, and a novel chimeric fusion protein, PAX3/FKHR, was identified (Barr et al., 1993; Galili et al., 1993; Shapiro et al., 1993). This fusion protein contains the PAX3 DNA~ binding domains and the FKHR transactivation domain. FKHR, fork head in rhabdomyosarcoma, is a member of the fork head gene family of transcription factors that contain a conserved DNA-binding domain related to the Drosophila fork head gene (Clevidence et al., 1993; Weigel and Jackle, 1990). A less common translocation, t(1;13)(p36;q14), found in some ARMS tumors creates a similar PAX7/FKHR fusion protein (Davis et al., 1994). The PAX7/FKHR fusion protein contains he PAX7 DNA—binding domains and the FKHR transactivation omain. PAX3/FKHR and PAX7/FKHR.may share a common mechanism f action in generating ARMS tumors (Bennicelli et al., 999). The DNA—binding and transactivation properties of AX3/FKHR have been compared to wild—type PAX3. PAX3/FKHR is able to bind artificial sequences also recognized by the wild—type PAX3, but is a more potent transcription activator than wild—type PAX3 (Bennicelli et al., 1996; Bennicelli et al., 1995; Fredericks et al., 1995; Sublett et al., 1995). This led to the hypothesis that PAX3/FKHR binds the same enhancer elements as PAX3, but results in higher levels of transactivation. However, recent data from domain swapping experiments show that the specificities of the PAX3 DNA— binding domains are influenced by the PAX3 and FKHR transactivation domains, suggesting that PAX3/FKHR may bind and transactivate genes not normally regulated by PAX3 (Cao and wang, 2000). Splotch mice Splotch is a semidominant lethal mutation in mice originally described in 1947 (Auerbach, 1954; Russell, 1947). Heterozygotes display a white belly spot and occasionally a curly tail. Homozygotes are more severely affected and die in utero at different developmental stages depending on the Splotch allele. Splotch homozygotes develop spina bifida, exencephaly, persistent truncus arteriosis, and abnormal skeletal muscle (Chalepakis et al., 1993; Conway et al., 1997b; Franz, 1989; Franz, 1990; Franz, 1992; Franz et al., 1993). Unlike individuals with WS, Splotch mice have normal hearing (Steel and Smith, 1992). The Splotch locus maps to the C4 band of murine chromosome 1, in a region with conserved synteny to human 2q35. MUtations in Pax3 have been identified in five Splotch alleles (Epstein et al., 1991; Epstein et al., 1993; Goulding et al., 1993; vegan et al., 1993). The Sp and.£$fl Splotch alleles are caused by a mutation in an intron splice site and a missense mutation in the paired domain, respectively. Embryos homozygous for the Sp allele die at day 13.5, most likely from.cardiac defects (Conway et al., 1997b; Epstein et 1., 2000). Embryos homozygous for the smfliallele survive til birth and are the least severely affected of all plotch mutants. The variable penetrance of the phenotype observed in 0th Splotch mice and individuals with waardenburg syndrome might be due to the presence of modifiers (Asher et al., 1996a). For example, the frequency of neural tube defects observed in Splotch heterozygotes is significantly increased in compound heterozygotes with mutations in the gene encoding neurofibromin (Lakkis et al., 1999). Putative targets of Pax3 Pax3 is a transcription factor with important functions during embryogenesis and tumor formation. Presumably Pax3 acts by regulating the expression of downstream genes in these processes. A thorough screen for genes regulated by Pax3 has not been performed, but a handful of putative target genes have been described including microphthalmia-associated Eranscription factor (MITF),Ip1atelet1gerived growth factor alpha geceptor (PDGFaR), and cMET. Mutations in METT'were identified in individuals with Waardenburg syndrome type 2 (Tassabehji et al., 1994b). Due to the overlapping expression patterns between Pax3 and.MITF and the similar phenotypes observed in individuals with mutations in both genes, I predicted that MEIT'was a target of PAX3. Examination of the.MITT’promoter identified a utative PAX3 binding site that could be bound in vitro by AX3 (Watanabe et al., 1998). Co—transfection experiments ith a PAX3 expression vector and an MEITLluciferase reporter onstruct demonstrated that PAX3 could transactivate the MITF romoter (watanabe et al., 1998). However, no expression tudies have been reported to support the hypothesis that ITF is regulated by PAX3 in Vivo. PDGFaR is a protein tyrosine kinase receptor specific to latelet—derived growth factor (PDGF) (Heldin and Westermark, 99). Overlapping expression patterns between Pax3 and GEaR during embryogenesis and the role of PDGF in 10 oncogenesis suggested that PDGFaR might be a target of PAX3/FKHR in ARMS (Epstein et al., 1998; Heldin and Westermark, 1999). In transient co—transfections, a PDGFaR promoter luciferase construct was transactivated by PAX3/FKHR (Epstein et al., 1998). Similarly, endogenous PDGFaR mRNA was induced in a murine embryonal carcinoma cell line following transfection with PAX3/FKHR. Furthermore, PDGFaR expression patterns are disrupted in Splotch embryos providing evidence that PDGFaR is a target of Pax3 in vivo (Dickman et al., 1999; Henderson et al., 1999). Another putative target of Pax3 is c—met, a tyrosine kinase receptor required for proper limb muscle development (Bladt et al., 1995; Epstein et al., 1996; Park et al., 1987). Epstein et a1. provided four pieces of evidence to show that Pax3 is likely to regulate c-met expression (Epstein et al., 1996). First, a Pax3 binding site was identified in the c—met promoter and bound by Pax3 in vitro. Second, in transient transfections, Pax3 and PAX3/FKHR transactivated a Cemet promoter luciferase construct. Third, ax3 overexpression in C2C12 myoblasts and NIH/3T3 cell lines 'ncreased endogenous c—met mRNA levels. Finally, Cemet xpression is disrupted in the lateral dermomyotome of .plotchzmice. Expression levels of CfiMET also correlate with AX3/FKHR levels in ARMS tumors (Ginsberg et al., 1998). In addition to.MITF, PDGFUR, and Cemet, several genes ave expression patterns in Splotch embryos consistent with gulation by Pax3 including be1, p57Kip2, cdc46, MyoD, and 11 EYfS (Dietrich, 1999; Hill et al., 1998; Kochilas et al., 1999; Maroto et al., 1997; Mennerich et al., 1998; Schafer and Braun, 1999; Tajbakhsh et al., 1997; Uchiyama et al., 2000). A cDNA microarray hybridization strategy to identify genes activated by PAX3/FKHR in NIH/3T3 cells has also identified several putative targets including myogenin, Igf2, Sixl, and Slug (Khan et al., 1999). These expression studies are valuable in identifying genes with expression patterns influenced by Pax3, but do not provide evidence that the genes are directly regulated by Pax3. 12 Cyclic Amplification and Selection of Targets In 1989, a novel strategy was described to identify genomic DNA fragments bound by a specific transcription factor. The strategy, Whole Genome PCB (WGPCR), involved separation of DNA fragments bound by a transcription factor in Vitro from unbound fragments, recovery of the bound fragments, and amplification of the fragments using the polymerase chain reaction (Kinzler and Vogelstein, 1989). To test the strategy, targets of Xenopus transcription factor IIIA (TFIIIA) were isolated (Kinzler and Vogelstein, 1989). The WGPCR strategy resulted in a several thousand fold enrichment of unique DNA fragments that contained TFIIIA binding sites. The WGPCR strategy, also called Cyclic Amplification and Selection of Targets (CASTing) was modified to select specific DNA fragments from a pool of random Oligonucleotides that have the highest affinity for a transcription factor (Wright et al., 1991). This application of the CASTing strategy is frequently used to identify a consensus binding sequence for a DNA—binding domain or transcription factor. EASTing has been used to identify putative targets of many :ranscription factors including p53, 73R, GLI, Gcn4, SRFfi and {p1 (Caubin et al., 1994; Eldeiry et al., 1992; Kinzler and ‘ogelstein, 1990; Mavrothalassitis et al., 1990; Pollock and reisman, 1990; Thiesen and Bach, 1990). 13 Overview The experimental data collected from this project are presented in the following three chapters. The first chapter includes a description of the genomic structure and alternative isoforms of PAX3. Novel PAX3 isoforms were identified and evaluated in terms of functional and evolutionary significance. In Chapter 2, I describe the generation of Pax3 antibodies and conditions to allow for the co-immunoprecipitation of DNA fragments bound by Pax3. A strategy called CASTing was used to create three libraries containing putative Pax3 binding sites. In Chapter 3, a set of genes identified by the CASTing strategy are examined as putative targets of Pax3. 14 Chapter 1: Identification and characterization of novel Pax3 coding sequences PAX3 gene structure Identification of novel Pax3 isoforms This project was initiated when a novel isoform of Pax3 was amplified while attempting to generate a full—length Pax3 cDNA clone. Primers were designed from sequences in the 5'UTR and 3'UTR of Pax3 (Table 1: Oligonucleotides, TF180 and TF181, respectively) and used to amplify first—strand cDNA from a 9.5 day mouse embryo. A product of the expected size was amplified, along with an additional product approximately 500 bp shorter. Both products were cloned, sequenced, and compared to the published Pax3 cDNA sequence. With the exception of several PCR—induced point mutations the longer product was identical to the published sequence for mouse Pax3 (Goulding et al., 1991). The shorter product was also identical to the reported sequence, except that it lacked about 500 bp of contiguous sequence. The ends of the deleted region are similar to mammalian intron consensus sequences for intron donor and acceptor sites, suggesting that the leleted region was an intron that was retained in the longer EDNA (Figure 1: Comparison of intron 8 to consensus motifs) Senapathy et al., 1990; Shapiro and Senapathy, 1987). igure 1 also includes comparisons between the 1ntron 15 consensus sequences and the homologous human and quail introns, which are discussed in later sections. Intron retention is a rarely reported mode of alternative splicing in vertebrates. A well characterized example of intron retention in a vertebrate system.is the bovine growth hormone (bGH), in which a small fraction of the bGH transcripts in vivo contain the terminal intron D (Hampson et al., 1989; Hampson and Rottman, 1987). Based on the human PAX3 gene structure the alternatively retained intron was located at the extreme 3' end of the gene (Macina et al., 1995). PAX3 was previously reported to have 8 exons (Macina et al., 1995). The alternatively retained intron begins about 250 bp downstream from the start of exon 8 and 15 nucleotides upstream of the reported translation :ermination codon. Splicing of the alternatively retained intron would result in the removal of the last 15 nucleotides tlong with the putative translation termination codon. The :equence immediately downstream of the alternatively retained ntron would be translated until a stop codon or intron plice site is encountered. Therefore, we hypothesized that AX3 contained additional coding sequences not included in he 8 published exons. The amino acid sequences of the two Pax3 isoforms anerated by the retention or removal of the alternate intron 7e identical except at their carboxy-termini. If intron 8 : present in a mature mRNA, translation will proceed from :on 8 for five amino acids (KPWTF) into intron 8 before 16 reaching the first stop codon (Goulding et al., 1991). Intron 8 is removed from the novel alternate isoform.and translation proceeds from exon 8 to exon 9, encoding a different C—terminus (AFHYLKPDIA). The original Pax3 isoform has been designated Pax3c and the novel isoform Pax3d (Barber et al., 1999). The overall goal of my thesis project was to identify genes regulated by the most biologically relevant form(s) of Pax3. It is possible that different isoforms interact with different affinities to different enhancer elements. Therefore, it was necessary to more carefully examine the Pax3d isoform before deciding which isoform should be used to isolate Pax3 targets. Experiments were designed to compare :he evolutionary conservation, expression pattern, DNA- )inding, and transactivation properties of Pax3d and Pax3c. Phis chapter discusses the results obtained from those axperiments. 17 Human PAX3 BAC isolation and characterization As described previously, a novel coding exon was identified downstream of exon 8. Published PAX3 sequence extended only about 400 bp into intron 8 (Macina et al., 1995). Macina and coworkers hypothesized that exon 8 was the terminal coding exon of PAX3 and that a putative poly(A) signal located about 200 bp downstream of the termination codon represented the end of the PAX3 3'UTR (Macina et al., 1995). With the preliminary identification of coding sequences further downstream, we sought genomic clones extending further into the 3' end of PAX3 in order to obtain additional DNA sequence information of novel coding domains as well as the 3' UTR. A contig of genomic clones spanning PAX3 also served as a resource for other PAX3 projects >ngoing in our laboratory. With these goals in mind, BAC clones were isolated for Luman PAX3 exons 1, 4, 5 and 8. BAC clones are circular lasmids propagated in E. coli, allowing for ease in handling nd purification. BAC clones are also more stable and less ikely to be chimeric than YAC clones (Monaco and Larin, 994; Rouquier et al., 1994). BAC clones from the Research anetics library range in size from 90—300 kb and average Dout 130 kb. BAC pools (Research Genetics) were screened by ER with primers for each of these exons. Several 'erlapping BAC clones were obtained. 18 After identifying BAC clones containing each exon, BAC DNA was purified and amplified with primer pairs from.exons 1, 4, 5 and 8. The exons present in each clone were roughly determined (Figure 2: PAX3 BAC Contig). Each exon of PAX3 was represented in one or more of the BAC clones isolated. 19 '71 an rn Sequence analysis of PAX3 exon 9 A human BAC clone, 81—L—8 as designated by the Research Genetics plate—row—column address, was identified by PCR with primers from.RAX3 exon 8 and used as template to sequence the genomic region downstream of exon 8 (Figure 2: PAX3 BAC Contig). Primers were designed to extend the previously reported sequence. Approximately 1000 bp of sequence was generated directly from.BAC 81—L—8. Analysis of this sequence showed that intron 8 is 497 bp, followed by exon 9 encoding the amino acid sequence, AFHYLKPDIA, which is identical to the murine Pax3d alternate transcript. Intron donor and acceptor sites and a putative branchpoint sequence are also present in PAX3 intron 8 (Figure 1: Comparison of intron 8 to consensus motifs). Approximately 800 bp of DNA sequence downstream from.the tart of exon 9 was also obtained from BAC clone 81-L—8. alysis of this sequence using the Baylor Genome Center's ene—Finder software (http://www.hgsc.bcm.tmc.edu/ archLauncher) predicted the presence of at least one ditional exon. The putative 5' intron donor site for tron 9 begins one base pair upstream of the translation op codon utilized in the Pax3d isoform (see Amplification alternate PAX3 cDNAs). 2C) Assembly of PAX3 ESTs in GenBank The DNA sequence obtained from 81—L-8 was used as a query sequence in a BLAST search of the dBEST database, a non-redundant EST database containing all EST entries in GenBank. This database represents the compilation of EST sequence reads available to the public. Several EST clones were identified using the query sequence obtained from 81—L— 8. Clones that appeared to be from the same region, but extending further in the 3' direction were used as query sequences for additional BLAST searches. In this way, ESTs from the PAX3 3'UTR were assembled. The EST entries identified in this process were assembled into a contig. The consensus sequence of this contig (Figure 3: PAX3 Sequence Exons 8, 9, and 10) was nalyzed using the Baylor College of Medicine Search Launcher oftware (http://www.hgsc.bcm1tmc.edu/SearchLauncher), and everal motifs were identified, including putative intron onor and acceptor sites (Solovyev and Salamov, 1997). These y or may not be of functional significance in generating 3 alternate transcripts with modified function or ability. I Six ATTTA motifs were identified in the 3' UTR. These tifs are thought to be involved in modulating mRNA half— fe (Akashi et al., 1994; Gay and Babajko, 2000). The esence of several of these motifs is consistent with the .pid turnover expected of a spatially and temporally 21 restricted transcription factor. Several putative poly(A) signals were identified. Although several EST clones appear to end in a similar region (Figure 3: PAX3 Sequence Exons 8, 9, and 10, nucleotides 2308—2316), the Baylor College of Medicine Gene-Finder POLYAH software (http://www.hgsc.bcm1tmc.edu/ SearchLauncher) did not identify the AATAAA motif at position 2281 as a putative poly(A) site. The POLYAH software includes several criteria for the identification of a poly(A) signal, including the presence of a GT—rich element located approximately 50 nucleotides downstream of the cleavage site (Salamov and Solovyev, 1997). Since this GT element could not have been present in the EST clones as it would have been cleaved off during mRNA processing, BAC clone 81—L—8 was sequenced with a primer near the putative poly(A) signal. When approximately 00 bp of additional genomic sequence downstream from the utative poly(A) signal was included in the query sequence, OLYAH identified the AATAAA motif at position 2281 as a utative poly-adenylation signal. This prediction together ith the thirteen EST clones ending at nucleotides 2308—2316 d the predicted size of the Pax3 mRNA based on northern lotting suggest that this site represents the end of PAX3 anscription. Since it was possible that additional introns existed atween the 5' end of exon 10 and the 3' end of the cDNA >ntig, two primers (TB39 and TB40) were designed in the far end of the cDNA contig. These primers were used in 22! conjunction with an exon 8 primer (TB16) to amplify the corresponding genomic fragments from BAC clone 81-L—8 and total human genomic DNA. The PCR products generated from genomic DNA and BAC 81—L—8 were the same size as predicted by the cDNA contig, suggesting that no additional introns are present in the 3' UTR of PAX3. Another goal of the BLAST analyses of PAX3 sequences was to identify the relative abundance in the EST database of the PAXBC and PAX3d isoforms. Seven human EST clones and one mouse EST clone are present in GenBank that contain both exon 8 and sequences downstream of exon 8 (Accession numbers: AI332917, AI805971, H82467, AI268224, H97691, N33148, AIO84479, and AIO84479). All of these clones lack intron 8 and are of the PAXBd type. Although this type of BLAST analysis may not produce an accurate quantitative estimate of he relative abundance of the two isoforms, the presence of ight PAX3d EST clones in GenBank suggests that the PAXBd 'soform is not an artifact of RT—PCR. It is also interesting 0 note that the PAX3C isoform.is not represented in the enBank EST database, especially since this appears to be the nly isoform.used by all other groups investigating the unctional properties of Pax3 (Bennicelli et al., 1996; alepakis and Gruss, 1995; Chalepakis et al., 1994b; ulding et al., 1993). 253 Amplification of alternate PAX3 cDNAs Although unlikely, it was possible that the cDNA contig generated by the BLAST searches described above included transcribed sequences from.a gene immediately downstream of PAX3. In order to demonstrate that the 3’ end of the cDNA contig generated by the BLAST searches is included in mature PAX3 mRNA, RT—PCR products were generated using upstream primers in PAX3 exon 6 (TF268) and downstream.primers (TF256, TB22, TB30, TB39, and TB40) located at various positions throughout the cDNA contig. Total RNA from.a human lymphoblast cell line was isolated, first strand cDNA was prepared and amplified by PCR. PCR products were cloned into pGEMCkT'Easy vector (Promega). Sequence analysis of clones confirmed that the RT-PCR products generated were derived rom PAX3. Interestingly, one clone contained exons 8, 9 and 0 together without any of the intervening introns. This uggested that intron 9 could also be removed in a mature ranscript, as predicted by the Baylor sequence analysis oftware. Both RAXBC and PAXBd isoforms were identified in the RT— CR products, although no quantitative estimates were made. everal clones containing novel rearrangements were also btained; these generally had small deletions flanked by epeated sequences. These clones are likely to be artifacts Toduced by improper alignment and looping out of the emplate strand during the elongation phase of the PCR. 24- Evolutionary conservation of Pax3 sequences A cDNA sequence of the Coturnix coturnix (quail) homologue of Pax3 (GenBank accession number AFOOO673) lacks the methionine start codon and the entire 5'UTR. The overall amino acid sequence is 95% identical to murine and human Pax3. It is noteworthy that the quail Pax3 isoform is of the Pax3d type, lacking intron 8. The amino acids encoded by quail Pax3 exon 9 are identical to mouse and man. The nucleotide sequences are also well conserved. There is only one base (a third base change) in quail exon 9 that differs from the murine sequence. There are two positions (also third base changes) where the human sequence differs from the mouse and quail sequences. One of these positions is at the putative intron 9 donor site. In humans, the GT nucleotides bf the intron donor site are present and intron 9 can be pliced. However, in mouse and quail, the GT nucleotides in he donor site are not present, making it unlikely that it is functional intron in either mouse or quail. Little is known about the human PAX3 isoform containing xons 8—9—10, but since that isoform is probably not produced In either mouse or quail, it is unlikely to have any volutionarily conserved function. The identification of a quail Pax3 cDNA is valuable for everal reasons. First, the isoform identified does not >ntain intron 8 sequence, further supporting the 'olutionary and perhaps functional significance of the PAX3d 25 isoform. Second, the overall amino acid sequence identity between human and quail of 95% throughout the entire coding region is remarkably high and suggests the importance of maintaining the integrity of Pax3 in vertebrates. Finally, although the quail cDNA lacks any 5'UTR, it does contain about 1500 bp of 3'UTR. Comparison of mammalian and avian 3'UTR sequences may identify domains that are of functional significance, perhaps in regulating mRNA stability. Having found that the Pax3d isoform is conserved in quail, the next question to answer was whether the Paxflc isoform.has also been conserved. Quail genomic DNA, kindly provided by Dr. Karen Friderici, was amplified with primers flanking the putative intron 8 sequence. A PCR product of the expected size was amplified, cloned into pGEMCFT'Easy vector (Pramega) and sequenced (Figure 4: Sequence of Quail ax3 intron 8). Quail Pax3 intron 8 is 517 bp, as compared 0 497 bp of human PAX3 intron 8. Quail intron 8 contains utative intron donor, acceptor, and branchpoint sequences as xpected (Figure 1: Comparison of intron 8 to consensus otifs). If a quail Pax3c isoform is generated, the deduced amino cid sequence (EYCNAIVIDQLYVFLSATKKITGIDPNG*) encoded by the. etention of intron 8 shares no sequence similarity to the arresponding mouse and human sequences (KPWTF*). The lack f evolutionary conservation of the amino acids unique to the lX3C isoform suggests that the Pax3c isoform.has no olutionarily conserved function. BLAST analyses of the 263 deduced amino acids encoded by quail intron 8 retrieve no significant hits and no prediction can be made about its contribution to the function of Pax3. Further characterization of the evolutionary conservation of Pax3 exon 9 was performed by amplifying genomic DNA from.several divergent vertebrates with primers in the 3’ end of Pax3. PCR products were amplified, cloned and sequenced from rat, cow, dog, rooster, and rhesus monkey. The amino acids encoded by exon 9 are identical in each of these species. Also, the only species containing a predicted intron 9 donor site is human, even the rhesus monkey does not contain a sequence likely to be functional. This suggests that intron 9 is a new intron, created sometime after the divergence of new and old world monkeys about 40 million years ago (Friedman et al., 1985; Sibley and Ahlquist, 1984). From these evolutionary studies, I conclude that exon 9 as been conserved throughout the evolution of vertebrates nd is likely to be functionally significant. 27' Waardenburg syndrome sequencing project Waardenburg syndrome type 1 (W81) is caused by mutations in PAX3 (Baldwin et al., 1992; Morell et al., 1992; Tassabehji et al., 1992). Approximately 100 mutations have been reported in PAX3 in WS1 patients over the past eight years (Attaie et al., 1997; Baldwin et al., 1995; Carey et al., 1998; Chalepakis et al., 1994a; Lalwani et al., 1995; Pasteris et al., 1993; Sotirova et al., 2000; Tassabehji et al., 1994a; Tassabehji et al., 1993; Wildhardt et al., 1996). Most mutations appear to be unique, with only a couple having been found in more than one presumably unrelated family (Baldwin et al., 1995). Mutations have been identified in each of the first eight exons. Although a great deal of effort has been made to identify all mutations, there are a number of WSl families in which no mutation can be found in the first eight exons. I screened 39 WS families for utations in exons 9 and 10 and the flanking splice sites. NA samples were kindly provided by Dr. Andrew Read (Manchester), Dr. Clinton Baldwin (Boston) and Dr. Walter ance (Virginia). A 1.8 kb PCR product encompassing exons 8, 9 and 10 and [bout 1 kb of 3'UTR was amplified from the genomic DNA of WSl irobands for which no mutation had been identified in the nown coding sequences. This fragment was purified and lectrophoresed to check for large rearrangements within the agion. .No large rearrangements were identified. The .28 PCR product was sequenced directly with primers flanking exons 8, 9 and 10. 29 A PAX3 exon 8 mutation in a WS1 proband One individual was found to have a 5 bp insertion in the middle of exon 8. This insertion changes the reading frame and results in a premature stop codon a few amino acids later. The mutation introduces an EcoRII site, allowing for easy genotyping of other members of the family as well as normal control individuals. This information was forwarded to Dr. Clinton Baldwin, the principal investigator that provided this particular WS1 sample. He will verify the presence of the mutation in other family members as well as reevaluate the clinical features of affected individuals. To my knowledge, this work has not yet been completed. No other mutations were identified that are likely to cause WSl in any of the other 38 samples screened. There are several possible explanations for this observation. First, since exons 9 and 10 are very small, we may not have looked at enough probands to identify a mutation in these exons. Second, it is also possible that mutations in exons 9 and 10 do not cause any gross phenotypic abnormalities. This would suggest that perhaps the extreme COOH terminus of the PAX3 protein has no important function. Third, mutations in the 3’ end of PAX3 may not cause WSl, but rather a less obvious phenotype that is not classified as WS1. It would be very interesting to screen a large set of patients that have henotypes similar to, but not classified as WSl. Perhaps 'ndividuals with mutations in exon 9 exhibit only some of the raits commonly associated with WSl, for example they may 30 present with nonsyndromal deafness, pigmentation anomalies, or subtle craniofacial irregularities. 31 PAX3 transcribed polymorphisms While sequencing the WSl probands, several polymorphisms were identified (Table 2: PAX3 polymorphisms). These polymorphisms will be valuable for the study of allelic differences in PAX3 expression patterns. The polymorphisms include (T)n (intron 8), C/T (intron 8), (GT)n (exon 10), (G)n (intron 9) and G/C (exon 10). This information has been distributed to investigators studying inter-allelic expression patterns of PAX3. These investigations might reveal the mechanism of reduced penetrance of the individual traits associated with W81. 32 Cloning mouse and human Pax3 cDNAs Primer and experimental design Several Pax3 expression constructs were needed for various experiments (Table 3: Pax3 Expression Constructs). In order to synthesize proteins in Vitro, full—length Pax3 cDNAs were cloned into pGEM—7Zf(+) vector (Promega). This vector contains a multiple cloning site flanked by both T7 and T3 RNA polymerase promoter sites. For in Vivo expression of Pax3, constructs were made using the pcDNA3.1/Zeo(+) vector (Invitrogen). This vector contains the cytomegalovirus immediate—early (CMV) promoter for expression in Vivo as well as zeocin resistance and ampicillin resistance genes. We wanted to have both mouse and human expression constructs so that we could identify and characterize the binding and transactivation of both human and mouse enhancer elements. Since we were not sure which isoform (Pax3c or Pax3d) was the more functionally significant form of the gene, expression constructs were prepared for both. We also wanted each construct to be represented by both forward and reverse orientations of the cDNA. The reverse orientation served as a negative control for various experiments. RT~PCR was used to generate cDNA fragments that included 11 of the ORF, some 5'UTR, and a mutated splice site at ither the intron 8 or intron 9 donor site. These mutations ere necessary in order to ensure that aberrant mRNA splicing 33 would not occur linking Pax3 coding sequences to vector sequences. Although it may have been better to include native 3'UTR, the cloning strategy would have been much more difficult and was omitted. The RNA source for mouse RT—PCR was a 9.5 day p.c. mouse embryo. The RNA used to generate human PAX3 clones was isolated from human skeletal muscle (Clontech) and/or a lymphoblast cell line. 34 Cloning and sequencing results Mouse and human PAX3 expression constructs were generated as described in the previous section (Table 3: Pax3 Expression Constructs). Plasmid DNA was prepared for each construct, and the inserts were sequenced with both vector and internal primers. Many mutations were identified and only a few clones contained long stretches of coding sequence free from.mutations predicted to change the amino acid sequence. Two reasons could explain the high frequency of mutations observed in the expression constructs. First, a standard Taq DNA polymerase was used instead of an enzyme with higher fidelity. Second, the PCR may have been performed for too many cycles, resulting in an accumulation of mutations in the final products. Due to the high mutation frequency in the clones, a second strategy was used to produce the expression constructs. Large restriction fragments from.regions in the existing clones that were free from.mutations were ligated together to form full-length mutation—free constructs. Fewer mutations are generally introduced during restriction digestion and ligation than with PCR, so this strategy was less likely to introduce mutations. Completed constructs were sequence verified over the entire coding region. No mutations were introduced during the shuffling of fragments, and the Pax3 expression constructs were obtained. 35 Design of the negative controls In order to demonstrate that results of several experiments were due to the functional properties of Pax3, it was important to design a set of negative controls for each experiment. For EMSAs, western blot analyses, immunoprecipitations and CASTing procedures a negative control was prepared using the rabbit reticulocyte lysate system (Promega) and an expression construct containing a Pax3 or PAX3 cDNA in the reverse orientation such that RNA synthesis from.the T7 promoter generated the antisense mRNA molecule and no Pax3 protein. The negative control cDNAs also contained several point mutations that would prevent synthesis of Pax3 protein in the event that mRNA was made in allowed us to control all other conditions including RNA the correct sense. Use of a clone in the reverse orientation polymerase. I For co—transfection experiments with Pax3 and luciferase reporter constructs, an empty vector or a vector containing a mutant and reversed Pax3 cDNA was used. Frequently, co— transfections are compared against an empty expression ector, so this was probably sufficient for our purposes as ell. However, mutant and reverse clones were prepared and sed in some experiments, producing results indistinguishable rom an empty vector. 36 In Vitro synthesis of Pax3 protein Once Pax3 expression constructs were generated that were mutation free, a cell-free system was used to synthesize Pax3 protein. The transcription and translation, TNT, coupled rabbit reticulocyte system (Promega) offers several advantages over other methods of protein production. First, since it is a eukaryotic-based system, most post— translational modifications can occur, which would not have been true for a prokaryotic system (Martin, 1999). Second, since the transcription and translation steps are performed concomitantly, RNA purification steps are avoided. Finally, the protocol is technically simple and results appeared reproducible. Proteins were synthesized using rabbit reticulocyte lysate (Promega) in the presence of 35S—methionine and electrophoresed on a 4—20% gradient Tris—Glycine-SDS gel (Nbvex). A prominent band of approximately 56 kDa was observed (Figure 5: Pax3 protein synthesized in Vitro), consistent with the predicted and reported size of Pax3 protein (Goulding et al., 1991). Several additional bands were also observed when lower concentrations of SDS were included in the loading buffer. The rabbit reticulocyte lysate contains approximately 100-200 mg/ml of endogenous protein. For a standard synthesis, 25 ul of that lysate is used in each TNT reaction (2.5—5 mg of protein/reaction). Under the manufacturer's conditions, luciferase control 37’ vectors are reported to produce between 100—500 ng of protein per reaction. Assuming that a similar amount of Pax3 is produced, there is greater than a 10,000—fold excess of endogenous protein to Pax3 protein present in each reaction. It seems likely that the high concentration of endogenous rabbit reticulocyte protein present in the reaction inhibits complete denaturation of the Pax3 protein unless high concentrations of SDS are present. This should be considered for other proteins expressed using the rabbit reticulocyte lysate system. In conclusion, constructs were obtained for the in Vitro and in vivo expression for both human and mouse Pax3c and Pax3d isoforms. The in vitro synthesis of Pax3 protein using these constructs produces a protein of the approximate predicted size and is likely to represent the desired Pax3 protein. 38 Functional properties of Pax3 isoforms DNA—binding activity Pax3 is a transcription factor whose function is presumably to bind enhancer elements and stimulate transcription. Pax3 and other Pax family members are related to the paired gene in Drosophila and share similar DNA— binding attributes (Bopp et al., 1986; Chalepakis and Gruss, 1995; Chalepakis et al., 1994c; Frigerio et al., 1986). A number of short Oligonucleotides have been shown to interact with Pax3 in Vitro (Chalepakis and Gruss, 1995; Chalepakis et al., 1994c; Goulding et al., 1991; Phelan and Loeken, 1998; Vogan et al., 1996). Before attempting to identify the genes regulated by Pax3, the DNA—binding specificities of the novel Pax3d isoform.and the previously described Pax3c were compared. The first experiments aimed at determining the relative ability of Pax3c and Pax3d to bind a short oligonucleotide called e5, which is derived from a sequence upstream of the Drosophila even-skipped gene and bound by the Paired protein (Goulding et al., 1991; Hoey and Levine, 1988; Hoey et al., 1988; Levine and Hoey, 1988). The Pax3c isoform.had previously been shown to bind to the e5 oligonucleotide under similar conditions (Goulding et al., 1991). NO differences were observed in the ability of Pax3c or Pax3d to bind to the e5 oligonucleotide in gel—shift assays (Figure 6: EMSAs with Pax3c and Pax3d). In later studies with several different 39 genomic DNA fragments identified by CASTing (discussed in Chapter 2), no differences in the DNA—binding specificities This does not mean that of Pax3c and Pax3d were observed. My data and observations there are no differences in vivo. only apply to the in vitro conditions described in more detail in the next chapter. 40 Transactivation potentials of Pax3 isoforms Pax3c and Pax3d differ from one another only at their carboxy-termini. Previous work suggested that the terminal one—third of the Pax3 protein is important for the transactivation properties (Chalepakis et al., 1994b). The DNA—binding elements are located in the first two-thirds of the protein (Goulding et al., 1991). We therefore hypothesized that if Pax3d was going to have a different function than Pax3c, it was likely that it would be in the transactivation potential of the protein, rather than in DNA— binding. A recent paper described the regulation of cMET by Pax3 (Epstein et al., 1996). One of the experiments the authors presented was the transactivation of a cMET promoter— luciferase reporter construct in vivo by Pax3 (Epstein et al., 1996). We used a similar strategy to compare the transactivation potentials of Pax3c and Pax3d. cMET reporter constructs and control Pax3 expression constructs were provided by Dr. Jonathan Epstein (university of Pennsylvania). In order to use a more convenient vector in future transactivation experiments, constructs were generated similar to those used by Epstein and coworkers, but instead using the pGL3 luciferase vectors (Promega). Clones ere sequenced to confirm the expected sequence and orientation of the cMEijromoter. P19 embryonal carcinoma cells are an undifferentiated ouse cell line that can be induced to differentiate into 41 neuronal cell types (neurons and glia) upon addition of retinoic acid (MCBurney et al., 1982). When cultured in the presence of low concentrations of DMSO, P19 cells differentiate into cardiac and skeletal muscle (MCBurney et al., 1982). Pax3 is induced upon addition of either retinoic acid or DMSO to P19 cells (Natoli et al., 1997; Pruitt, 1992). Pax3 is also involved in both neurogenesis and myogenesis (Bennett et al., 1998; Bober et al., 1994; Epstein et al., 1995; Franz et al., 1993; wada et al., 1997). Therefore, P19 cells were chosen for initial Pax3 transactivation experiments. Trial experiments were performed using the fo—lO, fo— 20, and fo—SO reagents (Promega) to determine the proper highest transfection efficiency with a control fi—gal transfection conditions for P19 cells. fo—50 produced the l construct and was used in subsequent experiments. The | optimal amount of DNA per transfection was also determined empirically to be 1 pg per well in a 6—we11 dish. Co—transfections were performed with a ratio of 2:1 expression vector to reporter construct and later a ratio of 5:221 expression:reporterzgalactosidase vectors in order to normalize luciferase activity to galactosidase activity. - This allowed for normalization of transfection efficiencies “between samples. Experiments normalized to either total protein or B—gal ctivity suggested that Pax3c and Pax3d.have very similar ransactivation potentials in this system (Figure 7: Pax3c 42 and Pax3d activation of cMET). .No significant differences in the transactivation potentials of Pax3c and Pax3d were observed. The results from control vectors provided by Dr. Jonathan Epstein were similar to those previously reported (Epstein et al., 1996). In conclusion, a novel isoform of Pax3, named Pax3d, has been characterized. Pax3d has been conserved since the divergence of birds and humans. Pax3c is conserved between mouse and man, but not birds. Pax3d is represented by 8 EST clones in GenBank. Pax3c is not represented in the public EST databases. Pax3d and Pax3c both bind and transactivate a cMET luciferase reporter construct. No difference in function was observed. For these reasons, Pax3d was used for additional studies of Pax3 function. 43 Chapter 2: CASTing for Targets of Pax3 Pax3 antibodies Design of Pax3 peptide antigens In order to co—immunoprecipitate genomic DNA fragments bound by Pax3, we needed to generate antibodies against Pax3. We first considered fusing Pax3 to GST, polyhistidine, or some other epitope tag and using antibodies against the tag for the immunoprecipitations. However, we were concerned that addition of an epitope tag might alter the DNA-binding activity and result in the immunoprecipitation of DNA fragments not bound by Pax3. Therefore, this idea was rejected in favor of an approach that would not require the ddition of any extraneous domains to the Pax3 protein. We decided to generate polyclonal rabbit antibodies gainst Pax3. There are several ways to prepare an antigen 0 which antibodies can be generated. For example, a fusion rotein can be produced in E. coli that includes a portion of he desired protein and an epitope tag. The fusion protein s purified via the epitope tag and used as antigen for jection into a rabbit host. This approach is generally ry successful, but requires the creation of an expression nstruct(s). The approach is also limited by the size and lubility of the fusion protein. A second strategy, and the one we used to generate i lyclonal antibodies against Pax3, was to synthesize a short 44 polypeptide to a region of interest within the protein, conjugate the polypeptide to a carrier and inject the complex into a rabbit. This method is faster than creating a fusion protein and has the added benefit of choosing a very small stretch of amino acids to produce an antigen. We decided to generate antibodies that would specifically recognize Pax3c or Pax3d as well as antibodies that would interact with either isoform. In designing the peptides, I consulted Dr. Robert Wenthold (National Institute on Deafness and Other communication Disorders) and Dr. Michael Shiui (Princeton Biomolecules). Both investigators had a history of success in designing peptides for antibody generation. The main considerations we used in choosing the peptides were the length, hydrophobicity, number of charged residues, proline content, and absence of internal cysteines. Specific to the needs of our project, we also chose peptides away from the DNA-binding domains, peptides unique to the Pax3c and Pax3d isoforms, and peptides that were conserved between mouse and human Pax3, but not present in the most closely related Pax protein, Pax7. The peptides synthesized were: Pax3c—specific [NH2]— QKPWTF—[COOH], Pax3d—specific [NHzl-QAFHYLKPDIA—[COOH] and Pax3 [NH2]—§SYQPTSIPQAVSD~[CONH2] (Table 4: Pax3 antibodies). The first peptide was designed so that antibodies against it ould specifically interact with both human and mouse Pax3c. he second peptide was designed so that antibodies against it 45 would specifically interact with both human and mouse Pax3d. The third peptide was designed so that antibodies against it would interact with both human and mouse Pax3c, Pax3d and human PAX3/FKHR proteins. Cysteine residues (underlined) not encoded by Pax3 were added to the amino termini of the peptides to allow conjugation via the sulfhydryl group to the carrier, Keyhole Limpet Hemacyanin (KLH). Each conjugated peptide was injected into two New Zealand white rabbits. Rabbits PB30/PB31, PB32/PB33, and PB34/PB35 were injected with the Pax3c—specific, Pax3d—specific, and Pax3 epitopes, respectively. Blood was drawn at multiple timepoints, and antisera were collected and tested against Pax3 peptides and in vitro synthesized proteins as discussed below. 46 Evaluation of antibodies against Pax3 Antisera were first tested against the peptides used to generate the antibodies. Non—conjugated peptides were resuspended in a 1:10 dilution series and spotted on nitrocellulose membranes. Antisera from all six rabbits were tested against the three peptides. The secondary antibodies used were donkey anti—rabbit coupled to horseradish peroxidase and in later experiments goat anti—rabbit coupled to horseradish peroxidase. Signals were detected with SuperSignal Chemiluminescent Substrate (Pierce). In each case, the antisera recognized the peptide to which it was generated. Cross—reactivity to the other peptides was not observed. I therefore concluded that antibodies had been generated by the rabbits against the KLH—coupled Pax3 peptides. The next step was to test the ability of the antibodies to bind Pax3 protein. Pax3 protein was synthesized using rabbit reticulocyte lysate (Promega) and electrophoresed on a 4—20% gradient Tris-Glycine—SDS gel (Nbvex). The proteins were transferred to nitrocellulose membranes and western blotted using antisera from the six rabbits. Each of the antisera recognized the respective Pax3 proteins as predicted. For example, antisera generated against the peptide specific to the Pax3d isoform.interacted with Pax3d, ut not Pax3c (Figure 8: Western blot analysis with antibody B33). Antisera generated against the peptide specific to 47 the Pax3c isoform interacted with Pax3c, but not Pax3d. Antisera generated against the peptide in common to both Pax3c and Pax3d recognized Pax3c, Pax3d, and PAX3/FKHR (Figure 9: Western blot analysis with antibody PB35). The construct used to synthesize the Pax3—Spd protein shown in Figure 9 was generated by site mutagenesis of a Pax3d—pGEM plasmid. The resulting protein is identical to the Pax3d isoform except that it contains the missense mutation in the paired domain that is responsible for the Spdxmutant mouse. The PAX3/FKHR expression construct was kindly provided by Dr. Frederic Barr (University of Pennsylvania). For all antisera, similar results were obtained for both human and mouse Pax3 protein. Although all antisera recognized the expected proteins, there were slight differences between individual antisera in the signal intensity visualized on the western blots. This could have been due to either differences in antibody titer or affinity. The antisera that produced the most intense signals on the western blots were selected for additional experiments. IgG was purified from the antisera using the Econo—PacC)Serum IgG Purification system (Bio—Rad) and used for co—immunoprecipitations and EMSAs. 48 Optimizations Buffer selection Before performing co—immunoprecipitations with Pax3, it was necessary to develop conditions that would enable Pax3 to interact specifically with its target DNA sequences. Many buffers have been described that promote the binding of transcription factors to specific DNA sequences (Caubin et al., 1994; Chodosh et al., 1988; Vortkamp et al., 1995). After reviewing the EMSA literature for buffers that worked well in other systems, a set of 8 buffers was prepared and tested for use with Pax3 (Table 5: Buffers for Pax3 EMSAs and IPs). The buffers varied greatly in content, particularly in the KCl concentration, ranging from 10—100 mM. EMSAS were performed to test each buffer condition using Pax3 protein synthesized in Vitro, a radiolabelled e5 oligonucleotide, and antibodies against Pax3. Reactions were performed in parallel and electrophoresed on one gel. Results indicated that several buffers promote Pax3 interaction with the e5 oligonucleotide. One buffer, B7, was chosen that produced the best results. The most important component of the buffers in influencing the binding of Pax3 to the e5 oligonucleotide appeared to be the KCl concentration. Increasing the amount of KCl decreased the amount of radiolabelled probe bound by .Pax3. I assume that decreasing the concentration of KCl also esulted in a decrease in the specificity of the Pax3—DNA 49 interaction, so an important control is to test the buffer with an oligonucleotide to which Pax3 does not bind. A second set of EMSAs was performed to test the binding activity of Pax3 to the Ie oligonucleotide. Pax3 was previously shown not to bind to the Ie oligonucleotide under conditions that permitted binding to e5 (Chalepakis et al., 1994b; Goulding et al., 1991). The positive control lanes using a radiolabelled e5 oligonucleotide were indistinguishable from previous results. The Ie oligonucleotide failed to produce any Pax3—dependent shift. It must be noted, however, that a protein in the rabbit reticulocyte binds to and shifts both the e5 and Ie Oligonucleotides and forms a complex that migrates to nearly the same position as the Pax3 complexes. This Pax3— independent shift has been observed by others performing similar assays using rabbit reticulocyte lysate to express Pax3 protein (Watanabe et al., 1998). Although the Pax3— independent complex migrates to nearly the same position as the Pax3-dependent complex, Pax3-dependent binding can be distinguished by super—shifting the complex with Pax3 antibodies. The Pax3—independent complex is not super— shifted with Pax3 antibodies. This is an essential control for all EMSAs using Pax3 protein synthesized from.rabbit reticulocyte lysate. 50 specific Pax3-DNA targets. Radiolabelled e5 oligonucleotide was selected as the positive control to develop conditions for the co-immunoprecipitations. Pax3 protein and e5 oligonucleotide were incubated as determined by the EMSA optimization experiments using buffer B7 (Table 5: Buffers for Pax3 EMSAs and IPs). Following incubation with Pax3 antibody, GammaBindW‘G-Sepharosem'(Amersham Pharmacia Biotech) was added to the reaction. The reaction was incubated for an additional hour at room temperature with constant rotation. The complex was washed 4 times with 1 ml of reaction buffer. After the final wash, the immunoprecipitated DNA was recovered and the amount of radiolabelled probe was measured in a scintillation counter. ?he amount of radiolabelled probe recovered from.a reaction :ontaining Pax3 protein, e5 oligonucleotide, and Pax3 ntibody were compared to a reaction that contained e5 ligonucleotide and Pax3 antibody, but lacked Pax3 protein Figure 10: Immunoprecipitation of Pax3—DNA complexes). imilar reactions were performed using different Pax3 roteins and antibodies as well as the Ie negative control 1A probe. Although low levels of counts were observed in negative .x3 controls, co-immunoprecipitations including Pax3 always 51 resulted in the recovery of significantly greater counts with the e5 oligonucleotide. Results using antibodies PB31, PB33, and PB35 were comparable. No differences were observed between Pax3c, Pax3d, PAX3C, PAX3d, and PAX3/FKHR proteins. Low counts, similar to the counts recovered in the negative protein controls with the e5 oligonucleotide, were recovered from.immunOprecipitations using the Ie control probe, independent of Pax3 protein. We concluded that conditions were established for specific co-immunoprecipitation of DNA fragments bound by Pax3 using the PB31, PB33, and PB35 antibodies and Pax3c, Pax3d, PAX3C, PAX3d, and PAX3/FKHR proteins. 52 CASTing Methodology WGPCR strategy and rationale The polymerase chain reaction (PCR) was invented in the mid—1980's and like the mechanism of PCR, an amplification of methodologies based on the technique were developed in the late 80s and early 90s (Erlich et al., 1991; Templeton, 1992). One such method, Whole Genome PCR (WGPCR), was designed to identify the DNA fragments bound by a specific transcription factor (Kinzler and Vogelstein, 1989). WGPCR has since been used to describe a divergent set of techniques ranging from the amplification of genomic DNA from a single cell to PCR—based comparisons of prokaryotic genomes (Harper and Wells, 1999; Miteva et al., 1998; Zhang et al., 1992). A more suitable name for the identification of DNA fragments bound by a transcription factor using the PCR is Cyclic Amplification and Selection of Targets (CASTing) (Wright et al., 1991; Wright and Funk, 1993). The basic premise of ASTing is that DNA fragments bound to a DNA—binding protein an be separated from unbound DNA and amplified by PCR. everal rounds of selection for binding to the protein and plification by the PCR result in an enrichment of DNA ragments bound by the protein. We used a CASTing strategy to isolate genomic DNA ragments bound by three proteins: murine Pax3d, human PAXBd, nd Human PAX3/FKHR. We isolated targets of murine Pax3d cause the mouse model Splotch has been widely used for the 53 study of neural tube defects and abnormal muscle and heart development (Bober et al., 1994; Conway et al., 1997a; Conway et al., 1997b; Dickman et al., 1999; Goulding et al., 1994; Tajbakhsh et al., 1997; Tremblay et al., 1998). Identification of putative targets of Pax3 will allow researchers to elucidate the genetic pathways in which Pax3 functions and how these pathways are perturbed in animals with mutations in Pax3. Targets of human PAX3d were isolated for several reasons. First, we were not sure if mouse and human Pax3 would regulate the same genes. The amino acid sequences of Pax3 and many other transcription factors have been conserved for millions of years, but humans are by definition distinct from other species. Perhaps these differences are due in part to differences in the genes regulated by transcription factors. Second, we selected targets of PAX3 with the hope of identifying an overlapping set of homologous genes that are putative targets of Pax3 in both species. Finally, the in vitro approach we used has advantages over other experimental approaches like differential display, in that use of primary tissues such as embryos is not required. CASTing is therefore ideal for the study of human gene egulation. Targets of PAX3/FKHR were isolated because understanding he cause of alveolar rhabdomyosarcoma (ARMS) is key to the reatment of this deadly pediatric cancer. Since both PAX3 NA-binding domains are intact in the PAX3/FKHR fusion 54 protein, it has been suggested that the fusion protein aberrantly regulates the same genes as normal PAX3 (Fredericks et al., 1995; Sublett et al., 1995). The formation of tumors may therefore be due to the activation of transcription of targets of PAX3 at inappropriate times, in different tissues, or to varying degrees. It is also possible that the DNA—binding properties of the PAX3/FKHR fusion protein are altered, resulting in the regulation of genes not normally under the control of PAX3 (Cao and wang, 2000). Although most in vitro and cell culture experiments suggest that the PAX3/FKHR fusion protein binds sequences similar to endogenous PAX3 and activates them.with increased veracity, no evidence is available regarding the DNA—binding properties of the fusion protein in vivo in ARMS cells. Identification of the targets of PAX3/FKHR and comparison of the regulation of these targets by PAX3 and PAX3/FKHR might explain how the creation of the PAX3/FKHR fusion protein results in ARMS. 55 Preparation of genomic DNA High molecular weight genomic DNA from human and C57Bl/6J mice were partially digested with Sau3AI to an average size of approximately 1500 bp as determined by agarose gel electrophoresis. Following digestion, the DNA was phenol extracted and ethanol precipitated. Double- stranded linkers (TB97/TB98, Table 1) with overhangs complementary to the 5' GATC overhang produced by the Sau3AI digestion were ligated to the genomic DNA. The linkers were tested prior to ligation to ensure that PCR amplification of genomic DNA with the primers did not generate spurious fragments. The partially digested, linker—ligated DNA was electrophoresed on a 1.5% agarose gel. DNA was purified from the gel in two size classes, 250—1000 bp and 1000—3000 bp. Before initiating the CASTing strategy, aliquots of the genomic DNA were amplified with primers matching the linkers. A series of PCRs were performed with varying numbers of cycles. PCR products were electrOphoresed on an agarose gel and visualized by ethidium bromide staining. After 10 cycles of PCR amplification, the size distribution Of the DNA products was approximately equal to the input DNA from.both size classes. Amplification for additional cycles produced products with a decreased average size. This is probably the result of an amplification bias for smaller PCR products. Therefore, it is critical to control the number of PCR cycles during the CASTing strategy in order to preserve the size 56 distribution of DNA molecules. This extra precaution to avoid overamplification is frequently excluded from CASTing strategies and may explain the reduced success of the process (Harris et al., 2000). The linker that was ligated to the genomic DNA fragments for the Pax3 CASTing library was generated by annealing primers TB97/TB98. I detected no problems when clones from the Pax3 CASTing library were sequenced (section CASTing Methodology—Description of WGPCR strategy). However, when the NIH Intramural Sequencing Center (NISC) attempted to sequence these clones, an artifact arose around the linker that resulted in a decrease in sequence quality. The sequencing difficulties were partially overcome by changing from.ABI PriaflO BigDyerM Terminator reagents to ABI PriamD BigDye““Eminem reagents (PE.Applied Biosystems). However, Dr. Jeff Touchman (NISC) suggested that we redesign the linkers before preparing additional CASTing libraries to avoid future difficulties. Three new sets of linkers were designed and tested as described above. The linker generated by annealing primers TB323/TB324 was selected and ligated to freshly prepared Sau3AI partially digested genomic DNA as described above. This DNA was used to generate the next three CASTing libraries. Sequence analyses of these libraries performed by the NISC were again sub-optimal with the Big Dye Terminator reagents, but were satisfactory with Big Dye Primer reagents. It seems likely that the source of the DNA sequencing problems encountered by 57' .. — .—_—-.—- .7 ~ “u--..~—=-..- ...'. the NISC was probably not due the sequence of the linkers as had been suggested. Since the sequencing was satisfactory using the Big DyerM Primer reagents, no additional optimization of linker sequence was performed. 58 - i Win-“.11..- _ . . _ _ . _ _. _ . . Ilia-fl Generation of Pax3 CASTing library The first CASTing library was generated using mouse genomic DNA, mouse Pax3d protein, and antibody PB33. The library was created by incubating approximately 250—500 ng of linker—ligated, size—selected genomic DNA with Pax3d under conditions described above. The Pax3/DNA complexes were immunoprecipitated with antibody PB33, and the DNA was recovered in 100 ul of HJL Am.aliquot of this DNA was used as template for PCR amplification with a primer hybridizing to the linker. A cocktail of the PCR mixture was prepared and aliquoted to separate reactions, each reaction was amplified for various numbers of cycles (i.e. 10, 15, 20...). The reaction products were electrophoresed and stained with ethidium bromide, EthBr. After a rough estimate was made to determine the number of cycles needed to detect products by EthBr staining, additional reactions were amplified for various numbers of cycles (i.e. 15, 16, 17, 18 and 19). Aliquots of these products were electrophoresed and stained. The products were purified from the reaction two cycles fewer than could first be visualized by EthBr staining. For example, if products were visible after 18 cycles, the products from.the reaction undergoing 16 cycles of amplification were purified and used for the next round of selection. This was done to reduce the amplification bias for smaller products that occurs with increased cycles. 59) .. _______.__._._._..._._.._.._.-.___.._._.....,-__-_..__ -_ _ . . . . ”hm: After the first round of selection and amplification using the smaller size class of DNA fragments (250-1000 bp), the products displayed a similar size distribution as the starting material. However, the size distribution of the products from.the larger size class did not resemble the input DNA. Instead, a smear extended from the expected size all the way up to the well of the gel. I hypothesized that the smear was the result of chimeric DNA fragments being formed during the PCR. I size selected the products in the 1000—3000 bp range by gel purification. However, upon amplification of the gel—purified products by PCR, a large smear was again observed. The high molecular weight smear was observed as soon as any products were visible. Concerned that this smear was due to the formation of chimeric products, I did not generate CASTing libraries with the larger size class of fragments. I did not want to select chimeric fragments that contained a Pax3 binding site from one part of the genome and a gene from.another part of the genome. I was concerned that this might result in the recovery of too many false positive genes that would need to be evaluated in later more time consuming steps. Elimination of the larger size class of DNA fragments reduced the probability of detecting a gene whose Pax3 binding site is further than a few hundred nucleotides from.a transcribed portion of the gene (see section Genes identified). Three rounds of selection and amplification were performed to make the first CASTing library, subsequently 60 designated the Pax3 or “cq” CASTing library. Library cq was cloned into pGEMCFE'Easy vector (Promega) and transformed into DHlOB cells. Transformation of 2 ul of the ligation reaction produced thousands of ampicillin resistant colonies. The majority of these, estimated at greater than 95%, contained inserts. 61 Pilot analyses of clones from Pax3 library Before proceeding with large scale analyses of the cq library or the preparation of additional libraries, a small set of random colonies was examined. Eight white colonies were randomly selected and amplified directly with primers TB44/TB45. These primers hybridize to sequences within the pGEMCFT'Easy'vector (Promega), flanking the insert. The PCR products were purified and sequenced. The vector and linker sequences were removed and the inserts were MASKed for repetitive elements using RepeatMasker (Smit, AFA & Green, P RepeatMasker at http://ftp.genome.washington.edu/RM/ RepeatMasker.html). The masked sequences were compared to the GenBank nr and dbest databases using BLAST v.2.0. Blast results indicated that three of the eight clones were similar to previously described genes. The remaining five sequences were not similar to any GenBank entries. None of the eight clones were redundant. One reason for initially characterizing a few random clones from.the library was to test if they contained Pax3 binding sites. DNA fragments could have been carried over during the immunoprecipitations that did not interact with Pax3. This type of background was expected and was the reason for multiple rounds of selection and amplification (Kinzler and Vogelstein, 1989; Kinzler and Vogelstein, 1990). The question is how many rounds of selection and amplification are sufficient to adequately enrich for Pax3 62 binding sites. Presumably if too few rounds are used, clones will be recovered that do not contain Pax3 binding sites. If too many rounds are included, a selection for the fragments that amplify the most efficiently or are bound by Pax3 the most tenaciously may result. I decided to perform three rounds of selection and amplification and then determine the percentage of clones that contain Pax3 binding sites under the in vitro conditions used in the CASTing selection. The eight random inserts amplified and sequenced above were radiolabelled and immunoprecipitated with Pax3. The counts recovered were compared to immunoprecipitations with the negative Pax3 protein control (Figure 11: Immunoprecipitation of Pax3 CASTing clones). All eight of the random clones were immunoprecipitated with Pax3, but not with the negative Pax3 protein controls. Therefore, three rounds of selection were sufficient to remove most of the fragments not bound by Pax3. Furthermore, since none of the products were immunoprecipitated with the negative protein control, the selection of these clones in the CASTing strategy was due to a specific interaction with Pax3 and not the result of an interaction with a protein in the reticulocyte lysate, IgG, or GammaBindTMGSepharose'TM that were also present in the co—immunoprecipitations. Two PCR products from an unrelated project were also labelled and immunoprecipitated as negative DNA controls. The two negative control DNA fragments were not co— immunoprecipitated with Pax3. 63 Another pilot experiment was conducted to determine if the eight clones were derived from contiguous sequence in the mouse genome. Primers pairs were designed to sequence at the ends of the DNA fragments and used to amplify mouse genomic DNA. In each case, a PCR product of the expected size was amplified. We concluded that the eight clones were derived from.contiguous mouse genomic sequence. In conclusion, analysis of the eight clones provided evidence that the cq library consisted of clones that contained Pax3 binding sites, suggesting that three rounds of selection and amplification were sufficient to enrich for putative Pax3 binding sites. Furthermore, the eight clones appeared to be derived from contiguous regions of genomic DNA, were balanced in GC content, and in some cases matched sequences designated as genes in GenBank. The library was therefore ready to be analyzed on a larger scale. 64 Generation of PAX3 and PAX3/FKHR libraries The linkers used to generate the Pax3 CASTing library were replaced with linkers made with primers TB323/TB324. Other than that substitution, the strategy for creating the next three CASTing libraries was the same as was used to generate the cq library. The libraries were prepared with human genomic DNA size—selected to be 250—1000 bp and protein synthesized in vitro using the rabbit reticulocyte system. The proteins included PAX3d, PAX3/FKHR and the same negative control described above for EMSAs and IPs. The number of cycles of PCR for each round of amplification was again determined empirically as two cycles before the products could be visualized by EthBr staining (Table 6: Summary of CASTing libraries). A total of three rounds of selection and amplification were performed. 65 ..___.men—H. .._.u. - _ . .-. _ CASTing Results Clone sequence analysis Clones from the CASTing libraries were sequenced en masse. PCR products from approximately 500 clones were sequenced using ABI PrismOBigDyeTM Terminator Cycle Sequencing Ready Reaction reagents (PE.Applied Biosystems). Another 3000 clones were sequenced using plasmid miniprep DNA and ABI Prism® BigDyeTM Primer Cycle Sequencing Ready Reaction reagents (PE.Applied Biosystems). In all, high quality sequence data were obtained for 1260 clones from the Pax3 (cq) library, 1203 clones from the PAX3 (ev) library and 1088 clones from.the PAX3/FKHR (ew) library (Table 6: Summary of CASTing libraries). The vector and linker sequences were trimmed, and the sequences were assembled in a text document in FASTA format. The total number of bases of high quality sequence generated from.the cq, ev, and ew libraries after removal of vector and linker sequences was approximately 1,300,000 bp. The average sequence read lengths generated from clones in the Pax3, PAX3, and PAX3/FKHR CASTing libraries were 353 bp, 355 bp, and 357 bp, respectively. This estimate does not necessarily represent the insert size since only a single sequencing run was generated for most clones. The sequence of the entire insert from.many of the shorter clones was obtained, including the flanking sau3AI sites. A few clones of interest were later sequenced in the reverse orientation. 66 A large number of clones however were not completely sequenced. Therefore the average insert size of the clones is greater than the average sequence read. 67 ————- - «Iv—“ii Repetitive elements The sequences of the CASTing clones were scanned for interspersed repetitive elements using RepeatMasker version 4/21/99 (Smit, AFA & Green, P RepeatMasker at http://ftp. genome.washington.edu/RM/RepeatMasker.html). The clones were assembled in FASTA format and entered into the web browser. The clones were analyzed by the RepeatMasker program.and returned with the sequence of repetitive elements replaced with N's. A table of the repeat elements as well as a list of the type of repeat identified in each clone were also reported. A summary of the repeats identified in the cq, ev, and ew libraries is shown in Tables 7, 8, and 9, respectively. The cq library contained 444,813 bp of genomic sequence. Of that, 108,790 bp (24.46%) were repetitive sequences MASKed l by RepeatMasker (Table 7: Repetitive elements in CASTing library cq). The repeats consisted primarily of LTR elements (41%), LINES (34%), and SINEs (18%). DNA elements, simple repeats, low complexity sequences, small RNAs, and other unclassified repeats contributed the remaining 6% of repeat sequence. No satellite DNA was identified in the cq library. The ev library contained 426,527 bp of genomic sequence. Of that, 157,962 bp (37.03%) were repetitive sequences MASKed by RepeatMasker (Table 8: Repetitive elements in CASTing library ev). Again the majority of the repeats were derived 68 _ _ _ ,_.__W.L-_-al -.._.... _., .... __ .. . . _ _ .. d" “ "' "t" -"-"""- " "'—-—" from LTR elements (24%), LINES (10%), and SINEs (58%). SINEs were much more common in this library than in the cq library. Similar results were found for the ew library that contained 388,527 bp of genomic sequence, 152,121 bp (39.15%) of which were repetitive sequences MASKed by RepeatMasker (Table 9: Repetitive elements in CASTing library ew). The majority of the repeats were similar to the ev library having been derived from LTR elements (25%), LINES (21%), and SINEs (48%). The overall percentage of repeats detected in the human libraries is about 50% greater than was detected in the mouse library. The reason for this is unknown. The human genome project has produced much more data than the mouse genome project, so perhaps the matrices used by the RepeatMasker software to detect repeats are more comprehensive for human sequences and thus more efficiently detects repetitive elements. The differences may also reflect a variation in the occurrence of repetitive elements contained in the human and mouse CASTing libraries. The MASKed sequences were assembled into FASTA format and BLASTed against the GenBank nr and dbest databases. Many repetitive elements were detected. Particularly, LTRs were not MASKed very well. Often, the ends of LTRs were not MASKed, resulting in the false identification of GenBank entries that contained LTRs. Perhaps the ends of LTRs are not well defined in the RepeatMasker matrix, causing incomplete detection and MASKing of these sequences. In 69 cut—we- -_..—._.—:...~. _ _ ... order to avoid these false positive BLAST results, all MASKed clones were edited prior to additional BLAST analyses. Short (less than 100 bp) stretches of DNA sequence flanking repeat elements were deleted. The MASKed repeats and approximately 20-50 bp of flanking sequence were also deleted. This extra step resulted in far fewer LTRs being identified in the BLAST analyses. It is possible that the deletion of these sequences prior to the BLAST analyses resulted in a failure to identify some targets. '70 .fl. I: 5‘. ._._4 . I . "L— ._— ._.ae#-flWc—E—bflma'. _ .. —.,— . "- 'Whl’ ‘— l ‘ Genes identified The MASKed and genomic sequences were BLASTed against the GenBank nr and dbest databases. A threshold of E=1e~10 was used as the lower limit for a "hit" to be returned in the results. This was done to eliminate the detection of entries with low sequence sindlarity. For the initial BLAST analyses, I was primarily interested in identifying transcribed units that included long stretches (>50 bp) of identical or nearly identical sequence. The genes identified by these BLAST analyses for the cq, ev, and ew libraries are shown in Tables 10, 11, and 12, respectively. Each row in tables 10—12 lists a clone number and the best match from either the GenBank nr or dbest databases. Genes identified in these BLAST analyses are discussed in detail in Chapter 3. 71 Clones in multiple libraries One of the hypotheses we wanted to test was whether the same target genes were identified in the cq, ev, and ew libraries. If the same DNA fragments were found in multiple libraries, it would provide evidence that enrichment for Pax3 binding sites occurred. If the libraries were not enriched and still consisted of random.genomic DNA sequence, then finding the same DNA fragment multiple times in different libraries would be extremely unlikely. The libraries were BLASTed against one another. The minimum thresholds were E=1e“20 for human to human library comparisons and E=1e‘6 for mouse to human comparisons. Several clones were identified whose sequence was represented in multiple libraries (Table 13: Sequences in multiple CASTing Libraries). Only clones with nearly identical sequences were considered positive for comparisons between human libraries. Less stringent criteria were used for human and mouse comparisons to allow for changes that have occurred in homologous DNA fragments since the divergence of mouse and man. A few mismatches were also permitted in human to human comparisons to allow for mutations generated during the multiple rounds of PCR amplification in the CASTing process. Clones with similarity to rRNA were excluded from this list. The identification of the same or homologous DNA fragments in more than one library provided evidence that an enrichment had occurred. 72 Clone redundancy Sequence analyses of approximately 1000 clones from each library resulted in some sequences being identified multiple times within the same library. In order to estimate the redundancy within a library, a fragment from the second clone sequenced in the cq library was used as a probe to screen additional clones from the cq library. Clone—2 was chosen because it was identical to an interesting gene, mFAT, identified by BLAST analysis. The insert from.clone 2 was amplified, radiolabelled, and hybridized to filters containing approximately 5000 bacterial colonies from.the cq CASTing library. Twelve positive clones were picked and verified in a secondary screen. Positives from the second screen were sequenced and compared to the sequence of Clone—2 and other mFAT'clones in the cq library. Three types of inserts were identified. Each insert contained a central 444 bp fragment that included Sau3AI site at both ends (Figure 12:.mEAT clones in CASTing library cq). Some clones also included 49 bp of additional sequence at the 3' end. A third type of insert included the central 444 bp fragment and a different sequence at the 3' end. Primer pairs were designed to identify which of these fragments are present in genomic DNA. Mouse genomdc DNA was amplified with primers pairs TBl91/TB320, TB19l/TB321, and TB19l/TB322. Products of the expected size were observed with primers pairs TBl91/TB320 and TBl91/TB322, but not with 73 TBl91/TB321. The products were cloned and sequenced to confirm.identity. To summarize the results, three types of mFAT clones were found in the cq library. The first contains 444 bp and flanking Sau3AI sites. This fragment is entirely derived from.the mFAT’genomic locus. The second fragment contains the same 444 bp fragment and an additional 49 bp. An internal sauBAI site is present, and the 500 bp fragment can be amplified from genomic DNA. Therefore, it seems likely that this fragment was derived entirely of sequence from the mFAT locus. Recall that the first step in the preparation of the genomic DNA was a partial Sau3AI digest. Apparently, the Sau3AI site at the mFAT locus was cut in some fragments, but not in others, resulting in two types of clones derived entirely from.the.mFAT locus. The third type of insert contained the central 444 bp fragment and an additional stretch of sequence that was not from the.mFAT locus. This type of clone is likely to represent a chimeric fragment that arose during the ligation of linkers to the partially digested genomic DNA. Since three different types of clones were found each containing a central fragment, it is likely that the central fragment contains a Pax3 binding site and that this site has been enriched in the cq library. The putative binding site contained within this fragment was probably bound by Pax3 and selected at least three times (once for each type of clone) during each round of selection in the CASTing process. The 74 -— _.—--.__-.._. :L' -.-. .- ' '_-_'l.= _ _ _ ... _~ presence of three different types of.mRAT clones suggests that enrichment of the binding site in the central 444 bp fragment is probably not due to a bias in amplification or cloning efficiency. The second major conclusion drawn was a rough estimate of the frequency of.mFAT clones in the cq library. A total of 12 mFAT'clones were identified in approximately 6000 colonies. An estimate of the redundancy of the library based on these data is approximately 0.2%. This is not surprising given that very few genes were identified multiple times by different clones in the BLAST analyses of the CASTing libraries. 75 Chapter 3: Evaluation of putative targets of Pax3 A large number of putative targets of Pax3 were identified by CASTing (Tables 10, 11, and 12). The final goal of this project was to evaluate the CASTing strategy and to provide evidence that the CASTing libraries contain genes that are likely to be regulated by Pax3. The genes listed in tables 10, 11, and 12 were reevaluated and a smaller set of genes meeting several criteria was selected. Gel-shift analyses were performed on some of these to determine if Pax3 binds directly to the genomic DNA fragments represented in the CASTing clones. The Pax3—binding site was identified in several clones and compared to other putative Pax3 binding sites. Northern blot analyses were performed to show that the expression levels of putative targets of Pax3 were influenced by levels of Pax3 protein in both mouse and human model systems. 76 Gene selection Criteria for selecting genes for study Tables 10, 11, and 12 contain the BLAST results for clones recovered from the three Pax3 CASTing libraries. Several criteria were considered in order to reduce the number of genes for further characterization including the extent of the target gene's sequence present in GenBank, the predicted protein function, the expression pattern of the gene, and the relatedness of the gene to other putative targets of Pax3. The complete mouse and human cDNA sequences were not available in GenBank for the majority of the genes identified in the CASTing libraries. In order to study the expression of putative targets in both human and mouse model systems, probes specific to the mRNA of the mouse and human genes were required. The probes were obtained by RTPCR, provided that the cDNA sequence was available. For some genes, sufficient transcribed sequence was available directly from the CASTing clone to design a probe. However, for the majority of the genes, only a small region of transcribed sequence was present in the CASTing clone, which was not long enough to design a probe. Therefore, preference was given to genes that had full—length mouse and human cDNA sequences in GenBank. The predicted function of the protein encoded by the putative Pax3 target gene was considered in selecting genes 77 for further analyses. Pax3 is involved in several aspects of development as well as cancer. Presumably, genes that are regulated by Pax3 will also have functions important in these processes. Therefore, a greater emphasis was placed on genes in the CASTing libraries that encoded proteins having predicted functions in tissues altered by mutations of Pax3. For example, one of the CASTing Clones contained the promoter of the TGFa gene. TGFa is expressed in several types of tumors, and is likely to have an important role in tumorigenesis (Kumar et al., 1995; Untawale et al., 1993). Since PAX3/FKHR causes ARMS, TGFa is a good candidate for regulation by PAX3/FKHR. In order for a gene to be directly regulated by Pax3, the target gene must have an overlapping expression pattern with Pax3. Extensive expression studies are time and labor intensive, and are generally not performed until a gene has been shown to be of importance for other reasons. As a result, the expression patterns of only a few genes identified in the BLAST analyses have been reported. For genes with available expression data, I placed greater emphasis on genes transcribed in skeletal muscle, heart, or the neural tube as well as in ARMS cell lines. The final criteria used to pick genes from the CASTing libraries for further study was their relatedness to previously reported putative targets of Pax3. An example of this is the engrailed—Z gene. The Drosophila homologue of Pax3 is paired. Paired protein regulates engrailed mRNA 78 expression (Bertuccioli et al., 1996; Miskiewicz et al., 1996). One of the mouse homologues of engrailed was identified in the mouse CASTing library. Perhaps the regulation of engrailed by Paired has been conserved throughout evolution and is important for mammalian development. I have been asked repeatedly during the course of this project if the fragments identified by CASTing showed a bias toward being derived from the 5' end of the gene. The underlying rationale for this question is the belief that more weight should be given to a putative Pax3 target if the Pax3 binding site is located slightly upstream of a transcription start site. Historically, there has been a bias in the ascertainment of the location of transcription factor binding sites. The first place that investigators usually look for regulatory elements is immediately upstream of transcription start. However, there are many examples of cis—regulatory elements located downstream of a transcription start site (Botquin et al., 1998; Kawamoto et al., 1988; Maekawa et al., 1989; Rotheneder et al., 1991; Vergeer et al., 2000). Although the location of the binding site with respect to a transcription start site was not a criteria used to select genes for further analyses, an attempt was made to determine the location of the binding site for genes that looked interesting by the other criteria. 79 Putative targets of Pax3 identified by CASTing Using the criteria described in the previous section, a refined list of putative targets identified in the CASTing libraries was created (Table 14: Putative targets of Pax3 identified by CASTing). Table 14 includes the clone in the CASTing library and the name of the gene identified by BLAST analysis of the clone against the GenBank nr and dbest databases. Redundant CASTing clones were not included in Table 14. A brief description of some of the most interesting of these genes has been provided below. mFAT mFAT is represented by several clones in the Pax3 CASTing library. mFAT is a member of the cadherin gene family, and has a predicted role in cell adhesion and developmental processes (Dunne et al., 1995; Mahoney et al., 1991). mFAT is expressed along the neural tube and in developing skeletal muscle in a temporal and spatial pattern similar to Pax3. I tm2A When the mouse CASTing clone a01a04 was first BLASTed against the GenBank nr database, a weak similarity (E=1e”U was detected to the first intron of the human Itm2A gene. The homologous mouse genomic sequence has since been deposited in GenBank and is identical to clone a01a04. Due to the significant conservation of this sequence between mouse and human, Pittois and coworkers suggested this portion of Itm2A intron 1 contained gene regulatory elements (Pittois et al., 1999). Itm2A is a type II integral membrane protein with a predicted function in osteo—chondrogenic differentiation (Deleersnijder et al., 1996; Pittois et al., 1998). p5 7Kip2 Mouse CASTing clone a01e04 contains exon 13 of the NUcleosome assembly protein 2, Nap2. More importantly, clone a01e04 is derived from genomic sequence that is approximately 50 kb upstream from p57mm?(Hu et al., 1996). ,p57fimgis listed in the Introduction as a putative target of Pax3 because its expression is altered in Splotch embryos (Kochilas et al., 1999). Perhaps the Pax3 binding site in clone a01e04 is important for the regulation of p57mwfl not Nap2. Celsrl Clone a01e09 is derived from.mouse genomic DNA in the middle of a very large gene called Celsrl, Cadherin EGF LAG sevenapass G—type receptor. Celsrl, also called MEGFZ, is expressed in the developing neural tube and facial structures and has a putative role in cell—cell interactions or cell adhesion (Hadjantonakis et al., 1997). 81 OLjprotocadherin OL—protocadherin (mouse CASTing clone a03e01), is a member of the protocadherin gene family with a putative function in cell-cell interactions or cell adhesion. OL— protocadherin is abundantly expressed in the brain and to a lesser extent heart and skeletal muscle (Hirano et al., 1999). BVES BVES, blood vessel/epicardial substance, was identified in the BLAST analysis of mouse CASTing clone cq05a07. BVES is a highly conserved gene expressed in heart and skeletal muscle and is essential for proper formation of blood vessels in the heart (Reese and Bader, 1999; Reese et al., 1999). Aberrant regulation of BVES by Pax3 might explain the occurrence of persistent truncus arteriosis observed in Splotch mice. Induction of BVES by PAX3/FKHR might also result in a stimulation of vasculogenesis during ARMS formation. Engrailed—2 Clone cq03g09 is derived from.mouse genomic DNA in the second exon of the mouse engrailed—2 gene. Drosophila Paired protein regulates engrailed (Bertuccioli et al., 1996; Miskiewicz et al., 1996). I suggest that the regulation of engrailed by Paired has been conserved throughout the evolution of eukaryotes, perhaps through the putative binding 82 site located in exon 2. Coincidentally, putative binding sites have been identified in the mouse engrailed-2 promoter for other members of the Pax gene family including Pax2, Pax5, and Pax8 (Song et al., 1996). These binding sites were not identified in the Pax3 CASTing libraries. TGFCZ The promoter of transforming growth factor alpha, TGEa, was recovered in the PAX3/FKHR CASTing library. TGFa is closely related to epidermal growth factor, EGF'(Marquardt et al., 1984). Both TGFa and EGF bind and activate the EGF— receptor. TGFa is abundantly expressed in developing embryos and many adult tissues and is believed to be a major regulator of normal growth and development (Kumar et al., 1995). TGFa is also activated in a wide range of tumors and is likely to have an important role in tumorigenesis. Due to its important roles in development and cancer, the TGFU promoter has been well characterized (Raja et al., 1991). Several transcription factors regulate TGFa expression including p53, AP—2, and TEFl (Shin et al., 1995; Wang and Kudlow, 1999; Wang et al., 1997). TGFa is a strong candidate for regulation by PAX3/FKHR during the formation of ARMS tumors. 83 VEGFR Finally, vascular endbthelial growth factor receptor 1, VEGFR-1 also called fit-1, was identified in the BLAST analysis of the PAX3/FKHR CASTing clone ew02d06. VEGFR is a tyrosine-kinase receptor specific to vascular endothelial growth factor, VEGF (Neufeld et al., 1999). VEGF and its receptors have a significant role in both normal and tumor— associated blood vessel growth (Jakeman et al., 1993; Kim et al., 1993; Shweiki et al., 1993). VEGFR is also required for proper blood vessel development in the heart (Partanen et al., 1999). VTETR is a candidate both for regulation by PAX3/FKHR during the formation of ARMS tumors, and for regulation by Pax3 in the heart during embryogenesis. 84 Analysis of Pax3 binding activity in CASTing clones In Chapter 2, a pilot study was described in which eight randomly selected clones from.CASTing library cq were bound by Pax3 and co—immunoprecipitated with Pax3 antibodies. That experiment demonstrated that clones in the cq library contained Pax3 binding sites. It was important to extend that study to additional clones in the cq library as well as clones in the ev and ew libraries. EMSAs or immunoprecipitations were performed on many of the clones described in the previous section. An example of the EMSAs on these fragments is shown in Figure 13: EMSAs with 4 CASTing clones. PCR products were amplified with primers designed to sequences at the ends of the clones a01a04 (Itm2A), a01e04 (NapZ or p57mm5, a01e09 (Celsrl), and a02a08 (nAchr—B3). The PCR products were end—labelled with 32P and incubated with Pax3 protein or with a negative Pax3 protein control. Purified IgG from antibody PB33 was added to the Pax3-DNA complexes. The complexes were electrophoresed on a non—denaturing (40:1 acrylamidezbis— acrylamide) gel. Each of the four CASTing clones was bound by Pax3. To date, every CASTing clone (25—30 clones) that has been examined by EMSA or IP has displayed Pax3-dependent binding. This observation is consistent with the pilot analyses of eight clones from the cq library and suggests that most if not all of the clones in the CASTing libraries 85 contain Pax3 binding sites. If this is correct, and if the in vitro conditions used to generate the CASTing libraries are similar to conditions in vivo, then thousands of Pax3 binding sites are present in the mouse and human genomes. In order to support the hypothesis that the binding sites in the CASTing clones are functionally significant, the human sequences homologous to mouse CASTing clones a01a04 (Itm2A) and Clone—2 (mFAT) were amplified from.human genomic DNA. The human DNA fragments were end-labelled and analyzed by EMSAs. Pax3—dependent binding was observed for each of the human fragments. If CASTing clones a01a04 and Clone—2 had been selected on the basis of spurious interactions with random genomic fragments, Pax3 would not have been predicted to bind the homologous fragments from a distantly related species. Since Pax3 bound the homologous human fragments, it is likely that the Pax3 binding sites in these genes have been conserved since the divergence of mouse and man. 86 .-_..—_.—~.—._._..__._._.s_._....m “' ‘* w-A-l-I- “M; Localization of Pax3 binding sites in CASTing clones Some of the CASTing clones were characterized further to identify the binding site within the larger fragment. This was done by two methods. The first approach was to scan the sequence of the CASTing clone for sequences that resembled Pax3 binding sites reported by others (Chalepakis and Gruss, 1995; Epstein et al., 1996; Epstein et al., 1998; Phelan and Loeken, 1998; Watanabe et al., 1998). This approach was successful in locating the Pax3 binding site in the mFAT CASTing clones. Briefly, an oligonucleotide was designed to contain a sequence in the.mEAT clones that resembled the Pax3 paired domain and homeodomain recognition sequences (Chalepakis and Gruss, 1995; Epstein et al., 1996; Epstein et al., 1998; Phelan and Loeken, 1998; watanabe et al., 1998). The oligonucleotide was shown by EMSA to be bound by Pax3 (Figure 14: Pax3 binding site in mFAT). A second oligonucleotide was generated that was identical to the first oligonucleotide except for a few base changes in the predicted paired and homeodomain binding sites. The mutant oligonucleotide was not bound by Pax3. For other CASTing clones, a putative Pax3 binding site was not apparent. A second approach was used to find the binding sites in these clones. Briefly, a set of overlapping PCR products were generated spanning the genomic region contained within the CASTing clone. The fragments were analyzed by EMSA to determine which fragment(s) included the 87 —_ ———J Pax3 binding site. By comparing the results of overlapping PCR products, the region containing the Pax3 binding sequence was reduced until Oligonucleotides could be generated spanning the region. Once Oligonucleotides were identified that were bound by Pax3, mutations were created to demonstrate that the interaction with Pax3 was specific. This approach was used to identify the Pax3 binding sites in the celsrl, TGFa and Itm2A CASTing clones. Pax3 consensus binding sequence After identifying the binding site in several clones, the sites were compared to consensus Pax3 paired domain binding sequences (Figure 15: Pax3 consensus binding sequence). The binding sites identified in the CASTing clones.mFAT, Celsrl, and TGFa resemble reported Pax3 binding sites (Chalepakis and Gruss, 1995; Epstein et al., 1996; Epstein et al., 1998; Phelan and Loeken, 1998; watanabe et al., 1998). The reported consensus binding sites were obtained by CASTing using Pax3 paired domain fusion proteins and random Oligonucleotides. One limitation in describing a consensus binding sequence is that the relative contribution of all four nucleotides at a single position is not included. A better description of a transcription factor binding site utilizes the information theory to display a “LOGO” (Schneider, 1996; Schneider, 1997; Schneider and Stephens, 1990). A LOGO is a graphical representation of the amount of information contributed by each nucleotide for each position along a putative binding sequence. Currently, too few Pax3 binding sites have been identified in the CASTing clones to generate a LOGO. Future work will include the identification of binding sites in additional CASTing clones and a description of these sequences using LOGOS. 89 Luciferase reporter assays As a final test of the binding sites, PCR products derived from the sequence from CASTing clones a01a04 (Itm2A), cq05a07 (BVES), ew04h04 (TGFa), and ew02d06 (VEGFR) were cloned into a luciferase reporter construct. The luciferase reporter constructs were co—transfected with PAX3/FKHR or negative control expression constructs and a B-gal vector (Figure 16: Transactivation of reporter constructs). Luciferase activity was normalized to B—gal activity. Higher levels of luciferase activity were detected in co— transfections including PAX3/FKHR than in negative controls for the VEGFR and BVES constructs, but not for the TGFa or Itm2A constructs. It is unclear why the VEGFR and BVES constructs were transactivated, but the TGFa and Itm2A constructs were not. Perhaps the cell type was not ideal for transactivation of these constructs, other cell lines will be used in future experiments. Future work will include transfections into different cell lines. 90 Expression analyses Model systems to study Pax3 transactivation One criterion that a gene is regulated by a transcription factor is to show a correlation between the concentration of the transcription factor and the amount of target gene mRNA. Often, however, this experiment is not performed, probably since a target gene mRNA level is not entirely dependent on a single transcription factor. Adding a single transcription factor to a cell culture system, for example, may only result in a slight increase in target gene mRNA levels. Use of more complex tissue sources, such as whole embryos with and without the transcription factor, result in even less significant differences in the overall target gene expression levels. An important piece of evidence that a gene identified by CASTing is a target of Pax3 would be to Show a correlation of the gene’s mRNA level and Pax3 protein level. Three model systems were chosen: mouse embryos with normal or mutant Pax3 protein, stable human embryonal rhabdomyosarcoma cell lines expressing varying levels of PAX3/FKHR, and transient transfections of PAX3/FKHR into murine NIH/3T3 cells. Splotch and Splotch—delayed, Sp and.£¢f, mouse embryos have been used by several groups to study the effect of Pax3 on development. Pax3 expression begins around embryonic day 8, peaks at 12 days p.c. and gradually decreases until very low expression levels are detected in 17 days p.c. embryos. 91 Adult mice heterozygous for the Sp and.S¢fialleles were mated to animals of the same genotype. Females were removed from.the crosses on the morning when vaginal plugs were observed. Sp females were sacrificed and embryos were collected between 9 and 14 days of gestation. SEE females were sacrificed and embryos collected between 9 and 20 days of gestation. Embryos were staged using the criteria described by Theiler (Theiler, 1989). A total of 384 embryos were collected from 52 Sp/+ X Sp/+ crosses (7.4 embryos/cross). A total of 272 embryos were collected from 35 Stf/+ X Spf/+ crosses (7.8 embryos/cross). Embryos were removed from extra—embryonic tissues and immediately placed in tubes and frozen in liquid nitrogen. Yolk sacs were recovered for every embryo. The yolk sac is derived from embryonic tissue and therefore represents the embryo’s genotype, not the maternal genotype. Yolk sac DNA was isolated and analyzed using Bi—PASA (Bi—directional PCR Amplification of Specific Alleles) to ascertain the genotype of the embryo (Liu et al., 1997). Each PCR contained four primers, two outer primers and two inner allele-specific primers facing opposite directions (Figure 17: Strategy used to genotype Splotch embryos). The two outer primers amplify a long product in all samples. The inner allele-specific primers, with the respective outer primers, produce allele— specific products of different sizes (Figure 17: Strategy used to genotype Splotch embryos). PCR products were electrophoresed on 4% agarose gels and visualized by EthBr 92 staining. The same strategy was used to genotype Splotch— delayed embryos (Figure 18: Splotch—delayed embryo genotypes). Bi—PASA allows the genotype of a sample to be determined from a single reaction, saving time and reagents, while producing more accurate results. Following Mendelian segregation principles, the expected ratio of +/+, Sp/+, and Sp/Sp genotypes is 1:2:1. A Chi— square analysis of the observed and expected genotypes was performed (xfl=0.48, 2 degrees freedom) and suggested that no drop-out of homozygous embryos had occurred. Furthermore, since the observed ratio of genotypes was not significantly different from the expected 1:2:1, it is likely that the genotyping data was accurate. Similar results were observed for Spd/+ X Spd/+ crosses. The second model system for studying Pax3 expression was to overexpress PAX3/FKHR in an NIH/3T3 cell line. PAX3/FKHR was transiently transfected into NIH/3T3 cells. Several plates of duplicate transfections were prepared. Total RNA was purified using the TRIzol reagent (Life Technologies). One plate of transfected cells were resuspended in 2X protein loading buffer (Nbvex). Resuspended cells were disrupted by passage through a QIAshredder (Qiagen). Protein concentrations were determined with BCA reagents (Pierce). Approximately 5 ug of protein was electrophoresed on 4—20% Tris—Glycine—SDS gel and western blotted as previously described using antibody PB35. Western blot analysis showed that PAX3/FKHR was expressed in the transiently transfected 93 l 1 .. __,-l_._.-_..-___._-_...._-._.._El_,-....-._._.-. ...-.-.- . . I W-. _- cells (Figure 19: Western blot of transient transfections). Transfections and analyses performed in parallel with an empty vector produced no PAX3/FKHR protein. The third model system used to show the regulation of targets of Pax3 were embryonal rhabdomyosarcoma cell lines expressing varying levels of PAX3/FKHR. The cell lines were prepared and generously provided by Oana Tomescu and Dr. Fred Barr (Pennsylvania). The PAX3/FKHR mRNA levels in the cell lines were determined by RNAse protection assays (Figure 20 and personal communication from Dr. Fred Barr). Before using these cell lines, protein was isolated as described for the NIH/3T3 transient transfections and western blot analyses were performed to determine the relative amounts of PAX3/FKHR protein present in the cell lines. Approximately 5 ug of protein was loaded from each clone. Western blot analysis was performed with antibody PB35. A band of approximately 98 kDa was observed in cell lines reported to express PAX3/FKHR (Figure 20: Western blot of stable cell lines). A band with slightly greater molecular weight was also observed in lanes not reported to express PAX3/FKHR, including the clone transfected with an empty vector. The embryonal rhabdomyosarcoma line used to make the stable cell lines did not originally express PAX3/FKHR, so it is unlikely that the faint band is PAX3/FKHR. The amount of PAX3/FKHR protein in the six cell lines is similar to the reported mRNA levels with the possible exception of cell line 4 which appears to have more PAX3/FKHR protein than cell line 27. 94 Northern blot hybridization analyses Northern blot hybridizations were performed using the model systems described above and many of the genes identified in the CASTing libraries. Total RNA was combined with 2X denaturing buffer (New England Biolabs) and heat denatured.at 65°C for 5 minutes. Approximately 5 ug of each sample was loaded and electrophoresed on a non—denaturing 1X TBE gel. The RNA was blotted to HybondDLIF2membranes (Amersham Pharmacia Biotech) and cross—linked by UV irradiation. Probes were prepared by generating an RTPCR product between 200—500 bp. The cDNA used to generate the mouse probes was prepared from RNA isolated from 12.5 and 15.5 day p.c. whole C57Bl/6J embryos. The cDNA source used to generate the human probes was prepared from.RNA isolated from human skeletal muscle and ARMS cell lines. Probes were labelled using RediPrimeW'II random primer labelling system (Amersham Pharmacia Biotech). Hybridizations with B—actin were used to verify equal loading between lanes. The majority of the probes generated from genes identified in the CASTing libraries produced signals that were indistinguishable between RNA samples expressing varying amoUnts of Pax3 or PAX3/FKHR. This negative result may be due to the absence of regulation of the target gene by Pax3 or PAX3/FKHR in the model systems examined. Alternatively, these genes may not be targets of Pax3 at all. 95 IIIIIIIIIIIIIIIIII-llllll--———_______ Several genes were identified with expression levels that correlate with the amount of Pax3 or PAX3/FKHR protein present in the model system. The most exciting of these genes include Itm2A, BVES, and VEGFR. Itm2A is induced following transfection of PAX3/FKHR into NIH/3T3 cells (Figure 21: Itm2A is induced by PAX3/FKHR). Itm2A is also expressed at significantly higher levels in +/+ mouse embryos than in Sp/Sp or Spflflflf mutant embryos (Figure 22: Itm2A expression in mouse embryos). BVES is more highly expressed in a wild-type 15.5 day embryo than in a Spiflkf littermate (Figure 23: BVES expression correlates with Pax3). Finally, VEGFR is more abundantly expressed in an embryonal rhabdomyosarcoma cell line expressing PAX3/FKHR than one without PAX3/FKHR (Figure 24: VEGFR expression correlates with PAX3/FKHR). These results demonstrate that the expression levels of some of the genes identified in the CASTing libraries correlate with the amount of Pax3 or PAX3/FKHR protein present in these model systems. In conclusion, I have shown that the Pax3 CASTing libraries contain genes with expression patterns and putative functions consistent with expectations of targets of Pax3. Furthermore, the CASTing clones contain evolutionarily conserved Pax3 binding sites that can be transactivated by Pax3 when placed upstream of a luciferase reporter. Finally, northern blot analyses demonstrated that the expression levels of some of the genes identified in the CASTing libraries are influenced by Pax3 or PAX3/FKHR. These 536 ———"' findings suggest that the CASTing libraries contain enhancer elements that are bound by Pax3 and PAX3/FKHR and used to regulate the expression of downstream genes. 97 Summary fi- A novel alternative transcript of Pax3, Pax3d, was identified in mice and humans. Pax3d is generated by the alternative splicing of intron 8. fi’ A.BAC contig was assembled across PAX3. The sequence of PAX3 was extended (Genbank accession AF156931). Two novel exons and the complete 3’UTR of PAX3 were described. ‘t’ The conservation of the Pax3d alternative isoform and the sequence of Pax3 exon 9 were evaluated in quail, rat, cow, dog, rooster, and rhesus monkey. Pax3d has been conserved since the divergence of birds and mammals. Pax3c has been conserved since the divergence of mice and humans, but is not conserved in birds. ‘*’ 39 Weardenburg syndrome probands were screened for mutations in PAX3 exons 8, 9, and 10. One mutation was identified in PAX3 exon 8. Several transcribed polymorphisms in PAX3 were characterized. *' Expression constructs were generated for Pax3c, Pax3d, PAX3C, and PAX3d. Proteins of the expected size were generated from the expression constructs. 98 mM.—.4-‘J... -...__. ~ “u _. ‘G The DNA~binding activity of Pax3c and Pax3d were compared, no differences were observed. 'fi' The transactivation properties of Pax3c and Pax3d were compared, no differences were observed. ‘fi' Rabbit polyclonal antibodies were generated against three peptides derived from Pax3. The antibodies recognized Pax3 proteins synthesized in vitro and in Vivo. fi’ Buffer conditions were optimized to promote the binding of Pax3 to specific DNA sequences. Co—immunoprecipitations were optimized to recover DNA fragments bound by Pax3. fi' A.CASTing strategy was used to generate libraries of genomic DNA fragments bound by mouse Pax3, human PAX3, and human PAX3/FKHR. " Inserts from more than 3000 clones in the CASTing libraries were sequenced. Repetitive elements in the sequences were MASKed, the remaining sequences were BLASTed against the GenBank nr and dbest databases. HUndreds of genes were identified. 99 :i' Pilot analyses were performed on random clones in the CASTing libraries. All clones examined contained Pax3 binding sites. ‘* Several clones were identified in multiple independent CASTing libraries, suggesting that these DNA fragments had been selected in separate experiments with different Pax3 proteins. “' Three types of mFAT clones were identified in the mouse CASTing library, each contained a central fragment with a Pax3 binding site. This fragment was bound by Pax3 at least three times during each round of the CASTing process, suggesting that enrichment for Pax3 binding sites had occurred. “' The long list of ESTs, genomic clones, and full—length genes identified by BLAST analyses of the CASTing libraries were condensed into a shorter list of putative targets. This list contains the genes most likely to be regulated by Pax3. " All CASTing clones examined contain Pax3 binding sites. 100 compared to Pax3 binding site consensus sequences. The binding sites identified in the CASTing clones were similar to the binding sites reported by others. fi‘ Luciferase constructs containing the inserts from two CASTing clones were transactivated by PAX3/FKHR. *' Mouse embryos from Sp/+ X Sp/+ and Spd/+ X Spd/+ crosses were collected and genotyped using Bi~PASA. ‘i- PAX3/FKHR was transiently expressed in NIH/3T3 cells. fi’ Nbrthern blot analyses identified at least three genes with expression patterns correlated to levels of Pax3 or PAX3/FKHR. 101 Discussion This work has produced a few important insights into the function of Pax3, and has raised many more questions. Several novel isoforms of Pax3 were identified and evidence was presented to suggest that the most biologically relevant form of Pax3 is the isoform designated Pax3dQ+. This isoform appears to be the most abundantly expressed and has been conserved since the divergence of birds and mammals. Other isoforms of Pax3 are much less abundantly expressed and cannot be found in divergent species. It is unclear if the less abundant and less well conserved transcripts (i.e. PAXBA, PAXBB, Pax3e, and Pax3f) have a role in the function of Pax3 in vivo or are merely artifacts of the highly sensitive methods, such at RTPCR, that were used to identify the alternative transcripts. Although no evidence was presented to distinguish the function of Pax3c and Pax3d, identification of many putative targets of Pax3 will allow the binding affinities and transactivation potentials of Pax3c and Pax3d to be compared with more targets than were previously available. 102 IIIIIIIIII"'.....--I---L__ Future studies using targets identified by CASTing could include EMSAs and co—transfection studies similar to those described in Chapter 1 for the e5 oligonucleotide and cMET reporter construct, respectively. It is possible that examination of the binding and transactivation of these targets by Pax3c and Pax3d will identify enhancers bound with different affinities or transactivated to different degrees by the alternate isoforms. The in vivo binding properties of Pax3c and Pax3d could also be compared by performing chromatin immunoprecipitation (CHIP) assays with antibodies specific to each of the Pax3 isoforms in mouse embryos or cells culture systems expressing endogenous Pax3 protein. It is possible that Pax3c and Pax3d bind different enhancer elements in Vivo. If this is true, alternate enhancers would be immunoprecipitated depending on the use of antibodies specific to either the Pax3c or Pax3d isoforms. Although these questions are worth studying, I suggest the Pax3c and Pax3d isoforms are functionally equivalent. I hypothesize that the most ancient form of Pax3 is Pax3d, but that over time a lack of selective pressure for 103 -.-..-.._.-.. -..-;r . . (IIIIWL‘ “hm" complete and proper splicing of intron 8 has led to the emergence of the alternate transcript Pax3c. Alternative splicing is thought to occur when the intron donor, acceptor, or branchpoint sequence motifs are not ideal. If there are no functional differences between the Pax3c and Pax3d isoforms, mutations occurring in the intron 8 splice site motifs do not alter the function of Pax3. These mutations would therefore not have been detrimental to the fitness of the organism and thus were not selected against in subsequent generations. The central focus of my research project was to identify genes regulated by Pax3 using a Whole Genome PCR- based strategy called Cyclic Amplification and Selection of Targets (Kinzler and Vogelstein, 1989; Wright et al., 1991). I believe this strategy is the best approach to identify genes directly regulated by a transcription factor, but it is not without limitations. The shortcomings of CASTing can be divided into general limitations of the strategy and limitations specific to the adaptations used to identify targets of Pax3. One limitation of CASTing is the artificial state of the naked 104 . . .. _._.-_-. . —--un_l._--u- _ -_ —-———————_-WM' - “MI-u..- .u..." ......-.. .-— I __ ' ' —- -"" DNA fragments bound under the in vitro conditions. Very different conditions exist in Vivo where much of the genome is not accessible to binding. This difference may result in the recovery of DNA fragments bound by Pax3 in vitro that are not accessible to Pax3 in vivo and thus are not bound by Pax3 in Vivo. Another limitation to CASTing is the lack of interaction with other proteins at or near the enhancer element that promote binding by the transcription factor. I believe that transcription factors are likely to interact with DNA as components of large multi—unit complexes. Absence of other components of the complexes in the in vitro binding reactions might result in a selection against enhancer elements that interact weakly with the transcription factor. Post-translational modifications of the transcription factor that occur in vivo but not in vitro may also influence the DNA—binding prOperties. In addition to these limitations, the antibodies used to identify targets of Pax3 may have reduced the DNA- binding affinity of Pax3 for some targets. This was observed in EMSAs as a reduced signal intensity of DNA/Pax3/antibody complexes as compared to the intensity of 105 the DNA/Pax3 complexes. This may have resulted in a selection against some enhancer elements. It is also possible that the conditions used to bind Pax3 to DNA were not stringent enough to eliminate weak interactions that do not occur in vivo, thereby increasing the number of false positive clones. Several interesting genes were identified by CASTing but not evaluated in the present study. The first such class includes fragments identified in multiple CASTing libraries (Table 13: Sequences found in multiple CASTing libraries). When BLASTed against GenBank, these sequences were not similar to any known genes, thus complicating the synthesis of probes suitable for expression studies. Future work could involve a more detailed characterization of these sites including expression analyses of genes located near these putative enhancers. Complete assembly and annotation of the raw human and mouse genomic DNA sequence from.the Human Genome Project will simplify the identification of genes near these fragments. Another interesting class of genes not evaluated in the present study includes fragments overlapping anonymous EST clones. 106 The EST clones were usually too short to generate probes. Also, since no putative function or expression pattern was known, these genes were not as high a priority in this study as other known genes. One question that arose from this research project regards the large number of putative binding sites identified. Does Pax3 really have thousands of binding sites in the mouse and human genomes? If these sites are bound by Pax3 in vivo, are they all used as enhancers for regulating downstream gene expression? Perhaps these sites are not used for gene regulation, but rather as parking places for Pax3 protein. The parking places would be occupied by Pax3 in cell types or at times when Pax3 protein was not required. Having “parked” transcription factors available in the nucleus that could be quickly made available would be advantageous over the time consuming steps of mRNA synthesis and translation. Several areas of study remain to be investigated as a result of observations reported here. Since the sequence analyses of the CASTing libraries have identified little redundancy, sequencing more clones would likely identify 107 ' Lu-..“ I. I..- I.._I .I - L.__.__ ,. additional putative targets of Pax3. Perhaps some of these putative targets could be eliminated by additional sequencing of clones from the negative control CASTing library. Any clone contained in the negative control CASTing library is likely to have been selected during the CASTing strategy either through an interaction with a protein in the rabbit reticulocyte lysate or by directly interacting with the Pax3—antibodies or protein G— sepharose. These interactions may have contributed to the large number of clones observed in the Pax3, PAX3, and PAX3/FKHR CASTing libraries. The Human Genome Project is still generating DNA sequence from large portions of the human genome and the Mouse Genome Project will soon add large amounts of mouse DNA sequence. Since the majority of the CASTing clones were BLASTed against Genbank at a time when the Genome Project was incomplete, one could BLAST the sequences from the CASTing clones against Genbank again and identify the genes recently submitted. This is likely to produce valuable information for several years until the complete annotations of the human and mouse genomes are available. 108 The CASTing libraries described in this project are likely to contain many targets of Pax3. They may also contain DNA fragments that are not bound by Pax3 in Vivo. Additional experiments need to be performed to discern between these possibilities for large numbers of genes. The approaches I used were to find the binding site, demonstrate transactivation of the binding site and investigate regulation of the gene by Pax3 in a model system by northern blot analyses. These approaches are time consuming and not easily adaptable to high—throughput analyses. Perhaps a better strategy would be to show binding of Pax3 to the sequence contained in the CASTing clone in vivo using a chromatin immunoprecipitation strategy (CHIPS). This approach has been used successfully to study binding sites of transcription factors in Drosophila and a few mammalian cell culture systems, but has not yet been adapted to study a tissue source as complex as a mouse embryo (Boyd et al., 1998; Crane— Robinson and Wolffe, 1998; Kuo and Allis, 1999; Murphy et al., 1999; Orlando et al., 1997). Perhaps coupling traditional chromatin immunoprecipitation methods with 109 real—time relative quantitative PCR would be of sufficient sensitivity to produce quantitative estimates of Pax3 binding to putative enhancers elements. If I were to spend another year of two on this project, I would pursue a CHIP analysis to test the binding of Pax3 to many of the putative enhancers identified by CASTing. An additional area of research would be in situ hybridization analyses on mouse embryos. The in situ hybridization analyses would be used to test for co— expression of Pax3 and putative targets of Pax3 in mouse embryos. Quantitative in situ hybridization analyses of putative targets of Pax3 in normal and Pax3 mutant mice could also be performed to provide a quantitative comparison of target gene expression levels in different tissues of the embryo. The DNA—binding domains of the nine Pax genes are highly conserved and bind related sequences in vitro. It is possible that some of the DNA fragments enriched in the Pax3 CASTing libraries are actually targets of other Pax proteins, such as Pax7 or Pax2. The conditions used to bind Pax3 protein to the DNA fragments in the CASTing 110 strategy was artificial and may have permitted weaker interactions that do not occur in vivo. This has important implications for this research project. If targets of other Pax proteins were selected in the CASTing libraries, then this resource is of value researchers studying other Pax proteins. It is also possible that Pax proteins bind overlapping sets of enhancer elements. As the Pax proteins have duplicated and diverged throughout evolution, some binding sites may co—evolve with specific Pax proteins while others contain sequence that can be recognized by multiple Pax proteins. It would be interesting to test the binding of other Pax proteins to the fragments identified in the Pax3 CASTing libraries. What strategy would I use if I were to begin this research project today? The CASTing strategy was developed over a decade ago and is a valid approach to identifying targets of transcription factors. I believe that the availability of the complete sequences of the mouse and human genomes will make this strategy even more successful. I would not change any of the reagents, optimizations, or approaches used to create the Pax3 CASTing libraries. 111 However, the analyses of putative targets would be done by primarily two different means. First, expression studies would be performed by microarray hybridization and subsequently northern blot hybridization or RNAse protection assays for genes with positive results by the more efficient high throughput microarray strategy. Final confirmation of target genes would be performed by chromatin immunoprecipitation with both cell culture systems and mouse models. Demonstration of Pax3 or PAX3/FKHR binding to the enhancer elements in Vivo by CHIP analysis coupled to a correlation of target gene expression with Pax3 or PAX3/FKHR protein levels would provide the best evidence for having identified targets of Pax3. 112 APPENDICES 113 ‘—~'----— .- - ..__ ‘-—....__... ..—...—..._.... APPENDIX A Tables 114 Table 1: Oligonucleotides Oligo Sequence Notes 1 tgaccaggatccCAGGGCTCCAAGTGGAC downstream transl. stop AGTTC 2 tgaccaggatccCAAGAGAAATGAGAGCG upstream transl. start AGACC 3 tgaccaggatccGAACTTCTCCGCCCTCA upstream transl.start GCAAC 4 tgaccaggatccCCGCACTCGCCTTTCCG PAX3 cDNA cloning 5' TTTCG 5 GGCAAGATTAAGCCACACATGC PAX3 poly(T) marker 6 .AATTACACAAGGAAGCCCCTGC PAX3 poly(T) marker 7 CTCCATCTCCCCGCTGCATGGC BAC screening primer 8 CCACGGGGTCCACGCTGTAGCC BAC screening primer 9 TGGCAATCAGGTTTCACGTCTC PAX3 poly(T) marker 10 GCGTGTGTTTCCTTACAGGTGC walking primer 11 ACTGGCCCTGTTTCTGGTCTTC walking primer 12 ACTCGGCCACCTCCATCTCAGC BAC screening(l343-) 13 TGTAGGTGGGTGGGCAGTAGGC BAC screening (—1423) 14 CAGCGGGCCGACTCCATCAAGC BAC screening (1375-) 15 CGGCCACGGGGTCCACGCTGTA BAC screening_(—1456) l6 ATTACGCGCTCTCCCCTCTCAC WS mutation screening 17 TCCTCTTCTCCACTGCTTTTGTCG WS mutation screening 18 AACGCCGCCGAGCTGGTTGACG Ident.of TXN Start 19 TGTGGCGGATGTGGTTGGGCAG Ident. of TXN start 20 AGAGTTGAGTTTATCTCCCTTCCA PAX3A specific probe 21 TAATGAAAGGCACTTTGTCCATAC PAX3B specific probe 22 GTAACCATGTGAAACCATTGCC Ident. of TXN stop 23 TCCACTGCTTTTGTCGAACGTG Ident. of TXN stop 24 GTCTCCTATTGGGACCACTGCC Ident. of TXN stop 25 CGAAGACCAGAAACAGGGCCAG Ident. of TXN stop 26 TTAAGGCAATGGTTTCACATGG BAC sequencing_ 27 CAGGCTGACTTCTCCATCTCCC BAC screening (1303—) 28 GCCGTACTGGCCGTACTGATAG BAC screening (—l479) 29 GTGATGAGCATCTTGGGCAACC BAC screening (1258—) 30 AGTATCAGCATCGAACATCGAC WS screening CA repeat 31 GAACAGTCTGCTTGCCCAAACC WS screening CA repeat 32 CGATGTTCGATGCTGATACTTC BAC sequencing CA repeat 115 "-nhlhn-i- "I:— 'q___'- Table 1 (cont'd). Oligo Sequence NOtes 33 tgaccaggatccCCGCACTGTGCTCGCTT TTTCG Pax3 cDNA cloning 5' 34 tgaccaggatccCTAGAACGTCCAAGGCT Pax3A cDNA cloning 3' TGGATTGTCCATACTGC 35 tgaccaggatccCACTTATGCAATATCTG Pax3B cDNA cloning 3' GCTTGAG 36 tgaccaggatccCTAAAAAGTCCAAGGCT PAX3A cDNA cloning 3' TGGATTGTCCATACTGC 37 tgaccaggatccCACTTAGGCGATATCTG PAX3B cDNA cloning 3' GCTTGAG 38 ATGGGCTACCAGGAAGAAGGAC BAC sequencing 39 TTCTGTTCTTGCCCTGCCTTTC 4O ACATTGGTGGTGGTTGAGGCTG 41 CCACACAGGAAAGGGAAACAAG 42 TAGGGAGACCCAAGCTGGCTAGCG PCR and sequencing 43 AACGGGCCCTCTAGACTCGAGCGG PCR and sequencing 44 GCCAGGGTTTTCCCAGTCACGAC PCR and sequencing 45 GCTATGACCATGATTACGCCAAGC PCR and sequencing 48 CCAAGCGCCAGGAGGGGGAGACT BAC and cloning 50 GTGCAAGATGGAGGAAACAAGC sequencing_ 51 CACCTCAGGTAATGGGACTTCT sequencing 52 TGCCCACATCTCAGCCCTATTG sequencing 53 TCGCAGCAGGGGTGAAGGGAGC BAC isolation 54 CTTCCAAAGGGAATCCCGTGCG BAC isolation 55 CGGGATCCCGGGGTGACACTCGCCTCCC 5' primer for constructs 56 CACCCAATCTTCGTGTCTGTCTGCCTCGC Cloning GTGC 57 ACTCCCTCCGCCCCTTCCCACAC BAC and cloning_ 58 CGGCGGCAGCAAGCCCAAGCAG isoform identification 59 CGGCGGCAGCAAGCCCAAGGTG isoform identification 60 tgaccaggatccCTCAGGCAGTCTGAAGT 3' for 87 bp product TAGCAC 61 tgaccaggatccCGTGTCTGTCTGCCTCG 3' for 297 bp product CGTGC 62 tgaccaggatccAGATCCCAGTAGCACCG mouse RPA probe TCCAC 116 Table 1 (cont'd). 01 igo Sequence Notes 63 tgaccaggatccCATAGTCGGTCTGAGGC mouse RPA probe TGGTG 64 CATCCGGCCCTGCGTCATCTCG clone sequencing 65 CGGGAAAGGTGAAGAGGAGGAG clone sequencing 66 ACAGCGCAGAAGCCGAACCACC clone sequencing 67 AGCCCACATCTATTCCACAAGC clone sequencing 68 GGTACCTCATCAGCCCCAGACT clone sequencing 69 ACCGAGCTCTTACGCGTGCTAG pGL3 basic and promoter 70 CCAACAGTACCGGAATGCCAAG pGL3 basic 71 GGGCGGGACTATGGTTGCTGAC pGL3 promoter 72 GGAAGTGTCCACCCCTCTTGGC clone sequencing 73 CGGCGGCAGCAAACCCAAGCAG clone sequencing_ 74 TTCCTTCGAACGCAGACAGCAG clone sequencing 75 AGGTTGCAGTGGGCTGAGATTG marker, interior, 5' 76 GGAAAGAAGGCAGTCACAGAGG .marker, interior, 3' 77 GAGGCAGGAGAATCACTTGAAC .marker, exterior, 5' 78 CCAAATGTGGTGTGTCCTAAAC .marker, exterior, 3' 79 ATCTTGCTCAGGCTGGTAAAGC .marker, exterior, 5' 80 CCTGCACTTAAATTCAAATGGTTC marker, exterior, 3' 81 TTTCTTAACATCATACCACATTTC marker, interior, 5' 82 TTCAAGCTAGGCGGTAGTTC .marker, interior, 3' 83 AGACTGGGTGACAGGGTGAGAC marker, exterior, 5' 84 GTGGAGGAGATGGCCATGGAG -marker, exterior, 3' 85 ACATATTATGTTTTGCATATTTTACC .marker, interior, 5' 86 GAATGGCTGGTTGTAAGGGC marker, interior, 3' 87 GGCAGTGGTCCCAATAGGAGAC WS sequencing 88 GGACTGTCAAGTTATTCTTTCAGC exon 8 amplification 89 GTCCAAGGCTTACTTTGTCCATAC exon 8 amplification 9O tgaccaggatccTCAGATCCCAGTAGCAC Goulding nt. 1261— CGTCCAC 91 tgaccaggatccCTCTGACTGCAGCTGGC Goulding nt. —1569 TGACACC 92 tgaccaggatccAAAGTAAGCCTTGGACG Gouldingynt. 1715— 93 tgaccaggatccGGCGCTGACCTCATTAA Goulding nt —2002 GAACATG 94 TGACGTAGCGAGTTAGTGAGAC sequence poly(A) signal 117 Table 1 (cont'd). Oligo Sequence Notes 95 TCCTACTGCCCTCCGACCTACA quail Exon 8- 96 CGAGACCGGAAAATAACACCAGC quail -Exon 9 97 TAGCCAGGAGTTCAGCGGTCG WGPCR pcr primer 98 GATCCGACCGCTGAACTCCTGGCTA WGPCR linker—adaptor 99 tgaccaggatcCGTCGGAAGTGGCAGTTATTCGG 100 tgaccaggatccGTCAAACTCACTGTCAGATCAAGG 101 tgaccaggatccGAAGATGGACTGGCAAAGAGAAGG 102 TCTATTAATACTACTGGAACTAAA Pax3 binding site—sense 103 TTTAGTTCCAGTAGTATTAATAGA Pax3 binding site—anti 104: GGAGACTCGGTCCCGCTTATCTCCGGCTG Pax3 binding site-sense TGC 105» GCACAGCCGGAGATAAGCGGGACCGAGTC Pax3 binding site—anti TCC l O 6 CTCAGCACCGCACGATTAGCACCGTTCCG EMSA CTTC 107' GAAGCGGAACGGTGCTAATCGTGCGGTGC EMSA TGAG 108 .AGGCAAGCTCTACTGTCATTCC upstream Pax3 binding_ 109 .ATTCTAGCATTTCCAGCATTTC downstream.Pax3 binding 1 l 0 TTGCCACAGACCTCACTACTCC E 111. AGATGGACTGGCAAAGAGAAGG 112 TGCCACTAGGTAGAACACAAGG evol. sequencing 113 TCAATCGCTCTCCTTTGTCTCC evol. sequencing 114. GTCAATTAGAGGCATGATTAAG evol. sequencing 115 CAGCTTGCTTCCTCCATCTTGC —exon 6 l l 6 TGAAATGTGATAGGTACGTTCAGG - exon 6 117 CTTTTAGCTGAGGGCACTGAGGCAGAGCG EMSA GCCCCTAGG 118» CCTAGGGGCCGCTCTGCCTCAGTGCCCTC EMSA AGCTAAAAG 119 GGAGAAGAGGAGGAGGCGGATC Goulding 790— JLK) GGCATGGCGGTGGGAGGGAATC Goulding_-1196 121. TCAGCTTGGTGGGGTCTTCATC Mouse northern probe 122 GGTGGTGGTGGGGTAGGTAGAG mouse northern probe 123 GGTCCCTGAAGCCCCTCTACTG sequencing_exon 6 124. TTGCTCAAGGACGCTGTCTGTG 118 Table 1 (cont'd). Eligo Sequence Notes 1 2 5 CAGGATGCGGCTGATAGAAC TC JJMS GCTTGCTTCCTCCATCTTGCACA R271C allele—specific 127 .AGAGGGACAGGGACACCGTGAG 300 upstream.TXN start 128» CCAAGTCCACAGGCTCCAGAGG 250 bp after TXN start 129 TAAGGGAGGAGTGTTCGCTGGC 1000 upstream TXN start 130 .ATCGGGTGAGGGAGGGTGGTG 75 bp after TXN start 131 .AAACTGTGATCCAGGGCTGTCC 250 3' of fusion point 132 TTAAGACCGAGAGAAGGCAGAAGC Sp genotyping up control 133 TCGCTCACTCAGGATGCCATCG Sp genotyping control 134: GTGTGCGCTCCTCTTTTCTCCAG Sp genotyping wt-spec. 135» GCGGCTGATAGAACTCACACACG Sp genotyping Sp—spec. 136 .ATTTCCCATTCGCCATTCAGG pGL3 basic vector 4524— 137 CCAGGAACCAGGGCGTATCTC pGL3 basic vector —180 l 3 8 tga C ca gga t c CATGGCGGCCGCGGGAATTCG 139 tgaccaggatccGCAGGCGGCCGCGAATTCACTAG l 4 0 C GAAGTGCCC C CAGGATGACC 141 CCCAACCACATCCGCCACAAG 142 CAAGCCCAAGCAGGTGACAAC l 4 3 GTGCAAGATGGAGGAAGCAAGC 1 4 4 GACAGCTTTGTGCCTCCGTCG 1 4 5 CCTACCACCACGGTGTCGGC l 4 6 TGAGTTCTATCAGCCGCATCC 1 4 7 AC TCTGAACCTGATTTACCGC l 4 8 AGCCGCTTCCTCCGAGCACTG 149 .AGCCAGGAGTTCAGCGGTCG.(n)25.co 65mer random mp. 150 .AGCCAGGAGTTCAGCGGTCG.(n)50.co 90mer random mp. l 5 1 GTCTATTAATACTACTGGAACTAAAG TaChibana Pax3 EMSA 152 CTTTAGTTCCAGTAGTATTAATAGAC Tachibana Pax3 EMSA 153 TGAAGGTCGGTGTGAACGGATTTGGC G3PDH upstream 154 CATGTAGGCCATGAGGTCCACCAC G3PDH downstream 155 GTGGGCCGCTCTAGGCACCAA B—actin upstream 156' CTCTTTGATGTCACGCACGATTTC B—actin downstream 157 TGCATCTTGGCTTTGCAGCTCTTCCTCAT mouse IFN—G upstream 119 .11: u n.. m Table 1 (cont'd). GGC 01 igo Sequence Notes GGC 158 TGGACCTGTGGGTTGTTGACCTCAAACTT mouse IFN—G downstream 159 ATGAAGGTCTCCACCACTGCC mouse MIP upstream 160 TCAGGCAATCAGTTCCAGGTCAGTGATGT ATTC mouse MIP downstream 161 AGCTGTTGCTCTGTGCAGACCACGAGA HESXl 1F 162 ACAAAGAATTGAAACAATTAAGCTGTGGC A HESXl 1R 163 TGGAACATAAGATTGACCATCTAAGACA HESXl 2F 164 AGCCTTTATATTATCATTATTGGGTGAA HESXl 2R 165 AGCTCATTTTTGGAGACATACTTGAATA HESXl 3F 166 TAACATTTCAACATCATGAATAACAACT HESXl 3R 167 GAATAATAAAATAATGTTTCTGAGACCTA T HESXl 4F 168 TCATGCTCTGCAATTAGAAGATAATTTCA C HESXl 4R 169 (GCTTCACATTAGCCCCGCTTGCGGACG PAX3'S PAX3 binding site ZEN) CGTCCGCAAGCGGGGCTAATGTGAAGC complement to TB169 171. GGAGGGAGAAGCGAGTGTGGTC PAX7 5' UTR 172 GAAGGTGGCGACGACGAGGAAG PAX7 5' UTR 173 CGTGGAGATTTAAAGTCCCCGCTTCTCAG AA Pax3's Pax3 binding site 174 TTCTGAGAAGCGGGGACTTTAAATCTCCA CG complement to TB173 175 GCTTCACACCAGCCCGACATGCGGACG Mutant TB169 176 CGTCCGCATGTCGGGCTGGTGTGAAGC complement of TB 175 177 tgaccaacgcthATTTAGAACCGCAGCT TGCC PAX3 Luc reporter upper 178 tgaccactcgagCTCGCTCAGCCTCTATT CCTC PAX3 Luc reporter lower 179 tgaccactcgagTATCCAGGTGAAGGCGA AACG PAX3 Luc reporter lower 180 tgaccaacgcgtCAACACTCCTGGCGTCA TATCC Pax3 Luc reporter upper 181 L“ tgaccactcgagTGGATTTCGCTTCGGGA Pax3 Luc reporter lower 120 Table 1 (cont'd). Oligo Sequence thes TTAC 182 tgaccactcgagGGGTGGTGACGAGGCAG GAAC Pax3 Luc reporter lower 183 GGCCGAGTCAACCAGCTCCGAGGAGTATT TATCAACG Spd mutagenesis 184 CGTTGATAAATACTCCTCGGAGCTGGTTG ACTCGGCC Spd mutagenesis 185 TTCTCCACGTCAGGCGTTGTCACC Goulding 644—621 186 CAGGGCCGAGTCAACCAGCTCC Goulding400—421 Spd. 187 tgaccaggatccATATTTATAAGGCAGCC AATGTGG Tsukamoto PAX3A/B cloning 188 tgaccaggatccTCCTAGTGTCCTCTGGT CTTTCAG Tsukamoto PAX3A cloning 189 tgaccaggatccCAGGGCCGAGTCAACCA GCTCC TF184 w/tail and Spd mut. 190 tgaccaggatccCAGGGCCGAGTCAACCA GCTCG TF184 with tail 191 AGAAGTTGCTTTGTTGAAACGTGG WGPCR clone 2 forward 192 GGCCAGTACCTTCTCAATGTCAGC WGPCR clone 2 reverse 193 CCTTCTAGTCTCATCCAGCAAC aOle03fl 194 TGAGGCCCAGGTTGAGTATAAC a01e03r1 197 tgaccaggatccCCTTCTAGTCTCATCCA GCAAC TB193 with BamHl 198 tgaccaggatccTGAGGCCCAGGTTGAGT ATAAC TB194 with BamHl 201 tgaccaggatccCTAGCTTAGCGCCATTA ACTCC WGPCR clone 3 forward 202 tgaccaggatccGCCCTGGCAGGTACTTT TAGG WGPCR clone 3 reverse 203 tgaccaggatccACTCTTATTTTGAACAC TGACACG WGPCR clone 4 forward 204 tgaccaggatccCAGTCGCATTGCATGTT CCG WGPCR clone 4 reverse 205 tgaccaggatccACAGAATGAACACATTG GACTG a01a04f1 206 tgaccaggatccTTAGGTGCGGAAATATG a01a04r1 121 IIIIIIIIIIIIIIIIIIIIIIIII---———_______ Table 1 (cont’d). 01 :i_.g_o Sequence Notes AAATG 207 tgaccaggatccATATAGGACTGCATCAT a01d06f1 GAACC 208 tgaccaggatccAAAGCATAGGAACTTAC a01dO6r1 CTTAG 209 TGGGAGGTAGACATGTACCATAC a01e01f1 21L) GATCATGTGATGGCTAAATAATC a01e01rl 211. CGGACACTGCTCACCACCATC a01e09f1 212 CAATTTCAGTCTCGCAGTAGTCC a01e09r1 213 tgaccaggatccCAAAGACTTTCATTAAG a01f10f1 TGCATC 214 tgaccaggatccTCCTCAGATGCAAGTCA a01f10r1 CTGG 215 tgaccaggatccCAGTGTGGAGTCTGTTA a01e04f1 GAGTGC 216 tgaccaggatccATCTGACGCCTAATGTT a01e04r1 CCAG 217 TCAGCACAGACTGGGAGAAGG a02a05f1 2 18 GTCTTCGACCTGCCTGCATC a02a05r1 219 .AGCCAGGAGTTCAGCGGTCG a01e05f1 220 CTTCCAGAGAATAAGAAACAGGTC a01e05r1 221. CGCGTCTACTGCTGGAACCTG a02h01f1 222 CTTAATGTGGGTTGAAGCACTGG a02h01r1 223 TGGTTCTGTTTTTCTTCTGCC a02c03f1 224. GACGCCCTTTATGCCCTTCC a02c03r1 225 'FFACAGCAGAAACTCACAGAT a02d07f1 2 2 6 ATCCATTTCTTGTCTATTTCA a 0 2 d0 7 r l 227 GATCTACAGGGTTGTTTTAGAAA a02e07f1 228 GGGCTGGGAAACGGGTAGG a02e07r1 2 2 9 ATTTATTAATTTGGTACAAGAAGG a O 2 f O 7 f l 230 TCTGAGGGTCCCGTGCGAGC a02f07r1 231. TTCAGAAAGACTACACCAGGC a01a08f1 232 AAGAAAGTGTGATGCCTATGG a01a08r1 2 3 3 ACGGTGGTTGTGTCTGTTTTGG U2 3 5 3 6 , c 1 one 2 RT— PCR 234 .ATCTGCATCAGAGGAGCCATTG a02a05f2 _235» TGCCTTCGTGCCAGTTTTCC a02a05r2 122 IIIIIIIIIIIIIIIIIII-llll-:::——————____l Table 1 (cont'd). ‘gligo Sequence Notes 236 'TTTGCAAGTCCTAGCCCACC a02f07f2 237 CCACCGAAACTGACCCTAAG a02f07r2 238 GGAGGCGCTCCTAAAGCTGC AA049839-a02f07—f1 239 (GCCCAGGCAGTTCCCAGCAG AAO49839-a02f07—r1 24$) CTCATCGCTGTTGGTCACGG a02h01f2 2 4 l TCACTGTACATTAAGACTGCACGC a 0 2 C 0 3 f 2 242 GATCGCACTCCTGTACTGGTCAC a02c03r2 243 CGCTGCTGCCTGTGGAAGAC E25mRNAf1 244. GCAGGCTCCACCAACAATCAG E25mRNAr1 245 CTTCGAGCACCACAGGACATG mouse RAR alpha f1 24f; GTCTGGCAGGTAGTTGTGATGG mouse RAR alpha r1 247' GAACACCAGCGTACTTCATCACC a01e03f2—RT—PCR 248 CTGGAGCTGTCGTCACTTTCATC a01e03r2—RT-PCR-AA172276 249 CTCCCAATTAGCGAAAGTGCG clone 3 RT—PCR primer 250 CCTACACCTCTGGCGGGTTC reverse clone 3 251. GGCGCGGATTTTAGGGACTTC reverse clone 3 252 .AGTAAGTGTGATTTTGGTTTTCC reverse for a01d06 253 CGAGGGGACAAAGGCGTTCAG a01e04 for RTPCR 254 .ACTGTGCCACGCTGACTCTGC a01e04 for RTPCR 255 .AGCGTGTGCTGCCCTTTGATG a01e09 for RTPCR 256i GCAGTAGTCCCCGGCGAAACC a01e09 for RTPCR 257 CTCTGGGATCGGGGTCTGTC a02a05 for RTPCR- AA410101 258 .ATGCAAAGGGGATGAAGGGC a02a08 for RTPCR 259 (GAGGTACTCCGCAATGAGAGGG a02a08 for RTPCR 2 6 O GCGGGCTTAGGGTCAGTTTCG a O 2 f 0 7 f or RT PCR 2 6 1 AGCCAACCCCAGCCTTTCATC a 0 2 f 0 7 f or RTPCR- AA049839 262 CGAGCCAGTGCCTTGTTCAC a02h01 RTPCR—RNU75361 263 GGTTCTGTCCATCCCATTTGC a02c03 RTPCR—AI011275 264. TCTCCGTGATGGCTTGGTCTAG a03a03 265 GTGGTATAGCTTGGGCAAGGATC a03a03—AA177869 266 .ATCCAGAGGAAACAAAGGGCAG a03a04 267 CCTTTGGGGCCATGACTTTGAC a03a04 268 GCACTTCGGGCACCTCTACAG a03a05-AJ006256 123 Table 1 (cont'd). 01 igo Sequence Notes 269 TGATGAGCTGGGTAGACTGTGC a03a05-AJ006256 270 .ACCGAGACAACTACCCAAGGAC a03a06—MUSACACT 271. CATCCTGTCACCAAAGCGTAAC a03a06—MUSACACT 272 CCGCAGACTCAAACAGGACAAC a03e05—W30473 273 CTGGGTGCTGTGGGTCTCTAAC a03e05—W30473 274 .ATCCAATTCGCCACGGTCTATG a03e05 275 .ACTAGCCCTACCAGCGATGAG a03e12—MMU28656 276 TTCACAAAATTCAAGGCAGAGC a03el2—MMU28656 277 .AATTCGGGGAAACGGGACTC a03f06 278 GTTCCTTAACTTGTCCCCAAACTG a0ef06 279 CAGGAAAAGACGCAGGCAGTG a03gO7-AA015506 280 CCAGCGGGACGAAGATGATG a03g07—AA015506 281. GGCGGAAACACCTCATACATTG Dep—l 282 .AGAGAACACTTGAACAGCCTTTAC Dep—l 283 .AGAGGATAAAACGGAGACACAGC OPG 284: CAGGCAGGCTCTCCATCAAG OPG 285 GAACTCCCAATTAGCGAAAGTGC clone 3 for RT-PCR 286 (GATCAACACAGAATGAACACATTG a01a04 (E25) 287 GATCTCAGGAGACATTTTGTATAA a01a04 (E25) 288 GTTCAAATTGTTCTTCTGTCTC a01a04 (E25) 289 .AATCCTTGTTCCTCCTAAATAG a01a04 (E25) 290 .AAGGGTGCCGAACTGTCAAAC a01e09 (MEGF2) 291. CCGTGATGGGATGGATAGGC a01e09 (MEGF2) 292 GACAGAAGAACAATTTGAACTGC a01a04 (E25) 293 CGCTTCTTTGAAGCTTGACTGAGTTCTTT WTl PAX CON C 294 GAAAGAACTCAGTCAAGCTTCAAAGAAGC TB293 comp 295 CAGCTGCTCTATGAAGTGTGAAGAA thyroperoxidase TPO 296 TTCTTCACACTTCATAGAGCAGCTG TB295 comp 297 CCCCATTATTTACAGATGAGAAATTTATA glucagon G1—33 TTGT 298 .ACAATATAAATTTCTCATCTGTAAATAAT TB297 comp GGGG 299 TCGAAGGGCCACTGGAGCCCATCTCCGGC mb—l ACGGC ._§00 lGCCGTGCCGGAGATGGGCTCCAGTGGCCC TB299 comp 124 *7 Table 1 (cont'd). 01 igo Sequence Notes .____1 TTCGA 301 .AGTAGAAGACAATGCACAATATTGTATAG Sl—Crystallin II—A GG 3 O 2 CCCTATACAATATTGTGCATTGTCTTCTA TB3 0 1 comp CT 303 CCCAAATTCTCCAGCCTACAC Brn-3C 304 CTCCCACGGCAAGAACCATC Brn—3C 305 'TCAAGCCCGACGCCACCTAC Brn—3C 306 .AGGTGGCGTCGGGCTTGAAC Brn—3C 307 CGTGGGCGAGGTAGAAGTGC Brn—3C 3 0 8 GGCTGGATGGCGAAATAGGC Brn— 3 C 309 TGGCCGAGGGGAGTGGACAC clone sequencing_ 310 .AGATGACGCAGGGCCGGATG clone sequencing 311. CGTGTGCAGAATGAAGGAACTG PAX3/FKHR sequencing 312 GCCGAGCTGCCAAGAAGAAAG PAX3/FKHR sequencing 313 CACCCATTATGACCGAACAGG PAX3/FKHR sequencing 314. CAACCTTCTCTCATCACCAAC PAX3/FKHR sequencing_ 315 .ATGAGCCCTTTGCCCCAGATG PAX3/FKHR sequencing 316 (CATTATGACACCAGTTGATCCTG PAX3/FKHR sequencing 317 TCTGCAGTCAACGGGCGTCC PAX3/FKHR sequencigg_ 318 CAATGGCTATGGCAGAATGG PAX3/FKHR sequencing 319 CGGAATGACCTCATGGATGG PAX3/FKHR sequencing 3 2 O CATGGATGGCCTGTTCTCTGTC mFAT PCR 3 2 1 AGGCCAGCCTCTCACAGTATTC mFAT PCR 322 CACCGACCAGGACGTGTATGAC mFAT PCR 323 GCGTATGTACCTTCGTTGCCG Linker #2 324. gatcCGGCAACGAAGGTACATACGC Linker #2 3 2 5 GCACGACCTCACGCTGACTGG Linker # 3 326 gatcCCAGTCAGCGTGAGGTCGTGC Linker #3 327 .ACTGGTACGCACGACGATTGG Linker #4 328 gatCCCAATCGTCGTGCGTACCAGT Linker #4 329 .ATCCTCACGCCAATGATTTCC mFAT PCR 3 3 O CCCACATGGATGGCCTGTTC mFAT PCR 331. GCAGCAAGCCCAAGCAGGTG PAX3 sequencigg_ _§32 CAAATTACTCAAGGACGCGG PAX3 sequenciggg 125 Table 1 (cont'd). ATCCTG 01 igo Sequence Notes 3 3 3 GGCAACGTGAACAGGTCCAAGGC FKHR RTPCR 334 TCCAATGGCACAGTCCTTATCTAC FKHR RTPCR 335» GCACACATTGGGCAAACATCCTG FKHR RTPCR 3 3 6 TAACCCTCAGCCTGACACCCAGC FKHR RTPCR 337 tgaccaggatccTCCAATGGCACAGTCCT TB334 with BamHI TATCTAC 3 3 8 tgaccaggatccGCACACATTGGGCAAAC TB3 3 5 wi th BamHI 3 3 9 AGCCCCAAAGCGAGCGAAGC NAP2 human Nap1L4 3 4 0 CACTGTGCCATTCCGATTCCGC NAP2 human Nap1L4 3 4 1 GGGTGAAGGACGGCTGTGATG Human Celsrl orthologue 3 42 CTCTGGGGCACGGAAGGTCG Human Cel srl orthologue 3 4 3 TTCTGCTCCTGCTCCCATAAG Mouse Cel srl 344 CTGAGGTTCTCCATTCCGAGC Mouse Celsrl 3 4 5 CGGCAAAGCTCTATGGAAGTGG Human l euc ine aminopep t . 3 4 6 ACTTTGCAGCAGACACGATGGC Human leuc ine aminopept . 3 47 GCTCAAACAGGAATGGACAGAC Human nAchr—B3 3 4 8 TATTTCCCATTCTCCGTTATCG Human nAchr— B 3 3 4 9 GCGCTCGTGTTCTGGTCCTATG Human ACAT 350 .ATGAAGAGCACGAACAGCACGG Human ACAT 3 5 1 TTAATGCCACCGACCCTGATG Mouse OL—protocadherin 3 5 2 AGGAAGACTTGAGGCGGAACG Mouse OL—pro tocadherin 3 5 3 CCATCGTGGCGTCTCTACAGG Human OL—pro tocadherin 3 5 4 CCAGTCCATGCGAAAGAGGTTC Human OL-pro tocadherin 355» CGCCTCAGCCTCCTTAGTCTCC Human KIAA665 3 5 6 GGCAGACAAAGGCGCTCAGG Human KIAA6 6 5 3 5 7 C TCGACCCTCAGATGGACAACC hFAT 3 5 8 CCTCAATGTCAGTCACGGAAGAG hFAT 3 5 9 CAGATGACGCACGGATGATGG mouse c 1 one 4 3 6 0 GCTCAAATGTTCCCACTGTCCC mouse c lone 4 3 6 l CACACCTACACAAC CATTCGGG human c 1 one 4 3 6 2 AGGCAGATCTCAGTGACGACCC human c 1 one 4 3 6 3 ATCCACATCTTCACTCAAGCCG mous e STC 364 .ATTGGCGATGCACTTTAAGCTC mouse STC _36 5 CAGCAGCATCACCAGCAACAAC human STC 126 Table 1 (cont'd). Oligo Sequence Notes 3 6 6 GCCGACCTGTAGAGCACTGTTG human STC 367 CCAAAGGTCCCAATCTGAACGC mouse tektin 368 TGTACTTGGCAGAGCGGTTCAG mouse tektin 369 .ATTAGCAGACAGCATCAGGGCG human tektin 370 .ACAGCTCTTGAAAAGGCCATCC human tektin 371. CATCCCTGAGCACATCGACATC mouse ABC transporter 372 CTAGCCGCGTCTTCACATACTG mouse ABC transporter 373 .AGATCCCACCACCTGTCATTATG human ABC transporter 374. TGGGCTCACCTGCTGTTTCC human ABC transporter 375» TGCGGCAAACATAACCCAAGC mouse ENPEP 376 ‘TGCAGTGAATCCCAAAAGTCGG mouse ENPEP 377’ GCCTGTACAACCATTCCACCAC mouse engrailed ITHB CAGAGCCCACAGACCAAATAGG mouse engrailed 379 .ACGGAGCAGCCAGACACAAAG mouse chM3 3 8 0 GGACTTTCGAGATGGTTAGGGC mous e chM3 381. GAAGGAGCTGGCGGACATCAC human chM3 382 CTGTCATGTTCTGCTCTGTTGGTC human chM3 383 GGGACGGCCTCACCTACAATG mouse IMPD 384. GTCCGGGCAATGATGGCTACC mouse IMPD 385» GAACGAACCCGCCCTCCACTG mouse ankyrin 3 8 6 CACCTTCCCAAACAAAGCCGC mouse ankyrin 387 TCCAGGCTCATTTGCTTCCAC mouse nell—related 388 .ACCGAATCTCATCCCTCAGGC mouse nell-related 389 TTCCTTGTAAGCTGAGACTGAGCG cq01b09—AA463176 390 ACCCAGTCACAAGCAGCTCAGC cq01b09—AA463176 391 GTCGACGATGCTGATGCTGCTG cq01g11—AI604872 392 ACGCAGCTGGTTGGTGAACAG cq01g11—AI604872 393 CTTTACTAATTGAACTCACTGGCCC cq02d09-AI662207 394 .ATCTCACCAGCCGCCTTCAAG cq02d09-AI662207 395 .ATCTTCTGGTGGCATGAGTGTCAC cq02h08—AA511962 396' CCAAAACCAAACTGTAGTGAGCATG cq02h08-AA511962 397 CTGTGCTGTGTTGCGGACCTC cq02d09-AI558083 398 TGAGGTGAGAGGAGCTGGCCC cq02d09-A1558083 399 .AGCAGAGGGAATGGGGAGCAG cq04e09 400 TGCCTCAAAGAAACTGTCTGGACC cq04e09 127 Table 1 (cont'd). Oligo Sequence Notes 401 .AGAGCTCCTGAACTGCCTGATG cq04e09—human R55454 402 CAATCTCAACACAGGAACCTCCC cq04e09-human R55454 403 CCAGACGGATCCCACAAAACC cq05a07—AA869455 404. CACATCCAGGCACAGCGGTAG cq05a07—AA869455 405 CGGCCTTAGGAGAGTCGCAGAG cq05b12—AA656349 406 GGAGCTGAAGCACAGGAAGTGG cq05b12-AA656349 407 .ATGGTCACCTGGCTCTATCGC cq07e01-AA414968 408 .ACCCCAAATGTGCCTGATCTC cq07e01—AA414968 409 GCCCAACAGTTACCCAGACGG cq07f01-W13454 410 GTTGAAGGAGGAGTGGCAGCG cq07f01-W13454 411. CTCAGACCCTGGAGAGTTTGGAG cq08b09-AI425320 412 CCCGCAGCACCACGTATAAG cq08b09—AI425320 413. GATCTCAGGATATCCTCGAATTCTGCTGG pGL3 linker ATC 414. GATCCTGCAGCAGAATTCGAGGATATCCT pGL3 linker GAGATC 415 GAAACAGCTATGACCATG M13 reverse primer 416 (TEAAAACGACGGCCAGTG M13 —20 seq. primer 417 GTAATACGACTCACTATAG T7 sequencing primer 418 .AACGTCCGTCACGGAAGAATTAATCTTAT mFAT ds Oligo GCAGAA 419 TTCTGCATAAGATTAATTCTTCCGTGACG mFAT ds Oligo GACGTT 420 .AAAGAAATTAGCTATGATAATCAATTC Itm2A ds Oligo 421. GAATTGATTATCATAGCTAATTTCTTT Itm2A ds oligp 422 CTGAATTATTGGAAAAGAGATATGAAATT Itm2A ds oligg 423 CTGCAATTTCATATCTCTTTTCCAATAAT Itm2A ds Oligo TCAG 424. GTCATTTCATATTTCCGCACCTAAATTTA Itm2A ds Oligo ATTATACA 425 TGTATAATTAAATTTAGGTGCGGAAATAT Itm2A ds Oligo GAAATGAC 426i CTGTCCGACTTGCCCTCCTTG even—skipped oligg 427 .AGGGTGGGCGGAAGGTAAAGG cq09CO8—AA592551 428 CGCGCCAGACTCCTAAGAGCC cq09c08—AA592551 429 GATCCATGCCACAGACCAGGAC hFAT around Sau3AI 128 Table 1 (cont'd). Oligo Sequence Notes 430 .AGACGTCCAGATGTGGGTGAGG hFAT around Sau3AI 431. TGTGGGTGCGATGAGGAATAAG human Itm2A Sau3AI 432 TTAATGCAGAAGATGAAAGGACAATC human Itm2A Sau3AI 433 .AGCCGCACGGTCAGAACTCAG Human Itm2A for RTPCR 434 CACGAATTTCCTCCACAGCAAC Human Itm2A for RTPCR 435 .ACCCTTCATTAGTTCCACCACGGTGCTCT a01e09 ds oligo TCCGGCCTAT 436 .ATAGGCCGGAAGAGCACCGTGGTGGAACT a01e09 ds oligo AATGAAGGGT 437 ACCCTTCAGGAGTTCCACCACGCATATCT a01e09 ds oligo mutant TCCGGCCTAT 438 .ATAGGCCGGAAGATATGCGTGGTGGAACT a01e09 ds oligo mutant CCTGAAGGGT 439 TTCTGCTCTGGCTGCTCTTGC Ligl mouse RTPCR 440 CTGGGTGATCCTGTTTTTGCTC Ligl mouse RTPCR 441. ACACATTCACGGTGAGCCTGG Ligl human RTPCR 442 TGTGCCCACCCAGAATCACTG Ligl human RTPCR 443 CTCGGCTCGCTGGCACTTTC Human ICAM5 RTPCR 444 .AGAGTTGCGCTTGGGGATTGG Human ICAM5 RTPCR 445 .AACCTTTCTGGGCGGACCTTC Mouse ICAM5 RTPCR 446 CCTACGAAACTGCGGCGAATC Mouse ICAM5 RTPCR 447’ CAATCCAAACCAGAGACACCCG Human LAMA2 448 TCAGGCGAATATAGCGAGCGG Human LAMA2 449 .AGGAACCCTCAGTGCCGAATC Mouse LAMA2 450 'TCAGGCGAATGTAGCGAGCAG Mouse LAMA2 451. GTTTGACAGTTCGGCACCCTTCATTAGTT a01e09 ds oligo CCACC 452 GGTGGAACTAATGAAGGGTGCCGAACTGT a01e09 ds oligo CAAAC 453 GTTTGACAAATCGGCACCCTTCAGGAGTT a01e09 ds oligo mutant CCACC 454. GGTGGAACTCCTGAAGGGTGCCGATTTGT a01e09 ds oligo mutant CAAAC 455 .AAAAAGAAATTAGCTATGATAATCAATTC a01a04 ds oligo G 456 (CGAATTGATTATCATAGCTAATTTCTTTT aOlaO4 ds oligo 129 Table 1 (cont'd). 01 igo Sequence Notes T 457 TTTCAAATTAAGCTATTATGTGCCTTTTA a01a04 dS oligo G 458 CTAAAAGGCACATAATAGCTTAATTTGAA A a01a04 ds oligo 459 tgaccagggtccGCGTATGTACCTTCGTTGCCG 460 'TAGCCAGCGTATGTACCTTCGTTGCCG 461. CAGCTTTTCTGCCCCAACTAAC nucleoporin p54 human 462 GGGCATGCAACTATAACCTACTGC nucleoporin p54 human 463 .ACCACTCCACCACAGCCAAG engrailed 2 human RTPCR 464. GGTGAAACCCTAAGCAGCCC engrailed 2 human RTPCR 465 CGATCCAAGACACACAGGGTTACAC pheromone mouse RTPCR 466 ACATTTCCTCCTTAGACTATTGGC pheromone mouse RTPCR 467 CATGGCCGACTACCTGATTAGTG human IMPDH2 RTPCR 468 GGAGAGGAAACCAGTGGGGTC human IMPDH2 RTPCR 469 .AAGGAGAAAGCCAAGTCAGCG mouse YAF2 RTPCR 470 .ATCTAATAAATCCTGGCAAATTCG mouse YAF2 RTPCR 471. CTGGTGGAGTGGGGTGATAGC human YAF2 RTPCR 472 .AACTGCTGAGTAACCTGCTGTGC human YAF2 RTPCR 473 CCTCACCGCTCCTGTTGTTTG mouse LR3 RTPCR 474 TTCCGGGTACTGCCATCCATC mouse LR3 RTPCR 475 CCTCGCCGCTCCTGCTATTTG human LR3 RTPCR 476 CCATCCCTGCCCGCTCAATC human LR3 RTPCR 477' GGGGTGTTGCAGTTTCATTGG mouse DSGl 478 GCCTGGCGCTATTACCCTTTC mouse DSGl 479 CAACTCAAAGAGGAACCCAATCG human DSGl 480 TGAGTTCAAATTGTTCGGTTCATC human DSGl 481 .ATGCCCAGATAGTGCGGTTGC human ankyrin 482 ‘GATTTGCACCACATCGGTCTTG human ankyrin 483 CCAGTCTTTGGAGGTGGTGGTG mouse atrophin 1 related 484 CGTTAAGGGGCTCCGACAATAC mouse atrophin 1 related 485. CGTCCACACCCGTCAACACAC human atrophin 1 related 486 1CCCGCTGGCGTTTGTTACTC human atrophin 1 related 487' CAAAATTACCAGGCTTATCACTGC mouse VAPl 488 .AACAAGCTGGCATACTGACTCATC mouse VAPl 130 Table 1 (cont’d). 01 igg Sequence Notes 489 CTTCGATTTGCTTCCTCTCCAC human VAPl 490 GAAGTTGGTCTGATCGGCTGAG human VAPl 491. CTCCTTGGCGTTGCTCAGTTC mouse myomegalin 492 GAGCGATCGGAACAAACAAGC mouse myomegalin 493 .AGACTTGGACACAGTTGCAGGG human myomegalin 494 TTCTAGCCGACTGCGTAACACC human myomegalin 495 GCTGGTTTAAGGCGTCATTCAC mouse EST KIAA0809 496 CCGTAACCAAAGCAGCACAGC mouse EST KIAA0809 4 9 7 GTCTTCCTTGCCCCACAGTTG human KIAAO 8 0 9 498 GCCACCCTCTGACACCCAATC human KIAA0809 499 CGGGCTTAGGGTCAGTTTCG mouse SIG 500 .ACACCTCCTCAAGCAACTCTGTG mouse SIG 501. GGTGCCTTATGCCCGCTCAG human SIG 502 GCCACCATTTTCCCCTCCTG human SIG 503 TGGGCAACAATTTATCTGGGGTC human a01e03 504. CCCAGTTGGACACAAGGCTGC human a01e03 505 CCACGGCCTCTGTCATCCAAG human a02c03 506 TCTTGCCCCACTCCTTCTTCC human a02c03 507 .ACACCCGTGCCTTCTAATGAG mouse BVES 5 O 8 TTGTCCTCTGTGGCGTAAACC mous e BVES 509 GGGCCACTCTCTACCGATGTG human BVES 510 .AAGGCACAGGGGTAAATGTTATG human BVES 511. CATGTTGGTGCCTTCTCTGTGG human cq08b09 512 TTCGGTTGGCAGAGGAGCAG human cq08b09 513 .ACGCAGGCCAGAGGAGAGAAC mouse cq08e08 514. CTGACAAAGGCAACAAACAACCC mouse cq08e08 515 CTGGCGAAGTCTGGGGAGTC mouse cq09d07 5 l 6 GCAACGAGCAACTGACCATCC mous e ch 9 d0 7 517’ CTATGGCGGCATGTTTGAAGC mouse cq09f02 518 TGGCCTGGGCTACTTTCAACAC mouse cq09f02 519 .ACTCAACACGGCTGGGTCCTG human cq09f02 520 .AGTTCACCCTCAGCGACCTCAC human cq09f02 521. TTTAGGCAGCCACTTCTCTGAAAG mouse cq11a06 522’ GGTGGGATGTTGAGGCAAGTAAC mouse cq11a06 523 GTGTCACGGGTCTGCTCAAGG mouse cqllalO 131 Table 1 (cont’d). n--. 2" 5.--;— L 01 i_go Sequence Notes 52 4 CAGAGCCTTGCGATTGAGTGC mouse cql lal 0 5 2 5 CCACGGAGCAGAATGTCAAGC human cql 1 a 1 0 KIAAO 2 5 6 5 2 6 GCTGGACAATGGGAAGACCTG human cql lal 0 KIAAO 2 5 6 5 2 7 GACAGGTGGATCAC TGCGGTG mou s e cql 1b 0 6 5 2 8 TCCTGCGGTGGTGACTTGCTC mouse cql 1b0 6 5 2 9 GTGCTTGTGCCTCTTGTTATTCC mouse cql 0 a 1 1 5 3 O AAGTGAAACCACAGGCGATGC mouse cql Gal 1 5 3 l GCTCCGGGACAGTTGCTTCAC mou s e c ql 2 e 0 5 5 3 2 AAATGCTGAGGAGGGTGGAGG mous e cql 2 e O 5 5 3 3 CACTGGTTTGTTCTGTCCTCCTG human a 0 3 a O 4 5 3 4 TACCGTGCGTGGGACTATGAC human a 0 3 a 0 4 5 3 5 TCTCTCTAATGGCAGCTTGGGAC human a O 3 f 1 0 5 3 6 ATCATTCAGCACCCATCACTCAC human a O 3 f 1 O 5 3 7 GCCTTCCGATTCCCACCTTG mouse c lone 3 5 3 8 CTCCCAATTAGCGAAAGTGCG mous e c 1 one 3 5 3 9 CAAGTTCTCTGCCTGCCTGCC human c 1 one 3 5 4 O CTCCCGAGAAGCCGAGATAATAG human C 1 one 3 5 4 1 AAAGCGGCAAAGTGCCCCAAG mous e ch 6 g0 6 5 4 2 TCAGGGATAACCGTCGCAAGC mouse ch 6 g0 6 5 43 TTCCTAACTGAGCCCAAAGAGGTG mouse ch 6e12 5 4 4 AGGGTCGAGGCGTTCTGCTTC mous e ch 6 e1 2 5 4 5 GCACTGCACTGAGGACCCG human ch 6 e1 2 5 4 6 GCAAGGGTGGGAGCATTCTG human ch 6 e 1 2 5 4 7 GGGAAGGTGAAGGTCGGAGTC human GAPDH 5 4 8 CGCTCCTGGAAGATGGTGATG human GAPDH 5 4 9 TGGTGAAGGTCGGTGTGAACG mou s e GAPDH 5 5 0 CAGAAGGGGCGGAGATGATG mous e GAPDH 5 5 1 ACAGAGCCTCGCCTTTGCCG human B- ac t in 5 5 2 C CTCGTCGCCCACATAGGAATC human B— ac t in 5 5 3 GTCCACACCCGCCACCAGTTC mous e B- ac t in 5 5 4 CCAGAGGCATACAGGGACAGC mous e B — ac t in 5 5 5 C CATGAGGGTCACCGAGGAAC human TGFA promo ter 5 5 6 CAGGCAGGCCACATCGTTAAG human TGFA promo ter 5 5 7 CCACCTCAGAGCCACAAATCC human TGFA promoter 5 5 8 C TGTGACGAATCTGGTTATATGGC human TGFA promo ter 132 Table 1 (cont'd). Oligo Sequence Notes 559 CCTGTTCGCTCTGGGTATTGTG human TGFA RTPCR 560' CAAACTCCTCCTCTGGGCTCTTC human TGFA RTPCR 561 GGGTATCCTGTTAGCTGTGTGCC mouse TGFA RTPCR 562 CAGAGTGGCAGCAAGCAGTCC mouse TGFA RTPCR 563 CTCGCTATTTGACTCGTGGCTAC human VEGFR RTPCR 564 TATTATTGCCATGCGCTGAGTG human VEGFR RTPCR 565 GGAAACCACAGCAGGAAGACG mouse VEGFR RTPCR 566 .AGGGATGCCATACACGGTGC mouse VEGFR RTPCR 567 tgaccaggatccTGTGGGTGCGATGAGGA Human Itm2A 431 w/tail ATAAG 568 tgaccaggatccTTAATGCAGAAGATGAA AGGACAATC Human Itm2A 432 w/tail 569 tgaccaggatccATGATTTCCTCAACGTC CGTCAC Mouse mFAT w/tail 570 tgaccaggatccCACCGACCAGGACGTGT ATGAC Mouse mFAT 322 w/tail 571 tgaccaggatccGATCCATGCCACAGACC AGGAC Human hFAT 429 w/tail 572 tgaccaggatccAGACGTCCAGATGTGGG TGAGG Human hFAT 430 W/tail 573 tgaccaggatccGATCATGCCGCCTATGG AATAC human VEGFR w/tail 574 tgaccaggatccGGGTTACCTGCGACTGA GAAATC human VEGFR w/tail 575 .ACCAAGTGTACCCCGAAAGAGG human TGFA promoter 576 CCACCAGCTGAGGCCAAAAG human TGFA promoter 577' CTGCCAGAACCAGAAGAAAGTATG human VEGFR EMSA 578 TGGCAGGCAACAAATTAGTGAAC human VEGFR EMSA 579 CCAGATTCGTCACAGAGACCCATTTTTTA human TGFA oligo EMSA TCAGTC 580 GACTGATAAAAAATGGGTCTCTGTGACGA ATCTGG human TGFA oligo EMSA 581 CCAGATTCCAGGAAGAGACCCATTTTTTA TCAGTC human TGFA mutant PD 582 GACTGATAAAAAATGGGTCTCTTCCTGGA ATCTGG human TGFA mutant PD 133 4-_z_.I—1-'- Table 1 (cont'd). 01 igo Sequence Notes 583 CCAGATTCGTCACAGAGACCCGCTTTCGT human TGFA mutant HD ACAGTC 584: GACTGTACGAAAGCGGGTCTCTGTGACGA human TGFA mutant HD ATCTGG 585 ‘ATCTGCTCGCTATTTGACTCGTG human VEGFR RTPCR 586 TCTAGAGTCAGCCACAACCAAGG human VEGFR RTPCR 587 CAGCGCATGGCAATAATAGAAGG human VEGFR RTPCR 588 .AGGTTTCGCAGGAGGTATGGTG human VEGFR RTPCR 589 .ACATTACCTGGATTCTGCTACGG mouse VEGFR RTPCR 590 TTCCCGGTTCTTGTTGTATTTTG mouse VEGFR RTPCR 591. GCGTGCAGAGCCAGGAACATATAC mouse VEGFR RTPCR 592 TAGAAGGAGCCAAAAGAGGGTCG mouse VEGFR RTPCR 593 AGGATCTGGGAGGGCGAGTTG human ASK 594: CAGGGCTTTGGCTCATATCTTCC human ASK 595 .AAGAACCAGCGGAACTCAAAAC human cellubrevin 5 9 6 TTTTCTGGAAGGCATAAGT’I‘GG human ce 1 lubr evin 597 GGAGAACGAGGAGGACGGTGAG human STM2 598 CAGCAAGAGGGTGGGGTAGTCC human STM2 599 .AAAGGGAGTCAAAGGGCTAACAG human Hs_cu1-3 EKK) GCACCTCCAACACCAACTTCAG human Hs_cu1—3 601 TTCAGGGCCTCCAACACCAAG human ew15b11 602 CCCACGCAAGGCACTTACTCC human ew15b11 603 GTCTCAGCCTCGCCACCTTC human PCCMT 604; TCTCCGAAGACCACCATCAGC human PCCMT 605 ‘GACCGGGAGGAAAAGGCTGAG human ev01a06 606 ‘TGGCCTTCACCCTTCCTATTGC human ev01a06 607' GAAGAAGCTGACTGCGGAGGTG mouse ev01a06 608, TGGTCTGTTGGTGTCCCTGTCC mouse ev01a06 609 ‘CCATCGAGAGGGTCAAACTGC human ANTl EH1) CAGCATCCCCTTGGCAGTATC human ANTl 611. CTTCTGCTCTGGCTGCTCTTGC mouse GA3—43 612 CGGATGCTGATGCGGTTGCTC mouse GAB—43 613 CGGGCCGTTATTTTCTTGACTC human KIAA0339 614: CTGGGAGGCTCGGAAGTTGC human KIAA0339 __§15. GTATCAGCGGCAGGGAAGGAG human DAGl 134 Table 1 (cont'd). AGCCAA ‘Qiigo Seguence Notes 616 GGCTTCTTATTGGCGATGTGC human DAG]. 617 GTCCCCAGATTCCCACACTCAG human TGFA RTPCR 618 CAAACTCCTCCTCTGGGCTCTTC human TGFA RTPCR 619 GTGCCACAGACCTTCCTACTTGG human TGFA RTPCR 620 CTTCCCCAGTAGGCAAATGACAG human TGFA RTPCR 621. AGCCAGAAGAAGCAAGCCATCAC mouse TGFA RTPCR 622 CTCGGTGTGGGTTAGCAAGAAGG mouse TGFA RTPCR 623 TGGGGACAAGAGGACAAAAGAGC mouse TGFA RTPCR 624 .AGAGGGGTGGAGAGGCTTGGTAG mouse TGFA RTPCR 625 TCAACTACACCGAATCTCACAAAG mItm2A EMSA 626 ‘TGCGGAAATATGAAATGACTTGC mItm2A EMSA 627 .ATTGGAAAAGAGATATGAAATTGC mItm2A EMSA 628 'GTCCCTATTTAGGAGGAACAAGG mItm2A EMSA 629 CACCAACCTAAAGTCCAATAAATG hItm2A EMSA 630 TGAGACATGAGAAGAACAGTTTGC hItm2A EMSA 631. GAAAGTCAGCAACGCCAAAGTC mBVES EMSA 632 CTTGGTTCTGTGGAGTGTAGGC mBVES EMSA 633 CCCCACCATGCACTTCAAATAG mBVES EMSA 634 ‘ATCATGGGCTAAGGTTGTTTTG mBVES EMSA 635 .ATCTCGCACCTTCCTACCTCAC mBVES EMSA 636 TGTCATTATTTAATGCCTCAAGGCAGTGA mBVES EMSA ds oligo 637 TTGGCTTCACTGCCTTGAGGCATTAAATA ATGACA comp to 636 638 CAGCGTAACAACATCAAAGATAGC hVEGFR EMSA 639 GTTCATACTTTCTTCTGGTTCTGG hVEGFR EMSA 640 CATAAATGTTCACTAATTTGTTGCCTGCC AGAACC hVEGFR EMSA ds oligo 641 GGTTCTGGCAGGCAACAAATTAGTGAACA TTTATG 640 comp. TF180 tgaccaggatcCTGGAGGCCGGAAACAGGGCTCC TF181 tgaccaggatccCTGCCCCGCCTCCCTCTCTGGC TF182‘TTGGGTTTGCTGCCGCCGATGG TF183‘GGCCTGCCGTTGATAAATACTCCTCG Spd specific TF184 CAGGGCCGAGTCAACCAGCTCG wt specific 135 .n...—-... .. Table 2: RAXB polymorphisms Polymorphism Location (T)n Intron 8 (G)n Intron 9 (GT)n Exon 10 C/T Intron 8 G/C Exon 10 Position 311—321 883—895 1135—1154 671 1038 Table 2: PAX3 polymorphisms. Five polymorphisms were detected in an3. Three of the polymorphisms are variable number of mononucleotide or dinucleotide repeats, two are single nucleotide substitutions. The location of the pOlymorphisms in PAX3 are shown. Numbers correspond to the nucleotide position shown in Figure 3: PAX3 sequence exons 8, 9, and 10. 136 Table 3: Pax3 expression constructs Clone Name Species Isoform Vector Pax3c—pGEM amuse~ Pax3cQ+ pGEM—7Zf(+) Pax3d—pGEM House Pax3dQ+ pGEM—7Zf(+) PAX3c—pGEM human PAX3cQ+ pGEM—7Zf(+) PAX3d—pGEM 1numu1 PAX3dQ+ pGEM—7Zf(+) Pax3c-pcDNA mouse Pax3cQ+ pcDNA3.1/Zeo(+) Pax3d—pcDNA mouse Pax3dQ+ pcDNA3.1/Zeo(+) PAX3 c—pcDNA human PAX3CQ+ pcDNA3 . 1 / Zeo (+) PAX3d-pcDNA human PAX3dQ+ pcDNA3.1/Zeo(+) Table 3: Pax3 expression constructs. Eight types of Pax3 expression constructs were generated. Constructs were prepared for Pax3cQ+, Pax3dQ+, PAX3cQ+, and PAX3dQ+ isoforms in pGEM—7Zf(+) and pcDNA3.1/Zeo(+) vectors. 137 "—‘I-_-—d-'- n—fi— —. ...._. — Table 4: Pax3 antibodies Peptide 1 Peptide 2 Peptide 3 Pepti de QKPWTF QAFHYLKPDIA QSYQPTS I PQAVSD Pax3c + — + Pax3d — + + PAX3C + - + PAX3 d - + + PAX3 / FKHR - — + Anti—sera PB30, PB31 PB32, PB33 PB34, PB35 Table 4: Pax3 antibodies. Three peptides (1, 2, & 3) were synthesized, conjugated to KLH and injected into rabbits. Anti—sera were collected from two rabbits for each peptide (PB30-PB35). Specificity of the anti—sera was determined by western blot analyses for the “c” and “d” isoforms of human and mouse Pax3 as well as human PAX3/FKHR is indicated by (+) and (-) symbols. 138 Table 5: Buffers for Pax3 EMSAs and IPs Component B1 32 BB B4 BS 36 B7 HEPES pH 7.9 (mM) 25 25 25 .25 12 12 25 KCl (mM) 100 50 50 10 — 100 50 DTT (mM) 1 1 1 1 0 . 6 0 . 6 1 glycerol (%) 10 12 10 10 12 12 10 (dIdC)n (ug/Rxn) 1 0.15 1 1 2 2 1 MgCl2 Umfl) - 5 — — - _ _ ZnSO4 (MM) — 10 — - - — _ NP—4O (%) — 0.1 - — — — — EDTA pH 8.0 (mM) — - - 1 1 - ZnClz ”ED 50 — 50 50 — — — Tris—HCl pH 7.9 (mM) — - — — 4 4 — BSA (ug/Rxn) — — - — 3 3 1 Table 5: Buffers for Pax3 EMSAs and IPs. Seven buffers, B1—B7, were prepared using the reagents listed above. (-) indicates absence of a component. 139 Table 6: Summary of CASTing libraries DNA source Protein Antibody PCR Cycles - First round PCR Cycles — Second round PCR Cycles - Third round Clones sequenced Total sequence obtained (bp) Average size (bp) Table 6: CASTing libraries were prepared. cq Mouse Pax3dQ+ PB33 22 14 12 1260 444,813 353 Summary of CASTing libraries. ev ew Human Human PAX3dQ+ PAX3/FKHR PB33 PB35 26 26 26 20 18 20 1203 1088 426,527 388,527 355 357 Three Pax3 The number of cycles of PCR is shown for each round of amplification in the CASTing strategy. and total sequence obtained for each library. Also shown are the number of clones sequenced The average size reflects the average sequence read length and clone size. 140 _f— Table 7: Repetitive elements in CASTing library cq Sequences: 1260 Total length: 444813 bp GC level: 43.92 % Bases masked 108790 bp (24.46%) ————————--—.—-—.—--——--——_—c———_————-—————.—————————————————’—— fl—___—------_—_------_—_—_-*_———-—_*—*_—_*u——‘_‘—_— number of length percentage elements occupied of sequence SINEs 189 18838 bp 4 24 % B15 66 5956 bp 1.34 % B2—B4 102 11809 bp 2.65 % IDs 5 398 bp 0.09 % MIRs 6 675 bp 0.15 % LINES: 162 35518 bp 7.98 % LINEl 159 34925 bp 7.85 % LINE2 3 593 bp 0.13 % LTR elements: 235 42657 bp 9.52 % MaLRs 135 22141 bp 4.98 % Retroviral 93 19268 bp 4.33 % MER4_group 7 33 bp 0.01 % DNA elements: 39 6165 bp 1.39 % MER1_type 33 4849 bp 1.09 % MER2_type 6 1316 bp 0.30 % Mariners 0 0 bp 0.00 % Unclassified: 4 427 bp 0.10 % Total interspersed repeats: 103178 bp 23.20 % Small RNA: 1 32 bp 0.01 % Satellites: 0 0 bp 0.00 % Simple repeats: 87 4253 bp 0.96 % Low complexity: 6 294 bp 0.07 % 141 i Table 8: Repetitive elements in CASTing library ev Sequences: 1203 Total length: 426527 bp GC level: 46.80 % Bases masked 157962 bp (37.03%) ——————n——————————--————————————_——_————————————————_ _———~——.——o————————————————————_.—_——-———_————_—--——————— number of length percentage elements occupied of sequence SINEs 542 89132 bp 20 90 % ALUs 504 84197 bp 19 74 % MIRs 38 4935 bp 1 l6 % LINES: 93 15787 bp 3.70 % LINE1 71 12216 bp 2.86 % LINE2 22 3571 bp 0.84 % LTR elements: 143 37528 bp 8.80 % MaLRs 53 11762 bp 2.76 % Retrov. 40 10785 bp 2.53 % MER4_group 37 11150 bp 2.61 % DNA elements: 71 10368 bp 2.43 % MER1_type 46 7093 bp 1.66 % MER2_type 17 2347 bp 0.55 % Mariners 0 0 bp 0.00 % Unclassified: 2 642 bp 0.15 % Total interspersed repeats: 153457 bp 35.98 % Small RNA: 1 103 bp 0.02 % Satellites: 6 982 bp 0.23 % Simple repeats: 58 2297 bp 0.54 % Low complexity: 37 1128 bp 0.26 % 142 Table 9: Repetitive elements in CASTing library ew Sequences: 1088 Total length: 388527 bp GC level: 42.51 % Bases masked 152121 bp (39.15%) ————-—_——————_-——--—-—————_———-—-———-—_———_—————-—u-———— ———_—-—-——-———_——————————————-———————_——_—————————— number of length percentage elements occupied of sequence SINES 516 69635 bp 17 92 % ALUS 486 65429 bp 16 84 % MIRS 30 4206 bp 1 08 % LINES: 141 29802 bp 7.67 % LINE1 111 23603 bp 6.07 % LINE2 29 6086 bp 1.57 % LTR elements: 165 36806 bp 9.47 % MaLRs 95 22353 bp 5.75 % Retrov. 30 5881 bp 1.51 % MER4_group 29 5894 bp 1.52 % DNA elements: 49 8255 bp 2.12 % MER1_type 32 5042 bp 1.30 % MER2_type 10 2439 bp 0.63 % Mariners 0 0 bp 0.00 % Unclassified: 1 226 bp 0.06 % Total interspersed repeats: 144724 bp 37.25 % Small RNA: 4 248 bp 0.06 % Satellites: 17 5553 bp 1.43 % Simple repeats: 24 794 bp 0.20 % Low complexity: 29 885 bp 0.23 % 143 Table 10: BLAST results from.CASTing library cq Clone Gene Description Score clone2 H.Sapiens mRNA for hFat protein. 4.30E—69 nt clone2 zo7lf05.r1 Stratagene pancreas (#937.. 4.50E—28 est clone3 AV074335 Mus musculus stomach C57BL/.. 7.8OE—46 est clone4 Homo sapiens cAMP—regulated guanine nu 2.7OE—13 nt clone4 an04h10.x1 Stratagene schizo brain 8.. 2.70E—11 est a01a04 Human DNA sequence from PAC 696H22 o.. 6.00E-20 nt a01a08 AV041492 MUS musculus adult C57BL/6J.. 4.90E—11 est a01a12 C86399 Mouse fertilized one—cell—emb.. 9.30E—17 est a01b03 ab09d01.r1 Stratagene lung (#937210).. 3.20E—10 est a01d04 UI—M-ALl—ahj —a—10-—0-UI . sl NIH__BMAP__M. . 4 . 80E—26 est a01d06 Novel human mRNA from chromosome 1,... 6.00E—20 nt a01d06 EST179679 Cerebellum II Homo sapienS.. 7.30E—19 est a01e03 Homo sapiens, WORKING DRAFT SEQUENCE... 2.70E-18 nt a01e04 Human chromosome 11 cosmid cSRL—87... 5.50E—15 nt a01e09 Sequence 4 from Patent WO9707209. 5.90E—48 nt a01f10 Homo sapiens Xq28 BACS 360 F12, GSHB... 6.40E—38 nt a02a05 leucine aminopeptidase [cattle, kidn... 6.20E—50 nt a02a05 m185a08.y1 Stratagene mouse kidney (... 7.00E—74 est a02a08 Human nicotinic acetylcholine recept... 1.00E—59 nt a02a08 yl71f01.r1 Soares infant brain 1NIB 1.20E—27 est a02b01 Homo sapiens, clone hRPK.14_A_1, com... 6.20E-51 nt a02c03 Homo sapiens mRNA for KIAA1020 prote... 7.70E—20 nt a02c03 me05f05.x1 Soares mouse embryo NbMEl... 5.10E-49 est a02d01 Homo sapiens PAC clone DJ0789N01 fro... 1.70E—26 nt a02d03 Homo sapiens clone NHO308G20, WORKIN... 4.50E—14 nt a02d07 pEMS730-E14 Flow Sorted Mouse Y Chro... 1.7OE—43 nt a02d07 v005d05.y1 Stratagene mouse Skin (#9... 2.70E—09 est a02d08 UI—R-YO—abm—e—09—0—UI.Sl UI-R—YO Rat... 8.1OE—1O est a02e07 AV075911 Mus musculus stomach C57BL/... 4.10E—23 est a02f07 mj14b11.r1 Soares mouse embryo NbMEl... 9.30E-19 est a02f10 Homo sapiens clone DJ1121A15, WORKIN... 1.80E-17 nt a02f12 Homo sapiens, WORKING DRAFT SEQUENCE... 3.10E—25 nt a02f12 v001f04.r1 Stratagene mouse Skin (#9... 1.10E—37 est a02h09 HS_3092_B1_C06_MF CIT_Approved Human... 7.30E—l6 nt 144 Table 10 (cont'd). Clone Gene Description Score a02h12 kflak065 Roswell Park Cancer Institu... 2.60E—19 nt a02h12 uh62d08.r1 Soares mouse embryonic st... 1.80E—20 est a03a02 vs70g}2.x1 Stratagene mouse skin (#9... 2.10E—25 est a03a03 mt01f08.r1 Soares mouse 3NbMS Mus mu... 3.00E—12 est a03a04 Homo sapiens chromosome 12 clone 44N10, 2.60E—33 nt a03a04 op47a03.s1 Soares_NFL_T_GBC_Sl Homo 1.30E—34 est a03a06 uh87h09.rl Soares mouse urogenital r.. 2.10E—11 est a03a11 EST229166 Normalized rat kidney, Ben... 3.10E—12 est a03c03 Mus musculus, WORKING DRAFT SEQUENCE... 2.20E—52 nt a03c03 mt78a01.y1 Soares mouse lymph node N... 1.9OE—56 est a03012 Mus musculus, WORKING DRAFT SEQUENCE... 5.00E-36 nt a03c12 vi93c10.r1 Stratagene mouse heart (#... 6.50E—27 est a03e05 Homo sapiens mRNA for KIAA0665 prot... 1.30E—13 nt a03e05 AV160597 Mus musculus head C57BL/6J 4.70E-12 est a03f02 AV061833 Mus musculus small intestin... 1.80E-09 est a03f06 ue82e01.r1 Soares mouse uterus NMPu .. 2.50E-67 est a03g07 .mh99d09.r1 Soares mouse placenta 4Nb... 6.7OE—36 est a03g10 Mus musculus clone 182_H_5, WORKING 8.30E-16 nt a03g10 mq54e12.y1 Soares 2NbMT Mus musculus... 3.00E—19 est cq01a04 HS_5332HB2_G06,T7A RPCI-ll Human Mal.. 1.60E—12 nt cq01b02 Homo sapiens stanniocalcin (STC)_gen... 1.10E—37 nt cq01b02 tu88a10.x1 NCI_CGAP_GaS4 Homo sapien... 8.70E—37 est cq01b09 RPCI—11-457M11.TV RPCI—ll Homo sap... 3.70E-22 nt cq01b09 UI-M-APl-agm—g—OZ—O-UI.S1 NIH_BMAP_M... 1.50E—57 est cq01c05 Mus musculus, WORKING DRAFT SEQUEN... 1.50E—42 nt cq01c05 mv27f04.r1 GuayWoodford Beier mouse 1.80E—44 est cq01d07 Homo sapiens chromosome 9q34, clone 1.30E—23 nt cq01d12'UI-R—YO-act—d—O7-0-UI.Sl UI-R-YO Rat. 2.00E-12 est cq01e12 wj37d12.x1 NCI_CGAP_Lu19 Homo sapien... 6.30E—23 est cq01g04 Homo sapiens mRNA; cDNA DKFZp586K18... 9.30E—20 nt cq01g04 an01g07.x1 Stratagene schizo brain S... 7.00E-22 est cq01g11 me34d07.x1 Soares mouse embryo NbMEl. 2.60E—65 est cq01g12 ou31b08.x1 SoareS_NFL_T_GBC_Sl Homo 6.90E—21 est cq02a01 Mus musculus, WORKING DRAFT SEQUEN... 5.70E—14 nt cq02a01 uf01b12.x1 Sugano mouse embryo mewa 1.50E—27 est 145 Table 10 (cont'd). Clone Gene Description Score cq02a09 Homo sapiens clone NH0394E01, WORKIN... 1.10E—30 nt cq02c05 Homo sapiens BAC clone RG041H04 from... 2.10E—41 nt cq02d04 Human DNA sequence from PAC 93H18 on.. 1.30E—43 nt cq02d09 mz96e10.x1 Soares mouse lymph node N... 3.20E—27 est cq02e10 MDB1091 Mouse brain, Stratagene Mus 1.80E-10 est cq02f03 Homo sapiens BAC clone RG118D07 fromn.. 2.70E—15 nt cq02h06 Human Chromosome 15q26.1 PAC clone p... 9.40E-22 nt cq02h08 vj40c01.r1 Stratagene mouse Skin (#9... 1.40E—57 est cq02h09 HS_5409_A2_G11_T7A RPCI—ll Human Mal... 1.90E-25 nt cq03b12 Mus musculus, WORKING DRAFT SEQUENC... 2.00E-17 nt cq03b12 mm05f03.r1 Stratagene mouse diaphrag... 7.20E—41 est cq03c02 Mus musculus, WORKING DRAFT SEQUENCE... 2.20E—23 nt cq03c02 v134g03.x1 Stratagene mouse skin (#9... 8.20E—11 est cq03d07 Human DNA sequence from clone 246H3 o.. 6.10E-14 nt cq03d09 Homo sapiens clone NH0236P02, WORKI... 7.40E—15 nt cq03d09 vw65g11.x1 Stratagene mouse heart (#... 8.60E-14 est cq03f03 Human DNA sequence from clone 505B13 o. 2.50E-31 nt cq03f05 Homo sapiens chromosome 16 clone RPC... 8.20E—21 nt cq03g04 HS_5189_A2_A01_SP6E RPCI—ll Human Ma... 9.10E—15 nt cq03g04 vm10c09.r1 Knowles Solter mouse blas... 7.60E—21 est cq03g09 Human engrailed protein (EN2) gene... 6.60E—35 nt cq03g09 qa09a09.x1 NCI_CGAP_Brn23 Homo sapie... 7.4OE—30 est cq03h01 my20c07.r1 Barstead mouse heart MPLR... 4.00E—10 est cq03h08 H.Sapiens mRNA for ubiquitin conjug... 2.80E—09 nt cq03h08 vk83d09.s1 Knowles Solter mouse 2 ce... 4.50E—13 est cq04c02 AV075379 Mus musculus stomach C57BL/... 9.60E—19 est cq04c03 HS_2014_A2_A07_MR CIT Approved Human... 6.30E—52 nt cq04c06 Homo sapiens chromosome 16 clone 312E.. 1.50E—40 nt cq04e09 Homo sapiens mRNA; cDNA DKFZp434D17... 9.80E—56 nt cq04e09 v111f04.r1 Soares mouse mammary glan... 4.90E—54 est cq04e10 Homo sapiens, WORKING DRAFT SEQUENCE... 1.80E—17 nt cq04f06 Human DNA sequence from clone 90L6 . 5.00E—21 nt cq04h01.MuS musculus clone GSMB-187H15, WORK... 9.60E—09 nt cq04h01.mrl9h06.rl Soares mouse 3NbMS Mus mu... 6.80E—16 est cq05a03 Mus musculus, WORKING DRAFT SEQUENC... 9.00E-41 nt 146 Table 10 (cont'd). Clone Gene Description Score cq05a03 AU015086 Mouse two~cell stage embryo... 2.90E—50 est cq05a07 vq08c04.r1 Barstead stromal cell lin... 9.60E—17 est cq05b02 Homo sapiens chromosome 4 clone Bl... 2.40E—29 nt cq05c01 Human DNA sequence from clone 67Kl7... 2.30E-18 nt cq05c07 Human DNA sequence from clone 215D11... 1.60E—46 nt cq05c08 Human DNA sequence from clone 28H20 o.. 9.70E—17 nt cq05d06 Homo sapiens 12q24.1 PAC RPCI1—71H24... 2.70E-28 nt cq05d09 EST233022 Normalized rat ovary, Bent... 1.30E—20 est cq05e07 Mus musculus chromosome 10 clone 595... 1.10E—30 nt cq05e07 m051h12.r1 Life Tech mouse embryo 10... 3.60E-32 est cq05f12 Homo sapiens chromosome 6 clone DJ19... 3.30E—14 nt cq05g02 Mus musculus clone 187_J_17, WORKING... 2.80E—24 nt cq05902 vf43b08.y1 Soares mouse NbMH Mus mus... 4.90E-29 est cq05g03 AU022865 Mouse unfertilized egg:cDNA... 1.20E—17 est cq05g12 vv21b03.r1 Stratagene mouse heart (#... 8.10E—15 est cq06a05 Sequence 3 from Patent WO 9001545. 1.30E—23 nt cq06a05 vi36f12.r1 Beddington mouse embryoni... 6.00E—37 est cq06b03 Homo sapiens chromosome 16, BAC clon... 2.50E—24 nt cq06b04 Homo sapiens chromosome 21 PAC LLN... 1.80E-12 nt cq06c03 Homo sapiens chromosome 16 clone CI... 4.20E—11 nt cq06d06 Homo sapiens clone DJ1032D07, WORKIN... 3.60E-09 nt cq06f05 RPCI—11—245C21.TV RPCI—ll Homo sapi... 1.1OE—18 nt cq06f06 Mus musculus, WORKING DRAFT SEQUENC... 6.80E—28 nt cq06f11 Human YYl—associated factor 2 (YAF2) 3.70E—39 nt cq06f11 ud80e04.r1 Soares mouse mammary glan... 4.4OE—79 est cq06g06 UI-R-C3—th-a—01-0-UI.S1 UI—R—C3 Ratt... 2.50E—30 est ‘ cq06g07 Homo sapiens Xp22 BAC G8279A12 (Ge... 1.30E—15 nt cq06h04 HS_5103_A1_D09_T7A RPCI-ll Human Mal... 1.50E—28 nt cq07e01 mc84h10.x1 Soares mouse embryo NbMEl... 1.00E—56 est cq07f01 mc03g08.r1 Soares mouse p3NMF19.5 Mu... 3.60E-14 est cq07f09 Homo sapiens clone NH0557N21, WORKIN... 1.80E—36 nt cq07h12 Homo sapiens chromosome 17 clone 251... 1.60E—16 nt cq08b09 Homo sapiens clone NH0309N08, WORKIN... 3.30E—38 nt cq08b09 mf08b11.y1 Soares mouse p3NMF19.5 Mu... 3.30E—28 est Cq08d01 yp87b06.r1 Soares fetal liver spleen... 3.80E—1l est 147 Table 10 (cont'd). Clone Gene Description Score cq08e02 RPCIll—50P9.TJ RPCI—ll Homo sapiens 6.40E—13 nt cq08e05 Burkholderia pseudomallei invasion—... 7.00E—18 nt cq08e08 Mus musculus chromosome 19 clone D19.. 7.60E—18 nt cq08e08 ui67b11.x1 Sugano mouse liver mlia M.. 4.20E—28 est cq08g08 RPCI11—105012.TJ RPCI—ll Homo sapi... 1.20E—11 nt cq09a11 Homo sapiens chromosome X clone bWX... 2.30E—17 nt cq09b03 Human DNA sequence from clone 1045... 1.50E—31 nt cq09c03 Homo sapiens genomic DNA, 21q region... 1.90E-30 nt cq09c03 zd15e02.r1 Soares_fetal_heart_NbHH19... 9.70E—29 est cq09c08 v024e09.r1 Barstead mouse myotubes M... 4.80E—10 est cq09d07 uk22d06.y1 Sugano mouse embryo mewa .. 1.90E-24 est cq09d09 HS_3121_B1_D03_T7C CIT Approved Huma... 3.90E—10 nt cq09f02 Homo sapiens clone p5B6, genomic sur... 1.20E-24 nt cq09f02 mv25b07.r1 GuayWoodford Beier mouse 9.80E-20 est cq09f04 Bos taurus desmoglein mRNA, complete... 3.30E—10 nt cq09f09 Homo sapiens chromosome 4 clone B320... 8.20E-16 nt cq09g04 RPCIll-65M10.TJ RPCI—ll Homo sapiens. 8.60E-31 nt cq09h05 Mus musculus, WORKING DRAFT SEQUENCE... 4.60E—09 nt cq09h05 ma82h09.r1 Soares mouse p3NMF19.5 Mu... 5.30E-11 est cq10a11 Human mRNA for KIAA0013 gene, compl... 1.20E—10 nt cq10a11 vb17a01.rl Soares mouse 3NbMS Mus mu... 1.60E—71 est cq10b03 HS_2009_A2_A05_MR CIT Approved Human... 6.30E—21 nt cq10c01 Mus musculus chromosome 10 clone 536... 2.30E—l7 nt cq10c01 vu95g05.x1 Stratagene mouse Skin (#9. 3.20E—14 est cq10c08 vu57d08.xl Soares mouse mammaryfglan... 7.90E-36 est cq10d03 Ovis arieS glyceraldehyde-3—phosph... 2.20E—30 nt cq10d03 C89210 Mouse early blastocyst cDNA M... 6.80E—40 est cq10e01 mj42b12.r1 Soares mouse embryo NbME1.. 2.70E—13 est cq10f09 Human mRNA for KIAA0229 gene, partial 4.20E-09 nt cq10f09 vd21a08.s1 Knowles Solter mouse 2 ca... 1.00E—23 est cq10g11 Homo sapiens chromosome 1 clone 111... 1.80E—36 nt cq10h01 Human DNA sequence from clone 998H6 5.60E—37 nt cq10h09 AU022864 Mouse unfertilized egg cDNA... 7.20E—13 est cq10h12 Homo sapiens mRNA for nel—related prot. 9.60E—28 nt cq10h12 zb08g09.rl Soares_fetal_lung:NbHLl9W... 1.50E-28 est 148 Table 10 (cont'd). Clone Gene Description Score cq11a06 vg37h03.r1 Soares mouse mammary glan... 4.20E-39 est cq11a07 Homo sapiens chromosome 4, WORKING 1.20E-18 nt cq11a10 ud26h04.r1 Soares 2NbMT Mus musculus... 6.10E—21 est cq11b08 Human DNA sequence from PAC 12409 0... 2.10E—13 nt cq11c04 vl44g02.y1 Stratagene mouse skin (#9... 2.00E—12 est cq11d03 vu55c11.x1 Soares mouse mammary glan... 9.40E-18 est cq11e01 ui27h04.r1 Soares mouse urogenital r... 3.60E-13 est cq11e03 Human DNA sequence from clone 459L4 1.10E-16 nt cq12a07 Homo sapiens chromosome 4, WORKING D... 1.20E-64 nt cq12a08:mj63c11.r1 Soares mouse p3NMF19.5 Mu... 1.20E—73 est cq12a09 Human DNA sequence *** SEQUENCING I... 3.90E—11 nt cq12c04 HS-lOlS-Bl—EO4—MF.abi CIT Human Geno... 6.90E-10 nt cq12d12 Homo sapiens mRNA for KIAA0809 prote... 2.30E—15 nt cq12e03 Homo sapiens PAC clone DJ0170019 fro... 1.70E-28 nt cq12e03 nv07d05.s1 NCI_CGAP_Pr22 Homo sapien... 1.50E-29 est cq12e05 Homo sapiens chromosome 19 clone CIT... 5.30E—13 nt cq12f12 mo60f10.x1 Stratagene mouse Tcell 93... 3.50E-37 est cq12h03 ye42f09.r1 Soares fetal liver Spleen... 3.10E—22 est 149 Table 11: BLAST results of CASTing library ev Clone Gene Description Score ev01a01 Human DNA sequence from clone 398F6 o.. 8.20E—14 nt ev01a06 wa90a04.x1 NCI_CGAP_GC6 Homo sapiens... 1.70E—62 est ev01a06 Homo sapiens cadherin-8 mRNA, compl... 5.10E—58 nt ev01a08 HS_5126_B2_E09_T7A RPCI—ll Human Ma... 2.90E—43 nt ev01a12 Homo sapiens genomic DNA, 21q regio... 5.30E—38 nt ev01c02 HS_5519_B2_C09_SP6E RPCI—ll Human Ma... 1.50E—l2 nt ev01c07 Human DNA sequence from cosmid B11B7 o. 1.90E—37 nt ev01d02 Homo sapiens chromosome 17, clone 34... 2.60E—48 nt ev01e04 Homo sapiens mRNA for KIAA0933 prot... 1.50E—31 nt ev01e04 AJ003273 Selected chromosome 21 cDNA... 1.20E-30 est ev01f06 Human Chromosome 15q26.1 PAC clone p... 3.80E-38 nt ev02c01 HS_3121_B1_D03_T7C CIT Approved Huma... 7.20E—31 nt ev02f03 Homo sapiens chromosome 11 clone DJ7... 1.20E—54 nt ev02f06 mt54c02.y1 Stratagene mouse embryoni... 7.20E—15 est ev02f09 HS_5219_B2_F11_SP6E RPCI—ll Human Ma... 1.30E—57 nt ev02g05 HSBC5F092 STRATAGENE Human Skeletal .. 8.20E-14 est ev02g08 RPCI—11—348N2.TV RPCI-ll Homo sapien... 1.10E—56 nt ev02g09 qn20d08.x1 NCI_CGAP_LuS Homo sapiens... 3.90E-42 est ev02h02 CIT—HSP-2333J18.TF CIT—HSP Homo sa. . . 2.10E—73 nt ev02h02 ae82e05.S1 Stratagene schizo brain S... 5.20E—72 est ev02h03 Human heart/Skeletal muscle ATP/ADP . 3.50E—54 nt ev03a06 Homo sapiens chromosome 4, WORKING D... 1.30E—88 nt ev03a09 Human DNA sequence from clone 696P19... 3.00E—88 nt ev03b01 am26g07.S1 Soares_NFL_T_GBC_Sl Homo . 5.70E-50 est ev03b04 Homo sapiens chromosome 5 clone P1_... 2.90E—58 nt ev03c02 cDNA GA3-43 encoding novel polypepti... 2.80E-21 nt ev03c02 zt54b02.r1 Soares ovary tumor NbHOT . 2.90E—l8 est ev03c04 HS_5054_B2_B05_T7A RPCI—ll Human M. .. 2.10E—52 nt ev03d12 wg44b02.x1 Soares_NSF_F8_9W;OT_PA_P_,.. 2.20E-63 est ev03d12 HS_5176_B1_A07_T7A RPCI-ll Human Mal... 5.80E—26 nt ev03e02 Homo sapiens chromosome 19 clone CI... 2.5E—104 nt ev03e02 HSZ78387 Human fetal brain S. Meier-... 4.20E-68 est ev03e07 Human DNA sequence from PAC 341110 0... 3.70E-96 nt ev03g09 Homo sapiens clone 549M12/460I24, WO... 5.50E—72 nt ev04a06 ye65d04.r1 Soares fetal liver spleen... 7.60E—14 est 150 Table 11 (cont’d). —_.a—n----u-- _t-_,.._ Clone Gene Description Score ev04a10 Human mRNA for KIAA0123gene, partia... 1.00E—28 nt ev04a10 EST92780 Skin tumor I Homo sapiens c... 9.60E—28 est ev04b01 Human DNA sequence from clone 1054C... 1.10E—25 nt ev04b04 Homo sapiens chromosome 19, cosmid Rd.. 6.30E—55 nt ev04b11 zf87h03.s1 Soares_pinea1_gland_N3HPG... 4.70E—52 est ev04c01 Homo sapiens clone hRPK.29_A_1, WORK... 4.60E-15 nt ev04e06 HHEA03H Atrium cDNA library Human he... 7.80E-55 est ev04e07 ty77f01.x1 NCI_CGAP_Kidll Homo sapie... 1.50E—21 est ev04e07 Drosophila melanoggster, chromosom... 3.40E—10 nt ev04e11 RPCI-11—416L2.TV RPCI-ll Homo sapien... 4.50E—32 nt ev04g02 Human mRNA for KIAA0339 gene, comple... 1.30E-72 nt ev04g09 Homo sapiens chromosome 5 clone CIT9... 2.90E—15 nt ev05a01 RPCI—11—423J18.TV RPCI—ll Homo sapie... 6.9OE—85 nt ev05a06 Human DNA sequence from clone 1090E8... 1.30E—36 nt ev05a06 vs30c05.r1 Stratagene mouse Tcell 93... 2.50E-12 est ev05b06 Homo sapiens high—mobility group pho... 5.60E—70 nt ev05b06 wj47c01.x1 NCI_CGAP_Lu19 Homo sapien... 3.9OE—68 est ev05b10 RPCI-11—265K5.TJ RPCI—ll Homo sapie... 1.9OE—21 nt ev05d08 Human PAC clone DJ149P21, complete... 1.90E-33 nt ev05e04 ti75e01.x1 NCI_CGAP_Kidll Homo sapie... 2.00E—12 est ev05g10 Human DNA sequence from PAC 345P10 on. 2.1E—113 nt ev05h06 zv57f01.r1 Soares_testis_NHT Homo sa... 1.30E—29 est ev06a03 CIT-HSP-2339K11.TR CIT—HSP Homo sap... 1.10E—27 nt ev06a11 CIT—HSP—2017012.TRB CIT—HSP Homo sap. . 5.4OE-16 nt ev06a12 Human DNA sequence from clone 172B20... 1.90E—68 nt ev06b03 Homo sapiens chromosome 14 clone b... 1.30E—44 nt ev06b03‘wj98e08.x1 NCI_CGAP_Lym12 Homo sapie... 3.10E—24 est ev06c01 Homo sapiens chromosome 19, cosmid RJ.. 1.50E—87 nt ev06c11 Homo sapiens DNA sequence from PAC 1... 1.00E—25 nt ev06d04 Homo sapiens chromosome 8 clone BA... 3.30E—94 nt ev06d07 HS_3218_A2_G08_MR CIT_Approved Human... 1.50E-34 nt ev06e12 Human Chromosome 15q26.1 PAC clone... 1.60E—51 nt ev06f11 347G5.TVB CIT9788KA1 Homo sapiens g:.. 6.90E—24 nt ev06g05 HS_5224_A2_GO4_T7A RPCI-ll Human Mal... 1.20E—58 nt ev06g07 EST179730 Cerebellum II Homo sapiens... 1.70E-28 est 151 Table 11 (cont'd). Clone Gene Description Score ev06g10 Homo sapiens PAC clone DJ1143H19 fr... 4.80E—48 nt ev06g12 Human DNA sequence from cosmid E132... 4.30E—93 nt ev06g12 af71a07.r1 Soares_NhHMPu_Sl Homo sap... 2.3OE-76 est ev06h08 Homo sapiens Chromosome 22q11.2 Cosmi.. 3.80E—72 nt ev06h08 DKFZp434F1328_Sl 434 (synonym: htes3... 4.50E—58 est ev06h11 Homo sapiens chromosome 5 clone CIT9. . 1.60E—15 nt ev07a02 Homo sapiens chromosome 17, clone hd.. 3.10E—83 nt ev07b06 Human DNA sequence from clone 20B11 o.. 5.70E-96 nt ev07b08 Homo sapiens PAC clone DJ1064B22 fr... 5.00E—15 nt ev07d07 zb67f10.S1 Soares_fetal_1ung_NbHL19W}.. 7.20E—34 est ev07d11 HS_5038_B2_F02_SP6E RPCI11 Human Mal... 5.30E-24 nt ev07e05 Human laminin alpha 2 chain (LAMA2... 4.10E—18 nt ev07e09 Homo sapiens chromosome 10 clone CIT98 7.60E-15 nt ev07f05 53b1 Human retina cDNA randomly prinL.. 2.80E—18 est ev07f05 Human mRNA fragment encoding cytopl... 5.50E—18 nt ev07g04 Homo sapiens chromosome 5, BAC clone... 6.70E—58 nt ev07h10 Homo sapiens, WORKING DRAFT SEQUEN... 3.70E-94 nt ev08a09 ac71f03.S1 Stratagene fetal retina 9... 1.30E-38 est ev08al2 Human dystroglycan (DAGl) mRNA, comp... 1.3E—131 nt ev08b01 Homo sapiens chromosome 5 clone CI... 8.20E—73 nt ev08b01 zx56h12.r1 Soares_feta1_1iver_spleen... 4.80E-17 est ev08d01 HS-1053—A2—E02-MR.abi CIT Human Geno... 1.40E-21 nt ev08e10 Homo sapiens PAC clone DJ0991G20, c... 9.4E-102 nt ev08g01 CITBI—E1—2537H3.TR CITBI—El Homo sap... 1.20E-61 nt lev08g04 Homo sapiens chromosome 19 clone CI... 7.50E—66 nt ev08h02 Homo sapiens complement component 4 . 5.90E—25 nt ev08h02 wb27e11.x1 NCI_CGAP_GC6 Homo sapiens... 1.90E—23 est ev08h05 Homo sapiens clone 564L17, WORKING D... 3.00E—88 nt ev09b03 tf32e03.x5 NCI_CGAP_Brn23 Homo sapie... 2.60E—43 est ev09b12 yS84d12.Sl Soares retina N2b4HR Homo... 1.60E-20 est ev09b12 CIT-HSP—217308.TF CIT—HSP Homo sapi... 3.90E-15 nt ev09c10 RPCI-11—378G12.TV RPCI—11 Homo sapie... 3.00E-43 nt ev09c12 HS_5409_A2_G11_T7A RPCI-ll Human Mal... 4.10E—41 nt ev09d03 yb60d02.r1 Stratagene ovary (#937217... 3.40E—42 est ev09d09 Homo sapiens DNA from.chromosome 19... 2.40E—27 nt 152 Table 11 (cont’d). Clone Gene Description Score ev09e10 Homo sapiens, WORKING DRAFT SEQUENCE... 2.50E—25 nt ev09e12 Human DNA sequence from clone 327J... 2.20E—l6 nt ev09e12 tm57f03.xl NCI_CGAP_Brn25 Homo sapie... 7.50E—15 est ev09f03 ta67d02.x1 Soares_total_fetuS_Nb2HF8... 1.50E—11 est ev09f09 Homo sapiens DNA sequence from clo... 5.90E—55 nt ev09g09 Homo sapiens clone A—685D8, WORKIN... 2.80E—25 nt ev09g10 Homo sapiens clone NH0178A14, WORK... 1.10E—13 nt ev09h07 Homo sapiens clone DJ1093I16, WORKIN... 2.0E—105 nt ev09h07 yr23b04.s1 Soares fetal liver Spleen... 8.40E—37 est ev09h08 Homo sapiens chromosome 8 clone BAC . 6.20E—58 nt ev09h08 nk7lc03.s1 NCI_CGAP_Schl Homo sapieni.. 9.20E—56 est ev09h12 Homo sapiens gene for CC chemokine P... 1.80E—47 nt ev10a09 nw18e05.S1 NCI,CGAP_GCBO Homo sapien... 3.20E-28 est ev10a09 Homo sapiens chromosome 5, PAC clon... 2.60E—27 nt ev10c09 Homo sapiens clone DJ1058P19, WORK... 1.50E—59 nt ev10d05 RPCI11—72G7.TK RPCI—ll Homo sapiens .. 1.00E—78 nt ev10d06 qd53d10.x1 Soares_fetal_heart_NbHH19... 8.20E-11 est ev10d10 tn52g10.x1 NCI_CGAP_Kidll Homo sapie... 1.40E—94 est ev10d10 Homo sapiens chromosome 19 clone CI... 7.30E—46 nt ev10e02 CIT—HSP-2305L15.TF CIT—HSP Homo sapi... 2.80E—62 nt ev10h10 Homo sapiens CD30L protein (CD30L) g... 3.90E—25 nt ev14a06 RPCIll-15I10.TP RPCI—ll Homo sapiens... 1.90E—91 nt ev14d03 RPCI11—43D11.TJ RPCI—ll Homo sapiens... 6.60E—78 nt ev14e02 Homo sapiens PAC clone DJ0669B10 fr... 7.70E—30 nt ev14e10 RPCI—11—289E14.TJ RPCI—ll Homo sapie... 1.10E—11 nt ev14g05 Homo sapiens chromosome 17, clone 9.30E—58 nt ev15e09 Homo sapiens chromosome 17, clone hRu.. 8.50E—15 nt 153 Table 12: BLAST results of CASTing library ew Clone Gene Description Score ew01a03 Hdll7—f Adult heart, Clontech Homo S. 1.70E-12 est ew01a07 Homo sapiens prostate—Specific membr. 2.20E—14 nt ew01b01 zd90b12 .r1 Soares_fetal__heart_NbHH19 . 9.20E—48 est ew01b01 CIT—HSP—233OE10.TF CIT—HSP Homo sapi. 6.50E—13 nt ew01c06 HS_5340_B1_C04__SP6E RPCI—ll Human Ma. 1.20E—75 nt ew01d02 Homo sapiens, WORKING DRAFT SEQUENC. . 2.20E—7l nt ew01e06 CITBI—E1—2511C8.TF CITBI—El Homo sap. 2.30E-15 nt ew01f02 te93f04.x1 NCI_CGAP_Pr28 Homo sapien. 4.80E—36 est ew01f03 Human mRNA for KIAA0313 gene, comple. 3.70E-19 nt ew01f03 DKFZp434C2218_r1 434 (synonym: hteSB. 1.60E—16 est ew01f11 Homo sapiens chromosome 17 clone hRP. 1.70E-75 nt ew01f11 qq96g10.x1 Soares_total_fetuS__Nb2HF8. 1.40E—13 est ew01g07 Human DNA sequence from PAC 528L19. . 5.70E—31 nt ew01g08 Homo sapiens chromosome 5 clone CIT. . 2.70E—47 nt ew01g09 HS_5543_A2_G08_T7A RPCI—ll Human Mal. 1.20E—30 nt ew01g09 op97d02.x5 NCI_CGAP_LuS Homo sapiens. 4.80E—25 est ew02a01 Human familial Alzheimer's disease (. 4.40E-50 nt ew02a01 tm35e11.xl NCI_CGAP_Kidll Homo sapie. 5.40E—41 est ew02a02 CITBI-E1—2603H15.TF CITBI-El Homo S. . 9.70E—27 nt ew02b01 Homo sapiens chromosome 17, clone 34. 4.30E—48 nt ew02d06 Sequence 5 from patent US 5712380. 8.10E—32 nt ew02d06 EST80611 Placenta II Homo sapiens cD. 3.20E-24 est ew02d10 Homo sapiens Xp22 PACS RPC11—263P4 7.30E—48 nt ew02e08 Human DNA sequence from clone 524E1. . 1.90E—36 nt ew02h02 HS_3218__B1_F12_T7 CIT Approved Hum. 9.00E—48 nt ew02h05 Human DNA sequence from clone 20B11 9.40E—44 nt ew03a03 Human DNA sequence from clone 503N11. 5.60E—23 nt ew03a04 Homo sapiens genomic DNA, chromosome. 2.60E—93 nt ew03a09 yn47d03.r1 Soares adult brain N2b5HB. 1.30E—43 est ew03a11 RPCI-11—348123.TV RPCI—ll Homo sapi. . 2.90E-12 nt ew03b03 HS_5420_B1_C08_T7A RPCI—11 Human Mal. 4.50E—25 nt ew03c01 Homo sapiens clone NH0368J13, WORK. .. 2.20E—37 nt ew03d05 Human DNA sequence from cosmid N5H6 8.30E—60 nt ew03d11 Human DNA sequence from clone 20J23 6.9E'—110 nt ew03e04 Homo sapiens chromosome 16, BAC clon. 1.30E—80 nt 154 Table 12 (cont'd). Clone Gene Description Score ew03e09jyv06c08.sl Soares fetal liver spleen” 7.40E—33 est ew03e09 RPCI11-105K19.TJ RPCI—ll Homo sapi.. 5.10E—21 nt ew03e10 Homo sapiens chromosome 5, Bac clone. 6.30E—74 nt ew03g01 HS_3229_B2_F08_T7 CIT Approved Human” 2.60E—15 nt ew03g10 RPCI—11—161I18.TJ RPCI—ll Homo sapie. 3.40E-16 nt ew03h11 Human DNA sequence from clone 467L1 1.10E—77 In: ew03h11 zc81e06.r1 Pancreatic Islet Homo sap. 7.30E—28 est ew03h12 Human DNA sequence from clone 15D7.. 1.50E-26 nt ew04a02 EST24394 Cerebellum II Homo sapiens 1.60E-19 est ew04a08 13c4 Human retina cDNA randomly 4.90E—74 est ew04b06 Homo sapiens chromosome 1 clone DJ6.. 6.00E—27 nt ew04c08 CITBI-E1-2527F12.TR CITBI—El Homo sa. 4.60E-27 nt ew04d01 Homo sapiens chromosome 17, clone 13. 8.20E-35 nt ew04d02 Homo sapiens clone NH0092G23, WORK.. 1.30E-36 nt ew04d12 HS_3071_A2_E01_MR CIT Approved Huma.. 2.40E-48 nt ew04e06 RPCI11—108N10.TV RPCI—ll Homo sapiena 7.4OE—46 nt ew04f10«CITBI—E1—2546B6.TF CITBI—El Homo sa.. 4.20E—49 nt ew04g05 RPCI—11—196K19.TV RPCI—ll Homo sapi.. 5.40E—19 nt ew04g07 Homo sapiens Xp22 BAC GSHB-184P14 (.. 9.80E—25 nt ew04h01.Homo sapiens chromosome 19 clone CI.. 1.5E—107 nt ew04h04 Human transforming growth factor—alph. 7.90E—50 nt ew04h081qa99b10.S1 SoareS_pregnant_uteruS_Nb. 4.80E—10 est ew05b06 RPCIll—l47IlO.TV RPCI—ll Homo sapi.. 4.00E—12 nt ew05c09:nz65h04.s1 NCI_CGAP_GCBl Homo sapieni 2.7OE—23 est ew05d07 Human DNA sequence from clone 134019. 1.50E—24 nt ew05d11.Human DNA.Sequence from clone 569M23. 2.50E—98 nt ew05e03 HS_3053_A1_C08_MR CIT Approved Huma.. 1.60E—33 nt ew05e07 Homo sapiens BAC clone RG135C18 from” 1.00E—48 nt ew05e07 vd49e06.s1 Knowles Solter mouse 2 ce. 5.10E—29 est ew05e09 td12d03.x1 NCI_CGAP_C016 Homo sapien. 5.50E—16 est ew05f04 Homo sapiens PAC clone DJ0922123 fro. 3.20E-48 nt ew05f08 tc56c01.x1 Soares_NhHMPu_Sl Homo sap. 1.20E—27 est ew06b03 Homo sapiens chromosome 5, BAC clone. 2.20E—14 nt ew06c03 Homo sapiens genomic DNA, chromosome. 9.9OE—26 nt ew06c05 Human BAC clone RG126M09 from 7q21.. 1.20E-44 nt 155 Table 12 (cont'd). Clone Gene Description Score ew06c08 zr30a03.r1 Stratagene NT2 neuronal p. 3.40E—12 est ew06d05 Homo sapiens chromosome 20 clone DJ}. 2.30E-46 nt ew06d06 Homo sapiens PAC clone DJ1032B10 f.. 3.90E—49 nt ew06d11.CIT—HSP-385N2.TR CIT—HSP Homo sapiend 2.40E—16 nt ew06d12 Homo sapiens chromosome 16 clone BAC. 1.30E—36 nt ew06e06flHomo sapiens chromosome 17, clone H.. 1.40E—78 nt ew06e06 tz45d01.x1 NCI_CGAP_Brn52 Homo sapie. 3.10E—39 est ew06e11 Homo sapiens chromosome 1 clone 9E21. 2.00E-37 nt ew06f03 Homo sapiens genomic DNA of 8p11.2 S. 2.80E—25 nt ew06f04 Homo sapiens clone NH0288C18, WORKIN. 5.00E—43 nt ew06g03 Homo sapiens clone NH0308G20, WORKI.. 9.90E—40 nt ew06h01.zv62g05.s1 Soares_testiS_NHT Homo sa. 6.20E—18 est ew07a05 te53e07.x1 Soares_NFL_T_GBC_Sl Homo 3.20E-29 est ew07a11.Human metallothionein I-B gene, exo.. 6.20E—29 nt ew07b02 Homo sapiens clone 15_A_14, WORKING.. 5.80E-61 nt ew07b02 yc34f07.sl Stratagene liver (#937224) 1.20E—46 est ew07b04 Homo sapiens chromosome 16 clone 2D4. 2.60E-71 nt ew07d03 Homo sapiens genomic DNA of 8p11.2.. 1.40E—49 nt ew07d03 te60c10.x1 Soares_NFL_T_GBC_Sl Homo 2.20E-37 est ew07d091am60b11.x1 Johnston frontal cortex H. 2.10E—10 est ew07e02 Human DNA sequence from.PAC 106C24... 8.00E-26 nt ew07e10ZRPCI—11—419I17.TV RPCI—ll Homo sapie. 5.60E-66 nt ew07e10 Zh53a03.r1 Soares_fetal_liver_spleen. 1.90E—60 est ew07f11.HS_2170_B1_E01_MR CIT Approved Hum... 4.50E-14 nt ew07g01 HS_2178_B2_D05_T7 CIT Approved Human. 2.20E-57 nt ew07gOl.ou48b04.x5 NCI_CGAP_BrZ Homo sapiens. 8.50E—39 est ew07903 Homo sapiens chromosome 8 clone PAC 2.40E-36 nt ew08c11 HS_2037_B1_F05_MR CIT Approved Human. 3.60E—46 nt ew08d12 Homo sapiens chromosome 12p13.3, WOR” 3.10E—32 nt ewO8e11§Humanfiglycophorin B gene, exon 5, 8.20E—32 nt ew08elljyp96a09.sl Soares fetal liver Spleen. 1.70E—31 est ew08g06 Human DNA sequence from cosmid L129. . 1.10E—59 nt ew09a10 Homo sapiens DNA from chromosome 1... 5.10E—52 nt ew09b04 Homo sapiens chromosome 5 clone CI... 8.10E—63 nt ew09d02 Human DNA sequence *** SEQUENCING I.. 1.00E—48 nt 156 Table 12 (cont'd). Clone Gene Descrifipt ion Score ew09d11.Homo sapiens clone DJ0562J12, WORKI.. 4.4E—108 nt ew09d12 Homo sapiens clone hRPK.64_A_1, WORK. 7.9OE—17 nt ew09f11.344L7.TPB CIT97BSKA1 Homo sapiens 9.20E—15 nt ew09g03 Homo sapiens clone GSZO7AO4, WORKING. 1.20E—70 nt ew09g09 RPCI11-126L17.TV RPCI—ll Homo sapien. 4.40E—14 nt ew09h01 Human DNA sequence from.clone 43408.. 5.00E—58 nt ew09h06 Homo sapiens chromosome 17, clone HC. 4.00E—57 nt ew09h10 Homo sapiens, WORKING DRAFT SEQUENCE. 1.50E—26 nt ew09h11.Homo sapiens clone DJ1058P19, WORKIN. 5.20E—39 nt ew10a02 Homo sapiens, WORKING DRAFT SEQUENC.. 2.20E—15 nt ew10a11.Homo sapiens genomic DNA, chromoso... 1.60E—38 nt ew10d05 HS—1016—A2-F11—MR.abi CIT Human Ge... 5.80E—28 nt ew10d06 Homo sapiens mRNA; cDNA DKFZp586Il... 3.1E—100 nt ew10d06¢oa85f05.sl NCI_CGAP_GCBl Homo sapien. 1.00E—96 est ew10f05 Homo sapiens PAC clone DJ0537J23 fro. 3.3E-103 nt ew10903ZHS_2096_A2_G04_T7C CIT Approved BURL. 1.50E—53 nt ew10g03*wm54b04.x1 NCI_CGAP_Ut2 Homo sapiens. 1.90E—16 est ew10g08.ak07d11.sl Soares_parathyroid_tumor_, 9.60E—38 est ew10g08fiHomo sapiens ATPase homolog mRNA, 2.10E—37 nt ew10gflf>Homo sapiens chromosome 10 clone LAl. 4.4E—105 nt ew10g10 zp43g03.r1 Stratagene muscle 937209 5.60E—17 est ew10h05 RPCI11-109P18.TJ RPCI—ll Homo sapi... 3.60E—15 nt ew11a10 Human DNA.Sequence from clone 410I8 8.20E—87 nt ew11c01 Homo sapiens clone DJ0701019, WORKIN. 6.60E—12 nt ewllell Homo sapiens chromosome 19 clone CIT. 1.60E-43 nt ewlle11.EST10602 Adipose tissue, white I Homu 3.50E—33 est ew12d05 Homo sapiens PAC clone DJ0685A02 f... 7.50E—87 nt ew12f05 t087a02.x1 NCI_CGAP_Gas4 Homo sapien. 8.30E-50 est ew12f11.Homo sapiens chromosome 17, clone 3.. 1.50E—52 nt ew12g98IHS_2026_B1_H11_T7 CIT Approved Human. 5.60E—16 nt ew12g08«qe62b04.y5 Soares_fetal_1ung_NbHL19WZ 2.90E—14 est ew12h03jyt02h03.S1 Soares retina N2b5HR Homo. 4.10E—12 est ew14b04 zq88e11.r1 Stratagene hNT neuron (#9. 1.20E—20 est ewl4e03 Homo sapiens BAC clone RG317H01 from” 1.60E-59 nt ew14e07 CIT-HSP-2387El3.TR.1 CIT—HSP Homo s.. 2.00E—15 nt 157 Table 12 (cont’d). Clone Gene Description Score ew14f08 Homo sapiens chromosome 5 clone CIT—. 8.00E-34 nt ew14g06 HS_2198_Bl_G04_MR CIT Approved Human. 1.90E—42 nt ew14h11.Human Hs—cu1—3 mRNA, partial cdS. 1.00E-31 nt ew14h11 zi08c02.r1 Soares_fetal_liver_spleenq 7.30E—30 est ew15a0113enomic sequence from Human 17, comp. 4.60E—52 nt ew15a09 Human DNA sequence from.BAC 15El o... 5.60E—30 nt ew15b11gyr05h06.r1 Soares fetal liver spleen. 2.50E—22 est ew15b11 Human acyl—CoA dehydrogenase mRNA,... 1.80E—21 nt ew15c11.Homo sapiens prenylcysteine carboxyl. 7.5E—102 nt ew15c11gyx13e08.rl Soares melanocyte 2NbHM H. 1.20E—83 est ew15d05 HS_5121_A1_G06_SP6E RPCI—ll Human Ma. 3.00E-36 nt ew15d082H.sapienS mRNA sequence (16p11.2) 7.00E-31 nt ew15d08an001d09.S1 NCI_CGAP_Phel Homo sapien. 1.00E-29 est ew15e08 tS54f02.x1 NCI_CGAP_Kid8 Homo sapien. 3.80E—10 est ew15g92 cSRL—30g10—u cSRL flow sorted Chromo. 5.30E—14 nt 158 Table 13: Sequences in multiple CASTing Libraries cq ev ew cq02h06 ev01f06, ev01f12, ev02b03, ev05903, ev05h08, ev09e03, ev09e09, ev15a07 cq09d09 ev02c01, ev03d11, ev03h11, ev14e11 cq02h09 ev09c12, ev10h01 ev01d02, ev09d12, ew02b01, ew12f11 ev03c09, ev03h01, ev14b07 ev09f02 ew02c10, ew02h04, ew07c12, ew09f04 ev08a04, ev08a11, ew02g08, ew05g08 ev03f12, ev14f05 ev07b06 ew02h05, ew05b10, ew05c11, ew09b03 ev04f05, ev08g08, ew03c12, ew12h09 ev10a05, evl4b09 Table 13: Sequences in multiple CASTing Libraries. Each row lists the clones from the cq, ev, and ew CASTing libraries having Similar sequence aS defined in the text. 159 Table 14: Putative targets of Pax3 identified by CASTing. Clone Gene Clone2 .mEAT,:murine homologue of the Drosophila FAI’gene Clone4 cAMP—GEF, cAMP—guanine nucleotide exchange factor a01a04 Itm2A, integral membrane protein 2A a01e04 Nap2, nucleosome assembly protein 2 (p57KIP2?) a02a05 LAP, leucine aminopeptidase (promoter) a02a08 nAchr—BB, acetylcholine receptor B3 (promoter) a01e09 Celsrl, Cadherin EGF LAG sevenjpass G—type receptor a02f07 type II membrane protein (promoter) a02h01 .MUnc13-3 a03a05 nucleoporin p54 (promoter) a03a06 ACACT, acyl—coA: cholesterol acyltransferase a03e01 OLfiprotocadherin a03e12 PHAS-l, insulin—stimulated eIF—4E binding protein cq03£04 ENPEP, type II integral membrane glycoprotein cq03g09 engrailed-2 cq05a07 BVES, blood vessel/epicardial substance cq06f11 YAF2, YYl—associated factor 2 cq07c09 LR3, low-density lipoprotein receptor cq09cO3 myomegalin (promoter) cq09f04 desmoglein cq10f09 ankyrin 1 cq31e03 LIGl, integral membrane glycoprotein ‘cq31g09 ICAM5, intercellular adhesion molecule 5 ev02h03 ANUT, heart/Skeletal muscle ATP/ADP translocator ev03c02 GAB-43, induced upon P19 differentiation to nerve ev03e02 olfactomedin, extracellular matrix protein in CNS ev07e05 LAMA2, laminin alpha 2 ev08a12 DAGl, dystroglycan, laminin binding component ev10h10 CD30L, CD30 ligand ew02d06 VEGER, Vascular endothelial growth factor receptor ew03h11 cellubrevin, synaptobrevin—3 ew04h04 TGFa, transforming growth factor alpha (promoter) ew05e07 ASK, activator of Siphase kinase ew14hll Hs-cul-3, cullin 3 160 APPENDIX B Figures 161 5’ Intron Branchpoint 3’ Intron Splice Site Sequence Splice Site Consensus AthRagt Ynctgac YYYYYYYYYYYNcagG llllll HIIII llllllllllllll Mouse Pax3 AAgtaagc ccctaac YGYGYYYYYYYacagG llllll llllll llllllllllllll Human PAX3 AAgtaagc ccctgac YGYGYYYYYYYacagG Illl llllllll lllllll Quail Pax3 AAgtagta ? YYYYYYYYAYchagG Figure 1: Comparison of intron 8 to consensus motifs. Mouse, human and quail Pax3 intron 8 sequences were aligned with intron consensus sequences. A (l) symbol designates a nucleotide consistent with an intron consensus motif. A branchpoint sequence in quail Pax3 intron 8 was not identified. 162 260—I—20 1 \I \1 V\ \ A 7\ A, 230—A—22 331—P—20 81—L—8 1 y x \ l\ K , Exons U1 PAX3 IIII I I I'll 1234 6 7 8 Figure 2: RAX3 BAC contig. Four BACS were isolated that Span the human PAX3 genomic locus. An "X" denotes the location of a set of PCR primers used to determine the exons contained within the four BAC clones. BACS 230—A—22 and 331—P—20 Span exons 1—4, but not 5—10. BAC 260—1—20 spans exons 1-5, but not 6—10. BAC 81—L—8 spans exons 5—10, but not 1—4. 163 agGTAATGGG ACTGATTACG GGTGTCGGCC GTCTGCCAAC TACAGTATGG taagccttgg caactcttcc ccggtatcaa cttccttgtg attaattgtt ttggccaaaa ggtcagctcc tcaaacttgt ggtattaaat gaattgtccc CTTTCATTAT ctaaaactgg ctctgaaaac gtcccaatag GGTTTCAAAT GACCTGGAGC ATGGTTACAT CTAGAACATT GTGTATTGAG TATAATAGAA TCTACAACTT TGACTCTTAC GGAAGCTAGA TATCCTCATT TAACAGGATA GGGGAGAGGT CTGGGTTGGA TCTGCTTATC CCCAAGGCGA TTTTCACAGC TGCTGGACAA ATAATTATGC AAGGCAGGGC TTATACAATG ACGTTTGTGT GTTGAGAGGA CTTGGAACCA AACTATGACG CATGCATTGG TTGCAGCCTC AAGTGCAGTA TAAACAAAAA ACTCCTGACC CGCTCTCCCC AGCTGCAGTC ATCTCAGTCC ACCCTGTCAC actttttagg ttaagaaagg tttttttttt taattatttt gagacgtgaa tgaaataatc aggatcatat ctcaggaata atgacattgt agcatgacct CTCAAGCCAG ccctgtttct aaaaaaaaat gagacaaagg CCTTTTGAAC AATAAAAGAC ATCAAAACAT CCATTTGCTT ATTTACCCAG TACCCTAAAG TAAAACTGCT ACACTGGAAA CTAGCTGTAA TGTAATCTAG TGGGCTACCA GTGTGACGTT GKAAGGGCAC AGTGTGCAAT TATAAGCAAA CCATAATTCA CCAACACCCG .TTTTAAACTG AAGAACAGAA ATATCCAATA GGTGACATTT ACAAGAAGGT CGAATGAGGT TAGCGAGTTA TCTTAGAGGG AACCAACAAC AATATTTACT AAACAC AACCACGGTG TCTCACCGGG AGAGACTAGA TACTGTCCAC AGGCTACCAA gggcaatttc tgaattagag tgcaaagcca cttaactgat acctgattgc cctgacatta gggggataat aaaatattag cagcctgtag aaaaagctgc ATATCGCgta ggtcttcgca tacccttttg agagtgattg ACGTTCGACA AAATGCAACA CCCAATTCTT GTGTGCGTGC ACTGGTTTGG TATAACATGT CATTTCATGA TTGAAATAAA AGGACCTCCT TTGTCAATAA GGAAGAAGGA TTTCCAGTTC GGTGGAGAGA ACTGTGTACC ATCAGAAGTA TAGTGATAGA GGTTGCTCCT ATTACGTTTA GTGTTCACAT TTAAAAGATT TAGAATGTCA GGAAGGCAGT AGGCACAAAT GTGAGACTAG TGATGGAACA AATGTGTTGT ATTTATAATG Figure 3: RAXB exons 8, 9, 3—249), 9 (nt 747-777)! and 10 164 GGGTACCTCA GGTCTGGAAC CCATATGAAG CCACCTATAG TATGGGCAGT tcctggaagg gcaagattaa gctgactgtt gtcaacaaca cactaggtaa gaaacacatg cccagggaca tctcaagcct ctgatcttgc gtgtgtttcc agtgaactgt gcctagatat ttgggggggg attttcttcc AAAGCAGTGG TTTTAAGGCA TAGTGCTGAT GTGCGTGTGT GCAAGCAGAC CGATGTTCGA AGCAAGAATG GATAAATGCA AAATAAATTA ATGGCCCGGA CTTGTTTCCC ACATTTATTT AAGTGACATG TCATGGATGC TATTTTTGCA GTGCAGTGAT TTGATTAGAG CATTCTTCTC GGCTCTCATG ATAGGAACTG ATAAATTTGC GTTATTTATT ACATTCCTTT ACATAGTACC AGAGAAGTAC AAAAATGTCT AATAAACAGT and 10. PAX3 exons 8 TCAGCCCCAG CTACCACCAC AGCTTGGACA CACCACAGGC ATGGACAAAg gagataaact gccacacatg ccagcagggg tcttgcggtt aacacaaggg ttcttaatga caaagttgtg ttgatagcac ccctgactgt ttacagGTGC ccacttggag gaagaatctg tggggcagtg tccaatagTT AGAAGAGGAA 1000 ATGGTTTCAC CATCGAGGAG 1100 GTGTGTGTGT TGTTCTAAAA 1200 TGCTGATACT GAGGAATCCT 1300 ACCATTTTTA TTTGTTACTG 1400 CATGTCTTGC TTTCCTGTGT 1500 GGTAGCCCAC CATTCACATA 1600 TTTGGCAATG GGTGCTTTTA 1700 GTTTGCCATT ACCAGATCAG 1800 TTTTCTATGA TACTTTTAAA 1900 AGGTTTACAT TTTTTGAGAT 2000 TCTTTCTATT TAGAGTTAAG 2100 GAATTTAGTT TATCTTTGTG 2200 TTCATGTAAA TATGAACATC 2300 100 200 300 400 500 600 700 800 900 (nt (948—2316) are highlighted. GIGAGTACTG GCAACAAAGA CTCCAAAATC GAATTCTGCT GTTATTAATT GGTTGTCCAA TTCTTAATGA GTAGAGTTTT TTTGATATAT CCCCATGCTG TTTTTTTTCT Figure 4: Sequence of Quail Pax3 intron 8. CAACGCCATA AAATTACAGG GAGTGGAGTG GCATAATTAT GTTGAGACAT ATCGAAATAA GGTCAGCTCA GTCAAACTTG CAATATTAAA CGAGGTTGTC ATTGCAQ intron 8 is 517 bp. Sites are underlined. GTCATTGATC AATAGATCCT CACTGGTGAG TTTCTTAAGT GAAACCTGAT CCACTGGCAT AGAATCATTA TCCCAGGAAT TATGACAATG ACAGCGTGAC Intron donor 165 AATTGTATGT AATGGCTGAC ATAGCTGTAG GATGTCAACA TGCCATTAGG TCCATTATTA CAGTATATAA AAAAATATTA TCAACCTGTA TTAAAAAGCT (GT) and acceptor (AG) GTTTTTGTCT 50 TGCTACTACT 100 GTTGGAGCAA 150 ACATCTTGAT 200 AAAACATAAG 250 GAAACACATG 300 TCCCAGATAC 350 GTCTTTTTCC 400 GTTGGCATGG 450 TCTTTTGTTG 500 Quail Pax3 Pax3cQ+ Pax3dQ+ 250-—- Figure 5: Pax3 protein synthesized in vitro. Pax3 protein was synthesized in vitro in the presence of 35S— methionine using a rabbit reticulocyte lysate transcription and translation coupled system (Promega). Products were electrophoresed on a 4—20% Tris-glycine-SDS gel. Molecular weight markers (left) were included to determine the approximate Size of the Pax3 protein. 166 + + + <3 <3 s s «:2 22 m m m m N (U a a: a: 3.5 ‘L “L n. a a a g 2 Anti-Pax3 lgG- - + - + _ + weH- *‘. .‘E ”5 supershift- Pax3 shift- non-speci fi c- Figure 6: EMSAs with Pax3c and Pax3d. Mouse Pax3c and Pax3d were incubated with radiolabelled e5 oligonucleotide. Lanes 5 Pax3 antibodies were added to lanes 2, 4, and 6. and 6 are negative Pax3 protein controls. 167 OOl—‘l—‘NNWWIbIb OWOUlOU‘IOUlOUl Relative transactivation Pax3cQ+ Pax3dQ+ 0 H .50 UH (U-LJ O E w 0 Z U Expression construct Figure 7: Pax3c and Pax3d activation of cMET. A cMET' luciferase reporter construct was co— transfected with Pax3cQ+ or Pax3dQ+ expression vectors or an empty vector. Luciferase activity was normalized to B—gal activity. 168 |'_I o H 4.) n: n: n: a: 5 =1 E 5: s z z z z o W m In m o Q Q Q 0 0 E9 0 o o o o w E" ‘1' ‘3‘ ‘3‘ ‘3‘ ‘1‘ fl“ ‘1‘ 5 o 0 U U ' m m m m .5: .5: :9. :2 4.: é’é :3 fi :1 .. a v: :25 m (5 rd 0) Ch 04 0.4 m D.) m m m z D ’5» 50 uni- at” “slimline 36 — 30 w w fig :43: .1» m ' ’ was". Figure 8: Western blot analysis with antibody PB33. Proteins were synthesized in vitro from eight expression constructs and one negative control. Western blot analysis was performed with antibody PB33, a Pax3d/PAX3d—specific antibody. A strong band of approximately 56 kDa was observed in lanes 3, 4, 7, and 8. The predicted size of Pax3 is 56 kDa. 169 H 0 3* “ H 0 E: 2 Q4 2 s o 0) 5 (fig TI} 5 g g a) > a. d O: Q: m M > 'H I l m I l m -H u o 'o I o o \. p s. :2 22 ".2 :2 :2 :2 g. (D (6 (d 03 a) z m p. m a a a z .-—-250 m —98 1...... . —64 “~- : 1.“ 50 ____35 ___.30 Figure 9: Western blot analysis with antibody PB35. Western blot analysis was performed with antibody PB35, an antibody predicted to interact with Pax3c, Pax3d and PAX3/FKHR. A 56 kDa band was observed in lanes 2—6 and a 98 kDa band was observed in lane 7. The predicted Sizes of Pax3 and PAX3/FKHR are 56 kDa and 98 kDa, respectively. 170 2500 2000 1500 E‘ 1000 U N o Pax3 500 0 e5 oligo Ie oligo Figure 10: Immunoprecipitation of Pax3-DNA complexes. Radiolabelled e5 and Ie Oligonucleotides were incubated with Pax3 protein or with a negative Pax3 control. Complexes were immunoprecipitated with Pax3 antibodies. The amount of radioactive oligonucleotide recovered from CO—immunoprecipitations of reactions containing Pax3 protein was measured and compared to Similar reactions lacking Pax3. 171 12000 10000 8000 g, 6000 I Pax3 a No Pax3 4000 2000 1 2 3 4 5 6 7 8 N1 N2 Figure 11: Immunoprecipitation of Pax3 CASTing clones. Eight inserts from clones in the Pax3 (cq) CASTing library were amplified, radiolabelled, and incubated with Pax3 or with a negative Pax3 control. Two negative DNA controls, N1 and N2, were included. Pax3/DNA complexes were immunoprecipitated with Pax3 antibodies and the recovered radiolabelled DNA was measured in a scintillation counter. 172 Type 1 Type 2 Type 3 Figure 12: mFAT clones in CASTing library cq. Twelve mFAT clones were identified in CASTing library cq and sequenced. Three types of clones were found, each contained a central 444 bp fragment from the mFAT locus. Types 2 and 3 contained additional sequences adjacent to the 444 bp fragment. The sequence represented in the type 2 clones is contiguous in the mouse genome. Type 3 clones are chimeras, created by ligating a Sau3AI fragment from the mFAT locus to another Sau3AI fragment from elsewhere in the genome. 173 a01a04 a01e04 a01e09 a02a08 I '_II T—‘1 I ‘—_1 1 2 3 4 1. 2 3 ‘4 1 2 3 4 1 2 3 4 Pax3 antibody — + - + - + — + — + _ + _ + _ + Pax3 protein — — + + _ _ + + _ _ + + _ _ + + day: , x , A 1 ”I“ & super—shift shift a in 78 ’“ , it probe is a ”a * h “a w Figure 13: EMSAs with 4 CASTing clones. The inserts from four CASTing clones in the cq library were amplified, radiolabelled and incubated with Pax3 protein followed by Pax3 antibody. Complexes were resolved on native a polyacrylamide gel. Presence or absence of Pax3 protein and Pax3 antibodies in each lane are indicated by a “+” or “-“, respectively. A Pax3—dependent Shift of the PCR product was observed in lane 3 for each probe and a super— shift was observed in lane 4 for each probe. 174 Pax3 protein - + - + Pax3 antibody $36 ——— super—shift ——— shift ___ free probe Figure 14: Pax3 binding site in mEAT. An oligonucleotide was designed from sequence in CASTing library Clone—2 (mFAT). The ds oligonucleotide was end— labelled and incubated with Pax3 protein and Pax3 Lane 3 Shows the oligonucleotide shifted with lane 4 shows the oligonucleotide super—Shifted with Pax3 protein and Pax3 antibodies. 175 Consensus 1: Epstein (1998) CGTCACGGTT Consensus 2: Epstein (1996) GTCACGNTT Consensus 3: Chalepakis (1995) TCGTCACGC mFAT AACGTCCGTCACGGAAGAA TTAATCTTA Celsrl ACAGTTQGfiCflCATTAGTTCCAC TGFOC CAGATTCGTCACAQAGACCCA TTI‘TTTA Figure 15: Pax3 consensus binding sequence. The Pax3 binding Sites identified in the mFAT, Celsrl, and TGFa CASTing clones were aligned with previously reported Pax3 paired domain binding Site consensus sequences (Chalepakis and GruSS, 1995; Epstein et al., 1996; Epstein et al., 1998). Nucleotides that are consistent with reported Pax3 paired domain binding Site consensus sequences are Shown in bold and underlined. The putative Pax3 binding Sites also include sequences Similar to the reported homeodomain recognition sequence ATTA (italics) (Goulding et al., 1991). 176 Relative Light Units CD -* DD 03 $> 01 O) ‘4 TGF VEGFR BVES Itm2A Figure 16: Transactivation of reporter constructs. Luciferase reporter constructs for TGFa, VEGFR, BVES, and Itm2A were co—transfected with a PAX3/FKHR expression construct or an empty vector. The chart shows the ratios of luciferase activity measured in co—transfections with PAX3/FKHR to co—transfections with an empty vector. Luciferase activity was normalized to B—gal activity. Significant transactivation by PAX3/FKHR was observed for the VEGFR and BVES constructs, but not for the TGFa and Itm2A constructs. 177 0P1 mut —’ -) e— (— wt 0P2 + 63‘ + \ \ \ Q: a. + m V! Figure 17: Strategy used to genotype Splotch embryos. A. Four primers (0P1, 0P2, mut, and wt) are used to amplify DNA isolated from embryonic yolk sacs. The mut and wt primers are Specific to the Splotch and wild—type alleles, respectively. The 0P1 and 0P2 primers hybridize to both B. The PCR products amplified normal and mutant alleles. from the three possible genotypes and the corresponding Sizes of the products are Shown. 178 aliulrialu.a....r.i- I 1:.. __control are.-- . M-' ::Spd allele ‘ ' wt allele v ,- as... . “we.“ “91”“ Figure 18: Splotch-delayed embryo genotypes. Bi—directional PCR Amplification of Specific Alleles was used to genotype DNA recovered from mouse embryos. Samples Shown in lanes 2, 5, 10, and 11 are +/+. Samples shown in lanes 1, 4, 6, 9, 13, and 14 are Spd/+. Samples Shown in lanes 3, 7, 8, and 12 are Spd/Spd. 179 ——250 fi —98 “Ali-fig —64 —so ——36 -—-30 Figure 19: Western blot of transient transfections. A PAX3/FKHR expression vector (lane 2) and an empty vector control (lane 1) were transiently transfected in NIH/3T3 cells. Proteins were recovered and western blotted with antibody PB35, which recognizes the PAX3/FKHR protein. The predicted Size of PAX3/FKHR is 98 kDa (lane 2). 180 4 13 19 27 54 38 Clone PAX3 /FKHR mRNA expression units 98 64 50 36 Figure 20: Western blot of stable cell lines. Western blot analysis was performed using antibody PB35 on proteins recovered from six cell lines. The Six clones were previously Shown to express variable levels of PAX3/FKHR mRNA as determined by Ribonuclease Protection Assay (RPA). No PAX3/FKHR protein was observed in clone 13 (empty vector control) or clone 38. Clone 4 appeared to express more PAX3/FKHR protein that clone 27, which contradicts results from the RPA. 181 U} 0 >1 5 0) M (U '0 In N H + \ + 1 N Sp/Sp 12.5 day embryos u +/+ 15.5 day embryo h Spd/Spd 15.5 day embryo 9‘ NIH/3T3 + PAX3/FKHR a NIH/3T3 + empty vector g; Figure 21: Itm2A is induced by PAX3/FKHR. Northern blot hybridizations were performed on total RNA from mouse embryos and RNA from NIH/3T3 cells transiently transfected with PAX3/FKHR or an empty vector using a radiolabelled Itm2A probe. Equal loading of RNA was verified with a B—actin probe (data not shown). 182 9.5 10.5 11.5 12.5 13 13.5 15.5 16.5 17.5 +-+-+-+-+—+-+-+-+- kb 50‘ 3: 1: ' 0.5: ' w Figure 22: Itm2A expression in mouse embryos. Northern blot hybridizations were performed on total RNA from wild—type and Pax3 homozygous mutant mouse embryos at various developmental timepoints using a radiolabelled Itm2A probe. Wild—type (+) and homozygous mutant (-) RNA samples were loaded for 9 time points between 9.5 and 17.5 days p.c. 183 Spd/Spd 15.5 day embryo m .2. 5% a 0 “’5: >1 ”'0 'Uln In“! ‘H 2'... ti +111 +/+ 15.5 day embryo Figure 23: BVES expression correlates with Pax3. Northern blot hybridizations were performed on total RNA from wild—type and Pax3 homozygous mutant mouse embryos using a radiolabelled BVES probe. No differences in BVES but at 15.5 expression were observed at 12.5 days p.c., days p.c. expression was greater in +/+ embryos than in Spd/Spd embryos. 184 13 38 27 4 54 19 Clone — VEGFR Figure 24: VEGFR expression correlates with PAX3/FKHR. Northern blot hybridizations were performed on total RNA from cell lines expressing varying levels of PAX3/FKHR (Figure 20). Lanes were loaded from left to right to correspond to increasing levels of PAX3/FKHR as determined by western blot analysis. Clone 13 (far left) does not express PAX3/FKHR, clone 19 (far right) expresses the highest amount of PAX3/FKHR. 185 APPENDIX C Materials and Methods 186 MATERIALS AND METHODS PCR amplifications All polymerase chain reactions were performed in a buffer originally designed to amplify minisatellites and other problematic templates (Jeffreys et al., 1990). Each reaction contained a final concentration of: 45 mM Tris—HCl (pH 8.8), 11 mM (NHQZSON 4h5 mM MgC12, 6.7 mM 3— mercaptoethanol, 4.5 uM EDTA, 1 mM of each dNTP (dATP, dTTP, dCTP and dGTP)(Pharmacia), 110 ug/ml bovine serum albumin (Pharmacia), 1 uM each primer and 0.1 U/ul thermostable DNA polymerase. Primers were typically designed to be 21—24 nucleotides in length with 50-60% GC content, a G or C at the 3‘ base and little or no secondary structure or primer dimer hybridization as determined by Oligo v.4.0—S (National Biosciences, Inc.). The annealing temperature for all PCRS was 65°C, unless stated otherwise. Primer elongation was carried out at 720C for a duration which depended on the length of the PCR product (approximately 1 min/1000 bp). For amplification of long products (greater 187 than 2 kb), 0.01 U/ul Pfu DNA polymerase (Stratagene) was also included in each reaction. Sequencing Three types of DNA sequencing protocols were used. The first was radioactive sequencing using Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit (USB). This sequencing format was used for analysis of expression constructs, direct sequencing of BAC DNA, and mutation detection in WS1 individuals. The amount of input DNA varied with the source, approximately 100—500 ng for PCR products, 350—1000 ng for small plasmids, and 1—2 Mg for BAC DNA. Reactions were performed according to the manufacturer's protocol and electrophoresed on 6 or 8% polyacrylamide gels in 1X glycerol tolerant buffer. Following electrophoresis, gels were dried and exposed to film. The second type of sequencing used for PCR products of clones in the CASTing libraries was the ABI PrismOBigDyerM Terminator Cycle Sequencing Ready Reaction Kit (PE.Applied Biosystems). Briefly, 2—5 ul of purified PCR products were 188 sequenced with an internal primer using the recommended protocols. The products were electrophoresed using ABI 377 machines. DNA sequence was recovered using Sequencing Analysis version 3.3 or 3.4.1 (PE Applied Biosystems). Vector and linker sequences were trimmed manually using DNASIS version 2.0. The National Institutes of Health Sequencing Center (NISC) did the majority of the sequencing on the CASTing clones. Overnight cultures were grown in TB media, and plasmid DNA was recovered using an alkaline lysis system. Approximately 2—4 ul (100-1000 ng) of plasmid DNA was used as template for sequencing with the ABI PrimnO BigDyeTM Primer Cycle Sequencing Ready Reaction Kit (PE.Applied Biosystems). Cell culture A murine teratocarcinoma cell line (P19, ATCC CRL 1825) was used to study Pax3 expression and gene regulation. P19 cells were grown in a—MEM with 2.5% fetal bovine serum and 7.5% calf serum at 37°C and 5% CIb. Cells 189 were split 1:5 every 3 days and frozen aliquots in 5% DMSO after 3 generations. Cells lines transfected with a PAX3/FKHR expression vector and stably expressing varying levels of PAX3/FKHR protein were provided by Frederic Barr (University of Pennsylvania). These cell lines were propagated in DMEM with 4500 mg/L D—glucose, 10% FBS, 50 mg/L penicillin, 50 mg/L streptomycin, and 500 ug/ml G418. Transfections Initially, transfections were performed using the fo system (Promega) and subsequently with lipofectamineTIV reagent (Life Technologies). For a transfection in a standard 6—well (35 mm) plate, the following was performed. Approximately 1 ug of plasmid DNA was incubated with 5 ul lipofectamine reagent in 200 ul 1X PBS for 30 minutes. Cells were washed twice with serum—free media. For each well, 2 ml serum—free medium was added to the DNA/lipofectamine complexes, overlaid on the cells, and incubated at 37°C and 5% C02. The transfection media was removed 8—12 hours later and replaced with complete media. 190 BAC screening BAC DNA pools were purchased from Research Genetics. PCR primers for PAX3 exons 1—2, 4, 5 and 8—9 were used to amplify the BAC Superpools. PCR products were electrophoresed on 1.5—2% agarose gels and visualized by ethidium bromide staining. Individual BACS were identified by amplifying Plate Pool Plates and Row/Column Plates for each of the positive signals detected in the Superpool plate amplifications. Individual BAC clones were ordered from Research Genetics and shipped as bacterial stabs. Each stab was streaked onto LB agar plates (12.5 ug/ml chloramphenicol) and grown overnight at 37°C. BAC maxiprep An overnight culture of 400 ml LB (12.5 ug/ml chloramphenicol) was inoculated with cells from a single colony using a sterile loop and grown at 37°C with Shaking for approximately 15 hours. When the culture reached an 0D600 > 1.0, the cells were chilled briefly on ice, then pelleted at 4000 x g for 10 minutes at 4°C. The media was 191 i decanted and the pellet frozen at —20°C overnight. The frozen pellet was thawed at room temperature for about 15 minutes, then resuspended in 10 ml of Buffer P1 (Qiagen). The plasmid DNA was purified using the QIAGEN Plasmid Maxi Kit and the manufacturer's instructions, except for the following. In order to elute the high molecular weight BAC DNA, elution buffer was heated to 80°C and added in three 5 ml increments to the column, so as to minimize cooling of the column during elution. Eluted DNA was precipitated as recommended by the manufacturer and resuspended in 1 ml water and incubated at 45°C for 10 minutes in a vacuum centrifuge to remove residual ethanol. The concentration of DNA was determined using a Spectrophotometer and varied from 25—60 ug/ml. The AZGO/A280 ratios were between 1.6 and 1.8, suggesting that the purifed DNA was free from protein and RNA contamination. Approximately 1 ug of DNA was digested with BamHI and electrophoresed on a 0.8% agarose gel. Several distinct, sharp bands were apparent, and very little genomic DNA contamination was observed, suggesting that the BAC DNA was 192 isolated intact and with little E. coli genomic DNA contamination. Electrophoretic mobility Shift assays Double—stranded DNA probes were end—labelled with 32P— ATP using the RTS T4 Kinase labeling system (Life Technologies) and purified using QIAquick nucleotide removal kit (Qiagen). The counts recovered were determined in a scintillation counter; approximately 10,000 cpm were used for each EMSA reaction. The reaction buffer used for all EMSAS contained 25 mM HEPES pH 7.9, 50 mm KCl, 1 mM DTT, 10% glycerol, 50 ng/ul dIdC (Pharmacia), and 50 ng/ul BSA (Pharmacia). 2 ul Pax3 protein (Materials and Methods, In vitro Pax3 protein synthesis) was combined with 10ul 2X reaction buffer and 6 ul water and mixed by gentle pipetting. The radiolabelled probe was then added and mixed and reactions were incubated for 30 minutes at room temperature. After 30 minutes, 5 ul of anti—Pax3 IgG was added, mixed by pipetting, and reactions were allowed to proceed for an additional 30 minutes at room temperature. Immediately prior to loading, 2 ul 0.1% bromophenol'blue was 193 added to each reaction. Approximately 10 ul of each reaction was loaded per well on a 6% or 8%, 40:1 or 80:1 (acrylamide:bis—acrylamide) non—denaturing gel and electrophoresed in 1X TBE. Gels were electrophoresed at 4°C for approximately 3 hours until the bromophenol blue dye migrated approximately 2/3 of the way through the 15 cm gel. Gels were dried under vacuum at 80°C for 1—2 hours and exposed to autoradiograph film. Luciferase assays Luciferase reporter constructs were prepared by cloning the CASTing products from four genes [Itm2A (a01a04), VEGFR (ew02d06), TGFa (ew04h04), and BVES (cq05a07)] into the pGL3—promoter vector (Promega). The luciferase reporter constructs were co—transfected into NIH/3T3 cells with a B-galactosidase expression vector and either a PAX3/FKHR expression vector or an empty vector. Cells were plated in 6—well dishes (35 mm per well) and grown overnight so that the cells were approximately 75% confluent at the start of the transfection procedure. Cells were transfected as described above (Materials and 194 Methods, Transfections) using the lipofectaminerM reagent system (Life Technologies). For each well, 1 ug of luciferase reporter, 1 ug of expression vector, and 500 ng of B—galactosidase plasmid DNA were transfected. Cell were lysed after 48 hours with 1X Reporter lysis buffer (Luciferase assay system, Promega). B—galactosidase activity was measured using 100 ul of the cell lysate and the B—galactosidase enzyme assay system (Promega). Luciferase activity was determined using 20 ul of cell lysate and the luciferase assay system (Promega). Luciferase activity was normalized to B—galactosidase activity. Values were displayed as a ratio of luciferase activity when co-transfected with the PAX3/FKHR expression vector divided by the luciferase activity when co— transfected with an empty vector. Western blot analyses Proteins were combined with 2X Tris—Glycine—SDS sample buffer (Nbvex) and heated at 95°C for 5 minutes. Samples were loaded on a 4—20% gradient Tris—Glycine—SDS gel and electrophoresed at room temperature at 120 volts for 2 195 hours. After electrophoresis, proteins were transferred to 0.45 or 0.2 micron nitrocellulose membranes in 1X Tris— Glycine buffer pH 8.3 overnight at 4°C and 30 volts. Membranes were blocked with BLOTTO (50 mM Tris pH 8.0, 2 mM CaClz, 80 mM NaCl, 0.2% NP—40, and 5% w/v dry milk) for a minimum of 1 hour with at least one change of BLOTTO. For western blotting, all antibody incubations and washes were performed at room temperature. Primary antibodies were diluted 1:500 in BLOTTO and incubated for 1 hour with constant rocking. Blots were washed two times for 30 minutes each in BLOTTO. The secondary antibody (Goat anti— rabbit IgG coupled to horseradish peroxidase, Pierce catalog number 31460) was diluted 1:20,000 in BLOTTO and incubated with the membranes for 1 hour, followed by two 30 minute washes in BLOTTO and two 15 minute washes in 1X Buffer A (50 mM Tris pH 8.0, 2 mM CaClz, and 80 mM NaCl). The membranes were then incubated for 5 minutes with equal volumes of SuperSignalrM West Pico Stable Peroxide Solution and SuperSignalrM West Pico Luminol/Enhancer Solution (Pierce), sealed immediately in heat sealable pouches and exposed to film for various intervals between 5 seconds and 196 10 minutes. SeaBluerM Pre—Stained standards (Novex) were electrophoresed next to the samples to determine the approximate size of the proteins detected by the Pax3 antibodies. Northern blot hybridizations Approximately 5 ug of total RNA was combined with an equal volume of 2X denaturing sample buffer (2X TBE pH 8.3, 13% ficoll w/v, 0.01% bromophenol blue, and 7 M urea). RNA samples were heated to 65°C for 5 minutes and chilled briefly on ice prior to loading. Samples were electrophoresed for 2—4 hours on a 1.0—1.5% native agarose gel in 1X TBE containing 200 ug/L EthBr. Following electrophoresis, RNA was transferred overnight by capillary action to a 0.45 micron nitrocellulose membrane. Locations of molecular weight standards (New England Biolabs, catalog number 362) were visualized and marked on the membrane by placing the membrane upside down on a uv transilluminator. Probes were labelled with 3ZP—dCTP using RediprimeTM II random prime labelling system (Amersham Pharmacia Biotech) and purified using QIAquick nucleotide removal kit 197 (Qiagen). The counts recovered were determined in a scintillation counter; approximately 1—2 X 106 cpm/ml were used for each hybridization. Blots were wet briefly in RNAse—free water, then placed in cylindrical tubes with 5— 10 ml ULTRAhyb hybridization buffer (Ambion). Blots were pre—hybridized with constant rotation at 42°C for 30-60 minutes. Probes were then added directly to the pre— hybridization buffer, and incubated for 8—12 hours at 42°C. Following hybridization, blots were rinsed briefly with wash solution (0.1% SDS, 0.1 X SSC), then washed in wash solution for 1 hour at 60°C with constant swirling. Blots were sealed in heat sealable pouches and exposed to autoradiograph film at —80°C with one MS screen (Kodak). Signals were also quantified by exposing the blots to a PhosphorImager screen and measuring signals with ImageQuant software version 3.3 (Molecular Dynamics). In vitro Pax3 protein synthesis Pax3 cDNAs were cloned into the pGEM—7Zf(+) vector (Promega) and pcDNA3.1/Zeo(+) vector (Invitrogen). These vectors contain T7 RNA polymerase promoters. Pax3 protein 198 was synthesized in vitro using these expression vectors and the TNT® Coupled Reticulocyte system (Promega). Briefly, 1 ug of Pax3 expression vector was combined with 2 ul reaction buffer, 1 ul amino acids (minus methionine), 1 ul amino acids (minus leucine), 1 ul RNASin (Promega), 1 ul T7 polymerase, and 12.5 ul rabbit reticulocyte lysate in a final volume of 25 ul. Reactions were incubated at 30°C for 1.5 hours. Proteins were stored in aliquots at —20°C. Immunoprecipitations Co—immunoprecipitations with anti—Pax3 antibodies were performed with both radioactive and non—radioactive DNA fragments. Radioactive co—immunoprecipitations were used to demonstrate binding of DNA fragments and Oligonucleotides to Pax3 protein. Non—radioactive co— immunoprecipitations Were used as part of the CASTing strategy. Reactions for radioactive co— immunoprecipitations were performed exactly as described for EMSAS (Materials and Methods, Electrophoretic mobility Shift assays). Following incubation with the Pax3 IgG, 25— 50 ul of pre—washed GammaBindrM G SepharoseTM was added. 199 GammaBindTM G SepharoseTM was pre—washed by adding 10 volumes of 1X incubation buffer (25 mM HEPES pH 7.9, 50 mm KCl, 1 mM DTT, 10% glycerol, 50 ng/ul dIdC, and 50 ng/ul BSA), gently inverting the tube to mix the solution. Washed GammaBindTM G SepharoseTM was collected by centrifugation at 2000 rpm (325 x g) for 1 minute. The wash solution was removed by pipette. Following four washes, the GammaBindTM G SepharoseTM was resuspended with incubation buffer in a final volume of 25—50 ul. After addition of the pre—washed GammaBindTM G SepharoseTM to the DNA/protein/antibody reactions, the tubes were rotated at room temperature for 1 hour. Complexes were washed 4 or 5 times with 1 m1 of incubation buffer (25 mM HEPES pH 7.9, 50 mm KCl, 1 mM DTT, 10% glycerol, 50 ng/ul dIdC, and 50 ng/ul BSA). After the final wash, the pellet was resuspended in 100 ul water, transferred to 10 ml scintillation fluid, and measured on a scintillation counter. Non—radioactive immunoprecipitations were performed similarly, except that after the final wash, DNA was collected by addition of 100 ul water, boiling for 5 minutes 200 and centrifugation for 1 minute at 14000 rpm (16,000 x g). The supernatant containing the DNA was transferred to a new tube for subsequent amplification by PCR. 201 BIBLIOGRAPHY Akashi, M., Shaw, G., Hachiya, M., Elstner, E., Suzuki, G., and Koeffler, P. (1994). Number and location of AUUUA motifs: role in regulating transiently expressed RNAS, Blood 83, 3182-3187. Asher, J. H., Jr., and Friedman, T. B. (1990). Mouse and hamster mutants as models for Waardenburg syndromes in humans, J Med Genet 27, 618—626. Asher, J. H., Jr., Harrison, R. W., Morell, R., Carey, M. L., and Friedman, T. B. (1996a). Effects of Pax3 modifier genes on craniofacial morphology, pigmentation, and viability: a murine model of waardenburg syndrome variation, Genomics 34, 285-298. Asher, J. H., Jr., Sommer, A., Morell, R., and Friedman, T. B. (1996b). Missense mutation in the paired domain of PAX3 causes craniofacial- deafness-hand syndrome, Hum Mutat 7, 30-35. Asher, J. J., Morell, R., and Friedman, T. B. (1991). Confirmation of the location of a Waardenburg syndrome type I mutation on human chromosome 2q. Tight linkage to FNl and ALPP, Ann N Y Acad Sci 630, 295—297. Attaie, A., Kim, E., Wilcox, E. R., and Lalwani, A. K. (1997). A Splice—Site mutation affecting the paired box of PAX3 in a three generation family with Waardenburg syndrome type I (W81), Mol Cell Probes 11, 233—236. Auerbach, R. (1954). Analysis of the developmental effects of a lethal mutation in the house mouse, Jour Exp Zool 127, 305—329. Baldwin, C. T., Hoth, C. F., Amos, J. A., da-Silva, E. 0., and Milunsky, A. (1992). An exonic mutation in the HuP2 paired domain gene causes Waardenburg's syndrome, Nature 355, 637—638. 202 Baldwin, C. T., Hoth, C. F., Macina, R. A., and Milunsky, A. (1995). Mutations in PAX3 that cause Waardenburg syndrome type I: ten new mutations and review of the literature, Am J Med Genet 58, 115—122. Barber, T. D., Barber, M. C., Cloutier, T. E., and Friedman, T. B. (1999). PAX3 gene structure, alternative splicing and evolution, Gene 237, 311—319. Barr, F. G. (1997a). Chromosomal translocations involving paired box transcription factors in human cancer, Int J Biochem Cell Biol 29, 1449—1461. Barr, F. G. (1997b). Fusions involving paired box and fork head family transcription factors in the pediatric cancer alveolar rhabdomyosarcoma, Curr Top Microbiol Immunol 220, 113—129. Barr, F. G., Fitzgerald, J. C., Ginsberg, J. P., Vanella, M. L., Davis, R. J., and Bennicelli, J. L. (1999). Predominant expression of alternative PAX3 and PAX7 forms in myogenic and neural tumor cell lines, Cancer Res 59, 5443—5448. Barr, F. G., Galili, N., Holick, J., Biegel, J. A., Rovera, G., and Emanuel, B. S. (1993). Rearrangement of the PAX3 paired box gene in the paediatric solid tumour alveolar rhabdomyosarcoma, Nat Genet 3, 113—117. Baumgartner, S., Bopp, D., Burri, M., and N011, M. (1987). Structure of two genes at the gooseberry locus related to the paired gene and their Spatial expression during Drosophila embryogenesis, Genes Dev 1, 1247—1267. Bennett, G. D., An, J., Craig, J. C., Gefrides, L. A., Calvin, J. A., and Finnell, R. H. (1998). Neurulation abnormalities secondary to altered gene expression in neural tube defect susceptible Splotch embryos, Teratology 57, 17—29. 203 Bennicelli, J. L., Advani, S., Schafer, B. W., and Barr, F. G. (1999). PAX3 and PAX7 exhibit conserved cis— acting transcription repression domains and utilize a common gain of function mechanism in alveolar rhabdomyosarcoma, Oncogene 18, 4348—4356. Bennicelli, J. L., Edwards, R. H., and Barr, F. G. (1996). Mechanism for transcriptional gain of function resulting from chromosomal translocation in alveolar rhabdomyosarcoma, Proc Natl Acad Sci U S A 93, 5455-5459. Bennicelli, J. L., Fredericks, W. J., Wilson, R. E., Rauscher, F. J., 3rd, and Barr, F. G. (1995). Wild type PAX3 protein and the PAX3—FKHR fusion protein of alveolar rhabdomyosarcoma contain potent, structurally distinct transcriptional activation domains, Oncogene 11, 119—130. Bertuccioli, C., Fasano, L., Jun, 8., Wang, S., Sheng, G., and Desplan, C. (1996). In vivo requirement for the paired domain and homeodomain of the paired segmentation gene product, Development 122, 2673-2685. Bladt, F., Riethmacher, D., Isenmann, S., Aguzzi, A., and Birchmeier, C. (1995). Essential role for the c—met receptor in the migration of myogenic precursor cells into the limb bud, Nature 376, 768—771. Bober, E., Franz, T., Arnold, H. H., Gruss, P., and Tremblay, P. (1994). Pax-3 is required for the development of limb muscles: a possible role for the migration of dermomyotomal muscle progenitor cells, Development 120, 603—612. Bopp, D., Burri, M., Baumgartner, S., Frigerio, G., and N011, M. (1986). Conservation of a large protein domain in the segmentation gene paired and in functionally related genes of Drosophila, Cell 47, 1033—1040. Bopp, D., Jamet, E., Baumgartner, S., Burri, M., and N011, M. (1989). Isolation of two tissue—specific Drosophila paired box genes, Pox meso and Pox neuro, Embo J 8, 3447—3457. 204 Botquin, V., Hess, H., Fuhrmann, G., Anastassiadis, C., Gross, M. K., Vriend, G., and Scholer, H. R. (1998). New POU dimer configuration mediates antagonistic control of an osteopontin preimplantation enhancer by Oct—4 and Sox—2, Genes Dev 12, 2073-2090. Boyd, K. E., Wells, J., Gutman, J., Bartley, S. M., and Farnham, P. J. (1998). c—Myc target gene specificity is determined by a post—DNA binding mechanism, Proc Natl Acad Sci U S A 95, 13887-13892. Burri, M., Tromvoukis, Y., Bopp, D., Frigerio, G., and Noll, M. (1989). Conservation of the paired domain in metazoans and its structure in three isolated human genes, Embo J 8, 1183—1190. Cao, Y., and Wang, C. (2000). The COOH—terminal transactivation domain plays a key role in regulating the in vitro and in vivo function of Pax3 homeodomain, J Biol Chem 275, 9854-9862. Carey, M. L. (1996) Screening for mutations in PAX3 and.MITF in Waardenburg syndrome and Waardenburg syndrome- like individuals, M.S., Michigan State University, East Lansing. Carey, M. L., Friedman, T. B., Asher, J. H., Jr., and Innis, J. W. (1998). Septo—optic dysplasia and WSl in the proband of a WSl family segregating for a novel mutation in PAX3 exon 7, J Med Genet 35, 248—250. Carriere, C., Plaza, 8., Caboche, J., Dozier, C., Bailly, M., Martin, P., and Saule, S. (1995). Nuclear localization Signals, DNA binding, and transactivation properties of quail Pax—6 (Pax—QNR) isoforms, Cell Growth Differ 6, 1531—1540. 205 Caubin, J., Iglesias, T., Bernal, J., Munoz, A., Marquez, G., Barbero, J. L., and Zaballos, A. (1994). Isolation of Genomic DNA Fragments Corresponding to Genes Modulated in—Vivo by a Transcription Factor, NUcleic Acids Research 22, 4132—4138. Chalepakis, G., Goulding, M., Read, A., Strachan, T., and GruSS, P. (1994a). Molecular basis of Splotch and Waardenburg Pax—3 mutations, Proc Natl Acad Sci U S A 91, 3685—3689. Chalepakis, G., and Gruss, P. (1995). Identification of DNA recognition sequences for the Pax3 paired domain, Gene 162, 267—270. Chalepakis, G., Jones, F. S., Edelman, G. M., and GruSS, P. (1994b). Pax—3 contains domains for transcription activation and transcription inhibition, Proc Natl Acad Sci U S A 91, 12745-12749. Chalepakis, G., Stoykova, A., Wijnholds, J., Tremblay, P., and GruSS, P. (1993). Fax: gene regulators in the developing nervous system, J Neurobiol 24, 1367-1384. Chalepakis, G., Wijnholds, J., and Gruss, P. (1994c). Pax—3—DNA interaction: flexibility in the DNA binding and induction of DNA conformational changes by paired domains, Nucleic Acids Res 22, 3131—3137. Chodosh, L. A., Baldwin, A. S., Carthew, R. W., and Sharp, P. A. (1988). Human CCAAT—binding proteins have heterologous subunits, Cell 53, 11—24. Clevidence, D. E., Overdier, D. G., Tao, W., Qian, X., Pani, L., Lai, E., and Costa, R. H. (1993). Identification of nine tissue-specific transcription factors of the hepatocyte nuclear factor 3/forkhead DNA-binding-domain family, Proc Natl Acad Sci U S A 90, 3948—3952. 206 Conway, S. J., Henderson, D. J., and Copp, A. J. (1997a). Pax3 is required for cardiac neural crest migration in the mouse: evidence from the splotch (Sp2H) mutant, Development 124, 505-514. Conway, S. J., Henderson, D. J., Kirby, M. L., Anderson, R. H., and Copp, A. J. (1997b). Development of a lethal congenital heart defect in the splotch (Pax3) mutant mouse, Cardiovasc Res 36, 163~l73. Crane—Robinson, C., and Wolffe, A. P. (1998). Immunological analysis of chromatin: FIS and CHIPS, Trends Genet 14, 477—480. Dahl, E., Koseki, H., and Balling, R. (1997). Pax genes and organogenesis, Bioessays 19, 755—765. Davis, R. J., D'Cruz, C. M., Lovell, M. A., Biegel, J. A., and Barr, F. G. (1994). Fusion of PAX7 to FKHR by the variant t(1;13)(p36;q14) translocation in alveolar rhabdomyosarcoma, Cancer Res 54, 2869—2872. Deleersnijder, W., Hong, G., Cortvrindt, R., Poirier, C., Tylzanowski, P., Pittois, K., Van Marck, E., and Merregaert, J. (1996). Isolation of markers for chondro— osteogenic differentiation using cDNA library subtraction. Molecular cloning and characterization of a gene belonging to a novel multigene family of integral membrane proteins, J Biol Chem 271, 19475—19482. Deutsch, U., Dressler, G. R., and Gruss, P. (1988). Pax 1, a member of a paired box homologous murine gene family, is expressed in segmented structures during development, Cell 53, 617—625. Dickman, E. D., Rogers, R., and Conway, S. J. (1999). Abnormal Skeletogenesis occurs coincident with increased apoptosis in the Splotch (Sp2H) mutant: putative roles for Pax3 and PDGFRalpha in rib patterning, Anat Rec 255, 353— 361. 207 Dietrich, S. (1999). Regulation of hypaxial muscle development, Cell Tissue Res 296, 175—182. Douglass, E. C., valentine, M., Etcubanas, E., Parham, D., Webber, B. L., Houghton, P. J., Houghton, J. A., and Green, A. A. (1987). A Specific chromosomal abnormality in rhabdomyosarcoma [published erratum appears in Cytogenet Cell Genet l988;47(4):following 232], Cytogenet Cell Genet 45, 148-155. Dunne, J., Hanby, A. M., Poulsom, R., Jones, T. A., Sheer, D., Chin, W. G., Da, S. M., Zhao, Q., Beverley, P. C., and Owen, M. J. (1995). Molecular cloning and tissue expression of FAT, the human homologue of the Drosophila fat gene that is located on chromosome 4q34—q35 and encodes a putative adhesion molecule, Genomics 30, 207—223. Eldeiry, W. S., Kern, S. E., Pietenpol, J. A., Kinzler, K. W., and Vogelstein, B. (1992). Definition of a Consensus Binding—Site for P53, Nature Genetics 1, 45—49. Epstein, D. J., Vekemans, M., and Gros, P. (1991). Splotch (Sp2H), a mutation affecting development of the mouse neural tube, Shows a deletion within the paired homeodomain of Pax—3, Cell 67, 767—774. Epstein, D. J., Vogan, K. J., Trasler, D. G., and Gros, P. (1993). A mutation within intron 3 of the Fax-3 gene produces aberrantly spliced mRNA transcripts in the Splotch (Sp) mouse mutant, Proc Natl Acad Sci U S A 90, 532—536. Epstein, J. A., Glaser, T., Cai, J., Jepeal, L., Walton, D. S., and Maas, R. L. (1994). Two independent and interactive DNA-binding subdomains of the Pax6 paired domain are regulated by alternative splicing, Genes Dev 8, 2022-2034. Epstein, J. A., Lam, P., Jepeal, L., Maas, R. L., and Shapiro, D. N. (1995). Pax3 inhibits myogenic differentiation of cultured myoblast cells, J Biol Chem 270, 11719—11722. 208 Epstein, J. A., Li, J., Lang, D., Chen, F., Brown, C. B., Jin, F., Lu, M. M., Thomas, M., Liu, E., Wessels, A., and Lo, C. W. (2000). Migration of cardiac neural crest cells in splotch embryos [In Process Citation], Development 127, 1869—1878. Epstein, J. A., Shapiro, D. N., Cheng, J., Lam, P. Y., and Maas, R. L. (1996). Pax3 modulates expression of the c— Met receptor during limb muscle development, Proc Natl Acad Sci U S A 93, 4213—4218. Epstein, J. A., Song, B., Lakkis, M., and Wang, C. (1998). Tumor—specific PAX3-FKHR transcription factor, but not PAX3, activates the platelet-derived growth factor alpha receptor, Mol Cell Biol 18, 4118—4130. Erlich, H. A., Gelfand, D., and Sninsky, J. J. (1991). Recent advances in the polymerase chain reaction, Science 252, 1643—1651. Farrer, L. A., Arnos, K. S., Asher, J. H., Jr., Baldwin, C. T., Diehl, S. R., Friedman, T. B., Greenberg, J., Grundfast, K. M., Hoth, C., Lalwani, A. K., and et a1. (1994). Locus heterogeneity for Waardenburg syndrome is predictive of clinical subtypes, Am J Hum Genet 55, 728— 737. Franz, T. (1989). Persistent truncus arteriosus in the Splotch mutant mouse, Anat Embryol 180, 457-464. Franz, T. (1990). Defective ensheathment of motoric nerves in the Splotch mutant mouse, Acta Anat 138, 246-253. Franz, T. (1992). Neural tube defects without neural crest defects in Splotch mice, Teratology 46, 599—604. Franz, T., Kothary, R., Surani, M. A., Halata, Z., and Grim, M. (1993). The Splotch mutation interferes with muscle development in the limbs, Anat Embryol (Berl) 187, 153—160. Fredericks, W. J., Galili, N., Mukhopadhyay, S., Rovera, G., Bennicelli, J., Barr, F. G., and Rauscher, F. J., 3rd (1995). The PAX3—FKHR fusion protein created by the t(2;13) translocation in alveolar rhabdomyosarcomas is a more potent transcriptional activator than PAX3, Mol Cell Biol 15, 1522—1535. Friedman, T. B., Polanco, G. E., Appold, J. C., and Mayle, J. E. (1985). On the loss of uricolytic activity during primate evolution——I. Silencing of urate oxidase in a hominoid ancestor, Comp Biochem Physiol [B] 81, 653—659. Frigerio, G., Burri, M., Bopp, D., Baumgartner, S., and Noll, M. (1986). Structure of the segmentation gene paired and the Drosophila PRD gene set as part of a gene network, Cell 47, 735—746. Galili, N., Davis, R. J., Fredericks, W. J., Mukhopadhyay, S., Rauscher, F. J. d., Emanuel, B. S., Rovera, G., and Barr, F. G. (1993). Fusion of a fork head domain gene to PAX3 in the solid tumour alveolar rhabdomyosarcoma [published erratum appears in Nat Genet 1994 Feb;6(2):214], Nat Genet 5, 230—235. Gay, E., and Babajko, S. (2000). AUUUA sequences compromise human insulin—like growth factor binding protein-1 mRNA stability, Biochem Biophys Res Commun 267, 509—515. Ginsberg, J. P., Davis, R. J., Bennicelli, J. L., Nauta, L. E., and Barr, F. G. (1998). Up—regulation of MET but not neural cell adhesion molecule expression by the PAX3—FKHR fusion protein in alveolar rhabdomyosarcoma, Cancer Res 58, 3542—3546. Goulding, M., Lumsden, A., and Paquette, A. J. (1994). Regulation of Pax-3 expression in the dermomyotome and its role in muscle development, Development 120, 957—971. Goulding, M., Sterrer, S., Fleming, J., Balling, R., Nadeau, J., Moore, K. J., Brown, S. D., Steel, K. P., and Gruss, P. (1993). Analysis of the Pax—3 gene in the mouse mutant splotch, Genomics 17, 355-363. Goulding, M. D., Chalepakis, G., Deutsch, U., Erselius, J. R., and Gruss, P. (1991). Pax—3, a novel murine DNA binding protein expressed during early neurogenesis, Embo J 10, 1135—1147. Hadjantonakis, A. K., Sheward, W. J., Harmar, A. J., de Galan, L., Hoovers, J. M., and Little, P. F. (1997). Celsrl, a neural—specific gene encoding an unusual seven— pass transmembrane receptor, maps to mouse chromosome 15 and human chromosome 22qter, Genomics 45, 974104. Hampson, R. K., La Follette, L., and Rottman, F. M. (1989). Alternative processing of bovine growth hormone mRNA is influenced by downstream exon sequences, Mol Cell Biol 9, 1604—1610. Hampson, R. K., and Rottman, F. M. (1987). Alternative processing of bovine growth hormone mRNA: nonsplicing of the final intron predicts a high molecular weight variant of bovine growth hormone, Proc Natl Acad Sci USA 84, 2673— 2677. Harper, J. C., and Wells, D. (1999). Recent advances and future developments in PGD, Prenat Diagn 19, 1193—1199. Harris, S. E., Winchester, C. L., and Johnson, K. J. (2000). Functional analysis of the homeodomain protein SIX5, Nucleic Acids Research 28, 1871—1878. Heldin, C. H., and Westermark, B. (1999). Mechanism of action and in vivo role of platelet—derived growth factor, Physiol Rev 79, 1283—1316. Heller, N., and Brandli, A. W. (1997). Xenopus Pax—2 displays multiple splice forms during embryogenesis and pronephric kidney development, Mech Dev 69, 83—104. “mffihg‘n‘f'h—Hfi-flAfi—r: 71.. 7. 7 .r:7 : . Henderson, D. J., Conway, S. J., and Copp, A. J. (1999). Rib truncations and fusions in the Sp2H mouse reveal a role for Pax3 in specification of the ventro— lateral and posterior parts of the somite, Dev Biol 209, 143—158. Hill, A. L., Phelan, S. A., and Loeken, M. R. (1998). Reduced expression of pax—3 is associated with overexpression of cdc46 in the mouse embryo, Dev Genes Evol 208, 128—134. Hirano, S., Yan, Q., and Suzuki, S. T. (1999). Expression of a novel protocadherin, OL-protocadherin, in a subset of functional systems of the developing mouse brain, J Neurosci 19, 995-1005. Hoey, T., and Levine, M. (1988). Divergent homeo box proteins recognize similar DNA sequences in Drosophila, Nature 332, 858—861. Hoey, T., Warrior, R., Manak, J., and Levine, M. (1988). DNA—binding activities of the Drosophila melanogaster even—skipped protein are mediated by its homeo domain and influenced by protein context, Mol Cell Biol 8, 4598-4607. Hu, R. J., Lee, M. P., Johnson, L. A., and Feinberg, A. P. (1996). A novel human homologue of yeast nucleosome assembly protein, 65 kb centromeric to the p57KIP2 gene, is biallelically expressed in fetal and adult tissues, Hum Mol Genet 5, 1743-1748. Jakeman, L. B., Armanini, M., Phillips, H. S., and Ferrara, N. (1993). Developmental expression of binding Sites and messenger ribonucleic acid for vascular endothelial growth factor suggests a role for this protein in vasculogenesis and angiogenesis, Endocrinology 133, 848— 859. Jaworski, C., Sperbeck, S., Graham, C., and Wistow, G. (1997). Alternative splicing of Pax6 in bovine eye and evolutionary conservation of intron sequences, Biochem Biophys Res Commun 240, 196—202. Jeffreys, A. J., Neumann, R., and Wilson, V. (1990). Repeat unit sequence variation in minisatellites: a novel source of DNA polymorphism for studying variation and mutation by single molecule analysis, Cell 60, 473-485. Kawamoto, T., Makino, K., Niwa, H., Sugiyama, H., Kimura, S., Amemura, M., Nakata, A., and Kakunaga, T. (1988). Identification of the human beta—actin enhancer and its binding factor, Mol Cell Biol 8, 267—272. Kay, P. H., and Ziman, M. R. (1999). Alternate Pax7 paired box transcripts which include a trinucleotide or a hexanucleotide are generated by use of alternate 3' intronic splice Sites which are not utilized in the ancestral homologue, Gene 230, 55—60. Khan, J., Bittner, M. L., Saal, L. H., Teichmann, U., Azorsa, D. 0., Gooden, G. C., Pavan, W. J., Trent, J. M., and Meltzer, P. S. (1999). cDNA microarrays detect activation of a myogenic transcription program by the PAX3— FKHR fusion oncogene, Proc Natl Acad Sci U S A 96, 13264— I 13269. Kim, K. J., Li, B., Winer, J., Armanini, M., Gillett, N., Phillips, H. S., and Ferrara, N. (1993). Inhibition of vascular endothelial growth factor—induced angiogenesis suppresses tumour growth in vivo, Nature 362, 841-844. Kinzler, K. W., and Vogelstein, B. (1989). Whole genome PCR: application to the identification of sequences bound by gene regulatory proteins, Nucleic Acids Res 17, 3645—3653. Kinzler, K. W., and Vogelstein, B. (1990). The Gli Gene Encodes a Nuclear—Protein Which Binds Specific Sequences in the Human Genome, Molecular and Cellular Biology 10, 634—642. 213 Kochilas, L. K., Li, J., Jin, F., Buck, C. A., and Epstein, J. A. (1999). p57Kip2 expression is enhanced during mid—cardiac murine development and is restricted to trabecular myocardium, Pediatr Res 45, 635—642. Kozmik, Z., Czerny, T., and Busslinger, M. (1997). Alternatively spliced insertions in the paired domain restrict the DNA sequence specificity of Pax6 and Pax8, Embo J 16, 6793-6803. Kozmik, Z., Kurzbauer, R., Dorfler, P., and Busslinger, M. (1993). Alternative splicing of Pax—8 gene transcripts is developmentally regulated and generates isoforms with different transactivation properties, Mol Cell Biol 13, 6024—6035. Kumar, V., Bustin, S. A., and McKay, I. A. (1995). Transforming growth factor alpha, Cell Biol Int 19, 373— 388. Kuo, M. H., and Allis, C. D. (1999). In vivo cross— linking and immunoprecipitation for studying dynamic ProteinzDNA associations in a chromatin environment, Methods 19, 425-433. Lakkis, M. M., Golden, J. A., O'Shea, K. S., and Epstein, J. A. (1999). Neurofibromin deficiency in mice causes exencephaly and is a modifier for Splotch neural tube defects, Dev Biol 212, 80—92. Lalwani, A. K., Brister, J. R., Fex, J., Grundfast, K. M., Ploplis, B., San Agustin, T. B., and Wilcox, E. R. (1995). Further elucidation of the genomic structure of PAX3, and identification of two different point mutations within the PAX3 homeobox that cause waardenburg syndrome type 1 in two families, Am J Hum Genet 56, 75—83. Lalwani, A. K., Mhatre, A. N., San Agustin, T. B., and Wilcox, E. R. (1996). Genotype—phenotype correlations in type 1 Waardenburg syndrome, Laryngosc0pe 106, 895—902. Levine, M., and Hoey, T. (1988). Homeobox proteins as sequence—Specific transcription factors, Cell 55, 537—540. Liu, Q., Thorland, E. C., Heit, J. A., and Sommer, S. S. (1997). Overlapping PCR for bidirectional PCR amplification of specific alleles: a rapid one—tube method for Simultaneously differentiating homozygotes and heterozygotes, Genome Res 7, 389—398. Liu, X., Newton, V., and Read, A. (1995a). Hearing loss and pigmentary disturbances in Waardenburg syndrome with reference to WS type II, J Laryngol Otol 109, 96—100. Liu, X. Z., Newton, V. E., and Read, A. P. (1995b). Waardenburg syndrome type II: phenotypic findings and diagnostic criteria, Am J Med Genet 55, 95—100. Lizard—Nacol, S., Mugneret, F., Volk, C., Turc—Carel, C., Favrot, M., and Philip, T. (1987). Translocation (2;13)(q37;q14) in alveolar rhabdomyosarcoma: a new case [letter], Cancer Genet Cytogenet 25, 373—374. Macina, R. A., Barr, F. G., Galili, N., and Riethman, H. C. (1995). Genomic organization of the human PAX3 gene: DNA sequence analysis of the region disrupted in alveolar rhabdomyosarcoma, Genomics 26, 1—8. Maekawa, T., Imamoto, F., Merlino, G. T., Pastan, I., and Ishii, S. (1989). Cooperative function of two separate enhancers of the human epidermal growth factor receptor proto—oncogene, J Biol Chem 264, 5488—5494. Mahoney, P. A., Weber, U., Onofrechuk, P., Biessmann, H., Bryant, P. J., and Goodman, C. S. (1991). The fat tumor suppressor gene in Drosophila encodes a novel member of the cadherin gene superfamily, Cell 67, 853—868. Maroto, M., Reshef, R., Munsterberg, A. E., Koester, S., Goulding, M., and Lassar, A. B. (1997). Ectopic Pax—3 activates MyoD and Myf—5 expression in embryonic mesoderm and neural tissue, Cell 89, 139-148. Marquardt, H., Hunkapiller, M. W., Hood, L. E., Todaro, G. J. (1984). Rat transforming growth factor type 1: structure and relation to epidermal growth factor, Science 223, 1079—1082. and Martin, K. (1999). Applications of Promega's In vitro Expression Systems, Promega Notes 70, 2—6. Mavrothalassitis, G., Beal, G., and Papas, T. S. (1990). Defining Target Sequences of DNA—Binding Proteins by Random Selection and Pcr — Determination of the Gcn4 Binding Sequence Repertoire, DNA and Cell Biology 9, 783— 788. McBurney, M. W., Jones—Villeneuve, E. M., Edwards, M. K., and Anderson, P. J. (1982). Control of muscle and neuronal differentiation in a cultured embryonal carcinoma cell line, Nature 299, 165—167. Mennerich, D., Schafer, K., and Braun, T. (1998). Pax— 3 is necessary but not sufficient for lbxl expression in myogenic precursor cells of the limb, Mech Dev 73, 147—158. Miskiewicz, P., Morrissey, D ., Lan, Y., Raj, L., Kessler, S., Fujioka, M., Goto, T ., and Weir, M. (1996). Both the paired domain and homeodomain are required for in vivo function of Drosophila Paired, Development 122, 2709— 2718. Miteva, V., Gancheva, A., Mitev, V., and Ljubenov, M. (1998). Comparative genome analysis of Bacillus Sphaericus by ribotyping, M13 hybridization, and M13 polymerase chain reaction fingerprinting, Can J Microbiol 44, 175—180. Monaco, A. P., and Larin, Z. (1994). YACS, BACS, and MACS: artificial chromosomes as research tools, Biotechnol 12, 280—286. PACS Trends 216 Morell, R., Friedman, T. B., Moeljopawiro, S., Hartono, Soewito, and Asher, J. H., Jr. (1992). A frameshift mutation in the HuP2 paired domain of the probable human homolog of murine Fax—3 is responsible for Waardenburg syndrome type 1 in an Indonesian family, Hum Mol Genet 1, 243—247. Murphy, M., Ahn, J., Walker, K. K., Hoffman, W. H., Evans, R. M., Levine, A. J., and George, D. L. (1999). Transcriptional repression by wild—type p53 utilizes histone deacetylases, mediated by interaction with mSin3a, Genes Dev 13, 2490—2501. Natoli, T. A., Ellsworth, M. K., Wu, C., Gross, K. W., and Pruitt, S. C. (1997). Positive and negative DNA sequence elements are required to establish the pattern of Pax3 expression, Development 124, 617—626. Neufeld, G., Cohen, T., Gengrinovitch, S., and Poltorak, Z. (1999). Vascular endothelial growth factor (VEGF) and its receptors, Faseb J 13, 9—22. Nornes, S., Mikkola, I., Krauss, S., Delghandi, M., Perander, M., and Johansen, T. (1996). Zebrafish Pax9 encodes two proteins with distinct C—terminal transactivating domains of different potency negatively regulated by adjacent N-terminal sequences, J Biol Chem 271, 26914-26923. Ogasawara, M., Wada, H., Peters, H., and Satoh, N. (1999). Developmental expression of Pax1/9 genes in urochordate and hemichordate gills: insight into function and evolution of the pharyngeal epithelium, Development 126, 2539—2550. Okladnova, O., Syagailo, Y. V., Mossner, R., Riederer, P., and Lesch, K. P. (1998). Regulation of PAX-6 gene transcription: alternate promoter usage in human brain, Brain Res Mol Brain Res 60, 177—192. Orlando, V., Strutt, H., and Paro, R. (1997). Analysis of chromatin structure by in vivo formaldehyde cross— linking, Methods 11, 205-214. Park, M., Dean, M., Kaul, K., Braun, M. J., Gonda, M. A., and Vande Woude, G. (1987). Sequence of MET protooncogene cDNA has features characteristic of the tyrosine kinase family of growth—factor receptors, Proc Natl Acad Sci U S A 84, 6379-6383. Partanen, T. A., Makinen, T., Arola, J., Suda, T., Weich, H. A., and Alitalo, K. (1999). Endothelial growth factor receptors in human fetal heart, Circulation 100, 583-586. Pasteris, N. G., Trask, B. J., Sheldon, S., and Gorski, J. L. (1993). Discordant phenotype of two overlapping deletions involving the PAX3 gene in chromosome 2q35, Hum Mol Genet 2, 953—959. Phelan, S. A., and Loeken, M. R. (1998). Identification of a new binding motif for the paired domain of Pax-3 and unusual characteristics of spacing of bipartite recognition elements on binding and transcription activation, J Biol Chem 273, 19153—19159. Pittois, K., Deleersnijder, W., and Merregaert, J. (1998). cDNA sequence analysis, chromosomal assignment and expression pattern of the gene coding for integral membrane protein 2B, Gene 217, 141—149. Pittois, K., Wauters, J., Bossuyt, P., Deleersnijder, W., and Merregaert, J. (1999). Genomic organization and chromosomal localization of the Itm2a gene, Mamm Genome 10, 54—56. Plaza, S., Dozier, C., Turque, N., and Saule, S. (1995). Quail Fax—6 (Pax—QNR) mRNAS are expressed from two promoters used differentially during retina development and neuronal differentiation, Mol Cell Biol 15, 3344—3353. 218 Poleev, A., Wendler, F., Fickenscher, H., Zannini, M. S., Yaginuma, K., Abbott, C., and Plachov, D. (1995). Distinct functional properties of three human paired—box— protein, PAX8, isoforms generated by alternative splicing in thyroid, kidney and Wilms' tumors, Eur J Biochem 228, 899-911. Pollock, R., and Treisman, R. (1990). A Sensitive Method for the Determination of Protein-DNA Binding Specificities, Nucleic Acids Research 18, 6197—6204. Pruitt, S. C. (1992). Expression of Pax—3— and neuroectodermrinducing activities during differentiation of P19 embryonal carcinoma cells, Development 116, 573-583. Raja, R. H., Paterson, A. J., Shin, T. H., and Kudlow, J. E. (1991). Transcriptional regulation of the human transforming growth factor— alpha gene, Mol Endocrinol 5, 514—520. Raney, R. B., Hays, D. M., Tefft, M., and Triche, T. J. (1993). Rhabdomyosarcoma and the undifferentiated sarcomas. In Principles and Practice of Pediatric Oncology, P. A. Pizzo, and D. G. Poplack, eds. (Philadelphia, JB Lippincott), pp. 769-794. Read, A. P., Foy, C., Newton, V., and Harris, R. (1991). Localization of a gene for Waardenburg syndrome type I, Ann N Y Acad Sci 630, 143-151. Reese, D. E., and Bader, D. M. (1999). Cloning and expression of hbves, a novel and highly conserved mRNA expressed in the developing and adult heart and Skeletal muscle in the human, Mamm Genome 10, 913—915. Reese, D. E., Zavaljevski, M., Streiff, N. L., and Bader, D. (1999). bves: A novel gene expressed during coronary blood vessel development, Dev Biol 209, 159—171. Rotheneder, H., Grabner, M., and Wintersberger, E. (1991). Presence of regulatory sequences within intron 2 of the mouse thymidine kinase gene, Nucleic Acids Res 19, 6805—6809. Rouquier, S., Batzer, M. A., and Giorgi, D. (1994). Application of bacterial artificial chromosomes to the generation of contiguous physical maps: a pilot study of human ryanodine receptor gene (RYRl) region, Anal Biochem 217, 205—209. Rowe, D., Gerrard, M., Gibbons, B., and Malpas, J. S. (1987). Two further cases of t(2;13) in alveolar rhabdomyosarcoma indicating a review of the published chromosome breakpoints, Br J Cancer 56, 379—380. Russell, W. L. (1947). Splotch, a new mutation in the house mouse Mus musculus, Genetics 32, 107. Salamov, A. A., and Solovyev, V. V. (1997). Recognition of 3'-processing sites of human mRNA precursors, Comput Appl Biosci 13, 23—28. Sanyanusin, P., Norrish, J. H., Ward, T. A., Nebel, A., McNoe, L. A., and Eccles, M. R. (1996). Genomic structure of the human PAX2 gene, Genomics 35, 258-261. Schafer, K., and Braun, T. (1999). Early Specification of limb muscle precursor cells by the homeobox gene belh, Nat Genet 23, 213-216. Schneider, T. D. (1996). Reading of DNA sequence logos: prediction of major groove binding by information theory, Methods Enzymol 274, 445—455. Schneider, T. D. (1997). Information content of individual genetic sequences, J Theor Biol 189, 427—441. Schneider, T. D., and Stephens, R. M. (1990). Sequence logos: a new way to display consensus sequences, Nucleic Acids Res 18, 6097—6100. 220 Senapathy, P., Shapiro, M. B., and Harris, N. L. (1990). Splice junctions, branch point sites, and exons: sequence statistics, identification, and applications to genome project, Methods Enzymol 183, 252—278. Seo, H. C., Saetre, B. 0., Havik, B., Ellingsen, S., and Fjose, A. (1998). The zebrafish Pax3 and Pax7 homologues are highly conserved, encode multiple isoforms and Show dynamic segment—like expression in the developing brain, Mech Dev 70, 49—63. Shapiro, D. N., Sublett, J. E., Li, B., Downing, J. R., and Naeve, C. W. (1993). Fusion of PAX3 to a member of the forkhead family of transcription factors in human alveolar rhabdomyosarcoma, Cancer Res 53, 5108-5112. Shapiro, M. B., and Senapathy, P. (1987). RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression, Nucleic Acids Res 15, 7155—7174. Shin, T. H., Paterson, A. J., and Kudlow, J. E. (1995). p53 stimulates transcription from the human transforming growth factor alpha promoter: a potential growth-stimulatory role for p53, Mol Cell Biol 15, 4694— 4701. Shweiki, D., Itin, A., Neufeld, G., Gitay—Goren, H., and Keshet, E. (1993). Patterns of expression of vascular endothelial growth factor (VEGF) and VEGF receptors in mice suggest a role in hormonally regulated angiogenesis, J Clin Invest 91, 2235—2243. Sibley, C. G., and Ahlquist, J. E. (1984). The phylogeny of the hominoid primates, as indicated by DNA—DNA hybridization, J Mol Evol 20, 2—15. Solovyev, V., and Salamov, A. (1997). The Gene—Finder computer tools for analysis of human and model organisms genome sequences, Ismb 5, 294—302. Song, D. L., Chalepakis, G., Gruss, P., and Joyner, A. L. (1996). Two Fax—binding sites are required for early embryonic brain expression of an Engrailed—Z transgene, Development 122, 627—635. Sotirova, V. N., Rezaie, T. M., Khoshsorour, M. M., and Sarfarazi, M. (2000). Identification of a novel mutation in the paired domain of PAX3 in an Iranian family with Waardenburg syndrome Type I, Ophthalmic Genet 21, 25— 28. Steel, K. P., and Smith, R. J. (1992). Normal hearing in Splotch (Sp/+), the mouse homologue of Waardenburg syndrome type 1, Nat Genet 2, 75—79. Sublett, J. E., Jeon, I. S., and Shapiro, D. N. (1995). The alveolar rhabdomyosarcoma PAX3/FKHR fusion protein is a transcriptional activator, Oncogene 11, 545— 552. Tajbakhsh, S., Rocancourt, D., Cossu, G., and Buckingham, M. (1997). Redefining the genetic hierarchies controlling skeletal myogenesis: Pax—3 and Myf—S act upstream of MyoD, Cell 89, 127—138. Tassabehji, M., Newton, V. E., Leverton, K., Turnbull, K., Seemanova, E., Kunze, J., Sperling, K., Strachan, T., and Read, A. P. (1994a). PAX3 gene structure and mutations: close analogies between Waardenburg syndrome and the Splotch mouse, Hum Mol Genet 3, 1069—1074. Tassabehji, M., Newton, V. E., and Read, A. P. (1994b). Waardenburg syndrome type 2 caused by mutations in the human microphthalmia (MITF) gene, Nat Genet 8, 251—255. Tassabehji, M., Read, A. P., Newton, V. E., Harris, R., Balling, R., Gruss, P., and Strachan, T. (1992). Waardenburg's syndrome patients have mutations in the human homologue of the Pax—3 paired box gene, Nature 355, 635— 636. Tassabehji, M., Read, A. P., Newton, V. E., Patton, M., Gruss, P., Harris, R., and Strachan, T. (1993). Mutations in the PAX3 gene causing Waardenburg syndrome type 1 and type 2, Nat Genet 3, 26—30. Tavassoli, K., Ruger, W., and Horst, J. (1997). Alternative splicing in PAX2 generates a new reading frame and an extended conserved coding region at the carboxy terminus, Hum Genet 101, 371—375. Templeton, N. S. (1992). The polymerase chain reaction. History, methods, and applications, Diagn Mol Pathol 1, 58-72. Theiler, K. (1989). The house mouse: atlas of embryonic development. (New York, Springer). Thiesen, H. J., and Bach, C. (1990). Target Detection Assay (TDA) - a Versatile Procedure to Determine DNA— Binding Sites as Demonstrated on Spl Protein, Nucleic Acids Research 18, 3203—3209. Tokuyama, Y., Yagui, K., Sakurai, K., Hashimoto, N., Saito, Y., and Kanatsuka, A. (1998). Molecular cloning of rat Pax4} identification of four isoforms in rat insulinoma cells, Biochem Biophys Res Commun 248, 153—156. Tremblay, P., Dietrich, S., Mericskay, M., Schubert, F. R., Li, Z., and Paulin, D. (1998). A crucial role for Pax3 in the development of the hypaxial musculature and the long-range migration of muscle precursors, Dev Biol 203, 49-61. Tsukamoto, K., Nakamura, Y., and Niikawa, N. (1994). Isolation of two isoforms of the PAX3 gene transcripts and their tissue— Specific alternative expression in human adult tissues, Hum Genet 93, 270-274. Turc—Carel, C., Lizard—Nacol, S., Justrabo, E., Favrot, M., Philip, T., and Tabone, E. (1986). Consistent chromosomal translocation in alveolar rhabdomyosarcoma, Cancer Genet Cytogenet 19, 361-362. 223 i Uchiyama, K., Ishikawa, A., and Hanaoka, K. (2000). Expression of lbxl involved in the hypaxial musculature formation of the mouse embryo, J Exp Zool 286, 270—279. Untawale, S., Zorbas, M. A., Hodgson, C. P., Coffey, R. J., Gallick, G. E., North, S. M., Wildrick, D. M., Olive, M., Blick, M., Yeoman, L. C., and et a1. (1993). Transforming growth factor—alpha production and autoinduction in a colorectal carcinoma cell line (DiFi) with an amplified epidermal growth factor receptor gene, Cancer Res 53, 1630—1636. Vergeer, W. P., Sogo, J. M., Pretorius, P. J., and de Vries, W. N. (2000). Interaction of Apl, Ap2, and Spl with the regulatory regions of the human pro—alpha1(I) collagen gene, Arch Biochem Biophys 377, 69—79. Vogan, K. J., Epstein, D. J., Trasler, D. G., and Gros, P. (1993). The splotch—delayed (Spd) mouse mutant carries a point mutation within the paired box of the Fax—3 gene, Genomics 17, 364—369. Vogan, K. J., and Gros, P. (1997). The C—terminal subdomain makes an important contribution to the DNA binding activity of the Pax—3 paired domain, J Biol Chem 272, 28289—28295. Vogan, K. J., Underhill, D. A., and Gros, P. (1996). An alternative Splicing event in the Pax—3 paired domain identifies the linker region as a key determinant of paired domain DNA—binding activity, Mol Cell Biol 16, 6677—6686. Vortkamp, A., Gessler, M., and Grzeschik, K. H. (1995). Identification of optimized target sequences for the GLI3 zinc finger protein, DNA Cell Biol 14, 629—634. Waardenburg, P. J. (1951). A new syndrome combining developmental anomalies of the eyelids, eyebrows and nose root with pigmentary defects of the iris and head hair with congenital deafness., Am J Hum Genet 3, 195—253. 224 Wada, H., Holland, P. W., Sato, S., Yamamoto, H., and Satoh, N. (1997). Neural tube is partially dorsalized by overexpression of HrPax—37: the ascidian homologue of Pax—3 and Pax—7, Dev Biol 187, 240—252. Walther, C., Guenet, J. L., Simon, D., Deutsch, U., Jostes, B., Goulding, M. D., Plachov, D., Balling, R., and Gruss, P. (1991). Fax: a murine multigene family of paired box—containing genes, Genomics 11, 424—434. Wang, D., and Kudlow, J. E. (1999). Purification and characterization of TEFl, a transcription factor that controls the human transforming growth factor—alpha promoter, Biochim Biophys Acta 1449, 50—62. Wang, D., Shin, T. H., and Kudlow, J. E. (1997). Transcription factor AP—2 controls transcription of the human transforming growth factor—alpha gene, J Biol Chem 272, 14244—14250. Ward, T. A., Nebel, A., Reeve, A. E., and Eccles, M. R. (1994). Alternative messenger RNA forms and open reading frames within an additional conserved region of the human FAX—2 gene, Cell Growth Differ 5, 1015—1021. Watanabe, A., Takeda, K., Ploplis, B., and Tachibana, M. (1998). Epistatic relationship between Waardenburg syndrome genes MITF and PAX3, Nat Genet 18, 283—286. Weigel, D., and Jackle, H. (1990). The fork head domain: a novel DNA binding motif of eukaryotic transcription factors? [letter], Cell 63, 455—456. Wildhardt, G., Winterpacht, A., Hilbert, K., Menger, H., and Zabel, B. (1996). Two different PAX3 gene mutations causing Waardenburg syndrome type I, Mol Cell Probes 10, 229—231. Wright, W. E., Binder, M., and Funk, W. (1991). Cyclic Amplification and Selection of Targets (CASTing) for the Myogenin Consensus Binding—Site, Molecular and Cellular Biology 11, 4104—4110. 225 Wright, W. E., and Funk, W. D. (1993). CASTing for Multicomponent DNA—Binding Complexes, Trends in Biochemical Sciences 18, 77—80. Zhang, L., Cui, X., Schmitt, K., Hubert, R., Navidi, W., and Arnheim, N. (1992). Whole genome amplification from a single cell: implications for genetic analysis, Proc Natl Acad Sci U S A 89, 5847-5851. Ziman, M. R., Fletcher, S., and Kay, P. H. (1997). Alternate Pax7 transcripts are expressed specifically in Skeletal muscle, brain and other organs of adult mice, Int J Biochem Cell Biol 29, 1029-1036. Ziman, M. R., and Kay, P. H. (1998). Differential expression of four alternate Pax7 paired box transcripts is influenced by organ— and strain—Specific factors in adult mice, Gene 217, 77—81. Zwollo, P., Arrieta, H., Ede, K., Molinder, K., Desiderio, S., and Pollock, R. (1997). The Pax—5 gene is alternatively spliced during B—cell development, J Biol Chem 272, 10160— 10168. 226 HARIE 11.18 . Li... 3.1” .2. ii. 11 v .. 1‘21... .713.