BIOL 471
Multiple sequence alignment exercise
Nucleotide and amino acid sequences are commonly used to reconstruct the phylogenetic history of organisms. Sequences obtained by researchers are deposited in various online databases and are freely available for use by others. The analysis of sequence data frequently begins by obtaining an alignment, in which multiple sequences are arranged to show positional homology. Since sequences vary in length because of mutations (insertions or deletions--indels, point mutations, translocations, etc.), gaps are added to sequences to maintain positional homology. However, positional homology is not known with certainty; it must be inferred using an analytical approach. In this exercise, you will use an online program that aligns sequences by matching similar portions of sequences as closely as possible, but with the fewest gaps possible. You will use the alignment to create a preliminary tree representing the phylogeny of a group of organisms.
Nucleotide sequence data can also be obtained from unrecognized or undescribed organisms to provide information about their identity or their proper placement among groups of organisms whose identity is already known. Analysis typically includes alignment of sequence information obtained for the same gene region in several organisms to see how homologous sites in the sequences compare. If the sequences are not properly aligned, valid comparisons cannot be made and identifications will be incorrect. In this exercise, you will also attempt to identify an unknown organism based on preliminary phylogenetic information obtained using related sequences obtained from online databases.
The most commonly used program to align sequences is ClustalW (or ClustalX, a similar text-based program), which is freely available online as source code or an executable program. Clustal aligns input sequences in pairs to produce a distance matrix, which is then used to produce a simple unrooted NJ guide tree. Using this tree, Clustal gradually builds an alignment by following the branching order on the tree. So the most terminal groups are aligned first, then treated as one (gaps are kept in place) and aligned with the next most closely related sequences. This continues until all sequences are aligned. The process is fast and can be used to align hundreds of input sequences.
Aligning existing sequences and creating a preliminary phylogeny:
Below are sequences of a gene amplified from the mitochondrial genome of representatives of the Ursidae (bear family), the relationships of which were studied in a series of papers by Talbot and Shields (1996a,b), and discussed in class. You will use these to create an initial multiple sequence alignment for analysis later in class. Based on simple tests of relationships you should be able to answer the questions below. Since these are published sequences, you can access information about them online at the NCBI (National Center for Biotechnology Information) website using the access numbers (GenBank access numbers in this case will start with letters such as U***** DQ****** or EF******); the nucleotide database also provides information about the collection localities, depositors, literature, etc.
Once you have a preliminary alignment, you can produce a phylogeny using available online freeware. One such program is SplitsTree, available from www.splitstree.org. This is a free program, but you must register a site license. It will take an alignment formatted in Clustal (which you will have), or one of many other common formats (NEXUS, PHYLIP, FASTA) and create trees or networks using a variety of analytical approaches (background discussed by the developers in Huson and Bryant, 2006). You will not be required to use this program for this exercise, but I will briefly introduce its use in class.
Identifying an unknown from sequence information:
You will also be given sequences obtained from an unknown organism (a fungus!), and you will be asked to find out as much as you can about this unknown. These sequences were obtained with fungal primers designed to amplify the large subunit (LSU) ribosomal RNA gene, the internal transcribed spacer (ITS) ribosomal gene region, and the second largest subunit of RNA polymerase II (RPB2). These have not yet been published.
To do these exercises, you must visit GenBank (to access deposited sequences) and also the European Bioinformatics Institute nucleotide database EMBL-EBI website that provides access to ClustalW, which will enable you to get an initial multiple alignment of your sequences.
Follow the procedure outlined below and answer the questions at the end of the exercise.
Procedure:
1. Visit GenBank and get information about the bear sequences (using either the Latin names or the accession numbers). Here is where you can find out who deposited them, what gene(s) were sequenced, if they were published and where, etc.
2. Copy and paste the sequence information below (it is already formatted in FASTA format, which can be recognized by ClustalW) into the ClustalW submission form window. Use default settings (you don't need to change anything) on the form and push "run." You will be given a multiple alignment of the sequences in two windows, one (push the "JalView" button) of which is in color and requires Java, and the other given below the alignment scores. Notice that the sequences are not all the same length, so the alignment leaves gaps where something is missing or added (indels, positions where a base or bases were inserted or deleted in one or more sequences). The process of gapping the sequences is complicated, but we would like the program to minimize this as much as possible so as to create the most likely representation of the homologies among the sequences.
3. Look first at the "Scores Table." This tells you the percentage similarity of each pair of sequences (which may or may not be evolutionarily significant). If you sort by alignment score, you can see which sequences are most similar to each other, or to unknowns. At the end of the results is a simple NJ guide tree produced from the pairwise distance measures (a cladogram or phylogram which may or may not represent actual evolutionary relationships). Clustal uses this to construct the alignment in the sequence order defined by the tree. Keep in mind that this tree is unrooted and represents only the most simplistic relationships of the very limited number of specimens examined in this exercise. A more complicated analysis requires a different program and careful consideration of the sequences included in the analysis.
4. Now produce a reduced version of your alignment by aligning only half (approximately) of each sequence below in ClustalW. Make sure you copy the front end of each sequence that includes the accession information. How does the alignment change (if at all)? How does the guide tree change?
5. What happens if you use only 10 bp sequences? Is there significant loss of signal?
6. If you want to go further with this analysis, you can access a program at www.splitstree.org and produce preliminary visualizations of trees and networks from your aligned sequences. I will do this for the class so you can see how it works, but it is not required for this exercise.
7. To find out the possible identity of the fungal unknown, use the BLAST search tool at GenBank (see below). This tool searches nucleotide databases for sequences similar to yours. These presumably represent potential close relatives of the unknown (perhaps even the same species as the unknown, if sequences have been deposited previously).
To BLAST search the sequences, visit the BLAST page of the NCBI website where GenBank is located. BLAST (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. For a better understanding of BLAST you can refer to the BLAST Course which explains the basics of the BLAST algorithm, or to the NCBI BLAST tutorial. For our purposes, it will be sufficient to do the following:
8. For the fungal LSU search only, add the first five sequences to the unknown and produce an alignment of the sequences in Clustal. Produce a NJ guide tree to see how these sequences cluster together.
Questions:
1. Which mitochondrial gene is used for this alignment? By looking at the alignment, can you tell whether the sequences all cover the same portion of the gene? Are they all the same length? What can cause the sequences to be unequal in length (why must they be gapped in the alignment)?
2. A quick look at the “Scores Table” in Clustal gives you an idea of the pairwise relationships of the sequences. Based on these scores, can you tell whether the brown bear sequences are the most similar to each other? Are there situations where brown bears are more closely related to some other species than to other brown bears?
3. The brown bear sequences represent various populations distributed
throughout
4. Based on your BLAST search, what is the likely identity of the unknown
(genus? species?)? Do you get different results when you use different genes to
do the search? Why is this?
5. What information about the unknown is gained by creating an alignment using additional sequences from GenBank?
Bear Sequence Data:
>AY390359_A_melanoleuca_giant_panda
ATGATCAACATCCGAAAAACTCATCCATTAGTTAAAATTATCAACAACTCATTCATTGACCTTCCAACAC
CATCAAACATTTCAACATGATGGAACTTTGGGTCTCTGTTAGGAGTGTGTCTGATCTTGCAAATCTTAAC
AGGCTTATTTCTAGCCATACACTATACATCAGATACAGCTACAGCCTTTTCATCAGTCGCACACATTTGT
CGAGACGTCAACTATGGTTGATTTATCCGATATATACATGCCAATGGGGCCTCTATATTTTTTATCTGCC
TATTTATACACGTAGGGCGAGGCTTATACTATGGATCATACCTATTTCCAGAGACATGGAATATCGGAAT
TATTCTCCTACTTACAGTTATAGCCACAGCATTCATAGGGTATGTACTACCTTGAGGACAAATATCCTTC
TGAGGAGCAACCGTCATTACTAACCTACTATCAGCAATTCCTTACATTGGCACTAATCTAGTGGAGTGAG
TCTGAGGGGGTTTCTCCGTAGATAAAGCAACACTAACCCGATTTTTTGCTTTTCACTTTATCCTTCCATT
TATCATCTCAGCACTAGCAATAGTCCATCTATTATTCCTTCACGAAACAGGATCTAATAACCCCTCCGGA
ATTCCATCTGACCCAGACAAAATCCCATTTTACCCCTATCATACAATTAAAGACATCCTAGGCGTCCTAT
TTCTTGTCCTCGCCTTAATAACCCTGGCTTTATTCTCACCAGACCTGTTAGGAGACCCTGATAACTATAC
CCCTGCAAATCCACTAAGTACCCCGCCACATATTAAGCCTGAATGGTACTTTCTATTTGCCTACGCTATC
CTGCGATCTATTCCTAATAAACTAGGAGGGGTGCTAGCTCTAATCTTCTCTATTCTAATTCTAACTATTA
TTCCACTATTACATACATCCAAACAACGAAGCATGATATTCCGACCTCTAAGTCAATGCTTATTCTGACT
CCTAGTAGCAGACCTACTCACACTAACATGAATTGGAGGACAGCCAGTAGAACACCCCTTCATTATTATT
GGGCAATTGGCCTCTATTCTCTACTTTACAATTCTTCTAGTACTTATACCTATCACTAGCATTATTGAGA
ATAGCCTCTCAAAATGAAGA
>U18870_Ursus_arctos_GB01
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGACCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTGTGTTTAATTCTACAGATTCTAAC
AGGCCTGTTTCTAGCCATACACTATACATCAGACACAACCACAGCTTTTTCATCAGTCACCCACATTTGC
CGAGACGTTCACTACGGGTGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATGTTCTTTATCTGCC
TATTCATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCTCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAGTTATAGCCACCGCATTTATAGGATACGTCCTACCCTGAGGCCAAATGTCCTTC
TGAGGAGCAACTGTCATCACCAATCTACTATCGGCCGTTCCCTATATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACTCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCACCTATTATTCCTACACGAAACAGGATCCAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTCCACCCATACTATACAATTAAGGATATTCTAGGCGCCCTAC
TTCTCACCCTAGCCTTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGACAACTATAT
CCCCGCAAATCCACTGAGCACCCCACCCCACATCAAACCCGAGTGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCCTCA
TTCCTCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGGCCCCTAAGCCAATGCCTATTTTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACACCCCTTCATTATTAT
GGACAACTAGCCTCCATTCTCTACTTTACAATCCTCCTAGTACTTATACCCACCGCTGGAATTATTGAAA
ACAACCTCTTAAAGTGGAGA
>U18886_Ursus_arctos_GB17
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGACCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTATGTTTAATTCTACAGATTCTAAC
AGGCCTGTTCCTAGCCATACACTATACACCAGACACAACCACAGCTTTTTCATCGGTCACCCACATTTGC
CGAGACGTTCACTACGGATGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATCTTCTTTATCTGCC
TATTTATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCTCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAATTATAGCCACCGCATTTATAGGATACGTCCTACCCTGGGGCCAAATGTCCTTC
TGAGGAGCGACTGTCATCACCAATCTACTATCGGCCATTCCCTACATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACCCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCATCTATTGTTCCTACACGAAACAGGATCTAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTCCATCCATACTATACAATTAAGGATATTCTAGGCGCCCTAC
TTCTCGCCCTAACCTTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGATAACTATAC
CCCCGCAAATCCACTGAGCACTCCACCCCACATCAAACCCGAATGGTACTTTCTATTTGCCTACGCTATC
CTACGATCTATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCATCA
TTCCTCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGACCCCTAAGCCAATGCCTATTTTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACATCCCTTCATTATTATC
GGACAACTAGCCTCCATTCTCTACTTTACAATCCTCTTAGTACTTATACCTATCGCTGGAATTATCGAAA
ACAACCTCTTAAAGTGGAGA
>U18888_Ursus_arctos_GB19
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGACCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTATGTTTAATTCTACAGATTCTAAC
AGGCCTGTTCCTAGCCATACACTATACACCAGACACAACCACAGCTTTTTCATCGGTCACCCACATTTGC
CGAGACGTTCACTACGGGTGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATCTTCTTTATCTGCC
TATTTATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCCCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAATTATAGCCACCGCATTTATAGGATACGTCCTACCCTGGGGCCAAATGTCCTTC
TGAGGAGCGACTGTCATCACCAACCTACTATCGGCCATTCCCTACATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACCCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCATCTATTGTTCCTACACGAAACAGGATCTAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTCCATCCATACTATACAATTAAGGATATTCTAGGCGCCCTAC
TTCTCGCCCTAACCTTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGATAACTATAC
CCCCGCAAATCCACTGAGCACTCCACCCCACATCAAACCCGAATGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCATCA
TTCCTCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGACCCCTAAGCCAATGCCTATTTTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACATCCCTTCATTATTATC
GGACAACTAGCCTCCATTCTCTACTTTACAATCCTCTTAGTACTTATACCTATCGCTGGAATTATCGAAA
ACAACCTCTTAAAGTGGAGA
>U18878_Ursus_arctos_GB09
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGACCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTATGTTTAATTCTACAGATTCTAAC
AGGCCTGTTCCTAGCCATACACTATACACCAGACACAACCACAGCTTTTTCATCGGTCACCCACATTTGC
CGAGACGTTCACTACGGATGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATCTTCTTTATCTGCC
TATTTATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCTCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAATTATAGCCACCGCATTTATAGGATACGTCCTACCCTGGGGCCAAATGTCCTTC
TGAGGAGCGACTGTCATCACCAATCTACTATCGGCCATTCCCTACATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACCCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCATCTATTGTTCCTACACGAAACAGGATCTAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCCTTCCATCCATACTATACAATTAAAGATATTCTAGGCGCCCTAC
TTCTCGCCCTAACCTTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGATAACTATAC
CCCCGCAAATCCACTGAGCACTCCACCCCACATCAAACCCGAATGGTACTTTCTATTTGCCTACGCTATC
CTACGATCTATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCATCA
TTCCCCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGACCCCTAAGCCAATGCCTATTTTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACATCCCTTCATTATTATC
GGACAACTAGCCTCCATTCTCTACTTTACAATCCTCCTAGTACTTATACCTATCGCTGGAATTATTGAAA
ACAACCTCTTAAAGTGGAGA
>U18897_Ursus_arctos_GB28
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGACCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTATGTTTAATTCTACAGATTCTAAC
AGGCCTGTTCCTAGCCATACACTATACACCAGACACAACCACAGCTTTTTCATCGGTCACCCACATTTGC
CGAGACGTTCACTACGGGTGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATCTTCTTTATCTGCC
TATTTATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCTCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAATTATAGCCACCGCATTTATAGGATACGTCCTACCCTGGGGCCAAATGTCCTTC
TGAGGAGCGACTGTCATCACCAATCTACTATCGGCCATTCCCTACATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACCCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCATCTATTGTTCCTACACGAAACAGGATCTAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTCCATCCATACTATACAATTAAGGATATTCTAGGCGCCCTAC
TTCTCGCCCTAACCTTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGACAACTATAC
CCCCGCAAATCCACTGAGCACTCCACCCCACATCAAACCCGAATGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCATCA
TTCCTCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGACCCCTAAGCCAATGCCTATTCTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACATCCCTTCATTATTATC
GGACAACTGGCCTCCATTCTCTACTTTACAATCCTCCTAGTACTTATACCCATCGCTGGAATTATCGAAA
ACAACCTCTTAAAGTGGAGA
>EU567096_U_maritimus_polar_bear
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATCAACAACTCATTTATTGATCTTCCAACAC
CATCAAACATCTCAGCATGATGAAACTTTGGATCCCTCCTTGGAGTGTGTTTAATTCTACAGATTCTAAC
AGGCCTGTTTCTAGCCATACACTATACATCAGACACAACCACAGCTTTTTCATCAGTCACCCACATTTGC
CGAGACGTTCACTACGGGTGAGTTATCCGATATGTACATGCAAATGGAGCCTCCATGTTCTTTATCTGCC
TATTCATGCACGTAGGACGGGGCCTGTACTATGGCTCATACCTATTCTCAGAAACATGAAACATTGGCAT
TATTCTCCTATTTACAGTTATAGCCACCGCATTTATAGGATACGTCCTACCCTGAGGCCAAATGTCCTTC
TGAGGAGCGACTGTCATCACCAATCTACTATCGGCCATTCCCTATATCGGAACGGACCTGGTAGAATGAA
TCTGAGGGGGCTTTTCCGTAGATAAGGCGACTCTAACACGATTCTTTGCTTTCCACTTTATTCTCCCGTT
CATCATCCTAGCACTAGCAGCAGTCCACCTATTGTTCCTACACGAAACAGGATCCAACAACCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTCCATCCATACTATACAATTAAGGATATTCTAGGCGCCCTAC
TTCTCACCCTAGCCCTAGCAACCCTAGTCCTATTCTCGCCCGACTTACTAGGAGACCCTGATAACTATAT
CCCCGCAAATCCACTAAGCACCCCACCCCACATCAAACCCGAGTGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCTAATAAACTAGGAGGAGTACTAGCACTAATTTTCTCCATTCTAATCCTAGCCCTCA
TTCCTCTTCTACACACGTCCAAACAACGAGGAATGATATTCCGGCCCCTAAGCCAATGCCTATTTTGACT
TCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACACCCCTTCATTATTATC
GGACAACTAGCCTCCATTCTCTACTTTACAATCCTCCTAGTACTCATACCCATCGCTGGAATTATTGAAA
ACAACCTCTTAAAGTGGAGA
>AF007937_U_americanus_(black_bear)
GAAACTTCGGATCCCTCCTCGGAGTATGTTTAGTACTACAAATTCTAACGGGCCTATTTCTAGCCATACA
CTACACATCAGATACAACTACAGCCTTTTCATCAATCACCCATATTTGCCGAGATGTTCACTACGGATGA
ATTATCCGATACATACATGCTAACGGAGCTTCCATGTTCTTTATCTGCCTGTTCATGCACGTAGGACGGG
GTCTGTACTATGGCTCATACCTACTCTCAGAAACATGAAACATTGGCATTATCCTCCTATTTACAGTTAT
AGCCACCGCATTCATAGGATATGTCCTGCCCTGAGGCCAAATATCCTTCTGAGGAGCAACTGTTATCACC
AACCTCCTATCAGCCATCCCCTATATTGGAACAGACCTAGTAGAATGGATCTGAGGGGGCTTTTCTGTGA
ATAAGGCAACTCTGACACGATTCTTTGCCTTCCACTTTATTCTTCCATTCATCATCTTGACACTAGCAGC
AGTCCACCTATTATTCCTACACGAAACAGGATCTAATAACCCCTCTGGAATCCCATCTGACTCAGACAAA
ATCCCATTTCATCCATATTATACAATTAAAGACGCCCTAGGCGCCCTACTTTTCATCCTAGCCCTAGCAA
CTCTAGTCCTATTCTCGCCTGACCTACTAGGAGATCCCGATAACTACACCCCCGCAAACCCACTGAGCAC
CCCACCCCACATCAAACCT
>U23554_Tremarctos_ornatus_speckled_bear
ATGACCAACATCCGAAAAACTCACCCACTAGCTAAAATCATCAACAGCTCATTCATTGACCTCCCAACAC
CATCAAATATCTCAGCGTGATGAAACTTCGGGTCCCTTCTTGGGGTGTGCCTGATCCTACACATCCTAAC
GGGCCTATTCCTGGCCATACACTATACAGCAGACACGACTACAGCCTTCTCATCAGTCGCCCATATCTGT
CGAGACGTTAACTACGGATGAGTTATCCGATACATACACGCGAACGGAGCTTCAATATTCTTTATCTGCT
TGTTCATACACGTGGGACGGGGTCTGTATTACGGCTCATACCTATTCTCAGAAACATGAAACATTGGAAT
TATTCTCCTACTCACAATTATAGCCACAGCATTCATGGGGTACGTGCTGCCCTGAGGCCAAATATCCTTT
TGAGGAGCAACCGTCATCACCAATCTGCTATCAGCTATCCCCTACATTGGAACCGACCTAGTAGAATGAA
TCTGAGGTGGATTCTCAGTAGATAAAGCAACCCTTACCCGATTTTTCGCTTTTCACTTTATCCTTCCATT
CATTATTTTAGCACTAGCCATAGTCCACCTATTATTTCTTCACGAAACAGGATCCAACAATCCCTCTGGA
ATCTCATCGAACTCAGACAAAATCCCATTTCACCCTTACTATACAATTAAAGATATTCTAGGCGTCTTAC
TTCTTCTCCTAGCCCTGGTAACCCTAGTCCTATTCTCACCCGACTTACTAGGAGACCCCGACAACTACAC
CCCTGCAAACCCAGTGAGCACCCCACTACATATCAAGCCTGAATGGTACTTCTTATTTGCCTACGCCATT
CTACGATCTATTCCCAATAAATTGGGAGGAGTACTGGCCCTAATCTTCTCCATTCTAATCCTAGCTATCA
TTCCTCTGCTGCACACATCCAAACAACGAGGAATGATATTCCGACCTTTAAGCCAATGCCTTTTCTGGCT
TCTAGCAGCAGACTTACTAACACTAACATGAATCGGAGGACAACCAGTGGAACATCCTCTTGTTATCATC
GGACAGCTAGCCTCTATCCTCTACTTCACAATCCTCCTAGTACTTATACCCATCGCCGGAATCATTGAAA
ATAACCTCTCAAAGTGAAGA
>U23562_Melursus_ursinus_sloth_bear
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATTAACAACTCACTCATTGACCTCCCAGCAC
CGTCAAACATCTCAGCATGATGAAACTTCGGATCCCTCCTCGGAGTGTGCTTAATTCTACAAATTCTAAC
AGGCCTATTTCTAGCCATGCACTATACATCAGACACAACCACAGCCTTTTCATCAGTCACCCATATCTGT
CGAGACGTCCACTACGGATGAATCATCCGATATATACATGCAAACGGGGCCTCCATATTCTTTATCTGCC
TATTCATGCACGTAGGACGGGGTCTGTACTATGGCTCATACCTATTCTCGGAGACATGAAACACCGGCAT
TATTCTCCTATTTACAGTCATAGCCACCGCATTCATAGGATACGTCCTACCCTGAGGCCAAATGTCCTTC
TGAGGAGCAACTGTCATCACCAATCTGCTATCGGCCATTCCCTATATTGGAGCGGACCTAGTAGAATGAA
TCTGAGGGGGGTTTTCCGTAGACAAGGCGACTCTAACACGATTCTTTGCCTTCCACTTTATCTTTCCATT
TATCATCCTAGCACTGGTAATAGTCCACCTATTGTTCCTACATGAAACAGGATCTAACAACCCCTCTGGA
ATCCCATCCAACTCAGACAAAATCCCATTTCACCCATATTATACAATTAAAGATATTATAGGCGCCTTAC
TTCTCATCCTAGCCCTGGCAACCCTAGTCCTATTCTCACCCGACTTACTAGGAGACCCCGACAACTACAC
CCCTGCAAACCCACTGAGCACCCCACCCCACATCAAACCCGAGTGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCCAATAAACTAGGAGGGGTACTAGCACTAATTTTCTCCATCCTAATCCTAGCTATCA
TTCCCCTTCTACACACATCCAAACAACGAGGAATGATATTCCGGCCCCTAAGCCAATGCCTATTTTGACT
CCTAGTAGCAGACCTACTAACACTTACATGAATCGGAGGACAACCAGTAGAATATCCCTTCATCACTATT
GGACAACTAGCCTCCATCCTCTACTTCATAATCCTCCTAGTACTCATGCCCATCGCCGGAATCATTGAAA
ATAATCTCTCAAAGTGAAGA
>U23558_S_thibetanus_Asian_black_bear
ATGACCAACATCCGAAAAACCCATCCATTAGCCAAAATCATCAACAACTCACTCATTGATCTCCCAGCAC
CATCAAATATCTCAGCATGATGAAACTTTGGATCCCTCCTCGGAATATGCCTAATCCTACAGATTCTGAC
AGGCCTATTTCTAGCTATACACTACACATCAGACGCGACTACAGCCTTTTCATCAGTCGCCCATATTTGC
CGAGACGTCCATTACGGATGAATTATCCGATACATACATGCAAACGGAGCCTCCATGTTCTTCATCTGCC
TATTCATACACGTAGGACGGGGCTTGTACTATGGCTCATACCTACTCTCAGAAACATGAAACATTGGCAT
CATCCTCCTATTTACAGTTATAGCCACCGCATTCATAGGATATGTCCTACCCTGAGGCCAAATATCTTTC
TGAGGAGCGACTGTCATTACCAACCTCCTATCAGCCATTCCCTATATTGGAACGGACCTAGTAGAGTGAA
TCTGAGGGGGCTTTTCCGTAGATAAAGCAACCCTAACACGATTCTTTGCTTTCCACTTTATCCTTCCATT
TATCATCCTAGCACTAGCAGCAGTCCATCTATTGTTCCTACACGAAACAGGATCCAACAACCCCTCTGGA
ATCCCATCCGACTCGGACAAAATCCCATTCCACCCATACTATACAATTAAGGACGCCCTAGGCGCCCTAC
TTCTCATTCTAGCCCTAGCAACTCTAGTTCTATTCTCGCCCGACTTACTGGGAGACCCTGACAACTATAC
CCCCGCAAACCCACTGAGCACCCCGCCCCACATCAAGCCCGAGTGATACTTTTTATTTGCTTACGCCATC
TTACGATCCATCCCCAACAAACTAGGAGGAGTACTAGCGCTAATCTTCTCTATCCTAATCCTAGCCATTA
TCCCCCTTCTACACACATCCAAACAACGAGGAATAATGTTCCGACCCCTAAGCCAATGCCTATTTTGACT
CCTAGTAGCAGACCTACTAACACTAACATGAATCGGAGGACAACCAGTAGAACATCCCTTCATCATTATC
GGACAGCTAGCCTCCATCCTCTACTTCACAATCCTCCTGGTGCTCATGCCCATCGCTGGAATCATTGAAA
ACAATCTCTCAAAGTGAAGA
>Helarctos_malayanus_sun_bear
ATGACCAACATCCGAAAAACCCACCCATTAGCTAAAATCATTAACAACTCACTTATTGACCTCCCAGCAC
CATCAAACATCTCGGCGTGATGAAACTTCGGATCCCTCCTCGGAGTATGCTTAATCCTACAGATTATGAC
AGGCCTATTTCTAGCCATACACTATACATCAGACACAACCACAGCCTTTTCATCAATCACTCATATCTGC
CGAGACGTTCACTACGGATGAATTATCCGATATATACATGCAAACGGAGCCTCCATGTTCTTTATCTGCC
TATTCATGCACGTAGGACGGGGTCTGTACTATGGCTCGTACCTATTCTCAGAAACATGAAACATCGGTAT
TATCCTCCTATTTACAGTTATAGCCACCGCATTTATAGGATACGTCCTACCCTGAGGCCAAATGTCCTTC
TGAGGAGCAACTGTCATTACCAATCTCTTATCAGCCATCCCCTATATTGGAACGGACCTAGTAGAATGAG
TCTGAGGAGGCTTTTCCGTAGACAAGGCGACTCTAACACGATTCTTTGCCTTCCACTTTATCCTTCCGTT
CATCATCTTGGCACTAACAGCGGTCCACCTATTATTCCTACACGAAACAGGGTCCAACAATCCCTCTGGA
ATCCCATCTGACTCAGACAAAATCCCATTTCACCCGTACTATACAATTAAGGACATCCTAGGCGCCCTAC
TTCTTACCCTAGCCCTAACAACCCTAGTTCTATTCTCGCCCGACTTACTAGGAGACCCTGACAACTACAT
CCCCGCAAATCCATTGAGCACCCCACCCCACATCAAACCCGAATGGTACTTTCTATTTGCCTACGCTATC
CTACGATCCATCCCTAATAAACTAGGAGGAGTACTAGCTCTAGTCTTCTCTATCCTAATCCTAGCCATTA
TCCCCCTCTTACACACATCCAAGCAACGAGGAATGATATTCCGACCTCTGAGCCAATGCCTATTTTGACT
CCTAGTAGCAGACCTACTAACACTAACATGAATTGGAGGACAACCAGTAGAACATCCCTTTACCATTATC
GGACAACTAGCCTCCATTCTCTATTTCATAATCTTCCTAGTATTCATACCCATCGCTGGAATTATTGAAA
ATAACCTCTCAAAATGAAGA
Unknown
fungal sequences:
>LSU_unknown
GGATTCCCCTAGTAACTGCGAGTGAAGCGGGAAAAGCTCAAATTTAAAATCTGGCAGGGTCCTCTCCGTCCGAGTTGTAATCTAGAGAAGCGCTATCCGT
GCCGGACCGTGTACAAGTCTCTTGGAACAGAGCGTCGCAGAGGGTGAGAATCCCGTCTTTGACACGGACTGCCGGTGCACTGTGATACGCTCTCAACGAG
TCGAGTTGTTTGGGAATGCAGCTCAAAACGGGTGGTAAACTCCATCTAAAGCTAAATATTGGCGAGAGACCGATAGCGAACAAGTACCGTGAGGGAAAGA
TGAAAAGAACTTTGGAAAGAGAGTTAAACAGTACGTGAAATTGTTGAAAGGGAAACGCTTGAAGTCAGTCGCGTCTGCCGGGGATCATCCTTCTCTCGAG
TCGGAGTACTTCCCGGTCGACGGGTCAGCATCAGTTTCGACCGTCGGATAAAAGCACGAGGAATGTGGCATCTCCGGATGTGTTATAGCCTCGGGTTGCA
TACGACGGTCGGGACTGAGGAACTCAGCACGCCGCGAGGCCGGGGTTCTCGAACCCACGTACGTGCTTAGGATGCTGGCGTAATGGCTTTAATCGACCCG
TCTTGAAACACGGACCAAGGAGTCTAACATGCCCGCGAGTGTTTGGGTGCAAAACCCGAGCGCGCAACGAAAGTGAAAGTTGAGATCTCTGTCGCGGAGA
GCATCGACGCCCAGACCAGACCTTTTGTGACGGATCTGCGGTCGAGCGTGTATGTTGGGACCCGAAAGATGGTGAACTATGCCTGAATAGGGCGAAGCCA
GAGGAAACTCTGGTGGAGGCTCGTAGCGATTCTGACGTGCAAATCGATCGTCGAATTTGGGTATAGGGGCGAAAGACTAATCGAACCATCTAGTAGCTGGTTCCTGCC
>ITS_unknown
AAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAATGAATTTAAACACGAGAGTTGGAAAGGGTTGCTGCTGGCCGAAAGGTATTCGTGCACGCCCCTCCTTCTCTCTGTGTTCATCTCCGAACCCCTGTGCACCCGTCGTAGGCCGAGCGATCGGCCTATGTTTTTTTCACAAACACCGTAAAGTTAAACGAATGTCATTCACGCAATGGGTCACTCCTCGAAAGAGGCCGGCGGCTCTTTGTTAAAATAAAAATAATACAACTTTCAACAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTATCAACTCTGAAAGCTTTCGTGGCCCTTCAGAGCTTGGATCTTGGAGCGTGCCGGGGATCCCCATCCTCGGCTCCTCTTGAAATGCATAAGTGGAACCTCTACAACTGCAGATCTGTTCTGGCGTGATAATTATCTACGTCGCGGCAGAGAAGCGGGTTGGATCCGCCTACAACCGTCGCGTCTCTGCGACAAAACGGGGGCCAGTCCCCCTCATTGACAATTTGACCTCAAATCAGGTAG
>RPB2_unknown
CCGAATGGACACTATGGCCAATATCCTGTACTACCCTCAGAAACCTCTTGCGACGACACGTTCCATGGAGTACCTTAAGTTCCGGGAACTTCCAGCCGGT
CAAAACGCGATCGTTGCAATTCTTTGCTACAGCGGATACAACCAGGAAGATTCCGTTATTATGAATCAGAGCTCGATTGATCGAGGCCTTTTCCGCAGTA
TCTACTACCGCAGCTACATGGACCTCGAGAAGAAGAGTGGAATTCAACAGCTCGAGGAGTTCGAGAAGCCAACGCGGGATAACACGTTACGCATGAAACA
TGGAACGTACGACAAACTGGAGGATGATGGGTTAATCGCCCCTGGAACTGGTGTCCGAGGAGAAGACATTATCATCGGTAAAACGGCGCCGATTCCACCA
GACAGCGAAGAGCTTGGTCAACGGACACGAACCCACACGCGGAGAGATGTGTCAACACCCCTGAAGAGTACAGAAAGCGGTATAGTCGATCAGGTTCTGA
TCACGACGAATTCGGAAGGTCAAAAGTTTGTCAAGATTCGTGTTCGATCAAGTCGTGTTCCCCAAATCGGGGACAAATTTGCATCACGCCACGGTCAGAA
AGGAACTATCGGTATCACGTATCGACAAGAAGACATGCCATTCACCGCCGAAGGTATCGTTCCTGACATTATCATCAATCCCCACGCCATTCCTTCCCGC
ATGACGATCGGCCACTTGGTGGAATGCCTACTATCAAAAGTTGCAACTCTGATTGGCAACGAGGGTGATGCTACGCCCTTCACGGACCTCACCGTTGAGT
CGGTCTCAACTTTCTTGAGGCAAAAGGGGTACCAGTCACGCGGGCTGGAGGTGATGTTCCACGGGCACACGGGACGCAAGCTCCAGGCTCAGGNTTATCTCGGNCCTACGTACTACC
Literature:
Huson, D.H. and D. Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23: 254-267/
Talbot, S. L. and G. F. Shields. 1996a. A phylogeny of the bears (Ursidae) inferred from complete sequences of three mitochondrial genes. Molecular Phylogenetics and Evolution 5: 567-575.
Talbot, S. L. and G. F. Shields. 1996b.
Phylogeography of brown bears (Ursus
arctos) of