Atte Moilanen 1, Liselotte Sundström 2 & Jes Søe Pedersen 3

Size: px
Start display at page:

Download "Atte Moilanen 1, Liselotte Sundström 2 & Jes Søe Pedersen 3"

Transcription

1 MATESOFT A PROGRAM FOR GENETIC ANALYSIS OF MATING SYSTEMS VERSION.0 DOCUMENTATION Atte Moilanen, Liselotte Sundström & Jes Søe Pedersen 3 Implementation and algorithms ajmoilan@mappi.helsinki.fi Department of Biological and Environmental Sciences University of Helsinki Concept and testing liselotte.sundstrom@helsinki.fi Department of Biological and Environmental Sciences University of Helsinki 3 Algorithms, documentation and testing JSPedersen@bi.ku.dk Department of Population Biology Institute of Biology University of Copenhagen November 006 MateSoft.Documentation.(6).doc/JSP

2 Table of Contents A. MATESOFT HELP Introduction Data Types and Analyses General Info on Input Format F Data: Deducing Queens from Known Offspring FQ Data: Deducing Fathers and Assigning Patrilines FQM Data: Mating Frequency Statistics QM Data: Mating Frequency Statistics File Configuration Menu Queen and Mate Deduction Menu Mating Frequency Statistics Menu Mating Frequency Statistics Output Troubleshooting and Special Use FAQ List of Example Files... 3 B. DEDUCING QUEEN GENOTYPES: ALGORITHMS FOR F DATA ANALYSIS Power Analysis Deducing Putative Queen Genotypes Worked-out Examples C. DEDUCING MATE GENOTYPES AND ASSIGNING PATRILINES: ALGORITHMS FOR FQ DATA ANALYSIS Defining Possible Fathers Selecting Putative Fathers (Parentage Analysis) Output Patriline Assignment and Putative Mates Some Simple Worked-out Examples... 4 D. ESTIMATION OF MATING FREQUENCY STATISTICS Estimation of the Paternity Skew c Estimation of the Nonidentification Error f Estimation of the Proportion of Double-Mated Queens D est and Effective Mate Number m e,p Estimations Based on Sperm Typing Statistics for Data with Unlimited Number of Queen Matings Statistics for Data with Only Single Matings Detected E. RELEVANT SOFTWARE AND LITERATURE Current Software for Parentage Analysis Literature MateSoft.Documentation.(6).doc/JSP

3 3 A. MATESOFT Help. Introduction MATESOFT is a software for the analysis of mating systems in male-haplodiploid organisms based on co-dominant genetic marker data. It is intended to be used in studies of hymenopteran social insects, where queens may have mated with one or several males and males may sire a variable proportion of the queen s offspring. The genetic data can be queen genotypes, genotypes of single-queen female offspring, and genotypes of sperm stored in the queen s spermatheca. The current version of MATESOFT has the following main features: Deduction of possible queen genotypes from offspring data when the mother could not be analysed (so-called F data; F stands for female offspring). Deduction of genotypes of putative males mated with the queen from offspring and queen data (FQ data). Assigning offspring to patrilines corresponding to the queen s putative mates (also FQ data). Estimation of mating frequency statistics from data where the queen can be either single or double mated. The statistics calculated are paternity skew, proportion of multiple mated queens, and average effective mate number using the algorithms in Pedersen and Boomsma (999a). The input is either genotypes of queens, their putative mates and offspring assigned to patrilines (FQM data), or genotypes of queens and sperm from their spermathecae (QM data). Estimation of summary mating frequency statistics from data where the queen can have any number of matings (FQM data). These include the sum of squared paternity contributions, frequency distribution of observed mate number, and average observed mate number. Deduction of parental genotypes and patriline assignment can be done for any number of queen matings. There are currently no general procedures available for estimating the effective mate number and related statistics in systems where a queen may have an arbitrary number of mates, each contributing an unknown proportion to the brood. We hope, however, to include methods for this in a future version of MATESOFT. System requirements: MATESOFT runs on any PC under a 3 bit version of Microsoft Windows and will also work under PC-Windows emulation on a Macintosh. Availability: MATESOFT can be downloaded free of charge at Registration: Please register by using the form on the MATESOFT homepage. This ensures that we can keep you updated about bugs and new releases. Support: If you have a problem or question please first have a close look at the example files and check the FAQ section. If you remain puzzled, contact Atte Moilanen about MateSoft.Documentation.(6).doc/JSP

4 4 possible bugs or how to make the program work, or Jes Søe Pedersen or Lotta Sundstöm about any other issue. Citation as publication (preferred): Moilanen A, Sundström L & Pedersen JS (004) MateSoft: a program for deducing parental genotypes and estimating mating system statistics in haplodiploid species. Molecular Ecology Notes, 4, Citation as software (alternative): Moilanen A, Sundström L & Pedersen JS (004) MateSoft: a program for genetic analysis of mating systems.0. Institute of Biology, University of Copenhagen, Copenhagen. Available at Acknowledgements: We thank the beta users for their feedback on software performance, and in particular Koos Boomsma, Elisabeth Brunner, Michael Haberl, Annette Bruun Jensen, Daniel Kronauer, Cathy Liautard, Alexandra Schrempf, Christoph Strehl, Seirian Sumner, and Palle Villesen. The development of this software was supported by the Carlsberg Foundation (J.S.P.), the Swiss National Science Foundation, the Danish National Science Research Council (J.S.P.), the Academy of Finland (L.S. and the Spatial Ecology Programme), and the FW5 EU Research-training network INSECTS (contract HPRN-CT ). The present Documentation is divided into five chapters. Chapter A provides the information needed for most users, including the formats of input and output data and explanations of the various program menus. The three subsequent chapters give a detailed account of the algorithms and estimations performed by the program. Finally, Chapter E lists other available software for parentage and related analysis along with relevant literature. Furthermore, a set of example files are included in the MATESOFT distribution package.. Data Types and Analyses To estimate mating frequency statistics from brood data you need to know the following: population allele frequencies, offspring genotypes, queen genotypes, male genotypes, and the sire of each offspring. If the queen genotypes are not known, MATESOFT can deduce them from the offspring data assuming the lowest number of queen mates that can explain the genotype array among offspring. Furthermore, if genotypes of putative mates are not deduced by the user, MATESOFT can do the analysis and assign offspring to patrilines, hereby completing the information for estimating mating frequency statistics. The deduction and handling of parental genotypes are less trivial than often assumed in such studies and the procedures implemented in MATESOFT will save error-prone analyses previously done by hand. When the statistics are based on sperm from the queen s spermatheca, only queen and sperm genotypes in addition to the population allele frequencies are needed for the estimations. In overview, MATESOFT is able to handle four different data types called as follows: F data: Genotypes of female offspring sorted in groups of sisters. FQ data: As F data but also including actual or deduced queen genotypes. MateSoft.Documentation.(6).doc/JSP

5 5 FQM data: As FQ data but also including deduced mate genotypes and assignation of offspring to putative patrilines. QM data: Genotypes of queens and the sperm stored in their spermathecae. Contains no offspring genotypes. MateSoft.Documentation.(6).doc/JSP

6 6 These data types are used in the following analyses where each arrow corresponds to processing in MATESOFT with input and output files: F.in F.out FQ.in FQ.out FQM.in FQM mating frequency statistics QM.in QM mating frequency statistics In F and FQ analyses two output files are produced. One output file referred to as the extended output data file contains the original data plus the further information deduced by the analysis. This file can usually be applied directly as input file at the following step, but in any case the file should be inspected and modifications may be needed before proceeding. The other file produced, simply referred to as output file, saves the analysis details also given in the screen output. A further description can be found in the relevant sections, and example files are provided for all cases. Estimations done by MATESOFT are based on the following assumptions: Loci are neutral and unlinked. Population allele frequencies are the same for males and females. Queens are not related to their mates, i.e. no regular inbreeding. Multiple mates of the same queen are not related. 3. General Info on Input Format This section gives information that applies to all data types regarding the format of the input file. Please refer to the file {Ex3-FQM.in.txt} for an example of most of the features mentioned. To the extent possible MATESOFT uses the common file format of RELATEDNESS and KINSHIP.3 (both by Keith Goodnight and Dave Queller, see Section E.), so that data can be transferred between these programs with only moderate file modifications. In particular, genotypes can be given in the same format, and through a configuration dialogue box the user can control how the individual data are loaded by the program. The input file is plain text and can be prepared in any text editor or in a spread sheet like Excel by using the save as text option. The file is organised as a spread sheet with rows and columns. Any row starting with an asterisk ( * ) is treated as a comment and simply disregarded by the program. IMPORTANT: all other rows are read with any white-space (space and/or tab stop) delimiting the columns. Empty columns are only allowed in the end of a row. If other columns contain no information they should be filled with an asterisk as place holder, e.g. Queen4<tab>*<tab>*<tab>30/36 in the case with two empty columns before the genotype. It follows that the first column cannot be empty as the row would otherwise be interpreted as a comment line. Furthermore, no spaces are allowed in variable names, e.g. use Queen4 instead of Queen 4. Keeping these rules ensures easy inspection of the file for formatting errors and that MATESOFT counts the columns correctly. MateSoft.Documentation.(6).doc/JSP

7 7 The input is divided in two sections: one for group data and one for individual genotype data. 3. Group Data Section The group data section has three subsections, starting with the lines Demes, K-Groups and Loci and allele frequencies, respectively, and each concluding with the line end. IMPORTANT: The identification of these lines is case-sensitive and the text should be entered exactly as shown here. The first part of the example file {Ex3-FQ.in.txt} illustrates the general lay-out: *MateSoft Data File: Fictive Sample Data #3 *The dataset has two demes, three loci analysed, single/double mating of queens *Group data section starts below this line Demes IslandA IslandB end *Group Deme Included K-Groups IAN IslandA IAN IslandA IAN3 IslandA IAN4 IslandA IAN5 IslandA IAN6 IslandA IAN7 IslandA IAN8 IslandA IBN IslandB IBN IslandB IBN3 IslandB IBN4 IslandB IBN5 IslandB IBN6 IslandB IBN7 IslandB end * Loci and allele frequencies *Locus IslandA IslandB L-Vsp0b L-Ric L-PGM f s end * Demes lists the names of all populations to which the groups (see below) belong. For each group the allele frequencies of its deme will be used as the population reference in all relevant calculations. Hence, it functions the same way as the deme option in RELATEDNESS. A deme name should be given even if all groups come from the same population. MateSoft.Documentation.(6).doc/JSP

8 8 K-Groups lists all group names used by the individuals in the genotype data section. A group comprises all individuals, actual or putative, pertaining to a single queen, her male mates, and her offspring. This is the important unit for estimating the mating frequency statistics and is analogous to the group level in RELATEDNESS. The second column shows to what deme the group belongs. Groups can be included or excluded in the subsequent analysis of the data by assigning (include) or 0 (exclude) in the third column. As the name tells Loci and allele frequencies gives the names of all loci studied and the frequency of all alleles found in each deme. For each locus, the first row starts with the locus name preceded by L-, e.g. L-Loc to indicate locus Loc. The second and subsequent columns give the sample size per deme as the number of diploid genomes (i.e. diploid and haploid individuals counted as and 0.5, respectively) analysed at this locus. Please note that the sample size refers to the individuals analysed for estimates of allele frequencies and not to the individuals in the present dataset. Ideally the estimated allele frequencies for the background population should be based on a large sample of unrelated individuals different from the ones included in the data file. Data for multiple demes should be given in the same order as in the deme list above. The rows below the locus name give the names and frequencies of each allele, deme by deme. The information for all loci is given this way one after another. The variable names for demes, groups, loci, and alleles can be any combination of up to eight letters and figures (case-sensitive). Longer names will be truncated to eight characters by the program when loading the data. Spaces and punctuation characters are not allowed, i.e. F 7, F-7, F_7, and F.7 are invalid names. 3. Individual Genotype Data Section The section with individual data starts with the line Individual genotype data section and concludes with the line end, both case-sensitive. The second row contains labels for the variables as column headings, and subsequent rows give the individual data. The individuals can be given in any order as the data in each row is loaded independently and sorted by the program. The format is illustrated by a stretch of the FQ data in{ex3-fq.in.txt}: Individual genotype data section starts below this line Ind-ID K-Group Class AltQ/M Alt-P Vsp0b Ric45 PGM IANQ IAN Q.0000 /4 60/7 s/s IANF0 IAN F * * /0 7/8 f/s IANF0 IAN F * * / 63/7 f/s IANF03 IAN F * * 4/0 60/8 f/s IANF04 IAN F * * 4/0 60/8 f/s IANF05 IAN F * * / 60/63 f/s { } IAN3Q IAN3 Q.0000 /8 60/69 f/s IAN3F0 IAN3 F * * 8/8 63/69 f/s IAN3F0 IAN3 F * * 8/8 60/63 f/s IAN3F03 IAN3 F * * /8 60/63 f/f IAN3F04 IAN3 F * * 8/8 63/69 f/s IAN3F05 IAN3 F * * 8/8 60/63 f/f IAN3F06 IAN3 F * *?/? 63/69 f/f IAN3F07 IAN3 F * * 8/8 63/69 f/s MateSoft.Documentation.(6).doc/JSP

9 9 and by some of the FQM data in {Ex-FQM.in.txt} Individual genotype data section starts below this line * Here starts computer generated section of queen genotypes *Male genotypes in the below section are computer generated *Ind-ID K-Group Class AltQ/M Alt-P Loc Q G0 Q.0000 aa M G0 M /.0000 a F00 G0 F * * aa F00 G0 F * * aa F003 G0 F * * aa F004 G0 F * * aa F005 G0 F * * aa { } Q0 G08 Q ab Q G08 Q cd M4 G08 M /.0000 c M5 G08 M /.0000 d M6 G08 M /.0000 a M7 G08 M /.0000 b F085 G08 F * * ac F086 G08 F * * bc F087 G08 F * * bc F088 G08 F * * bc F089 G08 F * * ad F090 G08 F * * ac F09 G08 F * * ad F09 G08 F * * bc F093 G08 F * * ac F094 G08 F * * bd F095 G08 F * * bd F096 G08 F * * ac Like the individuals, variables can be in any order and columns may contain data not used by MATESOFT as the user will tell the program what columns to read in and what variables they represent. For the same reason the actual column headings used in the file don t matter. In the following list of variables, R indicates that an analogous variable is used in RELATEDNESS (small r if it can be used as a demographic variable in this program), and numbers 4 show whether the variable is relevant for the F, FQ, FQM or QM data type, respectively: Variable Type Explanation Ind-ID 34 R Individual ID. A unique label for each individual in the data set. Individuals having several alternative genotypes listed (see below) should still have a unique ID for each of their alternatives, despite the alternatives represent the same, true individual. K-Group 34 R Group. Use the same name for all individuals pertaining to the same queen. All group names should be given in the list in the group data section. Class 34 r Class. Type of individual, being either queen ( Q ), mate ( M ) or female offspring ( F ). Only capital letters are allowed. There is no distinction between putative male mates and sperm scored. AltQ/M 3 r Alternative Queen/Mate. This is used to distinguish between possible alternative genotypes of the same, true individual. Read more about this variable in the sections on FQ and FQM data. Alt-P 3 Probability of Alternative. This is the weighted probability of the alternative so that all probabilities for a true individual sum to one. MateSoft.Documentation.(6).doc/JSP

10 0 Locus# 34 R Locus Name #. Loci should be given in the same order as in the list of loci and allele frequencies in the group data section. PatriQ# 3 r Patrilines for Queen Alternative #. Every mate and his putative offspring (i.e. a patriline) are assigned the same number. There is one column of such paternity assignments for each alternative genotype of the queen (#). See the section on FQM data for details. As for demes, groups, loci, and alleles variable names for individual IDs can be any combination of up to eight letters and figures, case-sensitive and not including spaces or punctuation characters. The variables Alt and Alt-P are not relevant for offspring, and PatriQ# has no use in queens and mates. In these cases and if the variables are not in the last columns * should be entered as a place holder to ensure correct loading of data. Diploid genotypes for queens and offspring are entered as the names of the two alleles scored separated by a special character, e.g. 56/59. Haploid genotypes for male mates are entered simply as the allele scored. All alleles in the genotypic data should be defined with their name and population (deme) frequencies in the group data section. A missing genotype is indicated by?/? or just?, independent of the ploidy of the individual. Having only one of two alleles scored for diploids like 56/? is not allowed. Both the character used as allele delimiter and the missing value indicator can be defined by the user in the file configuration menu, the default characters being / and?, respectively. Alternatively, no allele delimiter may be used if all alleles are given by single characters as in the example above from {Ex-FQM.in.txt}. The individuals can be given in any order and the data rows can be separated by comment lines (preceeded by * ) for easier inspection of the file. The following size limitations for the input data apply: Number of Limit Demes * Loci 00 Alleles per locus 00 Groups (queens) * Offspring per group 4000 Queen alternatives per group 00 Mates per group 000 Alternative patrilines per offspring 00 Individuals in total * *Limited only by the available memory in your computer. These limitations should not be of any practical importance. 4. F Data: Deducing Queens from Known Offspring The input for F analysis is genotypes of female offspring organised in groups of sisters (F data) and MATESOFT outputs a modified data file with possible genotypes of mother queens appended. The algorithms applied are described in Chapter B. MateSoft.Documentation.(6).doc/JSP

11 The input data should include the following variables in the individual genotype data section: Ind-ID, K-Group, Class, and Locus#. The deduction of queen genotypes works for any number of fathers that have sired the offspring in a brood, and if the maternal genotype cannot be deduced unequivocally, alternative queen genotypes are listed in the output with associated probability weights. See the section on FQ data for more about alternative genotypes and their probabilities. In case offspring genotypes can only be explained by a combination of different mothers, the brood is excluded from further analysis and a warning about polygyny is given. 4. The Power of Correct Deduction of Queen Genotypes The algorithm for deducing queen genotypes assumes that a sufficient number of offspring is analysed per brood so that at least one copy of both queen alleles are present in the brood. MATESOFT calculates the probability that this assumption is true, both per group and for the over-all data, and outputs the values, e.g. Your power of correctly deducing all queen genotypes is (see section B.). Groups with low deduction power should be omitted from further analysis. 4. Narrow and Broad Deduction of Queen Genotypes In the menu launching the F analysis (section 9) the user decides to apply one of two algorithms for deducing the queen genotypes: the narrow deduction option (default) or the broad deduction option. In both cases the algorithms works locus by locus to construct the maternal genotype from offspring data, but they differ in their assumption regarding the number of times the queen may have mated. Here, the narrow deduction option always assumes single mating of the queen when this can explain the offspring genotypes at the locus considered. However, data on other loci may show that the queen is in fact multiple mated, meaning that this assumption was too restrictive and alternative queen genotypes are possible. This option will always subsequently lead to the smallest number of mates possible and correct assignment of patrilines although the algorithm may give wrong queen and mate genotypes at single loci. The broad deduction option in principle assumes that single and multiple matings are equally likely and allows for all possible queen genotypes to be deduced. If it turns out from the analysis of other loci that multiple mating was not needed to explain the data, the FQ analysis will include some queen alternatives with too high mate numbers. For parsimony, it may be preferred to consider only the queen alternative(s) with the smallest number of mates. In that case, queen alternatives with more than the minimum number of mates should be removed manually by the user from the extended output data before further analysis. Alternatively, if it turns out that the queen is in fact multiple mated the correct queen genotype will be included among the alternative queen genotypes with associated mates. The algorithms are described in section B.. The general advise on what deduction option to apply is: if the exact genotypes of queens and fathers are of importance for the study, then use the broad deduction and be critical about the possible number of mates; if the parental genotypes are of no importance, go for the narrow deduction instead. MateSoft.Documentation.(6).doc/JSP

12 4.3 Checking Mendelian Segregation of the Queen s alleles If the putative queen is heterozygous at a given locus MATESOFT calculates the probability that the queen s alleles follow Mendelian segregation in the offspring and gives the value in the output file. Refer to section B..3 for the formulae. These probabilities should be inspected carefully before further analysis as low values may indicate that important assumptions of the analysis are violated. One possibility is that more than one mother has contributed to the offspring (polygyny) although monogyny is assumed. Another posibility is that the queen is mated more times than the minimum number needed to explain the offspring genotypes. An obvious example of such violation is the following: Consider ten offspring of genotypes,,,,,,,,, and 3. As monogyny is assumed, the queen genotype will be deduced to or 3 (queen type..3.), although such segregation of the queen s alleles in combination with the mate s allele is as low as (0!/(9!!))(0.5) 0 (0.) Rather, the brood has several mothers with the last offspring having a different mother than the first nine. The output is in the following format, here with suggestions of what action to take if the probability is regarded as too low to be acceptable: Group G0, Locus Loc Queen type...a : Mendelian probability for only single mating of queen. Interpretation and action: Single mating can explain the data. However, if the Mendelian probability is too low, then allow for more matings by applying the broad instead of the narrow deduction option in a new F analysis. Group G04, Locus Loc Queen type...4a : Mendelian probability for only double mating of queen. Interpretation and action: Double mating can explain the data. However, if the Mendelian probability is too low, then allow for more matings by applying the broad instead of the narrow deduction option in a new F analysis. Group G08, Locus Loc Queen type.. : Mendelian probability for monogynous group. Interpretation and action: A single queen mother can explain the data. However, if the Mendelian probability is too low, then inspect the offspring genotypes to detect the offspring of a possible alien mother. Either purge the data from these offspring or the safest option skip this group in further analysis. The Mendelian probability can t be applied as a P value in statistical hypothesis testing where values < 5% normally would lead you to discard the null hypothesis. What to adopt as critical values for the Mendelian probabilities should be based on biological more than statistical considerations, i.e. you should ask the question: how biologically likely is it that my first assumptions about the number of mothers and fathers were correct? For example, if your experimental setup makes it biologically impossible that more queens are involved, then you should accept the analysis even when a low probability for a monogynous group is the result. MateSoft.Documentation.(6).doc/JSP

13 3 4.4 The Extended Output Data File The extended output data file can be used as input data file for the FQ analysis. Inspect the file in advance, and modify this or the configuration file according to the subsequent analysis. Only relevant variables from the original data are included in the new genotype data section, so the position of columns may have changed. The default order of variables in the extended data file from F analysis is: Ind-ID, Group, Class, AltQ/M, Alt-P, Locus#. The files {Ex-F.in.txt}, {Ex4-F.in.txt}, {Ex-F.out.txt} and {Ex4-F.out.txt} exemplify the input and output data from F analysis. 5. FQ Data: Deducing Fathers and Assigning Patrilines The input for FQ analysis is genotypes of queens and their female offspring organised in groups (FQ data). MATESOFT outputs a modified data file with possible genotypes of queens mates and patriline assignment of each offspring appended. The relevant algorithms can be found in Chapter C. When both queens and offspring are analysed genetically the FQ data file is prepared by the user. When only offspring are scored the input file is produced by MATESOFT in the F analysis by appending putative queen genotypes. The input data should include the following variables in the individual genotype data section: Ind-ID, K-Group, Class, AltQ/M, Alt-P, and Locus#. When the queen genotype is deduced from the offspring several multilocus genotypes may be compatible with the brood data. If there are no such alternatives, the only possible genotype is just labelled under the variable AltQ/M (for alternative queen or mate genotypes) with the associated probability.0000 for the variable Alt-P. In the case of alternative queen genotypes, these are labelled by ordinals, i.e.,, 3 and so on with their probabilities summing to unity. The probability calculation in Alt-P is based on the population frequencies of the alleles involved and/or the allele segregation among the offspring. These variables are generated by MATESOFT for the extended data file when processing F or FQ data. MATESOFT will deduce genotypes of paternal males and assign the offspring to patrilines for any number of queen matings. If an offspring genotype is incompatible with the genotype of the queen analysed a warning about possible polygyny is given in the output. Such offspring or the complete group it belongs to should be removed from the data set by the user before further analysis. The output data file can be used for estimating mating frequency statistics. Inspect the file in advance, and modify this or the configuration file according to the subsequent analysis. Only relevant variables from the original data are included in the new genotype data section, so the position of columns may have changed. The default order of variables in the extended data file from FQ analysis is: Ind-ID, Group, Class, AltQ/M, Alt-P, Locus#, PatriQ#. The files {Ex-FQ.in.txt}, {Ex-FQ.in.txt}, {Ex4-FQ.in.txt}, {Ex-FQ.out.txt}, {Ex- FQ.out.txt}, and {Ex4-FQ.out.txt} exemplify the input and output from FQ analysis. MateSoft.Documentation.(6).doc/JSP

14 4 6. FQM Data: Mating Frequency Statistics The main data type for estimation of mating frequency statistics is genotypes of queens, their putative mates and their female offspring organised in groups (FQM data). MATESOFT outputs summary statistics for each group and over-all estimates of average paternity skew, proportion of multiple mated queens, and effective mate number. The input file can either be prepared by the user or produced by MATESOFT in the FQ analysis. The input data should include the following variables in the individual genotype data section: Ind-ID, K-Group, Class, AltQ/M, Alt-P, Locus#, and PatriQ#. As for queens in FQ data, alternative genotypes of putative mates that may occur in FQM data are indicated by use of the AltQ/M variable. All mates are labelled in the format A/B where A is the queen alternative he may have mated and B is the label used for him and his associated patriline, both given by ordinals. That is, / designates the mate that sired the offspring of patriline, given the queen is alternative. Contrary to queens, mates being alternatives to each other have identical AltQ/M values but are distinguished by the program by their different individual IDs. If not present in the input file, this variable is generated by MATESOFT for the output data when processing F or FQ data. This notation is illustrated by data from the file {Ex-FQM.in.txt}. First a simple case Individual genotype data section starts below this line * Here starts computer generated section of queen genotypes *Male genotypes in the below section are computer generated *Ind-ID K-Group Class AltQ/M Alt-P loci... Q G0 Q.0000 aa M G0 M /.0000 a where both the genotypes of the queen and her single mate are scored unambiguously, i.e. with no alternatives. Then a more complex example *Ind-ID K-Group Class AltQ/M Alt-P loci... { } Q G09 Q ab Q3 G09 Q 0.57 ac Q4 G09 Q bc M8 G09 M /.0000 c M9 G09 M / a M0 G09 M / b M G09 M /.0000 b M G09 M / a M3 G09 M / c M4 G09 M 3/ b M5 G09 M 3/ c M6 G09 M 3/.0000 a where the double mated queen can be one of three genotypes (each with associated probability), and one of the queen s two putative mates has two alternative genotypes. For example, if Q3 is the true maternal genotype (queen alternative ), she is mated to M siring patriline and either M or M3 siring patriline. Finally about the patriline variable PatriQ#. All offspring possibly sired by a putative male mate (i.e. a patriline) are labelled with the same number, and ordinals are assigned to MateSoft.Documentation.(6).doc/JSP

15 5 patrilines in the order that these are encountered among the offspring, i.e. the first patriline encountered is labelled, the second, etc. This numbering is done within each alternative queen and different queen alternatives may have been mated to a different number of putative mates (e.g. see group Col03 in the example file {Ex4-FQM.in.txt}). Both the putative mates and the paternity assignment depends on what queen genotype is assumed, so each queen alternative has an associated paternity assignment for the offspring. Consequently, the number of datafile columns with patrilines corresponds to the maximum number of queen alternatives in the overall data. Patriline variables are created by MATESOFT when analysing FQ data. In some cases, an offspring cannot be exclusively assigned to a patriline but is the possible daughter of one among several candidate fathers. The relevant patrilines are then listed separated by commas, e.g. 4, as for the offspring F07 in *Ind-ID K-Group Class AltQ/M Alt-P Hal54 Pha Pha77 { } Q0 Col36 Q c/f a/k a/f Q Col36 Q c/f a/m a/f Q Col36 Q c/f a/a a/f M8 Col36 M /.0000 h a a M8 Col36 M /.0000 k a h M83 Col36 M / k a d M84 Col36 M / k k d M85 Col36 M / e a a M86 Col36 M / e m d M87 Col36 M / c a d M88 Col36 M /.0000 h a a M89 Col36 M /.0000 k a h M90 Col36 M / k k d M9 Col36 M / e a a M9 Col36 M / e a d M93 Col36 M / e m d M94 Col36 M / c a d M95 Col36 M 3/.0000 h a a M96 Col36 M 3/.0000 k a h M97 Col36 M 3/ k k d M98 Col36 M 3/ e a a M99 Col36 M 3/ e m d M00 Col36 M 3/ c a d F07 Col36 F * *?/? a/a a/f 4, 4, 4, F07 Col36 F * * c/k a/k d/f F073 Col36 F * * f/k a/k d/f F074 Col36 F * * e/f a/m a/d F075 Col36 F * * e/f a/a a/a F076 Col36 F * * c/h a/a a/f F077 Col36 F * * c/k a/a a/h F078 Col36 F * * c/c a/a a/d F079 Col36 F * * c/k a/a f/h F080 Col36 F * * f/k a/a f/h F08 Col36 F * * c/h a/a a/a taken from the example file {Ex4-FQM.in.txt}. This means that both male and 4 are possible fathers of this offspring. The genotype and hence the individual ID label of these males depend on what queen alternative is assumed: M8,M85; M88,M9; and M95,M98 for queen alternative, and 3, respectively. The files {Ex-FQM.in.txt}, {Ex-FQM.in.txt}, and {Ex4-FQM.in.txt} exemplify the input to FQM analysis. MateSoft.Documentation.(6).doc/JSP

16 6 7. QM Data: Mating Frequency Statistics The alternative data type for estimation of mating frequency statistics is genotypes of queens and their putative mates, based on genetic analysis of the queen and sperm stored in their spermathecae (QM data). MATESOFT outputs summary statistics for each group and over-all estimates of proportion of multiple mated queens and effective mate number. The current version of the program is limited to a maximum of two matings per queen. The input data should include the following variables in the individual genotype data section: Ind-ID, K-Group, Class, AltQ/M, Alt-P, Locus#, and PatriQ#. Note that by principle all Alt-P for this data type as all parental alleles have been scored with certainty. When mate genotypes are deduced by sperm typing and several alleles are scored at several loci multilocus genotypes of each male cannot be deduced. For example, if the sperm contains alleles a and b at the first locus and c and d at the second locus, then the two males can be either ac and bd or ad and bc, respectively. However, whatever interpretation of the multilocus genotypes is used in the input data it has no influence on the calculations. The reason is that the estimation of the error of not identifying a multiple mating is based on the genotypes of putative single mates only. The files {Ex-QM.in.txt} and {Ex-QM.out.txt} exemplify the input and output for FQM analysis. 8. File Configuration Menu Before any analysis can be done the user has to define the configuration of the input file. This is done by opening and filling in the menu under File File configuration. The configuration can be saved for later application and loaded from file by using the File File configuration File menu. The name extension for configuration files is cfg (see the example files provided). After loading the configuration file {Ex-F.cfg} the menu will look like this: MateSoft.Documentation.(6).doc/JSP

17 7 See section 3. for a description of the variables and types of allele coding. If a variable is not relevant in for the subsequent analysis, enter 0 or simply leave the column number blank. Under Count give the number of loci analysed that follow in consequtive columns starting with the column indicated. The output file to be named is the general output file different from the file with extended data from the F or FQ analysis. The general output file is almost identical to the screen output when running the analyses. In F and FQ analysis some, but not all, of this output is rather technical and included for testing purposes, and as such it can be ignored by the average user. In FQM and FQ analysis this file contains the mating frequency statistics calculated. 9. Queen and Mate Deduction Menu This self-explanatory menu is used to launch analysis of F or FQ data. See section 4 for a discussion on what deduction option to apply for F analysis. When ready to run the data in {Ex-F.in.txt} it looks like this: MateSoft.Documentation.(6).doc/JSP

18 8 0. Mating Frequency Statistics Menu Use this menu to launch estimation of mating frequency statistics based on FQM or QM data. When ready to run the calculations for {Ex-FQM.in.txt} it looks like this 0. Data with Single and Double Mating of Queens The paternity skew c of a double mated queen is the proportion of the offspring sired by the mate having the largest contribution to the brood. The user can decide to let MATESOFT calculate the average paternity skew c ( c-bar ) by the estimator in Pedersen and Boomsma (999a) or to input a predefined value. When QM data is analysed, the skew cannot be estimated from the data and the value given by the user is used always (0.5 is default). Furthermore, this option may be relevant if the user wants to examine the effect on other statistics of a range of possible skews. See also the section about the output and the FAQ on data with no detected multiple matings. Two methods can be used to provide dispersal measures of the estimates: jackknifing over groups for standard errors and bootstrapping by groups for confidence limits. The statistical properties of these measures are not yet fully investigated and caution should be taken in their interpretation. We recommend that jackknifing and bootstrapping is only used for relative large datasets, i.e. data that includes minimum five groups in each category of detected single and double matings, respectively. The option Maintain M/M ratios in BS restricts the bootstrap replicates to have the same number of single and double mated groups as in the original data. This is included for testing purposes and is expected to give a more accurate confidence interval for c as this statistic is based on the sample of double mated groups only. 0. Data with Three or More Queen Matings If the data contain groups where three or more matings of the queen have been detected the calculation option FQM 3+ matings should be used. This will produce summary mating frequency statistics that can be presented directly or used for further analysis but not an integrated correction of sampling and detection errors. It should be mentioned that MATESOFT is able to load data with high mate numbers and perform the calculations for single double mating systems using the FQM - matings MateSoft.Documentation.(6).doc/JSP

19 9 option. In this case the program will merge all patrilines numbered and above to a single patriline. If three or more queen matings are rare in the population, if additional mates only contribute little to the brood, and if the most common patriline always holds the majority of offspring, then it may be recommended instead to analyse the data as if only single and double matings occurred. In that case D est should be understood as the estimated proportion of multiple mated queens in the population. Currently no procedures are implemented for calculating dispersal measures of the statistics for 3+ matings.. Mating Frequency Statistics Output As the calculation procedures differ also the output varies according to data type and the number of queen matings observed.. FQM Data with Single or Double Mating of Queens The following over-all statistics are given in the output as exemplified by the file {Ex- FQM.out} Estimation of paternity skew Average number of offspring in double mated groups n Observed pi for double mated groups, pi Corrected average paternity skew, c-bar Deviation from target value [abs(pi_double-pi) Paternity skew directly calculated from data, c.obs { } Estimation of proportion of double mating: summary Average weighted nonidentification error f' Observed proportion of double mated queens, D.obs Estimated proportion of double mated queens, D.est Average pedigree effective mate number, m.e,p.3933 The calculation procedures follow Pedersen and Boomsma [, 999 #754] and are described in more detail in sections D. 3. Furthermore, when the user has chosen bootstrapping and jackknifing of the statistics the following are included: 95% confidence limits c-bar: f': D.obs: D.est: m.e,p: { } Jacknifed average SD sample size c-bar: f': D.est: D.obs: m.e,p: The wide confidence intervals indicate that for this dataset (many) more queens have to be analysed to obtain accurate estimates of D est and m e,p, which is partly due to a high error in MateSoft.Documentation.(6).doc/JSP

20 0 identifying double mated queens (f ). Comparing the bootstrapped confidence limits and jackknifed SDs further gives the more general message that the estimates are not likely to follow the t-distribution, and that care should be taken in statistical tests assuming that the true sample SE equals the jackknifed SD. The files {Ex-FQM.out} and {Ex3-FQM.out} exemplify the output from FQM analysis of data with single or double mating of queens... Special Advice on Estimation of Paternity Skew When MATESOFT estimates the average paternity skew c the method of Pedersen and Boomsma (999a) is applied, and the average observed skew is given for comparison. Both values should be inspected as in some extreme cases of sampling errors or limited data the estimation will generate misleading results. One case is when patrilines are more equal in frequency than expected by random sampling given c 0.5 (e.g. 3-3, -, and 5-4 offspring sampled from the first and second mate, respectively, of each of three queens). Here, c cannot get low enough for the expected values to fit, and the program outputs c close to 0.5 with a large difference abs(pi_double pi). This is still correct as 0.5 is then the best estimate for c. The other extreme is when patriline is so rare that it is never represented by more than one offspring in any group. This leads to an estimation of c, as the rarer patriline is, the more likely it is that just a single individual from this patriline was sampled. However, this usually leads to absurd estimates of D est exceeding one, assuming more hidden double matings than the number of detected single mated queens. In this case the best estimate of c is the observed skew. The user should then take this value and recalculate the mating frequency statistics by applying the Use value option.. QM Data with Single or Double Mating of Queens The slightly different output from analysis of queen and sperm genotypic data is exemplified by the file {Ex-QM.out}: Estimation of proportion of double mating: summary Average weighted nonidentification error f' Observed proportion of double mated queens, D.obs Estimated proportion of double mated queens, D.est Average pedigree effective mate number, m.e,p { } 95% confidence limits c-bar: f': D.obs: D.est: m.e,p: { } Jacknifed average SD sample size f': D.est: D.obs: MateSoft.Documentation.(6).doc/JSP

21 m.e,p: Note that there is no calculation of c as this is fixed by the user. The specific calculation of f for queen and sperm genotypic data is given in section D.4, otherwise refer to sections D FQM Data with Three or More Queen Matings For each group the number of offspring (n) assigned to each patriline is given in a table and the sum of squared paternity contributions (π given as pi ) is calculated like in this example from the file {Ex4-FQM.out}: Groupwise statistics { } Pline Sum(w) Sum(w<) n y Group Col36/QQ pi0.77 (n) Group Col36, overall pi0.730 In this case a single offspring can belong to both patrilines and 4, and consequently its assignment is weigted ( Sum(w<) ) according to the relative frequencies of these patrilines based on offspring with unambiguous assignments ( Sum(w) ). The groupwise π is calculated corrected for sample size following the formula of Pamilo (993) given here as Eqn. D.5.3. If a group has alternative queens, π for each alternative is calculated along with the overall value for the group weighted for the probabilities of alternative queen genotypes. Furthermore, summary statistics is produced including the average π over all groups (π ) and the average number of matings detected ( k ) based on the frequency distribution of the number of patrilines per group. If alternative queens differ in the number of patrilines found their contribution to the frequency distribution is weighted according to the probabilities of the alternatives ( weighted k ). Alternatively, minimum k calculates the average number of matings detected based on the smallest number of matings found per group. Summary statistics Average pi over all groups pi0.454 Frequency of observed mate number k minimum weighted Average minimum k Average weighted k MateSoft.Documentation.(6).doc/JSP

22 See section D.5 for the application of these statistics in estimating the average effective mate number. The file {Ex4-FQM.out} exemplifies the output from FQM analysis of data with three or more queen matings.. Troubleshooting and Special Use FAQ Q: It doesn t work! Why do I just get stupid error messages? A: Hard to tell. Best place to start is to make sure that MATESOFT works with the example file that fits your data type. If it does, make a copy of the example file and modify it to contain your own data instead. Remember to modify the file configuration accordingly, if needed. Errors typically arise from non-printing characters like spaces and tab stops ending up the wrong places in the data file and being hard to catch. Q: I didn t detect any multiple matings at all. What can I do to get something interesting out of the data anyway? A: Then have a look at the calculation methods suggested in section D.6. Q: I don t have a large and independent sample of individuals for estimating the population allele frequencies. Actually, the individuals in this study are the only ones I ve got. What can I do? A: First you should realise that every group may represent as few as three haploid genomes (given the queen is single mated), so the basic problem is sample size and you won t get a good estimate of population allele frequencies unless many broods are analysed. The best option is simply to calculate the allele frequencies in the offspring, weighting groups equally. This provides unbiased estimates but with a large variation, as paternal alleles are counted double compared to maternal ones. However, this is preferred to other methods involving the deduction of maternal and paternal alleles as they have the shortcoming that the frequency of common alleles are underestimated. Q: My data are a mess: the number of offspring varies a lot between groups and many genotypes are not complete for all loci. Is this a problem? A: No! Estimations and analyses will work correctly anyway based on the available data. Just make sure that no offspring lacks scoring at all loci, and that all groups have some offspring (minimum one) scored at each particular locus. Q: I ve analysed the queens but wasn't able to score all queens at all loci. Do I just indicate the gaps as missing genotypes in the FQ data file? A: No, for the FQ analysis every queen should have a complete multilocus genotype. Take the offspring groups with incomplete queens and run a separate F analysis on this part of the data to deduce the missing genotypes. Then use the output to fill in the gaps in your original data set. Q: I ve scored the genotypes of queens, offspring and sperm. Can I take advantage of having both sperm and brood data? A: Based on the sperm typing you may be able to exclude some of the alternative mate genotypes in the FQM data. Then the FQM and QM analyses should be carried out as usual, and you ll have the possibility to compare the mating frequency statistics based on MateSoft.Documentation.(6).doc/JSP

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics Biology 107 General Biology Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics In Biology 107, our discussion of the cell has focused on the structure and function of subcellular organelles. The next

More information

BIOLOGY 321. Answers to text questions th edition: Chapter 2

BIOLOGY 321. Answers to text questions th edition: Chapter 2 BIOLOGY 321 SPRING 2013 10 TH EDITION OF GRIFFITHS ANSWERS TO ASSIGNMENT SET #1 I have made every effort to prevent errors from creeping into these answer sheets. But, if you spot a mistake, please send

More information

Biol. 303 EXAM I 9/22/08 Name

Biol. 303 EXAM I 9/22/08 Name Biol. 303 EXAM I 9/22/08 Name -------------------------------------------------------------------------------------------------------------- This exam consists of 40 multiple choice questions worth 2.5

More information

Using Microsoft Excel

Using Microsoft Excel Using Microsoft Excel Objective: Students will gain familiarity with using Excel to record data, display data properly, use built-in formulae to do calculations, and plot and fit data with linear functions.

More information

Resemblance among relatives

Resemblance among relatives Resemblance among relatives Introduction Just as individuals may differ from one another in phenotype because they have different genotypes, because they developed in different environments, or both, relatives

More information

Passing-Bablok Regression for Method Comparison

Passing-Bablok Regression for Method Comparison Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional

More information

Teachers Guide. Overview

Teachers Guide. Overview Teachers Guide Overview BioLogica is multilevel courseware for genetics. All the levels are linked so that changes in one level are reflected in all the other levels. The BioLogica activities guide learners

More information

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation.

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation. ST-Links SpatialKit For ArcMap Version 3.0.x ArcMap Extension for Directly Connecting to Spatial Databases ST-Links Corporation www.st-links.com 2012 Contents Introduction... 3 Installation... 3 Database

More information

Creating Empirical Calibrations

Creating Empirical Calibrations 030.0023.01.0 Spreadsheet Manual Save Date: December 1, 2010 Table of Contents 1. Overview... 3 2. Enable S1 Calibration Macro... 4 3. Getting Ready... 4 4. Measuring the New Sample... 5 5. Adding New

More information

Chapter 13 Meiosis and Sexual Reproduction

Chapter 13 Meiosis and Sexual Reproduction Biology 110 Sec. 11 J. Greg Doheny Chapter 13 Meiosis and Sexual Reproduction Quiz Questions: 1. What word do you use to describe a chromosome or gene allele that we inherit from our Mother? From our Father?

More information

Creating Questions in Word Importing Exporting Respondus 4.0. Importing Respondus 4.0 Formatting Questions

Creating Questions in Word Importing Exporting Respondus 4.0. Importing Respondus 4.0 Formatting Questions 1 Respondus Creating Questions in Word Importing Exporting Respondus 4.0 Importing Respondus 4.0 Formatting Questions Creating the Questions in Word 1. Each question must be numbered and the answers must

More information

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele

More information

Intracolonial nepotism during colony fissioning in honey bees?

Intracolonial nepotism during colony fissioning in honey bees? Intracolonial nepotism during colony fissioning in honey bees? Juliana Rangel Co-authors: Heather Mattila, Thomas Seeley Department of Neurobiology and Behavior Cornell University Apimondia Conference,

More information

Lab I. 2D Motion. 1 Introduction. 2 Theory. 2.1 scalars and vectors LAB I. 2D MOTION 15

Lab I. 2D Motion. 1 Introduction. 2 Theory. 2.1 scalars and vectors LAB I. 2D MOTION 15 LAB I. 2D MOTION 15 Lab I 2D Motion 1 Introduction In this lab we will examine simple two-dimensional motion without acceleration. Motion in two dimensions can often be broken up into two separate one-dimensional

More information

Moving into the information age: From records to Google Earth

Moving into the information age: From records to Google Earth Moving into the information age: From records to Google Earth David R. R. Smith Psychology, School of Life Sciences, University of Hull e-mail: davidsmith.butterflies@gmail.com Introduction Many of us

More information

Experimental design (DOE) - Design

Experimental design (DOE) - Design Experimental design (DOE) - Design Menu: QCExpert Experimental Design Design Full Factorial Fract Factorial This module designs a two-level multifactorial orthogonal plan 2 n k and perform its analysis.

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

Lab I. 2D Motion. 1 Introduction. 2 Theory. 2.1 scalars and vectors LAB I. 2D MOTION 15

Lab I. 2D Motion. 1 Introduction. 2 Theory. 2.1 scalars and vectors LAB I. 2D MOTION 15 LAB I. 2D MOTION 15 Lab I 2D Motion 1 Introduction In this lab we will examine simple two-dimensional motion without acceleration. Motion in two dimensions can often be broken up into two separate one-dimensional

More information

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have. Section 1: Chromosomes and Meiosis KEY CONCEPT Gametes have half the number of chromosomes that body cells have. VOCABULARY somatic cell autosome fertilization gamete sex chromosome diploid homologous

More information

STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)

STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University kubatko.2@osu.edu

More information

2: SIMPLE HARMONIC MOTION

2: SIMPLE HARMONIC MOTION 2: SIMPLE HARMONIC MOTION Motion of a mass hanging from a spring If you hang a mass from a spring, stretch it slightly, and let go, the mass will go up and down over and over again. That is, you will get

More information

M E R C E R W I N WA L K T H R O U G H

M E R C E R W I N WA L K T H R O U G H H E A L T H W E A L T H C A R E E R WA L K T H R O U G H C L I E N T S O L U T I O N S T E A M T A B L E O F C O N T E N T 1. Login to the Tool 2 2. Published reports... 7 3. Select Results Criteria...

More information

polysegratio: An R library for autopolyploid segregation analysis

polysegratio: An R library for autopolyploid segregation analysis polysegratio: An R library for autopolyploid segregation analysis Peter Baker January 9, 2008 It is well known that the dosage level of markers in autopolyploids and allopolyploids can be characterised

More information

Introduction to Genetics

Introduction to Genetics Chapter 11 Introduction to Genetics Section 11 1 The Work of Gregor Mendel (pages 263 266) This section describes how Gregor Mendel studied the inheritance of traits in garden peas and what his conclusions

More information

Introduction to Computer Tools and Uncertainties

Introduction to Computer Tools and Uncertainties Experiment 1 Introduction to Computer Tools and Uncertainties 1.1 Objectives To become familiar with the computer programs and utilities that will be used throughout the semester. To become familiar with

More information

Outline for today s lecture (Ch. 14, Part I)

Outline for today s lecture (Ch. 14, Part I) Outline for today s lecture (Ch. 14, Part I) Ploidy vs. DNA content The basis of heredity ca. 1850s Mendel s Experiments and Theory Law of Segregation Law of Independent Assortment Introduction to Probability

More information

Topic: The Standard Format for Importing

Topic: The Standard Format for Importing Office of Online & Extended Learning Respondus Faculty Help Topic: The Standard Format for Importing Respondus will import the following question types: Multiple choice True or False Essay (long answer)

More information

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight Supplementary File 3: Tutorial for ASReml-R Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight This tutorial will demonstrate how to run a univariate animal model using the software ASReml

More information

UNST 232 Mentor Section Assignment 5 Historical Climate Data

UNST 232 Mentor Section Assignment 5 Historical Climate Data UNST 232 Mentor Section Assignment 5 Historical Climate Data 1 introduction Informally, we can define climate as the typical weather experienced in a particular region. More rigorously, it is the statistical

More information

Traffic accidents and the road network in SAS/GIS

Traffic accidents and the road network in SAS/GIS Traffic accidents and the road network in SAS/GIS Frank Poppe SWOV Institute for Road Safety Research, the Netherlands Introduction The first figure shows a screen snapshot of SAS/GIS with part of the

More information

Probability and Discrete Distributions

Probability and Discrete Distributions AMS 7L LAB #3 Fall, 2007 Objectives: Probability and Discrete Distributions 1. To explore relative frequency and the Law of Large Numbers 2. To practice the basic rules of probability 3. To work with the

More information

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular

More information

Advanced Forecast. For MAX TM. Users Manual

Advanced Forecast. For MAX TM. Users Manual Advanced Forecast For MAX TM Users Manual www.maxtoolkit.com Revised: June 24, 2014 Contents Purpose:... 3 Installation... 3 Requirements:... 3 Installer:... 3 Setup: spreadsheet... 4 Setup: External Forecast

More information

The Genetics of Natural Selection

The Genetics of Natural Selection The Genetics of Natural Selection Introduction So far in this course, we ve focused on describing the pattern of variation within and among populations. We ve talked about inbreeding, which causes genotype

More information

NINE CHOICE SERIAL REACTION TIME TASK

NINE CHOICE SERIAL REACTION TIME TASK instrumentation and software for research NINE CHOICE SERIAL REACTION TIME TASK MED-STATE NOTATION PROCEDURE SOF-700RA-8 USER S MANUAL DOC-025 Rev. 1.3 Copyright 2013 All Rights Reserved MED Associates

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Section 11 1 The Work of Gregor Mendel

Section 11 1 The Work of Gregor Mendel Chapter 11 Introduction to Genetics Section 11 1 The Work of Gregor Mendel (pages 263 266) What is the principle of dominance? What happens during segregation? Gregor Mendel s Peas (pages 263 264) 1. The

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Meiosis vs Mitosis. How many times did it go through prophase-metaphase-anaphase-telophase?

Meiosis vs Mitosis. How many times did it go through prophase-metaphase-anaphase-telophase? Meiosis vs Mitosis Mitosis produces identical copies of cells for growth or repair. Meiosis produces egg cells or sperm cells. Look at the diagram of meiosis: What happened during prophase I? How many

More information

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series: Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7

More information

STA 431s17 Assignment Eight 1

STA 431s17 Assignment Eight 1 STA 43s7 Assignment Eight The first three questions of this assignment are about how instrumental variables can help with measurement error and omitted variables at the same time; see Lecture slide set

More information

MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION

MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION Why do kids look different from the parents? How are they similar to their parents? Why aren t brothers or sisters more alike? Meiosis A process where the number

More information

On Objectivity and Models for Measuring. G. Rasch. Lecture notes edited by Jon Stene.

On Objectivity and Models for Measuring. G. Rasch. Lecture notes edited by Jon Stene. On Objectivity and Models for Measuring By G. Rasch Lecture notes edited by Jon Stene. On Objectivity and Models for Measuring By G. Rasch Lectures notes edited by Jon Stene. 1. The Basic Problem. Among

More information

Please bring the task to your first physics lesson and hand it to the teacher.

Please bring the task to your first physics lesson and hand it to the teacher. Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will

More information

2: SIMPLE HARMONIC MOTION

2: SIMPLE HARMONIC MOTION 2: SIMPLE HARMONIC MOTION Motion of a Mass Hanging from a Spring If you hang a mass from a spring, stretch it slightly, and let go, the mass will go up and down over and over again. That is, you will get

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Information Dependent Acquisition (IDA) 1

Information Dependent Acquisition (IDA) 1 Information Dependent Acquisition (IDA) Information Dependent Acquisition (IDA) enables on the fly acquisition of MS/MS spectra during a chromatographic run. Analyst Software IDA is optimized to generate

More information

Guided Notes Unit 6: Classical Genetics

Guided Notes Unit 6: Classical Genetics Name: Date: Block: Chapter 6: Meiosis and Mendel I. Concept 6.1: Chromosomes and Meiosis Guided Notes Unit 6: Classical Genetics a. Meiosis: i. (In animals, meiosis occurs in the sex organs the testes

More information

Appendix A from G. Wild and S. A. West, Genomic Imprinting and Sex Allocation

Appendix A from G. Wild and S. A. West, Genomic Imprinting and Sex Allocation 009 by The University of Chicago. All rights reserved.doi: 0.086/593305 Appendix A from G. Wild and S. A. West, Genomic Imprinting and Sex Allocation (Am. Nat., vol. 73, no., p. E) Kin selection analysis

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Chromosome Chr Duplica Duplic t a ion Pixley

Chromosome Chr Duplica Duplic t a ion Pixley Chromosome Duplication Pixley Figure 4-6 Molecular Biology of the Cell ( Garland Science 2008) Figure 4-72 Molecular Biology of the Cell ( Garland Science 2008) Interphase During mitosis (cell division),

More information

Introduction to Genetics

Introduction to Genetics Introduction to Genetics The Work of Gregor Mendel B.1.21, B.1.22, B.1.29 Genetic Inheritance Heredity: the transmission of characteristics from parent to offspring The study of heredity in biology is

More information

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability Dropping Your Genes A Simulation of Meiosis and Fertilization and An Introduction to To fully understand Mendelian genetics (and, eventually, population genetics), you need to understand certain aspects

More information

Preptests 55 Answers and Explanations (By Ivy Global) Section 4 Logic Games

Preptests 55 Answers and Explanations (By Ivy Global) Section 4 Logic Games Section 4 Logic Games Questions 1 6 There aren t too many deductions we can make in this game, and it s best to just note how the rules interact and save your time for answering the questions. 1. Type

More information

Perinatal Mental Health Profile User Guide. 1. Using Fingertips Software

Perinatal Mental Health Profile User Guide. 1. Using Fingertips Software Perinatal Mental Health Profile User Guide 1. Using Fingertips Software July 2017 Contents 1. Introduction... 3 2. Quick Guide to Fingertips Software Features... 3 2.1 Additional information... 3 2.2 Search

More information

Two problems to be solved. Example Use of SITATION. Here is the main menu. First step. Now. To load the data.

Two problems to be solved. Example Use of SITATION. Here is the main menu. First step. Now. To load the data. Two problems to be solved Example Use of SITATION Mark S. Daskin Department of IE/MS Northwestern U. Evanston, IL 1. Minimize the demand weighted total distance (or average distance) Using 10 facilities

More information

Proof Techniques (Review of Math 271)

Proof Techniques (Review of Math 271) Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil

More information

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism OECD QSAR Toolbox v.4.1 Tutorial illustrating new options for grouping with metabolism Outlook Background Objectives Specific Aims The exercise Workflow 2 Background Grouping with metabolism is a procedure

More information

Categorical Data Analysis. The data are often just counts of how many things each category has.

Categorical Data Analysis. The data are often just counts of how many things each category has. Categorical Data Analysis So far we ve been looking at continuous data arranged into one or two groups, where each group has more than one observation. E.g., a series of measurements on one or two things.

More information

SPAGeDi 1.1. User s manual. a program for Spatial Pattern Analysis of Genetic Diversity. by Olivier HARDY and Xavier VEKEMANS

SPAGeDi 1.1. User s manual. a program for Spatial Pattern Analysis of Genetic Diversity. by Olivier HARDY and Xavier VEKEMANS SPAGeDi 1.1 a program for Spatial Pattern Analysis of Genetic Diversity by Olivier HARDY and Xavier VEKEMANS User s manual Address for correspondence: Laboratoire de Génétique et d'ecologie Végétales Université

More information

OECD QSAR Toolbox v.3.0

OECD QSAR Toolbox v.3.0 OECD QSAR Toolbox v.3.0 Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists Background Objectives Specific Aims Trend analysis The exercise

More information

Name Class Date. Pearson Education, Inc., publishing as Pearson Prentice Hall. 33

Name Class Date. Pearson Education, Inc., publishing as Pearson Prentice Hall. 33 Chapter 11 Introduction to Genetics Chapter Vocabulary Review Matching On the lines provided, write the letter of the definition of each term. 1. genetics a. likelihood that something will happen 2. trait

More information

Experiment 1: The Same or Not The Same?

Experiment 1: The Same or Not The Same? Experiment 1: The Same or Not The Same? Learning Goals After you finish this lab, you will be able to: 1. Use Logger Pro to collect data and calculate statistics (mean and standard deviation). 2. Explain

More information

A (Mostly) Correctly Formatted Sample Lab Report. Brett A. McGuire Lab Partner: Microsoft Windows Section AB2

A (Mostly) Correctly Formatted Sample Lab Report. Brett A. McGuire Lab Partner: Microsoft Windows Section AB2 A (Mostly) Correctly Formatted Sample Lab Report Brett A. McGuire Lab Partner: Microsoft Windows Section AB2 August 26, 2008 Abstract Your abstract should not be indented and be single-spaced. Abstracts

More information

OECD QSAR Toolbox v.3.3. Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists

OECD QSAR Toolbox v.3.3. Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists OECD QSAR Toolbox v.3.3 Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists Background Objectives Specific Aims Trend analysis The exercise

More information

Zetasizer Nano-ZS User Instructions

Zetasizer Nano-ZS User Instructions Zetasizer Nano-ZS User Instructions 1. Activate the instrument computer by logging in to CORAL. If needed, log in to the local instrument computer Username: zetasizer. Password: zetasizer. 2. Instrument

More information

Descriptive Statistics (And a little bit on rounding and significant digits)

Descriptive Statistics (And a little bit on rounding and significant digits) Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent

More information

ON SITE SYSTEMS Chemical Safety Assistant

ON SITE SYSTEMS Chemical Safety Assistant ON SITE SYSTEMS Chemical Safety Assistant CS ASSISTANT WEB USERS MANUAL On Site Systems 23 N. Gore Ave. Suite 200 St. Louis, MO 63119 Phone 314-963-9934 Fax 314-963-9281 Table of Contents INTRODUCTION

More information

Using SPSS for One Way Analysis of Variance

Using SPSS for One Way Analysis of Variance Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 007 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe ATLAS W path Instructions for tutors Version from 2 February 2018 Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe Technische

More information

Chapter 5. Piece of Wisdom #2: A statistician drowned crossing a stream with an average depth of 6 inches. (Anonymous)

Chapter 5. Piece of Wisdom #2: A statistician drowned crossing a stream with an average depth of 6 inches. (Anonymous) Chapter 5 Deviating from the Average In This Chapter What variation is all about Variance and standard deviation Excel worksheet functions that calculate variation Workarounds for missing worksheet functions

More information

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS 016064 BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS TEST CODE: 016064 Directions: Each of the questions or incomplete statements below is followed by five suggested answers or completions.

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding OECD QSAR Toolbox v.3.2 Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding Outlook Background Objectives Specific Aims The exercise Workflow

More information

Lab 6. Current Balance

Lab 6. Current Balance Lab 6. Current Balance Goals To explore and verify the right-hand rule governing the force on a current-carrying wire immersed in a magnetic field. To determine how the force on a current-carrying wire

More information

Introduction to Spark

Introduction to Spark 1 As you become familiar or continue to explore the Cresset technology and software applications, we encourage you to look through the user manual. This is accessible from the Help menu. However, don t

More information

7.014 Problem Set 6 Solutions

7.014 Problem Set 6 Solutions 7.014 Problem Set 6 Solutions Question 1 a) Define the following terms: Dominant In genetics, the ability of one allelic form of a gene to determine the phenotype of a heterozygous individual, in which

More information

Fractional Polynomial Regression

Fractional Polynomial Regression Chapter 382 Fractional Polynomial Regression Introduction This program fits fractional polynomial models in situations in which there is one dependent (Y) variable and one independent (X) variable. It

More information

Preptests 59 Answers and Explanations (By Ivy Global) Section 1 Analytical Reasoning

Preptests 59 Answers and Explanations (By Ivy Global) Section 1 Analytical Reasoning Preptests 59 Answers and Explanations (By ) Section 1 Analytical Reasoning Questions 1 5 Since L occupies its own floor, the remaining two must have H in the upper and I in the lower. P and T also need

More information

Lab 10: Ballistic Pendulum

Lab 10: Ballistic Pendulum Lab Section (circle): Day: Monday Tuesday Time: 8:00 9:30 1:10 2:40 Lab 10: Ballistic Pendulum Name: Partners: Pre-Lab You are required to finish this section before coming to the lab it will be checked

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding OECD QSAR Toolbox v.3.3 Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding Outlook Background Objectives Specific Aims The exercise Workflow

More information

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next genetics the study of heredity heredity sequence of DNA that codes for a protein and thus determines a trait genome a specific characteristic that varies from one individual to another gene trait the passing

More information

VELA. Getting started with the VELA Versatile Laboratory Aid. Paul Vernon

VELA. Getting started with the VELA Versatile Laboratory Aid. Paul Vernon VELA Getting started with the VELA Versatile Laboratory Aid Paul Vernon Contents Preface... 3 Setting up and using VELA... 4 Introduction... 4 Setting VELA up... 5 Programming VELA... 6 Uses of the Programs...

More information

Mathematical models in population genetics II

Mathematical models in population genetics II Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population

More information

STEP Support Programme. Hints and Partial Solutions for Assignment 1

STEP Support Programme. Hints and Partial Solutions for Assignment 1 STEP Support Programme Hints and Partial Solutions for Assignment 1 Warm-up 1 You can check many of your answers to this question by using Wolfram Alpha. Only use this as a check though and if your answer

More information

Determination of Density 1

Determination of Density 1 Introduction Determination of Density 1 Authors: B. D. Lamp, D. L. McCurdy, V. M. Pultz and J. M. McCormick* Last Update: February 1, 2013 Not so long ago a statistical data analysis of any data set larger

More information

NVLAP Proficiency Test Round 14 Results. Rolf Bergman CORM 16 May 2016

NVLAP Proficiency Test Round 14 Results. Rolf Bergman CORM 16 May 2016 NVLAP Proficiency Test Round 14 Results Rolf Bergman CORM 16 May 2016 Outline PT 14 Structure Lamp Types Lab Participation Format for results PT 14 Analysis Average values of labs Average values of lamps

More information

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means 4.1 The Need for Analytical Comparisons...the between-groups sum of squares averages the differences

More information

Introduction to Genetics

Introduction to Genetics Introduction to Genetics We ve all heard of it, but What is genetics? Genetics: the study of gene structure and action and the patterns of inheritance of traits from parent to offspring. Ancient ideas

More information

Analysis of 2x2 Cross-Over Designs using T-Tests

Analysis of 2x2 Cross-Over Designs using T-Tests Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous

More information

Overview In chapter 16 you learned how to calculate the Electric field from continuous distributions of charge; you follow four basic steps.

Overview In chapter 16 you learned how to calculate the Electric field from continuous distributions of charge; you follow four basic steps. Materials: whiteboards, computers with VPython Objectives In this lab you will do the following: Computationally model the electric field of a uniformly charged rod Computationally model the electric field

More information

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement Experiment 0 ~ Introduction to Statistics and Excel Tutorial Many of you already went through the introduction to laboratory practice and excel tutorial in Physics 1011. For that reason, we aren t going

More information

NEW HOLLAND IH AUSTRALIA. Machinery Market Information and Forecasting Portal *** Dealer User Guide Released August 2013 ***

NEW HOLLAND IH AUSTRALIA. Machinery Market Information and Forecasting Portal *** Dealer User Guide Released August 2013 *** NEW HOLLAND IH AUSTRALIA Machinery Market Information and Forecasting Portal *** Dealer User Guide Released August 2013 *** www.cnhportal.agriview.com.au Contents INTRODUCTION... 5 REQUIREMENTS... 6 NAVIGATION...

More information

Physics E-1ax, Fall 2014 Experiment 3. Experiment 3: Force. 2. Find your center of mass by balancing yourself on two force plates.

Physics E-1ax, Fall 2014 Experiment 3. Experiment 3: Force. 2. Find your center of mass by balancing yourself on two force plates. Learning Goals Experiment 3: Force After you finish this lab, you will be able to: 1. Use Logger Pro to analyze video and calculate position, velocity, and acceleration. 2. Find your center of mass by

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information