EVOLUTION PUBLISHED BY THE SOCIETY FOR THE STUDY OF EVOLUTION. A WEIGHTED HYBRID INDEX t WILLIAM H. HATHEWAY

Similar documents
Lab 4. Series and Parallel Resistors

THE OHIO JOURNAL OF SCIENCE

Lab 3. Newton s Second Law

STATISTICS Relationships between variables: Correlation

Electric Fields. Goals. Introduction

An Introduction to Multivariate Statistical Analysis

Principal Component Analysis, an Aid to Interpretation of Data. A Case Study of Oil Palm (Elaeis guineensis Jacq.)

The Acid Ranges of Some Spring Flowering Herbs with Reference to Variations in Floral Color

ESTIMATION OF CONSERVATISM OF CHARACTERS BY CONSTANCY WITHIN BIOLOGICAL POPULATIONS

Lab 9. Rotational Dynamics

Researchers often record several characters in their research experiments where each character has a special significance to the experimenter.

1 Measurement Uncertainties

The Matrix Algebra of Sample Statistics

Lab 2. Projectile Motion

PHYS 281 General Physics Laboratory

Lab 5. Magnetic Fields

A brief review of theory. Potential differences for RLC circuit + C. AC Output CHAPTER 10. AC CIRCUITS 84

Lab 6. Current Balance

Computational approaches for functional genomics

Logistic Regression: Regression with a Binary Dependent Variable

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

CHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum

Delayed Choice Paradox

Lab 11. Optical Instruments

Treatment of Error in Experimental Measurements

Laboratory III Quantitative Genetics

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Introduction. Chapter 1

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

TRIPLE-SYSTEMS AS TRANSFORMATIONS, AND THEIR PATHS

STUDY ON GENETIC DIVERSITY OF POINTED GOURD USING MORPHOLOGICAL CHARACTERS. Abstract

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

arxiv: v1 [hep-ph] 5 Sep 2017

Gravitational Fields

FAQ: Linear and Multiple Regression Analysis: Coefficients

Do not copy, post, or distribute

Complements on Simple Linear Regression

Lab 4. Friction. Goals. Introduction

On the Triangle Test with Replications

C. Watson, E. Churchwell, R. Indebetouw, M. Meade, B. Babler, B. Whitney

Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny

Math 123, Week 2: Matrix Operations, Inverses

A SECOND RECESSIVE FACTOR FOR BROWN PERICARP IN MAIZE.*

SIMPLIFIED CALCULATION OF PRINCIPAL COMPONENTS HAROLD HOTELLING

Genetic Divergence Studies for the Quantitative Traits of Paddy under Coastal Saline Ecosystem

Lab 5. Simple Pendulum

IENG581 Design and Analysis of Experiments INTRODUCTION

Practical Algebra. A Step-by-step Approach. Brought to you by Softmath, producers of Algebrator Software

2003 Mathematical Methods (CAS) Pilot Study GA 2: Written examination 1

A Numerical Taxonomic Study of the Avocado (Persea americana Mill.) 1

Lab 8. Ballistic Pendulum

Size and Power of the RESET Test as Applied to Systems of Equations: A Bootstrap Approach

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

* * MATHEMATICS (MEI) 4767 Statistics 2 ADVANCED GCE. Monday 25 January 2010 Morning. Duration: 1 hour 30 minutes. Turn over

Introduction to Genetics

Midterm 1 revision source for MATH 227, Introduction to Linear Algebra

Descriptive Statistics

ECNS 561 Multiple Regression Analysis

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Machine Learning, Fall 2009: Midterm

Regression-Discontinuity Analysis

s. Yabushita Statistical tests of a periodicity hypothesis for crater formation rate - II

Analysis of Variance and Co-variance. By Manza Ramesh

MULTIVARIATE ANALYSIS IN ONION (Allium cepa L.)

A Better Way to Do R&R Studies

Part 1: Naming the cultivar

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

2. Light carries information. Scientists use light to learn about the Universe.

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

Vocabulary: Samples and Populations

Time: 1 hour 30 minutes

Lab 6. RC Circuits. Switch R 5 V. ower upply. Voltmete. Capacitor. Goals. Introduction

13.7 ANOTHER TEST FOR TREND: KENDALL S TAU

* * MATHEMATICS (MEI) 4755 Further Concepts for Advanced Mathematics (FP1) ADVANCED SUBSIDIARY GCE. Friday 22 May 2009 Morning

Lecture 5. Symbolization and Classification MAP DESIGN: PART I. A picture is worth a thousand words

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Review of Multiple Regression

Discriminative Direction for Kernel Classifiers

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS

Milton Friedman Essays in Positive Economics Part I - The Methodology of Positive Economics University of Chicago Press (1953), 1970, pp.

Further Mathematics GA 3: Written examination 2

Writing Circuit Equations

Journal of Forensic & Investigative Accounting Volume 10: Issue 2, Special Issue 2018

Section 1.1: Patterns in Division

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Correlations with Categorical Data

GRE Quantitative Reasoning Practice Questions

Math Level 2. Mathematics Level 2

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

2012 Assessment Report. Mathematics with Calculus Level 3 Statistics and Modelling Level 3

Tweney, Ryan D. Commentary on Anderson and Feist s Transformative Science. Social Epistemology Review and Reply Collective 6, no. 7 (2017):

1 Measurement Uncertainties

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS

datapreviously available may be found in the two former discussions of the EXTRAGALA CTIC NEB ULAE velocity-distance relation.

A booklet Mathematical Formulae and Statistical Tables might be needed for some questions.

percent, since the ratio5/540 reduces to (rounded off) in decimal form.

Wavelet methods and null models for spatial pattern analysis

Selection on Correlated Characters (notes only)

Transcription:

EVOLUTION INTERNATIONAL JOURNAL OF ORGANIC EVOLUTION PUBLISHED BY THE SOCIETY FOR THE STUDY OF EVOLUTION Vol. XVI MARCH, 1962 No.1 A WEIGHTED HYBRID INDEX t WILLIAM H. HATHEWAY Colombian Agricultural Program of The Rockefeller Foundation, Bogota, Colombia Received March 6, 1961 The use of multivariate statistical analysis in biological research should be advocated with caution. Multivariate analysis is an exact tool which can give misleading answers if applied inappropriately. Impressive results may even induce research workers to abandon serviceable but perhaps unspectacular methods. Fortunately, the latter danger is remote in the case of hybrid indices, since results produced by less specialied techniques are required before a multivariate analysis can be attempted. The data used in the present study were obtained by Dr. Edgar Anderson (1954) for a study on stemless white violets. By means of powerful general methods he showed that introgression had occurred in his violets and predicted the exact appearance of the parental species. The analysis presented here confirms Anderson's results. It extends them in suggesting that approximate genetic relationships among members of the hybrid swarm can be determined from his data. METROGLYPHIC ANALYSIS Anderson's method for analying introgression involves the interplay of biological insight and simple graphical devices, termed by him metroglyphs (Anderson, 1957). He 1 Paper No. 134 of the Agricultural Journal Series of The Rockefeller Foundation. EVOLUTION 16: 1-1. March, 1962 has found in several hybrid swarms that the plants may differ from one another in only a few conspicuous features. These over-all differences, which are generally vague, usually can be broken down into more primary characters, each of which can be measured. For example, Anderson's violets varied most conspicuously in degree of coloring of the spur petals. Closer examinationshowed that involved in this were number of veins, their distance from the margin of the petal, and the order of branching; wing petal venation was also measured. Other sets of differences were discovered. The most convenient of these was pubescence of the pedicel and flower, which could be broken down into five more primary components. Anderson's data, which are used in the subsequent analysis, are set forth in table 1. The mass of data in table 1 is overwhelming unless reduced to some more easily comprehended form. Anderson's metroglyphs are essentially a way of condensing the several measurements he made on each plant to a simple but precise pictorial form. In fig. 1 each numbered dot represents a plant specimen. Each ray attached to a dot represents a definite character. For example, the erect apical ray shows the number of hairs on the pedicel (character Y4 of table 1) : a long ray indicates 1 to 13 hairs, a short ray 5 to 9 hairs, and no ray 1 to 4 hairs. A dot with

2 WILLIAM H. HATHEWAY TABLE 1. Characters measured by Dr. Edgar Anderson for study of introgression in stemless white violets Plant No. Xl x. X. X. Y1 Y. Y3 Y. Y5 Z A B C 1 3.4 2 1 1 16 1 2. 1 1.3.83 1.13 2 2.9 1 1 15 1 1. 1 1.12.84.96 3 1.5 1 3 6 6 2 1.6 12 2 1 8.75 8.25 17. 4 1.7 6 4 3 12 1 1.5 7 1 2 5.9 4.84 9.93 5 1.8 6 3 4 6 1.8 9 1 2 5.63 5.84 11.47 6 1.6 6 2 4 7 2 2.5 1 1 4 5.74 6.38 12.12 7 2.7 3 3 4 9 2.8 6 1 3.71 3.5 7.21 8 1.6 8 2 5 11 1.5 1 2 1 7.15 7.26 14.41 9 3.3 4 2 3 12 3 1.5 3 1 1 2.63 2.68 5.31 1 2. 7 2 5 15 1 2. 1 1 4 6.41 6.53 12.94 11 2.5 6 2 5 13 2 1. 9 3 5.57 5.22 1.79 12 1.8 5 2 5 18 2 1.6 1 1 3 5.93 6.62 12.55 13 1.7 2 2 2 18 2 1.3 5 1 2.82 3.11 5.93 14 1.8 6 3 5 19 2 1. 11 1 3 6.36 7.21 13.57 15 2. 6 3 4 23 2 1.5 9 1 3 5.43 6.18 11.61 16 2.3 4 2 2 21 2 2. 3 1 1 2.9 2.82 5.72 17 1.8 4 2 2 32 1 1.8 3 1 3.4 2.26 5.66 18 1.2 8 2 7 11 2 1.5 13 1 3 9.1 8.14 17.15 19 2.7 2 3 3 35 1 1.8 3 1 1 2.64 3.9 5.73 2 2.1 6 2 2 34 3 1.5 4 1 1 3.78 3.66 7.44 21 2.9 5 2 2 2 2 1.5 4 1 1 2.64 3.36 6. 22 1.8 9 2 5 9 2 1.5 1 2 1 7.29 7.22 14.51 23 2. 7 2 5 1 2. 9 2 2 6.41 6.55 12.96 24 3. 4 2 4 21 2 1.5 3 1 1 3.66 2.84 6.5 25 2.9 4 2 3 13 1 1.2 3 1 1 3.3 2.66 5.69 U -1..34.9.73 V.2.2 -.3.55.77 Ten characters are indicated by symbols: very faint; 1, hairs extending up to crook; Xl =petal margin width (mm). 2, hairs extending past crook of pedicel. x. =wing-petal venation. =intervenal color: 1, absent; 2, slight; 3, X3= degree of vein branching. blotched; 4, solid. x. =number of branches in submidvein. The weights for the petal venation index are Y1 =number of hairs per wing petal. given in line "U"; weights for the pubescence index y. =length of longest hair on wing petal: are in line "V." Column "A" is the calculated venal, short; 2, medium; 3, long. tion index for each specimen; column "B" is the y. =length of hairy area of wing petal (mm). calculated pubescence index. Column "C" is the y. = number of hairs on the pedicel. weighted hybrid index or introgression scale, the Y5 = position of pedicel hairs:, no hair or hairs sum of columns "A" and "B." many long rays thus represents a plant two easily grasped complexes is far from which scored high in many characters and obvious in the raw data as set forth in table consequently, a hairy and strongly colored 1; yet the diagram is only a pictorialied plant. representation of these data. By comparing A cursory examination of fig. 1 shows plants 1 and 2 with plants 3 and 18, Andertwo distinct types of violets as well as inter- son discovered other characters useful in mediate forms. Thus plants 1 and 2, lacking distinguishing the two parental types: fragrays and located together near the top of ranee, date of flowering, and anthocyanin in the diagram, represent one extreme; plants the peduncle. He definitely identified one 3 and 18, with many long rays, represent the of the parents of the hybrid swarm as Viola other. This organiation of the sample into pallens; the other was either V. blanda or V.

WEIGHTED HYBRID INDEX 3 3.1,3.5 2.6-3. C; a: ~ 2.1-2.5... o :J: b i 1.6-2. 1.1-1. ~!Y 5 ~ 4 'i_11 Y!i?b ~~t j \~''&-,:'." 5 ~'it-18 ~ B - 4 5-9 1-14 15-19 2-24 25-29 3-34 35-39 4-44 NUMBER OF HAIRS PER WING FIG. 1. Pictorialied scatter diagram showing variation in eight characters in a population of stemless white violets. This diagram is identical with one presented by Anderson (1954), except that individual plants are identified by numbers. Two of the characters are indicated along the margins, five by position and length of rays; black circles indicate individuals with a heavy blotch on the spur petal. incognita-two very similar species just coming into flower when the study was completed. UNWEIGHTED HYBRID INDICES The plants represented by fig. 1 evidently intergrade from Nos. 1 and 2 to Nos. 18 and 3. Plants 17 and 13 have only three rays. Plants 9, 2, 21, and 25 have four short rays. Near the other extreme several plants have six rays, some short and some long. This intergradation of several associated characters suggests the possibility of a scale of introgression or "hybrid index." Ideally the parental species should fall at the ends of such a scale and the F 1 hybrids in the middle; the backcrosses should lie between the F 1 and the parents. Assigning a plant aplace along an introgression scale according to its number of rays in fig. 1 is a possible but somewhat inaccurate solution to the hybrid index problem. More satisfactory is Anderson's suggestion (1956): each plant should be given a total score, the arithmetic sum of its scores for each of the nine characters when these are measured in the same units. Units are standardied by converting the scores for each character to a three-point scale. For example, the number of branches in the submidvein (X4 in table 1) varied from one to seven. Plants having one or two branches are scored ; those with six or seven branches are scored 2; intermediate plants are scored 1. Since there are nine characters altogether, the highest possible score on this scale would be 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 = 18. It is important, of course, that the extremes of one complex (species) be given high scores in all charac-

4 WILLIAM H. HATHEWAY 2. @ @ 1.5 @22 ~ )( @ C ~ @@ C/) 1 :::> IL C... effj :I: C> iii ~ :::>.5 @ @ CD II.5 Ul 1.5 2. 2. 3. UNWEIClHTED PETAL VENATION INDEX FIG. 2. Scatter diagram of unweighted indices of petal venation (horiontal scale) and pubescence (vertical scale). Individual specimens are identified by same numbers as in fig. 1. ters. Similarly, the extreme members of the other complex should receive low scores. Alternatively, the data for each character can be converted to statistical "standard measure" by dividing each value by its standarderror. This method, although more accurate than simple three-point scoring, is somewhat time consuming. As before, the index value for each specimen is obtained by summing these "standard measures." Again, the extreme members of one complex should score high for all characters (this may require the use of negative numbers). In either method each character receives equal weight in the index. In constructing such a hybrid index, it is tacitlyassumed that a composite score based on several characters is preferable to one based on only a few. This assumption is easily checked in the present case. Fig. 2 is a scatter diagram of unweighted indices of petal venation and pubescence. These indices were obtained by summing the "standard measures" of the four aspects of petal venation on the one hand and the five aspects of pubescence on the other. A scatter diagram of indices constructed from three-point scores differed from fig. 2 only in minor respects. The correlation between the venation and pubescence indices is disappointingly low (r =.64). Since the metroglyphic analysis showed that heavy pubescence is associated with strongly expressed venation, the indices of fig. 2 which fail to show this relationship clearly -may not be good measures of over-all pubescence or petal venation. Fig. 3, on the other hand, suggests a much stronger association between venation and pubescence (r =.9). This scatter diagram presents the relationship between the number of branches in the submidvein of the spur petal and the number of hairs on the pedicel. Evidently an index composed

WEIGHTED HYBRID INDEX 5 1Z 1 o Ul : ;< :I: LL o : III ~ :::> Z & @!V o 1 Z NUMBER OF BRANCHES IN SUB- MIDVEIN 4 6 7 FIG. 3. Scatter diagram showing the relationship between number of branches in the submidvein of the spur petal and number of hairs on the pedicel. Individual specimens are identified by same numbers as in fig. 1. of only one or two well-selected characters can be more meaningful than one made up of nine. An explanation is not far to seek. The variation in certain characters may have nothing to do with introgression. Including these irrelevant characters in a hybrid index simply introduces confusion into an otherwise orderly pattern of variation. One such variable in the present example, as Anderson (1954) noted, is number of hairs per wing petal. This character entered into the pubescence index weighted equally with four other measures of hairiness. Its inclusion in the index contributed to the random appearance of fig. 2 when compared with the relatively clear pattern shown in fig. 3. WEIGHTED HYBRID INDICES These remarks lead to a simple criterion for the inclusion of characters in an index or for assigning weights to them: The contribution of a character to an index should be in proportion to its usefulness in demonstrating a known or suspected relationship. In the present example, according to this criterion, characters such as those used in constructing fig. 3 should be weighted heavily in their respective indices. Number of hairs per wing petal, on the other hand, which contributed to the confusion in fig. 2, should receive a low weight or be omitted altogether from the pubescence index. It may be argued that an index lacks objectivity which weights observations according to how well they contribute to a desired conclusion. The bias, however, is more apparent than real. Selection of characters which may be of value in a morphological study always involves biological judgment. For his study of introgression in stemless white violets, Anderson chose character complexes of petal venation and pubescence. Other character complexes were rejected because they were found useless. Anderson's

6 WILLIAM H. HATHEWAY deliberate choice of characters in no way invalidates his conclusions. Since weighting is directly comparable to selection and rejection, the same arguments apply. To assign a weight of ero, for example, is to reject a character because it is of no value. When a complex of characters is studied, certain of its members may be more useful than others. Tn the present case, number of pedicel hairs is clearly more useful than number of wing-petal hairs. It is not always easy, however, to assign relative weights to such characters, and in some instances statistical methods may be helpful. Biological judgment, of course, must play the leading role in choice of the statistical method for determining the desired weights. CANONICAL ANALYSIS It is clear from the foregoing arguments that an appropriate statistical method should show clearly the relationship between petal venation and pubescence. In fact, if the weights are chosen correctly, the association between the two indices should be at least as strong as that shown in fig. 3-since that can be obtained simply by assigning weights of ero to seven of the nine characters. As degree of association is conveniently measured by the coefficient of correlation, it seems natural to seek weights which maximie the correlation between the two complexes of characters. These requirements are met by the technique of canonical analysis, described by Hotelling (1936). Canonical analysis presupposes the existence of two sets of variates (a set of x's and a set of y's) and seeks weights which maximie the correlation coefficient between the resulting indices. Thus, if is the weighted index of the x's, and if B = Vl Yl + V2 Y2 +... + V" Yn is the weighted index of the y's, then in a canonical analysis the weights (the u's and the v's) are chosen to make the correlation coefficient between A and B (rab) as large as possible. Given two sets of data such as those of table 1, the computations to determine the weights are straightforward and will not be described here, inasmuch as good descriptions are available elsewhere (d. Kendall, 1957). It should be noted, however, that these computations are time consuming. Checking the results presented here, for example, required slightly more than 1 hours on a "Monroe-matic" desk calculator. Fortunately, this work is easily accomplished on high-speed electronic computers. Suggestions for programing are included in an appendix to this paper. Weights obtained by the technique of canonical analysis are presented in table 1. Index values for each plant are obtained by multiplying the value for each character by its weight and summing. Thus, for plant 1, the venation index is (-1.) (3.4) + (.34) (2.) + (.9) (1.) + (.73) (1.) =-1.9 To avoid negative numbers, 2.2 was added to the venation index. Consequently, the corrected venation index value for plant 1 is -1.9 + 2.2 =.3. Index values for each plant are included in table 1. INTERPRETATIONS Fig. 4 is the scatter diagram of the index values for each plant, obtained by the method of canonical analysis. Since the weights have been chosen to give thehighest possible linear correlation between the two sets of characters, it is not surprising that the relationship is good (r =.97). What is perhaps more interesting is that the spots representing the individual plants fall in clusters along the calculated regression line. In studying these groups it is helpful to compare fig. 4 with Anderson's pictorialied scatter diagram (fig. 1). The two indices are over-all measures of venation and pubescence. A high pubescence index score, for example, indicates a very hairy plant. Plants 3 and 18 clearly are

WEIGHTED HYBRID INDEX 7 8 7 x o :!l ~o l3 4 III ~ III 3 2 2 1 A' PETAL VENATION INDEX FIG. 4. Scatter diagram of indices A and B of table 1. The horiontal scale is a weighted index composed of four aspects of petal venation. The vertical scale combines five measures of hairiness. Individual plants are identified by same numbers as in fig. 1. heavily pubescent and have strong petal venation; they are probably closely related to the coarser of the parental species-viola blanda or V. incognita. Plants 1 and 2, on the other hand, which have weak petal venation and sparse pubescence, are close to V. pollens. Plant 4 is about midway between these extremes on the scatter diagram and may be an F1 hybrid of the two species. Halfway between plants 1 and 4 is a cluster of 1 specimens, centered approximately on plant 13. Itis tempting to suggest that these may be first backcrosses to V. pallens, the more delicate of the two parents. Similarly, a second group of 1 specimens centering on plant 1 may represent backcrosses to the other parent. In fact, the preponderance of backcrosses in the sample is striking. These backcrosses are to both parents, and each backcross type makes up about 4 per cent of the sample. These new genetic forms may have some selective advantage. Anderson (19 S4) reported that the hybrid swarm was collected along a woodland road which had been repeatedly relocated and graded. Might it be possible, for example, to classify these disturbed soils into shaded and open areas corresponding to the two types of backcrosses? These considerationssuggest a possible application of accurate hybrid indices. The individual members of a hybrid swarm, properly classified along an introgression scale, might serve as sensitive "indicator plants" to distinguish micro-environmental types. The relationship between venation and pubescence indices being satisfactory, the combined hybrid index is constructed next. A simple composite index (C in table 1) is obtained by adding the pubescence and venation index scores for each plant. A more

8 WILLIAM H. HATHEWAY 4 G)@.,. ;;; >...... '" :! 3 :I... '" '" al ~@@ " Z ) @ ii:... @ ll '" '" : " '" 1 eo ~ 2 4 1 12 14 16 18 C' INTROGRESSION SCALE FIG. 5. Variation in blotching of spur of Viola pallens (specimens 1 and 2) in relation to degree of introgression from a more deeply colored species (specimen 3). The horiontal scale is a hybrid index obtained by summing arithmetically the weighted indices of petal venation and pubescence of fig. 4. Degree of development of pigment between veins of the petal was scored: 1, none; 2, slight; 3, blotched; 4, solid. Individual plants are identified by same numbers as in fig. 1. accurate index could be obtained, of course, by converting each of the two sets to statistical "standard measure" before adding. The importance of a composite hybrid index derives from its use as a scale showing the genetic relationships among the members of the hybrid swarm. In fig. 5 the horiontal axis represents the combined hybrid index or introgression scale. Plotted against this hybrid index in a scatter diagram is a tenth character measured by Anderson: degree of coloring between petal veins, scored on a four-point scale. Plants 6 and lo-which by comparison with fig. 4 are interpreted as backcrosses to the coarser parent-were colored solidly between the veins. Plants 11, 15, 12, 14, and 18 were blotched. Color in other specimens was slight or lacking. The over-all impression is that degree of coloring rises to a peak between values of 12 and 13 units on the introgression scale and drops off abruptly in both directions. Plants 8,22, and 3, located to the right of the peak, had no intervenal color. This strongly suggests that the coarser parent, Viola blanda or V. incognita, lacked intervenal blotching. Plant 18 is an apparent exception to this rule, but possibly it should have been classified as color grade 2 ("slight" on Dr. Anderson's original data sheet) instead of grade 3 ("blotched"). Thegenetic mechanism which could produce this situation is obscure, although Anderson (1954) described similar results obtained experimentally with four genera scattered in as many families. In any case, the value of the weighted hybrid index is clear. Intervenal blotching is shown to occur in the backcrosses and not

WEIGHTED HYBRID INDEX 9 in the coarser parent, as Anderson had suggested. Increased accuracy thus leads to a better description of one of the parents as well as to a new hypothesis on the origin of petal blotching in the hybrid swarm. Another interpretation, however, is possible. Plants 3 and 18 may not be specimens of the coarser parent but instead backcrosses to it. According to this hypothesis, Viola blanda (or V. incognita) was not included in the sample studied and should in fact fall to the right of 18 units on the introgression scale of fig. 5. The cluster of specimens in fig. 4 centered around plants 6 and 1 would then be F1 hybrids, and petal blotching could be interpreted as a manifestation of simple heterosis or the interplay of complementary factors. Anderson's observations on date of flowering support this interpretation. Violets with many of the characteristics of the putative coarser parent were not in flower when the hybrid swarm was collected, and hence were not included in the sample. These considerations do not, of course, invalidate the introgression scale. They merely indicate that it may be incomplete. Without the two solid reference points of the parental species, identification of F 1 hybrids and backcrosses is obviously a haardous procedure. A complicated mathematical procedure is thus not sufficient in itself for the solution of biological problems. In fact, figs. 4 and 5 can be considered as no more than mere ways of looking at the data. Sound interpretation of these results must always involve the biologist's judgment. The chief claim to be made for these graphs, then, is that they may suggest relationships for further study. They are worthwhile only insofar as they accomplish this objective more effectively than simple unweighted indices. The type of statistical analysis chosen for this study depended completely on the biological techniques used in earlier stages of the investigation. Canonical analysis was appropriate only because Dr. Anderson noticed that in the variable population of violets degree of hairiness seemed to be associated with degree of expression of petal venation. This observation in turn led to the identification and measurement of two sets of characters. Ifseveral unrelated characters had been scored, a principal component analysis (d. Kendall, 1957, p. 13) might have been appropriate; a canonical analysis certainly could not have been used. Furthermore, it is obvious that inclusion of both parental species-definitely identified as such-would have improved the hybrid index by giving it two fixed reference points. This was not possible in the present study because the coarser parent was apparently not fully in flower when the study was made. As Anderson has demonstrated repeatedly, metroglyphic analysis can lead, by the method of extrapolated correlates, to the identification of the parents of a hybrid swarm. Thus, metroglyphic analyses are always recommended before the construction of weighted hybrid indices is attempted. SUMMARY A weighted hybrid index was computed from a sample of 25 stemless white violets previously studied by Anderson (1954). It is argued that weighting is desirable in cases where several characters are studied, since some characters are more useful than others. The statistical method appropriate for determining the weights depends on the type of data to be analyed. In the present case a canonical analysis led to scatter diagrams which suggested genetic relationships among the members of the sample. It is emphasied that canonical analysis and other multivariate statistical methods should be applied only after more general methods, especially metroglyphic analysis, have thrown light on the nature of the problem. ACKNOWLEDGMENTS In the present paper a specialied mathematical method is brought to the attention of botanists. This has involved difficulties of presentation, since the worlds of botany and mathematics make only infrequent contact. I wish to thank Drs. F. E. Egler, H. M. Raup, F. R. Fosberg, and H. F. Robinson

1 WILLIAM H. HATHEWAY for suggestions on how this material might be effectively presented. I am especially grateful to Dr. J. B. Tukey for valuable manuscript criticisms and stimulating discussions. My indebtedness to Dr. Edgar Anderson-for much more than the use of his excellent data-is obvious. Without his enthusiastic criticism and encouragement this study would not have been made. ApPENDIX Determining the weights for a hybrid index by canonical analysis requires the solution of an intricate determinantal equation. Programing for high-speed computers is simplified by the following device. Suppose the variance-covariance matrix of the two sets of variates is M=[~, :J Here X is the variance-covariance matrix of the x-variables (petal venation characters), and Y is that of the y-variables (pubescence characters). W is the matrix of covariances between these two groups and W' is its transpose. Then the characteristic vector corresponding to the largest characteristic root of the matrix E gives the two indices, where (X-IW)] E = [ (y-iw') ', of course, is the ero matrix and X-I is the inverse of X. Programs for determining the characteristic roots of matrices are available for many computers. For desk calculators, the methods suggested by Rao (1952, p. 367) are recommended. LITERATURE CITED ANDERSON, E. 1954. An analysis of introgression in a population of stemless white violets. Ann. Mo. Bot. Gard., 41: 263-27. --. 1956. Character association analysis as a tool for the plant breeder. Genetics in Plant Breeding. Brookhaven Symposia in Bio!., 9: 123-14. --. 1957. A semigraphical method for the analysis of complex problems. Proc. Nat. Acad. Sci., 43: 923-927. HOTELLING, H. 1936. Relations between two sets of variables. Biometrika, 28: 321-377. KENDALL, M. G. 1957. A Course in Multivariate Analysis. New York. RAO, C. R. 1952. Advanced Statistical Methods in Biometric Research. New York.