We want to predict weight from height in the population. Suppose we measured the height and weight of a random sample of adults in shopping malls in the U.S. In general, not all of the points will fall on the line, but we will choose our regression line so as to best summarize the relations between X and Y. The regression problems that we deal with will use a line to transform values of X to predict values of Y. We can write the equation for the linear transformation Y=32+1.8X or F=32+1.8C. Rise over the same part of the line is 212-32, or 180. Note that the Y intercept is 32, because when X=0, Y=32. We can convert temperature in degrees Centigrade to degrees Fahrenheit using a linear transformation. One further example may help to illustrate the notion of the linear transformation. The value of b, the slope, controls how quickly the line rises as we move from left to right. The value a, the Y intercept, shifts the line up or down the Y-axis. A linear transformation is what is permissible in the transformation of interval scale data in Steven's taxonomy (nominal, ordinal, interval, and ratio). In junior high school, you were probably shown the transformation Y = mX+b, but we use Y = a+bX instead. A linear transformation allows you to multiply (or divide) the original variable and then to add (or subtract) a constant. The idea of a linear transformation is that one variable is mapped onto another in a 1-to-1 fashion. This means exactly the same thing as the number of units that Y changes when X changes 1 unit (e.g., 2/1 = 2, 10/12 =. Therefore, rise over run is the ratio of change in Y to change in X. If X is the horizontal axis, then run refers to change in X. If Y is the vertical axis, then rise refers to change in Y. The slope of a line (b) is sometimes defined as rise over run. It looks like this:Įquation 2.3 says that the predicted value of Y is equal to a linear function of X. If we take out the error part of equation 2.2, we have a straight line that we can use to predict values of Y from values of X, which is one of the main uses of regression. The equation for estimates rather than parameters is: We usually have to estimate the parameters. If the slope is -.25, then as X increases 1 unit, Y decreases. If the slope is 2, then when X increases 1 unit, Y increases 2 units. It denotes the number of units that Y changes when X changes 1 unit. The symbol b describes the slope of a line. The symbol a represents the Y intercept, that is, the value that Y takes when X is zero. The symbol X represents the independent variable. The portion of the equation denoted by a + b X i defines a line. Note that there is a separate score for each X, Y, and error (these are variables), but only one value of a and b, which are population parameters. Where Y iis a score on the dependent variable for the ith person, a + b X i describes a line or linear function relating X to Y, and e i is an error. Scores on a dependent variable can be thought of as the sum of two parts: (1) a linear function of an independent variable, and (2) random error. It is customary to talk about the regression of Y on X, so that if we were predicting GPA from SAT we would talk about the regression of GPA on SAT. The X variable is often called the predictor and Y is often called the criterion (the plural of 'criterion' is 'criteria'). It is customary to call the independent variable X and the dependent variable Y. The linear model assumes that the relations between two variables can be summarized by a straight line. Why does testing for the regression sum of squares turn out to have the same result as testing for R-square? What does it mean to test the significance of the regression sum of squares? R-square? How do we find the slope and intercept for the regression line with a single independent variable? (Either formula for the slope is acceptable.) What does it mean to choose a regression line to satisfy the loss function of least squares? How do changes in the slope and intercept affect (move) the regression line? According to the regression (linear) model, what are the two parts of variance of the dependent variable? (Write an equation and state in your own words what this says.)
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |