A predictor variable is another name for
Show
What is a Predictor Variable?Predictor variable, also known sometimes as the independent variable, is used to make a prediction for dependent variables. Predictor variables are extremely common in data science and the scientific method. The predictor variable is the counterpart to the dependent variable, often directly informed or affected by the predictor variable. How does a Predictor Variable work?All experiments examine or deal with some form of variables. Often the variable is not only something that is measured, but it is also manipulated or transformed. The main forms of variables are predictor variables, independent variables, and dependent, or outcome variables. The predictor variable is often mistaken as the independent variable, however they vary slightly in definition. Where an independent variable may be transformed or changed throughout the experiment, the predictor variable is not. When changes do arise with predictor variables, they are often naturally occurring. Imagine a teacher is looking to understand the effects of missing class time on grade point average. The predictor variable is the attendance rate of the students, and the outcome variable is the grade point average. In theory, the best way to directly study the effects of attendance on grade point average is to take a select group of students and instruct them to skip class entirely, as a control. However, the likelihood of actually doing so is very slim, and borderline unethical. Alternatively, the teacher can look at the naturally occurring attendance rate, and infer the relationship to grade point average that way. By comparing students with lower attendance's GPAs with those with higher attendance rates, the teacher can come to generalized conclusions about attendance. In this situation, the teacher used a predictor variable of attendance, as the variable was not altered throughout the study. Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function), on the values of other variables. Independent variables, in turn, are not seen as depending on any other variable in the scope of the experiment in question.[a] In this sense, some common independent variables are time, space, density, mass, fluid flow rate,[1][2] and previous values of some observed value of interest (e.g. human population size) to predict future values (the dependent variable).[3] Of the two, it is always the dependent variable whose variation is being studied, by altering inputs, also known as regressors in a statistical context. In an experiment, any variable that can be attributed a value without attributing a value to any other variable is called an independent variable. Models and experiments test the effects that the independent variables have on the dependent variables. Sometimes, even if their influence is not of direct interest, independent variables may be included for other reasons, such as to account for their potential confounding effect. Mathematics[edit]In mathematics, a function is a rule for taking an input (in the simplest case, a number or set of numbers)[5] and providing an output (which may also be a number).[5] A symbol that stands for an arbitrary input is called an independent variable, while a symbol that stands for an arbitrary output is called a dependent variable.[6] The most common symbol for the input is x, and the most common symbol for the output is y; the function itself is commonly written y = f(x).[6][7] It is possible to have multiple independent variables or multiple dependent variables. For instance, in multivariable calculus, one often encounters functions of the form z = f(x,y), where z is a dependent variable and x and y are independent variables.[8] Functions with multiple outputs are often referred to as vector-valued functions. Modeling[edit]In mathematical modeling, the dependent variable is studied to see if and how much it varies as the independent variables vary. In the simple stochastic linear model yi = a + bxi + ei the term yi is the ith value of the dependent variable and xi is the ith value of the independent variable. The term ei is known as the "error" and contains the variability of the dependent variable not explained by the independent variable. With multiple independent variables, the model is yi = a + bxi,1 + bxi,2 + ... + bxi,n + ei, where n is the number of independent variables.[citation needed] The linear regression model is now discussed. To use linear regression, a scatter plot of data is generated with X as the independent variable and Y as the dependent variable. This is also called a bivariate dataset, (x1, y1)(x2, y2) ...(xi, yi). The simple linear regression model takes the form of Yi = a + Bxi + Ui, for i = 1, 2, ... , n. In this case, Ui, ... ,Un are independent random variables. This occurs when the measurements do not influence each other. Through propagation of independence, the independence of Ui implies independence of Yi, even though each Yi has a different expectation value. Each Ui has an expectation value of 0 and a variance of σ2.[9] Expectation of Yi Proof:[9] The line of best fit for the bivariate dataset takes the form y = α + βx and is called the regression line. α and β correspond to the intercept and slope, respectively.[9] Simulation[edit]In simulation, the dependent variable is changed in response to changes in the independent variables. Statistics[edit]In an experiment, the variable manipulated by an experimenter is something that is proven to work, called an independent variable.[10] The dependent variable is the event expected to change when the independent variable is manipulated.[11] In data mining tools (for multivariate statistics and machine learning), the dependent variable is assigned a role as target variable (or in some tools as label attribute), while an independent variable may be assigned a role as regular variable.[12] Known values for the target variable are provided for the training data set and test data set, but should be predicted for other data. The target variable is used in supervised learning algorithms but not in unsupervised learning. Statistics synonyms[edit]Depending on the context, an independent variable is sometimes called a "predictor variable", "regressor", "covariate", "manipulated variable", "explanatory variable", "exposure variable" (see reliability theory), "risk factor" (see medical statistics), "feature" (in machine learning and pattern recognition) or "input variable".[13][14] In econometrics, the term "control variable" is usually used instead of "covariate".[15][16][17][18][19] "Explanatory variable" is preferred by some authors over "independent variable" when the quantities treated as independent variables may not be statistically independent or independently manipulable by the researcher.[20][21] If the independent variable is referred to as an "explanatory variable" then the term "response variable" is preferred by some authors for the dependent variable.[14][20][21] From the Economics community, the independent variables are also called exogenous. Depending on the context, a dependent variable is sometimes called a "response variable", "regressand", "criterion", "predicted variable", "measured variable", "explained variable", "experimental variable", "responding variable", "outcome variable", "output variable", "target" or "label".[14] In economics endogenous variables are usually referencing the target. "Explained variable" is preferred by some authors over "dependent variable" when the quantities treated as "dependent variables" may not be statistically dependent.[22] If the dependent variable is referred to as an "explained variable" then the term "predictor variable" is preferred by some authors for the independent variable.[22] Variables may also be referred to by their form: continuous or categorical, which in turn may be binary/dichotomous, nominal categorical, and ordinal categorical, among others. An example is provided by the analysis of trend in sea level by Woodworth (1987). Here the dependent variable (and variable of most interest) was the annual mean sea level at a given location for which a series of yearly values were available. The primary independent variable was time. Use was made of a covariate consisting of yearly values of annual mean atmospheric pressure at sea level. The results showed that inclusion of the covariate allowed improved estimates of the trend against time to be obtained, compared to analyses which omitted the covariate. Other variables[edit]A variable may be thought to alter the dependent or independent variables, but may not actually be the focus of the experiment. So that the variable will be kept constant or monitored to try to minimize its effect on the experiment. Such variables may be designated as either a "controlled variable", "control variable", or "fixed variable". Extraneous variables, if included in a regression analysis as independent variables, may aid a researcher with accurate response parameter estimation, prediction, and goodness of fit, but are not of substantive interest to the hypothesis under examination. For example, in a study examining the effect of post-secondary education on lifetime earnings, some extraneous variables might be gender, ethnicity, social class, genetics, intelligence, age, and so forth. A variable is extraneous only when it can be assumed (or shown) to influence the dependent variable. If included in a regression, it can improve the fit of the model. If it is excluded from the regression and if it has a non-zero covariance with one or more of the independent variables of interest, its omission will bias the regression's result for the effect of that independent variable of interest. This effect is called confounding or omitted variable bias; in these situations, design changes and/or controlling for a variable statistical control is necessary. Extraneous variables are often classified into three types:
In modelling, variability that is not covered by the independent variable is designated by and is known as the "residual", "side effect", "error", "unexplained share", "residual variable", "disturbance", or "tolerance". Examples[edit]
See also[edit]
Notes[edit]
References[edit]
What is the predictor variable called?Independent variables are also known as predictors, factors, treatment variables, explanatory variables, input variables, x-variables, and right-hand variables—because they appear on the right side of the equals sign in a regression equation.
Is predictor variable same as dependent variable?The outcome variable is also called the response or dependent variable, and the risk factors and confounders are called the predictors, or explanatory or independent variables. In regression analysis, the dependent variable is denoted "Y" and the independent variables are denoted by "X".
Is a predictor variable An IV?The IV is sometimes also called a "predictor" or "predicting variable".
|