The answer depends on consistent with WHAT!
In estimating a linear relationship using ordinary least squares (OLS), the regression estimates are such that the sums of squares of the residuals are minimised. This method treats all residuals as being as important as others.There may be reasons why the treatment of all residuals in the same way may not be appropriate. One possibility is that there is reason to believe that there is a systematic trend in the size of the error term (residual). One way to compensate for such heteroscedasticity is to give less weight to the residual when the residual is expected to be larger. So, in the regression calculations, rather than minimise the sum of squares of the residuals, what is minimised is their weighted sum of squares.
The LSMEANS statement computes least squares means (LS-means) of fixed effects. As in the GLM procedure, LS-means are predicted population margins-that is, they estimate the marginal means over a balanced population. In a sense, LS-means are to unbalanced designs as class and subclass arithmetic means are to balanced designs. The matrix constructed to compute them is the same as the matrix formed in PROC GLM; however, the standard errors are adjusted for the covariance parameters in the model.Each LS-mean is computed as , where is the coefficient matrix associated with the least squares mean and is the estimate of the fixed-effects parameter vector (see the section Estimating Fixed and Random Effects in the Mixed Model). The approximate standard errors for the LS-mean is computed as the square root of .LS-means can be computed for any effect in the MODEL statement that involves CLASS variables. You can specify multiple effects in one LSMEANS statement or in multiple LSMEANS statements, and all LSMEANS statements must appear after the MODEL statement. As in the ESTIMATE statement, the matrix is tested for estimability, and if this test fails, PROC MIXED displays "Non-est" for the LS-means entries.Assuming the LS-mean is estimable, PROC MIXED constructs an approximate t test to test the null hypothesis that the associated population quantity equals zero. By default, the denominator degrees of freedom for this test are the same as those displayed for the effect in the "Tests of Fixed Effects" table (see the section Default Output).Table 56.5 summarizes important options in the LSMEANS statement. All LSMEANS options are subsequently discussed in alphabetical order.Table 56.5 Summary of Important LSMEANS Statement OptionsOptionDescriptionConstruction and Computation of LS-MeansATmodifies covariate value in computing LS-meansBYLEVELcomputes separate marginsDIFFrequests differences of LS-meansOMspecifies weighting scheme for LS-mean computationSINGULAR=tunes estimability checkingSLICE=partitions F tests (simple effects)Degrees of Freedom and P-valuesADJDFE=determines whether to compute row-wise denominator degrees of freedom with DDFM=SATTERTHWAITE or DDFM=KENWARDROGERADJUST=determines the method for multiple comparison adjustment of LS-mean differencesALPHA=determines the confidence level ()DF=assigns specific value to degrees of freedom for tests and confidence limitsStatistical OutputCLconstructs confidence limits for means and or mean differencesCORRdisplays correlation matrix of LS-meansCOVdisplays covariance matrix of LS-meansEprints the matrixYou can specify the following options in the LSMEANS statement after a slash (/).ADJDFE=SOURCEADJDFE=ROWspecifies how denominator degrees of freedom are determined when -values and confidence limits are adjusted for multiple comparisons with the ADJUST= option. When you do not specify the ADJDFE= option, or when you specify ADJDFE=SOURCE, the denominator degrees of freedom for multiplicity-adjusted results are the denominator degrees of freedom for the LS-mean effect in the "Type 3 Tests of Fixed Effects" table. When you specify ADJDFE=ROW, the denominator degrees of freedom for multiplicity-adjusted results correspond to the degrees of freedom displayed in the DF column of the "Differences of Least Squares Means" table.The ADJDFE=ROW setting is particularly useful if you want multiplicity adjustments to take into account that denominator degrees of freedom are not constant across LS-mean differences. This can be the case, for example, when the DDFM=SATTERTHWAITE or DDFM=KENWARDROGER degrees-of-freedom method is in effect.In one-way models with heterogeneous variance, combining certain ADJUST= options with the ADJDFE=ROW option corresponds to particular methods of performing multiplicity adjustments in the presence of heteroscedasticity. For example, the following statements fit a heteroscedastic one-way model and perform Dunnett's T3 method (Dunnett 1980), which is based on the studentized maximum modulus (ADJUST=SMM):proc mixed; class A; model y = A / ddfm=satterth; repeated / group=A; lsmeans A / adjust=smm adjdfe=row; run;If you combine the ADJDFE=ROW option with ADJUST=SIDAK, the multiplicity adjustment corresponds to the T2 method of Tamhane (1979), while ADJUST=TUKEY corresponds to the method of Games-Howell (Games and Howell 1976). Note that ADJUST=TUKEY gives the exact results for the case of fractional degrees of freedom in the one-way model, but it does not take into account that the degrees of freedom are subject to variability. A more conservative method, such as ADJUST=SMM, might protect the overall error rate better.Unless the ADJUST= option of the LSMEANS statement is specified, the ADJDFE= option has no effect.ADJUST=BONADJUST=DUNNETTADJUST=SCHEFFEADJUST=SIDAKADJUST=SIMULATEADJUST=SMM|GT2ADJUST=TUKEYrequests a multiple comparison adjustment for the p-values and confidence limits for the differences of LS-means. By default, PROC MIXED adjusts all pairwise differences unless you specify ADJUST=DUNNETT, in which case PROC MIXED analyzes all differences with a control level. The ADJUST= option implies the DIFF option.The BON (Bonferroni) and SIDAK adjustments involve correction factors described in Chapter 39, The GLM Procedure, and Chapter 58, The MULTTEST Procedure; also see Westfall and Young (1993) and Westfall et al. (1999). When you specify ADJUST=TUKEY and your data are unbalanced, PROC MIXED uses the approximation described in Kramer (1956). Similarly, when you specify ADJUST=DUNNETT and the LS-means are correlated, PROC MIXED uses the factor-analytic covariance approximation described in Hsu (1992). The preceding references also describe the SCHEFFE and SMM adjustments.The SIMULATE adjustment computes adjusted p-values and confidence limits from the simulated distribution of the maximum or maximum absolute value of a multivariate t random vector. All covariance parameters except the residual variance are fixed at their estimated values throughout the simulation, potentially resulting in some underdispersion. The simulation estimates , the true th quantile, where is the confidence coefficient. The default is 0.05, and you can change this value with the ALPHA= option in the LSMEANS statement.The number of samples is set so that the tail area for the simulated is within of with % confidence. In equation form,where is the simulated and is the true distribution function of the maximum; see Edwards and Berry (1987) for details. By default, = 0.005 and = 0.01, placing the tail area of within 0.005 of 0.95 with 99% confidence. The ACC= and EPS= sim-options reset and , respectively; the NSAMP= sim-option sets the sample size directly; and the SEED= sim-option specifies an integer used to start the pseudo-random number generator for the simulation. If you do not specify a seed, or if you specify a value less than or equal to zero, the seed is generated from reading the time of day from the computer clock. For additional descriptions of these and other simulation options, see the section LSMEANS Statement in Chapter 39, The GLM Procedure.ALPHA=numberrequests that a t-type confidence interval be constructed for each of the LS-means with confidence level number. The value of number must be between 0 and 1; the default is 0.05.AT variable = valueAT (variable-list) = (value-list)AT MEANSenables you to modify the values of the covariates used in computing LS-means. By default, all covariate effects are set equal to their mean values for computation of standard LS-means. The AT option enables you to assign arbitrary values to the covariates. Additional columns in the output table indicate the values of the covariates.If there is an effect containing two or more covariates, the AT option sets the effect equal to the product of the individual means rather than the mean of the product (as with standard LS-means calculations). The AT MEANS option sets covariates equal to their mean values (as with standard LS-means) and incorporates this adjustment to crossproducts of covariates.As an example, consider the following invocation of PROC MIXED:proc mixed; class A; model Y = A X1 X2 X1*X2; lsmeans A; lsmeans A / at means; lsmeans A / at X1=1.2; lsmeans A / at (X1 X2)=(1.2 0.3); run;For the first two LSMEANS statements, the LS-means coefficient for X1 is (the mean of X1) and for X2 is (the mean of X2). However, for the first LSMEANS statement, the coefficient for X1*X2 is , but for the second LSMEANS statement, the coefficient is . The third LSMEANS statement sets the coefficient for X1 equal to and leaves it at for X2, and the final LSMEANS statement sets these values to and , respectively.If a WEIGHT variable is present, it is used in processing AT variables. Also, observations with missing dependent variables are included in computing the covariate means, unless these observations form a missing cell and the FULLX option in the MODEL statement is not in effect. You can use the E option in conjunction with the AT option to check that the modified LS-means coefficients are the ones you want.The AT option is disabled if you specify the BYLEVEL option.BYLEVELrequests PROC MIXED to process the OM data set by each level of the LS-mean effect (LSMEANS effect) in question. For more details, see the OM option later in this section.CLrequests that t-type confidence limits be constructed for each of the LS-means. The confidence level is 0.95 by default; this can be changed with the ALPHA= option.CORRdisplays the estimated correlation matrix of the least squares means as part of the "Least Squares Means" table.COVdisplays the estimated covariance matrix of the least squares means as part of the "Least Squares Means" table.DF=numberspecifies the degrees of freedom for the t test and confidence limits. The default is the denominator degrees of freedom taken from the "Tests of Fixed Effects" table corresponding to the LS-means effect unless the DDFM=SATTERTHWAITE or DDFM=KENWARDROGER option is in effect in the MODEL statement. For these DDFM= methods, degrees of freedom are determined separately for each test; see the DDFM= option for more information.DIFFPDIFFrequests that differences of the LS-means be displayed. The optional difftype specifies which differences to produce, with possible values being ALL, CONTROL, CONTROLL, and CONTROLU. The difftype ALL requests all pairwise differences, and it is the default. The difftype CONTROL requests the differences with a control, which, by default, is the first level of each of the specified LSMEANS effects.To specify which levels of the effects are the controls, list the quoted formatted values in parentheses after the keyword CONTROL. For example, if the effects A, B, and C are classification variables, each having two levels, 1 and 2, the following LSMEANS statement specifies the (1,2) level of A*B and the (2,1) level of B*C as controls:lsmeans A*B B*C / diff=control('1' '2' '2' '1');For multiple effects, the results depend upon the order of the list, and so you should check the output to make sure that the controls are correct.Two-tailed tests and confidence limits are associated with the CONTROL difftype. For one-tailed results, use either the CONTROLL or CONTROLU difftype. The CONTROLL difftypetests whether the noncontrol levels are significantly smaller than the control; the upper confidence limits for the control minus the noncontrol levels are considered to be infinity and are displayed as missing. Conversely, the CONTROLU difftype tests whether the noncontrol levels are significantly larger than the control; the upper confidence limits for the noncontrol levels minus the control are considered to be infinity and are displayed as missing.If you want to perform multiple comparison adjustments on the differences of LS-means, you must specify the ADJUST= option.The differences of the LS-means are displayed in a table titled "Differences of Least Squares Means." For ODS purposes, the table name is "Diffs."Erequests that the matrix coefficients for all LSMEANS effects be displayed. For ODS purposes, the name of this " Matrix Coefficients" table is "Coef."OMOBSMARGINSspecifies a potentially different weighting scheme for the computation of LS-means coefficients. The standard LS-means have equal coefficients across classification effects; however, the OM option changes these coefficients to be proportional to those found in OM-data-set. This adjustment is reasonable when you want your inferences to apply to a population that is not necessarily balanced but has the margins observed in OM-data-set.By default, OM-data-set is the same as the analysis data set. You can optionally specify another data set that describes the population for which you want to make inferences. This data set must contain all model variables except for the dependent variable (which is ignored if it is present). In addition, the levels of all CLASS variables must be the same as those occurring in the analysis data set. Specifying an OM-data-set enables you to construct arbitrarily weighted LS-means.In computing the observed margins, PROC MIXED uses all observations for which there are no missing or invalid independent variables, including those for which there are missing dependent variables. Also, if OM-data-set has a WEIGHT variable, PROC MIXED uses weighted margins to construct the LS-means coefficients. If OM-data-set is balanced, the LS-means are unchanged by the OM option.The BYLEVEL option modifies the observed-margins LS-means. Instead of computing the margins across all of the OM-data-set, PROC MIXED computes separate margins for each level of the LSMEANS effect in question. In this case the resulting LS-means are actually equal to raw means for fixed-effects models and certain balanced random-effects models, but their estimated standard errors account for the covariance structure that you have specified. If the AT option is specified, the BYLEVEL option disables it.You can use the E option in conjunction with either the OM or BYLEVEL option to check that the modified LS-means coefficients are the ones you want. It is possible that the modified LS-means are not estimable when the standard ones are, or vice versa. Nonestimable LS-means are noted as "Non-est" in the output. PDIFFis the same as the DIFF option.SINGULAR=numbertunes the estimability checking as documented for the SINGULAR= option in the CONTRAST statement.SLICE= fixed-effectSLICE= (fixed-effects)specifies effects by which to partition interaction LSMEANS effects. This can produce what are known as tests of simple effects (Winer 1971). For example, suppose that A*B is significant, and you want to test the effect of A for each level of B. The appropriate LSMEANS statement is as follows:lsmeans A*B / slice=B;This code tests for the simple main effects of A for B, which are calculated by extracting the appropriate rows from the coefficient matrix for the A*B LS-means and by using them to form an F test. See the section Inference and Test Statistics for more information about this F test.The SLICE option produces a table titled "Tests of Effect Slices." For ODS purposes, the table name is "Slices."
There are various tests for heteroscedasticity. For bi-variate data the easiest is simply plotting the data as a scatter graph. If the vertical spread of the data points is broadly the same along its range then the data are homoscedastic and if not then there is evidence of heteroscedasticity. Heteroscedasticity may be removed using data transformations. The appropriate transformation will depend on the data and there is no general transformation that will work in all instances.
The answer depends on consistent with WHAT!
In regression analysis , heteroscedasticity means a situation in which the variance of the dependent variable varies across the data. Heteroscedasticity complicates analysis because many methods in regression analysis are based on an assumption of equal variance.
releation ship between two variable one depend other is undepended
They are still unbiased however they are inefficient since the variances are no longer constant. They are no longer the "best" estimators as they do not have minimum variance
A sequence of variables in which each variable has a different variance. Heteroscedastics may be used to measure the margin of the error between predicted and actual data.
A. S. Hurn has written: 'In search of time-varying term premia in the London interbank market' 'Noise traders, imitation, and conditional heteroscedasticity in asset returns' 'Asset market behaviour in the presence of heterogeneous traders' 'Modelling the demand for M4 in the UK'
It can showwhether or not there is any relationship between two variables,the nature of the relationship - linear, quadratic, inverse, power etc,precision of relationship: the spread or scatter around the curve of best fit,whether the scatter is constant or changes (heteroscedasticity),presence of outliers,clustering (eg heights v/s weight of adults may show one cluster of points for men and another for women. If so, gender is another relevant variable).
Charles M. Beach has written: 'Cyclical sensitivity of aggregate income inequality' -- subject(s): Income distribution, Mathematical models 'Exact small-sample tests for heteroscedasticity' -- subject(s): Heteroscedasticity 'The impact of recession on the distribution of annual unemployment' -- subject(s): Effect of recession on, Unemployment 'Macroeconomic fluctuations and the Lorenz curve' -- subject(s): Econometric models, Income distribution, Unemployment, Wages 'Unrestricted statistical inference with Lorenz curves and income shares' -- subject(s): Income distribution, Lorenz curve, Statistical methods 'Simultaneity and the earnings-generation process for Canadian men' -- subject(s): Employment, Human capital, Mathematical models, Men, Wages 'Are we becoming two societies?' -- subject(s): Canada, Economic conditions, Income distribution, Middle class 'The distribution of unemployment spells' -- subject(s): Unemployment 'The impact of macroeconomic conditions on the instability and long-run inequality of workers' earnings in Canada' 'Alternative maximum likelihood procedures for regression with autocorrelated disturbances' -- subject(s): Autocorrelation (Statistics), Regression analysis
Jeffrey D. Martin has written: 'Effects of combined-sewer overflows and urban runoff on the water quality of Fall Creek, Indianapolis, Indiana' -- subject(s): Combined sewer overflows, Environmental aspects of Combined sewer overflows, Environmental aspects of Urban runoff, Urban runoff, Water quality 'Variability of pesticide detections and concentrations in field replicate water samples collected for the National Water-Quality Assessment Program, 1992-97' -- subject(s): Analysis, Environmental aspects of Pesticides, Heteroscedasticity, Measurement, Pesticides, Water
In estimating a linear relationship using ordinary least squares (OLS), the regression estimates are such that the sums of squares of the residuals are minimised. This method treats all residuals as being as important as others.There may be reasons why the treatment of all residuals in the same way may not be appropriate. One possibility is that there is reason to believe that there is a systematic trend in the size of the error term (residual). One way to compensate for such heteroscedasticity is to give less weight to the residual when the residual is expected to be larger. So, in the regression calculations, rather than minimise the sum of squares of the residuals, what is minimised is their weighted sum of squares.
When analyzing the impact of corporate performance on share prices, researchers and analysts often use various mathematical and statistical techniques. The use of natural logarithms is one such technique, and it is typically employed in financial modeling and regression analysis. Here's why natural logarithms are commonly used: Percentage Changes: Share prices and financial metrics often exhibit percentage changes rather than absolute changes. Natural logarithms help transform these percentage changes into a form that is more amenable to statistical analysis. Logarithmic transformations can stabilize the variance of data, making it easier to model relationships. Linearization: Taking the natural logarithm of a variable can sometimes linearize the relationship between variables. Linear relationships are easier to analyze and interpret in the context of regression analysis. In financial modeling, linear relationships simplify the modeling process and enhance the interpretability of coefficients. Interpretability: When you take the natural logarithm of a variable, the coefficients obtained from regression analysis can be interpreted as elasticities. Elasticities indicate the percentage change in the dependent variable associated with a one percent change in the independent variable. This can be useful for understanding the sensitivity of share prices to changes in corporate performance. Statistical Assumptions: The use of natural logarithms may help meet the assumptions of regression analysis, such as normality and homoscedasticity (constant variance of errors). These assumptions are important for the reliability and validity of statistical inferences drawn from the model. Data Transformation: Financial data often exhibit characteristics such as skewness or heteroscedasticity. Applying natural logarithmic transformations can help address these issues, making the data more suitable for regression analysis. It's important to note that the use of natural logarithms is just one approach among many in financial modeling and analysis. The choice of technique depends on the specific characteristics of the data and the assumptions underlying the analysis. Additionally, while natural logarithms are commonly used, other transformations, such as taking the square root or using Box-Cox transformations, may also be considered depending on the nature of the data.