| Class | Analysis and data manipulation command |
| Name | regress |
| Arguments | <ytrait> =
<x1>...[to]... <xN> [offset
<offset>] [poisson| (exponential|weibull|fixedweibull [<censoring_trait>] [shape <shape>])] [sim] [rep <imputations>]. |
Performs linear or logistic or poisson or weibull regression of trait ytrait on set of loci x1...xN. If an x variable is a marker genotype, that independent variable is the mean allele size in the genotype, with the exception of the first marker locus encountered in the list, which is fully allelic effect coded.
The offset option reads an offset for the linear predictor from the specified trait.
Addition of a binary trait name to the end of the keyword list when the regression is weibull or exponential declares this as the censoring indicator. The shape keyword declares a starting value for the solution of the Weibull distribution shape parameter. If fixedweibull is chosen as the distribution, the shape is fixed to the value given by the shape option.
The sim keyword gives a gene-dropped P-value for the first marker locus in the list. The rep keyword specifies a number of replicates for multiple imputation of the test marker locus genotypes, and is usually used when set analysis imputed has already been issued. Each imputed replicate is generated by running the MCMC genotype sampler for a number of iterations (based on the number of unobserved genotypes for that pedigree).
The models are fitted by the usual IRLS algorithms (using AS 164, [Stirling 1981]). The exponential and Weibull regressions are implemented as Poisson regressions (with log time as offset) as per Aitken and Clayton [1980].
Note in the survival analysis example below that the Weibull parameterization is the "usual" one, not that used by for example the survreg command in the R survival package (where the scale is the reciprocal of the Sib-pair shape parameter, and the regression coeffcients are scaled by that value)
Example:
#
# Survival analysis: gold standard results from R
#
# survreg(formula = Surv(time, status) ~ rx, data = rats)
# Value Std. Error z p
# (Intercept) 4.983 0.0833 59.81 0.00e+00
# rx -0.239 0.0891 -2.68 7.42e-03
# Log(scale) -1.333 0.1439 -9.26 2.01e-20
#
>> loc ratsurv.in
>> regress time = rx weibull status
------------------------------------------------
Weibull regression analysis of trait "time"
------------------------------------------------
Censoring variable: status.
Variable Beta Stand Error t-Value
-----------------------------------------------------
Intercept 18.8907 0.2294 82.3425 ***
rx -0.9042 0.3166 2.8557 *
No. usable observations = 150 ( 60.0%)
No. of uncensored times = 40 ( 26.7%)
Weibull shape parameter = 3.7909
Number of iterations = 84
Model LR Chi-square = 166.4316 (df= 148)
Akaike Inf. Criterion = 170.4316
>> log 3.7909
=> 1.332603457921975
>> 18.8907/3.7909
=> 4.983170223429792
>> -0.9042/3.7909
=> -0.23851855759845947
See also:
| gpe | Genotype probability estimates. |
| set analysis | Includes imputed genotypes in GLMs |
| << (kruskal-wallis) | Up to index | >> (clreg) |