Report basic summary statistics by a grouping variable. Useful if the grouping variable is some experimental variable and data are to be aggregated for plotting. Partly a wrapper for by and describe
describeBy(x, group=NULL,mat=FALSE,type=3,digits=15,...) describe.by(x, group=NULL,mat=FALSE,type=3,...) # deprecated
| x | a data.frame or matrix. See note for statsBy. |
|---|---|
| group | a grouping variable or a list of grouping variables |
| mat | provide a matrix output rather than a list |
| type | Which type of skew and kurtosis should be found |
| digits | When giving matrix output, how many digits should be reported? |
| ... | parameters to be passed to describe |
To get descriptive statistics for several different grouping variables, make sure that group is a list. In the case of matrix output with multiple grouping variables, the grouping variable values are added to the output.
The type parameter specifies which version of skew and kurtosis should be found. See describe for more details.
An alternative function (statsBy) returns a list of means, n, and standard deviations for each group. This is particularly useful if finding weighted correlations of group means using cor.wt. More importantly, it does a proper within and between group decomposition of the correlation.
A data.frame of the relevant statistics broken down by group: item name item number number of valid cases mean standard deviation median mad: median absolute deviation (from the median) minimum maximum skew standard error
describe, statsBy, densityBy and violinBy
as well as error.bars and error.bars.by for other graphical displays.
data(sat.act) describeBy(sat.act,sat.act$gender) #just one grouping variable#> #> Descriptive statistics by group #> group: 1 #> vars n mean sd median trimmed mad min max range skew #> gender 1 247 1.00 0.00 1 1.00 0.00 1 1 0 NaN #> education 2 247 3.00 1.54 3 3.12 1.48 0 5 5 -0.54 #> age 3 247 25.86 9.74 22 24.23 5.93 14 58 44 1.43 #> ACT 4 247 28.79 5.06 30 29.23 4.45 3 36 33 -1.06 #> SATV 5 247 615.11 114.16 630 622.07 118.61 200 800 600 -0.63 #> SATQ 6 245 635.87 116.02 660 645.53 94.89 300 800 500 -0.72 #> kurtosis se #> gender NaN 0.00 #> education -0.60 0.10 #> age 1.43 0.62 #> ACT 1.89 0.32 #> SATV 0.13 7.26 #> SATQ -0.12 7.41 #> ------------------------------------------------------------ #> group: 2 #> vars n mean sd median trimmed mad min max range skew #> gender 1 453 2.00 0.00 2 2.00 0.00 2 2 0 NaN #> education 2 453 3.26 1.35 3 3.40 1.48 0 5 5 -0.74 #> age 3 453 25.45 9.37 22 23.70 5.93 13 65 52 1.77 #> ACT 4 453 28.42 4.69 29 28.63 4.45 15 36 21 -0.39 #> SATV 5 453 610.66 112.31 620 617.91 103.78 200 800 600 -0.65 #> SATQ 6 442 596.00 113.07 600 602.21 133.43 200 800 600 -0.58 #> kurtosis se #> gender NaN 0.00 #> education 0.27 0.06 #> age 3.03 0.44 #> ACT -0.42 0.22 #> SATV 0.42 5.28 #> SATQ 0.13 5.38#describeBy(sat.act,list(sat.act$gender,sat.act$education)) #two grouping variables des.mat <- describeBy(sat.act$age,sat.act$education,mat=TRUE) #matrix (data.frame) output des.mat <- describeBy(sat.act$age,list(sat.act$education,sat.act$gender), mat=TRUE,digits=2) #matrix output