Versuchsplanung SoSe 2015 R - Lösung zu Übung 1 am Autor: Ludwig Bothmann

Größe: px

Ab Seite anzeigen:

Download "Versuchsplanung SoSe 2015 R - Lösung zu Übung 1 am 24.04.2015 Autor: Ludwig Bothmann"

Clara Burgstaller
vor 8 Jahren
Abrufe

1 Versuchsplanung SoSe 2015 R - Lösung zu Übung 1 am Autor: Ludwig Bothmann Contents Aufgabe 1 1 b) Schätzer c) Residuenquadratsummen d) Bestimmtheitsmaß e) ANOVA f) F-Test g) Test A gegen C h) Mit Funktionen lm(), aov() und anova() Aufgabe 1 # Daten erstellen score <- c(10, 8, 11, 10, 9, 2, 11, 3, 2, 10, 5, 6, 6, 6, 6) spot <- factor(c(rep("a", 5), rep("b", 5), rep("c", 5))) werbung <- data.frame(score = score, spot = spot) head(werbung) score spot 1 10 A 2 8 A 3 11 A 4 10 A 5 9 A 6 2 B str(werbung) 'data.frame': 15 obs. of 2 variables: $ score: num $ spot : Factor w/ 3 levels "A","B","C": # Gruppenmittelwerte und Gruppensummen tapply(score, INDEX = spot, mean) A B C

............................................ 4 h) Mit Funktionen lm(), aov() und anova().

2 tapply(score, INDEX = spot, sum) A B C # Plot der Daten plot(as.numeric(werbung$spot), werbung$score, pch=19, xlab="spot", ylab="score", main="beurteilung der Werbespots", col=rgb(0,0,0,.4),axes=false, ylim=c(1,11)) axis(side = 1, at = 1:3, labels = c("a","b","c")) axis(side = 2, at = 1:11, labels = 1:11) Beurteilung der Werbespots Score A B C Spot 2

3 b) Schätzer y_i. <- tapply(score, INDEX = spot, mean) (mu_dach <- mean(y_i.)) [1] 7 (alpha_dach <- y_i. - mu_dach) A B C c) Residuenquadratsummen # Neue Spalte mit Gruppenmittelwerten anfügen um weitere Berechnung zu # erleichtern werbung$spotmean <- rep(y_i., each = 5) # Within = unerklärte Streuung (=SSE in Stat III NF) (RSS_1 <- sum((werbung$score - werbung$spotmean)^2)) [1] 87.2 # Total = Gesamte Streuung (=SST in Stat III NF) (RSS_0 <- sum((werbung$score - mu_dach)^2)) [1] 138 # Between = Erklärte Streuung (SSR <- RSS_0 - RSS_1) [1] 50.8 d) Bestimmtheitsmaß # Anteil der erklärten Streuung an der Gesamtstreuung (R2 <- SSR/RSS_0) [1] e) ANOVA 3

, each = 5) # Within = unerklärte Streuung (=SSE in Stat III NF) (RSS_1 <- sum((werbung$score - werbung$spotmean)^2)) [1] 87.

4 df1 <- nrow(werbung) - 3 df0 <- nrow(werbung) - 1 RSS_1/df1 [1] (F_statistik <- (SSR/(df0 - df1))/(rss_1/df1)) [1] # p-wert (p_value <- 1 - pf(f_statistik, df1 = (df0 - df1), df2 = df1)) [1] f) F-Test # 95% - Quantil der F-Verteilung mit 2 und 12 Freiheitsgraden f_quant <- qf(0.95, 2, 12) F_statistik > f_quant [1] FALSE # => H_0 nicht ablehnen g) Test A gegen C # Geschätzte Residuenvarianz (sigma_dach2 <- RSS_1/df1) [1] # Teststatistik t <- (alpha_dach[1] - alpha_dach[3])/(sqrt(sigma_dach2 * (1/5 + 1/5))) t A # Quantile der t-verteilung mit 12 FG qt(p = c(0.025, 0.975), df = 12) [1]

06365381 f) F-Test # 95% - Quantil der F-Verteilung mit 2 und 12 Freiheitsgraden f_quant <- qf(0.

5 # Teststatistik ist im Ablehnebereich => H_0 ablehnen h) Mit Funktionen lm(), aov() und anova() # LM mit Effektkodierung: Argument `contrasts` benutzen lm_effekt <- lm(score ~ spot, data = werbung, contrasts = list(spot = contr.sum)) summary(lm_effekt) Call: lm(formula = score ~ spot, data = werbung, contrasts = list(spot = contr.sum)) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-07 *** spot * spot Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 12 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 12 DF, p-value: # Äquivalente Berechnung, etwas andere Summary lm_effekt2 <- aov(score ~ spot, data = werbung, contrasts = list(spot = contr.sum)) summary(lm_effekt2) Df Sum Sq Mean Sq F value Pr(>F) spot Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 # LM mit Dummykodierung lm_dummys <- lm(score ~ spot, data = werbung) summary(lm_dummys) Call: lm(formula = score ~ spot, data = werbung) Residuals: Min 1Q Median 3Q Max

4 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 7.0000 0.6960 10.057 3.37e-07 *** spot1 2.6000 0.9843 2.641 0.0215 * spot2-1.4000 0.9843-1.422 0.1804 --- Signif. codes: 0 '***' 0.

6 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** spotb * spotc * --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 12 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 12 DF, p-value: # Die ANOVAs sind identisch! anova(lm_effekt) Analysis of Variance Table Response: score Df Sum Sq Mean Sq F value Pr(>F) spot Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 anova(lm_effekt2) Analysis of Variance Table Response: score Df Sum Sq Mean Sq F value Pr(>F) spot Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 anova(lm_dummys) Analysis of Variance Table Response: score Df Sum Sq Mean Sq F value Pr(>F) spot Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 # anova(lm_dummys, lm_effekt, lm_effekt2) Mit dem contrasts-argument kann auch die Referenzkategorie geändert werden: base gibt an, welche Kategorie die Referenz ist 6

06365 # Die ANOVAs sind identisch! anova(lm_effekt) Analysis of Variance Table Response: score Df Sum Sq Mean Sq F value Pr(>F) spot 2 50.8 25.4000 3.4954 0.06365. Residuals 12 87.2 7.2667 --- Signif.

7 # Effektkodierung help(contr.sum) # Dummykodierung help(contr.treatment) lm_dummys_a <- lm(score ~ spot, data=werbung, contrasts = list(spot=contr.treatment(n=c("a","b","c"), base=1))) lm_dummys_b <- lm(score ~ spot, data=werbung, contrasts = list(spot=contr.treatment(n=c("a","b","c"), base=2))) lm_dummys_c <- lm(score ~ spot, data=werbung, contrasts = list(spot=contr.treatment(n=c("a","b","c"), base=3))) coef(lm_dummys_a) (Intercept) spotb spotc coef(lm_dummys_b) (Intercept) spota spotc coef(lm_dummys_c) (Intercept) spota spotb # Alternativ: werbung$spot_releveled <- relevel(werbung$spot, 2) werbung$spot_releveled [1] A A A A A B B B B B C C C C C Levels: B A C werbung$spot [1] A A A A A B B B B B C C C C C Levels: A B C 7

treatment(n=c("a","b","c"), base=2))) lm_dummys_c <- lm(score ~ spot, data=werbung, contrasts = list(spot=contr.

Ähnliche Dokumente

Lineare Modelle in R: Einweg-Varianzanalyse

Lineare Modelle in R: Einweg-Varianzanalyse Achim Zeileis 2009-02-20 1 Datenaufbereitung Wie schon in der Vorlesung wollen wir hier zur Illustration der Einweg-Analyse die logarithmierten Ausgaben der