Statistics, Data Analysis, and Simulation SS 2017

Statistics, Data Analysis, and Simulation SS 2017 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler <distler@uni-mainz.de> Mainz, 4. Mai 2017 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 1 / 25

Was wir bisher gelernt haben Spezielle diskrete Verteilungen Binomial Poisson Spezielle Wahrscheinlichkeitsdichten Uniform (Gleichverteilung) Gaussian (Normal) Chi-squared Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 2 / 25

Gammaverteilung Ziel ist die Berechnung der Wahrscheinlichkeitsdichte f (t) für die Zeitdifferenz t zwischen zwei Ereignissen, wobei die Ereignisse zufällig mit einer mittleren Rate λ auftreten. Als Beispiel kann der radioaktive Zerfall mit einer mittleren Zerfallsrate λ dienen. Die Wahrscheinlichkeitsdichte der Gammaverteilung ist gegeben durch f (x; k) = x k 1 e x Γ(k) mit Γ(z) = 0 t z 1 e t dt; Γ(z+1) = z! und gibt die Verteilung der Wartezeit t = x vom ersten bis zum k-ten Ereignis in einem Poisson-verteilten Prozess mit Mittelwert µ = 1 an. Die Verallgemeinerung für andere Werte von µ ist f (x; k, µ) = x k 1 µ k e µx Γ(k) Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 3 / 25

Gamma distribution 1 0.9 1.0*exp(-1.0*x) 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 4 / 25

Charakteristische Funktion Ist x eine reelle Zufallsvariable mit der Verteilungsfunktion F(x) und der Wahrscheinlichkeitsdichte f (x), so bezeichnet man als ihre charakteristische Funktion den Erwartungswert der Größe exp(ıtx): ϕ(t) = E[exp(ıtx)] also im Fall einer kontinuierlichen Variablen ein Fourier-Integral mit seinen bekannten Transformationseigenschaften: ϕ(t) = exp(ıtx) f (x)dx f (x) = 1 2π Insbesondere gilt für die zentralen Momente: µ n = E[x n ] = ϕ (n) (t) = d n ϕ(t) dt n = ı n ϕ (n) (0) = ı n µ n x n f (x)dx x n exp(ıtx) f (x)dx exp( ıtx) ϕ(t)dt Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 5 / 25

1.5 Theoreme Das Gesetz der großen Zahl Das Gesetz der großen Zahl (the law of large numbers) ist ein Theorem, das das Ergebnis beschreibt, sollte ein Experiment häufig wiederholt werden. Angenommen, dass in n statistisch unabhängigen Experimenten das Ereignis j insgesamt n j mal aufgetreten ist. Die Zahlen n j folgen einer Binomialverteilung, und das Verhältnis h j = n j /n ist die entsprechende Zufallsvariable. Der Erwartungswert E[h j ] ist die Wahrscheinlichkeit p j für das Ereignis j: p j = E[h j ] = E[n j /n] Für die Varianz gilt dann (Binomialverteilung!): V [h j ] = σ 2 (h j ) = σ 2 (n j /n) = 1 n 2 σ2 (n j ) = 1 n 2 np j(1 p j ) Da das Produkt p j (1 p j ) immer 1 4 ist, gilt die Ungleichung σ 2 (h j ) < 1/n bekannt als das Gesetz der großen Zahl. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 6 / 25

Der Zentrale Grenzwertsatz - The central limit theorem Der zentrale Grenzwertsatz (ZGS) ist der wichtigste Satz in der Statistik. Unter anderem erklärt er die zentrale Bedeutung der Gauß-Verteilung. Die Wahrscheinlichkeitsdichte der Summe w = n i=1 x i einer Stichprobe aus n unabhängigen Zufallsvariablen x i mit einer beliebigen Wahrscheinlichkeitsdichte mit Mittelwert x und Varianz σ 2 geht in der Grenze n gegen eine Gauß-Wahrscheinlichkeitsdichte mit Mittelwert w = n x und Varianz V [w] = nσ 2. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 7 / 25

Illustration: Zentraler Grenzwertsatz 0.5 0.5 N=1 0.4 Gauss 0.4 N=2 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.5 0.4 N=3 0.5 0.4 N=10 0.3 0.3 0.2 0.2 0.1 0.1 0-3 -2-1 0 1 2 3 0-3 -2-1 0 1 2 3 Dargestellt ist die Summe uniform verteilter Zufallszahlen im Vergleich zur Standardnormalverteilung. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 8 / 25

1.6 Stichprobe - Sampling eine zufällige (oder representative) Untermenge einer Population Stichprobe bestehend aus 100 Messungen: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 Mittelwert? Varianz? Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 9 / 25

1.6 Stichprobe - Sampling eine zufällige (oder representative) Untermenge einer Population Stichprobe bestehend aus 100 Messungen: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 l = 1 ni l i = 20.028 cm N ( s 2 1 = ni li 2 1 ( ) ) 2 ni l i N 1 N = 0.2176 cm 2 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 9 / 25

1.6 Stichprobe - Sampling eine zufällige (oder representative) Untermenge einer Population Stichprobe bestehend aus 100 Messungen: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 l = 1 ni l i = 20.028 cm N ( s 2 1 = ni li 2 1 ( ) ) 2 ni l i N 1 N l = 0.2176 cm 2 s = l ± N = (20.028 ± 0.047) cm s = s s ± 2(N 1) = (0.466 ± 0.033) cm Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 9 / 25

Stichprobe - Sampling 12 10 "length.dat" Gauß(µ=20.028,σ=0.466) Gauß(µ=20.0,σ=0.5) 8 Häufigkeit 6 4 2 0 18.5 19 19.5 20 20.5 21 21.5 Länge / cm Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 10 / 25

Numerische Berechnung von Stichprobenmittel und -varianz Bekannte Formeln: x = 1 n n i=1 x i s 2 = 1 n 1 n (x i x) 2. Die Berechnung erfordert jedoch, dass die Daten zweimal eingelesen werden müssen. Allerdings lässt sich die Berechnung - wichtig für große Stichproben - auch in einer Schleife durchführen: s 2 = 1 n 1 n i=1 (x i x) 2 = 1 n 1 n i=1 x 2 i Zwei Summen müssen berechnet werden: n n S x = x i S xx = i=1 i=1 Mittelwert und Varianz ergeben sich gemäß: x = 1 n S x s 2 = 1 n 1 1 n x 2 i ( S xx 1 n S2 x i=1 ( n i=1 ). ) 2 x i. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 11 / 25

Numerische Berechnung von Stichprobenmittel und -varianz Unter Umständen müssen dabei große Zahlen voneinander abgezogen werden. Je nach Darstellung von Zahlen auf dem Computer kann dies zu numerischen Problemen führen. Daher ist es besser eine grobe Schätzung des Mittelwertes x e (etwa der erste Messwert) zu verwenden: T x = n (x i x e ) T xx = i=1 n (x i x e ) 2 i=1 Damit erhält man: x = x e + 1 n T x s 2 = 1 n 1 ( T xx 1 ) n T x 2. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 12 / 25

1.7 Mehrdimensionale Verteilungen 1.7.1 Zufallsvariable in zwei Dimensionen Die mehrdimensionale Wahrscheinlichkeitsdichte f (x, y) der zwei Zufallszahlen x und ỹ ist definiert durch die Wahrscheinlichkeit, das Variablenpaar ( x, ỹ) in den Intervallen a x < b und c ỹ < d zu finden Normierung: Gilt: P(a x < b, c ỹ < d) = d b c a f (x, y) dx dy = 1 f (x, y) = h(x) g(y) dann sind die zwei Zufallsvariablen unabhängig. f (x, y) dx dy Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 13 / 25

Zufallsvariable in zwei Dimensionen Mittelwerte und Varianzen sind naheliegend (siehe 1. Dim): < x >= E[x] = x f (x, y) dx dy = x f y (x) dx < y >= E[y] = y f (x, y) dx dy = y f x (y) dy V [x] = (x < x >) 2 f (x, y) dx dy = σx 2 V [y] = (y < y >) 2 f (x, y) dx dy = σy 2 Sei z eine Funktion von x, y: z = z(x, y) Damit ist z ebenfalls eine Zufallsvariable. < z > = z(x, y) f (x, y) dx dy σ 2 z = (z < z >) 2 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 14 / 25

Zufallsvariable in zwei Dimensionen Einfaches Beispiel: z(x, y) = a x + b y < z > = a x f (x, y) dx dy + b y f (x, y) dx dy = a < x > + b < y > unproblematisch Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 15 / 25

Zufallsvariable in zwei Dimensionen Varianz: σ 2 z = = z(x, y) = a x + b y ((a x + b y) (a < x > + b < y >)) 2 ((a x a < x >) + (b y b < y >)) 2 = a 2 (x < x >) 2 +b 2 (y < y >) 2 } {{ } } {{ } σx 2 σy 2 +2ab (x < x >)(y < y >) }{{}?? < (x < x >)(y < y >) >= cov(x, y) Kovarianz = σ xy = (x < x >)(y < y >) f (x, y) dx dy Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 16 / 25

Zufallsvariable in zwei Dimensionen Normalisierte Kovarianz: cov(x, y) σ x σ y = ρ xy Korrelationskoeffizient ist ein dimensionsloses Maß für den Grad der Korrelation zweier Variablen: 1 ρ xy 1 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 17 / 25

Zufallsvariable in zwei Dimensionen Für die Determinante der Kovarianzmatrix gilt: σ xy = σ2 xσy 2 σxy 2 = σxσ 2 y(1 2 ρ 2 ) 0 σ2 x σ xy σ 2 y Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 18 / 25

2-dim Gauß-Verteilung -2.7-2.8-2.9 Parameter a 2-3 -3.1-3.2-3.3 1.85 1.9 1.95 2 2.05 2.1 2.15 Parameter a 1 Der Wahrscheinlichkeitsinhalt der Kovarianz-Ellipse: 39.3% Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 19 / 25

Kovarianz-Matrix in n-dimensionen Die Varianz lässt sich zur Kovarianz-Matrix verallgemeinern: V ij = ( x < x >)( x < x >) T Die Diagonalelemente der Matrix V ij sind die Varianzen und Nicht-Diagonalelemente sind die Kovarianzen: V ii = var(x i ) = (x i < x i >) 2 f ( x) dx 1 dx 2... dx n V ij = cov(x i, x j ) = (x i < x i >)(x j < x j >) f ( x) dx 1 dx 2... dx n. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 20 / 25

Kovarianz-Matrix in n-dimensionen Die Kovarianz-Matrix V ij = var(x 1 ) cov(x 1, x 2 )... cov(x 1, x n ) cov(x 2, x 1 ) var(x 2 )... cov(x 2, x n )......... cov(x n, x 1 ) cov(x n, x 2 )... var(x n ) ist eine symmetrische n n Matrix: V ij = σ 2 1 σ 12... σ 1n σ 21 σ 2 2... σ 2n......... σ n1 σ n2... σ 2 n Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 21 / 25

1.8 Transformation von Wahrscheinlichkeitsdichten Die Funktion einer Zufallsvariablen ist selbst wieder eine Zufallsvariable. Die Wahrscheinlichkeitsdichte f x (x) der Variablen x soll vermöge y = y(x) in eine andere Variable y transformiert werden: f x (x) y = y(x) f y(y) Betrachte: Intervall (x, x + dx) (y, y + dx) Bedenke: Die Flächen unter den Wahrscheinlichkeitsdichten in den jeweiligen Intervallen müssen gleich sein. f x (x)dx = f y (y)dy f y (y) = f x (x(y)) dx dy Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 22 / 25

Transformation von Mittelwert und Varianz, Fehlerfortplanzung Entwicklung um Mittelwert: y(x) = y( x ) + (x x ) dy dx + 1 x= x 2 (x d 2 y x )2 dx 2 +... x= x Bis 2. Ordnung: E[y] y( x ) + E[x x ] dy dx + 1 x= x 2 E[(x x )2 ] d 2 y dx 2 }{{} =0 1 d 2 y y y( x ) + 2 σ2 x dx 2 x= x }{{} wird oft weggelassen x= x Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 23 / 25

Fehlerfortplanzung Für die Varianz nehmen wir an y y( x ) und entwickeln y(x) um den Mittelwert x bis zur 1. Ordnung: ( [ V [y] = E (y y ) 2] = E (x x ) dy ) 2 dx = ( 2 dy dx E x= x ) ( [ (x x ) 2] = x= x dy dx x= x Gesetz der Fehlerfortpflanzung für eine Zufallsvariable. ) 2 σ 2 x Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 24 / 25

1.9 Faltung Zwei Zufallsvariablen x und y seien durch ihre Wahrscheinlichkeiten f x (x) und f y (y) gegeben. Offensichtlich ist ihre Summe w = x + y ebenfalls eine Zufallsvariable. Die Wahrscheinlichkeitsdichte der Summe w sei f w (w). Sie wird durch erhalten durch eine Faltung von x mit y. f w (w) = f x (x)f y (y)δ(w x y) dx dy = f x (x)f y (w x) dx = f y (y)f x (w y) dy f w (w) = f x (x) f y (y) ϕ w (t) = ϕ x (t) ϕ y (t) Charakteristische Funktion Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2017 25 / 25