ETH Zürich, SS 2003 Multivariate Statistik Dr. Werner Stahel, Seminar für Statistik //stat.ethz.ch/ sfs/ank03/multivariate/, //www.stat.math.ethz.ch/ stahel/courses/multivariate/
0 Multivariate Statistik
1 Inhaltsverzeichnis 1 Einleitung 1 1.1 Fragestellungen, Beispiele.......................... 1 1.2 Software.................................... 4 1.3 Voraussetzungen und Ziele.......................... 5 2 Grafische Darstellungen 7 2.1 Streudiagramme............................... 7 2.2 Symbole.................................... 7 2.3 Dynamische Grafik.............................. 8 3 Modelle 9 3.1 Vektorielle Zufallsvariable.......................... 9 3.2 Normalverteilung............................... 12 3.3 Alternative Modelle............................. 15 3.4 Klassische Schätzung der Parameter.................... 15 3.5 Fehlende Daten................................ 17 3.6 Verteilungen der Schätzungen, Wishart-Verteilung............ 18 3.7 Tests und Vertrauensregionen........................ 19 3.8 Geometrie im Raum der Stichproben oder der Zufallsvariablen..... 20 4 Korrelation, Regression, Varianzanalyse 21 4.1 Mehrere Stichproben, einfache multivariate Varianzanalyse........ 21 4.2 Multivariate Regression........................... 22 4.3 Inverse Regression, Kalibration....................... 24 4.4 Korrelationen................................. 25 4.5 Varianzanalyse und Regression mit Zufallseffekten............ 26 5 Robuste Schätzungen 27 5.1 Schätzungen als Funktionale......................... 27 5.2 Robustheit.................................. 27 5.3 Multivariate Lokation und Skala, affine Äquivarianz........... 29
0 Multivariate Statistik 5.4 Einfluss.................................... 31 5.5 Bruchpunkt.................................. 32 6 Diskriminanz-Analyse 34 6.1 Einleitung................................... 34 6.2 Klassierung bei bekannten Verteilungen, Entscheidungstheorie...... 34 6.3 Zwei Gruppen, gleiche Kovarianzen..................... 35 6.4 Mehrere Gruppen und ungleiche Kovarianzen............... 36 6.5 Fehlerraten.................................. 37 6.6 Weitere Methoden der Diskriminanzanalyse................ 38 7 Hauptkomponenten- und Faktoranalyse 41 7.1 Hauptkomponenten-Analyse......................... 41 7.2 Biplot..................................... 44 7.3 Faktoranalyse................................. 46 7.4 Lineare Entmischung............................. 50 7.5 Functional Data Analysis, Analyse von Spektren............. 52 7.6 Regression mit Messfehlern in den erklärenden Variablen......... 53 7.7 Hauptkomponenten-Regression und Verwandtes.............. 54 7.8 Strukturgleichungs- und graphische Modelle................ 56 Literatur................................... 57 8 Clusteranalyse, Distanzmethoden, Skalierung 59 8.1 Einleitung................................... 59 8.2 Ähnlichkeiten und Unähnlichkeiten..................... 60 8.3 Optimale Partitionen............................. 62 8.4 Mischverteilungen.............................. 63 8.5 Agglomerative Verfahren........................... 64 8.6 Divisive Verfahren.............................. 66 8.7 Multidimensionale Skalierung........................ 66 8.8 Minimal spanning tree............................ 66 8.9 Procrustes Analyse.............................. 67 Literatur zur Cluster-Analyse........................ 67 9 Verschiedenes 69
2 Multivariate Statistik
3 Literaturverzeichnis Aitchison, J. (1987). The Statistical Analysis of Compositional Data, Chapman & Hall. Anderberg, M. R. (1973). Cluster Analysis for Applications, Academic Press, N. Y. Anderson, E. (1935). The irises of the gaspe peninsula, Bulletin of the American Iris Society 59: 2 5. Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, Wiley, N. Y. Bilodeau, M. and Brenner, D. (1999). Theory of Multivariate Statistics, Springer Texts in Statistics, Springer-Verlag, New York. Bock, H. H. (1974). Automatische Klassifikation, Vandenhoeck & Rupprecht, Göttingen. Brown, P. J. (1993). Oxford, U.K. Measurement, Regression, and Calibration, Clarendon Press, Chatfield, C. and Collins, A. J. (1980). Introduction to Multivariate Analysis, Science Paperbacks, Chapman and Hall, London. Cooley, W. W. York. and Lohnes, P. R. (1971). Multivariate Data Analysis, Wiley, New Croux, C., Rousseeuw, P. J. and Hössjer, O. (1994). Generalized s-estimators, Journal of the American Statistical Association 89(428): 1271 1281. Deichsel, G. and Trampisch, H. J. (1985). Clusteranalyse und Diskriminanzanalyse, VEB Gustav Fischer Verlag (Stuttgart). Ernste, H. (1999). Strukturgleichungsmodellierung, Skript zum NDK Statistik, ETH Zürich. Everitt, B. (1980). Cluster Analysis, Second Edition, Halsted Press, Wiley. Everitt, B. S. (1978). Graphical Techniques for Multivariate Data, Heinemann Educational Books. Fahrmeir, L., Hamerle, A. and Tutz, G. (eds) (1996). Multivariate statistische Verfahren, 2nd edn, de Gruyter, Berlin.
4 Multivariate Statistik Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems, Ann. Eugenics 7: 179 184. Flury, B. (1997). A first course in multivariate statistics, Springer texts in statistics, Springer-Verlag, NY. Flury, B. und Riedwyl, H. (1983). Angewandte multivariate Statistik, Gustav Fischer, Stuttgart. Friedman, Hastie and Tibshirani (2000). Additive logistic regression: a statistical view of boosting, 28: 377 386. Fuller, W. A. (1987). Measurement Error Models, Wiley, N. Y. Gabriel, K. R. (1971). The biplot graphical display of matrices with applications to principal component analysis, Biometrika 58: 453 467. Gnanadesikan, R. (1997). Methods for Statistical Data Analysis of Multivariate Observations, Series in Probability and Statistics, 2nd edn, Wiley, NY. Gordon, A. D. (1981). Classification. Methods for the Exploratory Analysis of Multivariate Data, Chapman & Hall, London. Gower, J. C. (1996). Biplots, Methuen, London. Green, P. E. and Caroll, J. D. (1976). Mathematical Tools for Applied Multivariate Analysis, Academic Press, New York. Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions, Wiley, N. Y. Harris, R. J. (1975). A Primer of Multivariate Statistics, Academic Press, New York. Hartigan, J. A. (1975). Clustering algorithms, Wiley. Hartung, J., Elpelt, B. und Klösener, K.-H. (1987). Statistik. Lehr- und Handbuch der angewandten Statistik, 6. Aufl., Oldenbourg, München. Hastie, T. and Tibshirani, R. (1994). Discriminant analysis by gaussian mixtures, Journal of the Royal Statistical Society B?:? Hastie, T., Buja, A. and Tibshirani, R. (1995). Penalized discriminant analysis, Annals of Statistics. Hastie, T., Tibshirani, R. and Buja, A. (1994). Flexible disriminant analysis by optimal scoring, Journal of the American Statistical Association pp. 1255 1270. Johnson, N. L. and Kotz, S. (1972). Continuous Multivariate Distributions, A Wiley Publication in Applied Statistics, Wiley, New York. Johnson, R. A. and Wichern, D. W. (1982, 1988). Applied Multivariate Statistical Analysis, Prentice Hall Series in Statistics, 2nd edn, Prentice Hall Int.,Englewood Cliffs,N.J.,USA.
LITERATURVERZEICHNIS 5 Karson, M. J. (1982). Press, Ames. Multivariate Statistical Methods, The Iowa State University Kaufman, L. and Rousseeuw, P. J. (1989). Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, N. Y. Kendall, M. G. (1957, 1961). A Course in Multivariate Analysis, Griffin s Statistical Monographs & Courses, No.2, 2nd edn, Charles Griffin, London. Krzanowski, W. J. (1988). Principles of Multivariate Analysis; A User s Perspective, Clarendon Press, Oxford. Lawley, D. N. and Maxwell, A. E. (1971). Factor Analysis as a Statistical Method, Butterworths Mathematical Texts, 2nd edn, Butterworths, London. Little, R. J. A. and Rubin, D. B. (1987). Statistical Analysis with Missing Data, Wiley, N. Y. Manly, B. F. J. (1986, 1990). Multivariate Statistical Methods : A Primer, Chapman and Hall, London. Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis, Academic Press, London. Maronna, R. A. and Yohai, V. J. (1995). The behavior of the stahel-donoho robust multivariate estimator, Journal of the American Statistical Association 90(429): 330 341. Maxwell, A. E. (1977). Multivariate Analysis in Behavioural Research, Monographs on Applied Probability and Statistics, Chapman and Hall, London. Morrison, D. F. (1967, 1976). Multivariate Statistical Methods, McGraw-Hill Series in Probability and Statistics, 2nd edn, McGraw-Hill Book Co., New York. Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory, Wiley, N. Y. Paatero, P. (1996). Least squares formulation of robust non-negative factor analysis, Chemometrics and Intelligent Laboratory Systems 762: 1 13. Paatero, P. and Tapper, U. (1994). Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values, Environmetics 5: 111 126. Rencher, A. C. (1995). Methods of Multivariate Analysis, Wiley, N. Y. Rencher, A. C. (1998). Multivariate Statistical Inference and Applications, Wiley, N. Y. Ripley, B. D. (1996). Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge UK. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression & Outlier Detection, Wiley, N. Y.
6 Multivariate Statistik Rousseeuw, P. J. and Yohai, V. (1984). Robust regression by means of s-estimators, in J. Franke, W. Härdle and R. D. Martin (eds), Robust and Nonlinear Time Series Analysis, Vol. 26 of Lecture Notes in Statistics, Springer, pp. 256 272. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data, number 72 in Monographs on Statistics and Applied Probability, Chapman and Hall. Seber, G. A. F. (1984). Multivariate Observations, Wiley, N. Y. Sokal, R. R. and Sneath, P. H. A. (1963). Principles of Numerical Taxonomy, Freeman, San Francisco. Späth, H. (1977). Cluster-Analyse-Algorithmen zur Objektklassifizierung und Datenreduktion, Oldenbourg; München, Wien. Späth, H. (1983). Cluster-Formation und -Analyse: Theorie, FORTRAN-Programme und Beispiele, Oldenbourg; München, Wien. Srivastava, M. S. and Carter, E. M. (1983). An Introduction to Applied Multivariate Statistics, North Holland. Steinhausen, D. and Langer, K. (1977). Clusteranalyse: Einführung in Methoden und Verfahren der automatischen Klassifikation, de Gruyter, Berlin. Tatsuoka, M. M. (1971). Multivariate Analysis: Techniques for Educational and Psychological Research, Wiley, New York. Yohai, V. J. (1987). High breakdown-point and high efficiency robust estimates for regression, 15(2): 642 656.