Statistics 501

Multivariate Statistical Methods


Basic Information:

Date:  Spring 2005

Instructor

Kenneth J. Koehler
120  Snedecor Hall
Telephone :  515-294-4181
          Fax :  515-294-4040
     E-mail:   kkoehler@iastate.edu

           Office hours:  MWF 2-3 pm  (Friday session is in 115 Davidson)

 

Teaching Assistant:   MinHui Paik
                                           202D Snedecor Hall
                                   Telephone: 294-6609
                                   E-mail:   minhui@iastate.edu

                                    Office hour:  Tuesday 1-2 pm  

Textbook:

Johnson, R.A. & Wichern, D. W.,  Applied Multivariate Statistical Analysis,  5-th edition,
        Prentice-Hall,  2002,  Upper Saddle River, New Jersey.  ISBN 0-471-29008-5

Lecture Notes:

Lecture notes will be scanned and posted on this webpage as the semester progresses.  

Objectives:
In this course we will examine statistical methods for situations in which there is more than one response variable.  We will develop multivariate extension of t-tests and analysis of variance procedures.  Since these extensions are based on the multivariate normal distribution, we will briefly explore properties of the multivariate normal distribution.  We will explore ordination techniques for selecting low dimensional summaries of high dimensional data.  These include principal component analysis, factor analysis, canonical correlations, correspondence analysis, projection pursuit, multidimensional scaling and related graphical techniques.  We will explore a variety of methods for classifying cases into pre-specified groups, including linear and quadratic discriminant analysis, logistic regression, neural nets, and classification trees.  We will also explore methods for creating groups known as hierarchical and non-hierarchical clustering procedures.

Software and computational devices:

Assignments largely focus on analyzing data and becoming familiar with software packages.  Examples will be presented in the lectures using both the SAS and S-PLUS or R packages.  Links to files containing SAS and R code will be made available on this web page as we present them in the lectures.  Files containing R code should also run in S-PLUS.  Students may use other software, such as JMP, SPSS, or MATLAB to complete assignments, but we will only provide examples and help for the SAS and R and S-PLUS packages.

R software can be downloaded for free from the Comprehensive R Archive Network (CRAN) which may be reached through the main R web site 

            http://www.r-project.org

Click on the CRAN button under Downloads and scroll down to the United States entries.  You can explore those web sites, but clicking on 

            http://cran.us.r-project.org/

will get you to a web site from which you can download the version of R that fits your operating system.  There are Windows, Mac, and Linux versions.

Details on how to access SAS are iavailable on the new SAS web page from the Statistics Department

           Homepage->Resources->Software->SAS 

           (URL http//www.stat.iastate.edu/sas/sas.html ).

Linux SAS is now available for all ISU NetIDs. Vincent SAS will completely stop running at the end of February 2005.  Linux SAS is secured by SSH technology and can be run in full screen mode by installing the SSH version of Xwin32. It can also be run in batch mode by installing either putty.exe or the SSH client available from Scout. Details on how to install Xwin32/putty and how to access and run Linux SAS are available at 

             http//www.stat.iastate.edu/sas/sas.html


Copies of S-PLUS, R and SAS books and manuals are available in Snedecor 115, the Computation Center Library, and the Parks Library.  On-line help is also available.

You should have a calculator that you can bring to exams.


Course grades

About nine written assignments,  mid-term exam,  final exam.


Material to be Covered

 
Topic Reading Assignments Lecture Notes
 1.  Introduction: Applications and notation  (You can review matrix algebra on your own, Chapter 2 and Supplement 2A)   Johnson&Wichern,  Chapters 1 and 
     Sections 2.5-2.6
Introduction
Random Vectors
 2.  Properties of the multivariate normal distribution and measures of variability   Johnson&Wichern, Chapters 4 and 3 Distributions (part 1)
Distributions (part 2)
Distributions (part 3) 

 3.  Inferences about mean vectors
  Johnson&Wichern, Chapter 5 and
     Sections 6.1-6.3

Covariance inference (part 1)
Covariance inference (part 2)

One Sample Hotelling T 
Two Sample Hotelling T


 4.  Multivariate analysis of variance (MANOVA) and multivariate regression
  Johnson&Wichern, Sections 6.4-6.9,
    Sections 7.7-7.10, Supplement 7A 
    (You can also review the material in Sections 7.1-7.6 on univariate regression)

Repeated measures (part1)
Repeated measures (part2)

Manova (part1)

Manova (part2)
Manova (part3)

 5.  Principal component analysis, projection pursuit and graphical procedures.   Johnson&Wichern,  Chapter 8 and 
     Sections 12.7-12.8
PC (part1)

PC (part2)
PC (part3)


 6.  Factor analysis

  Johnson&Wichern,  Chapter 9
FactorAnalysis (part1)

FactorAnalysis (part2)
FactorAnalysis (part3)

FactorAnalysis (part4)
7. Correspondence Analysis   Johnson&Wichern,  Section 12.6 CorrespondenceAnalysis
 8. Classification: linear  and quadratic discriminant analysis, classification trees, logistic regression, neural nets   Johnson&Wichern,  Chapter 11 Classification (part1)

Classification (part2)
Classification (part3)

Classification (part4)

Classification (part5)

  9. Canonical correlation    Johnson&Wichern, Chapter 10
10. Cluster analysis Johnson&Wichern,  Sections 12.1-12.4 Cluster (part1)

Cluster (part2)

11. Multidimensional scaling   Johnson&Wichern,  Sections 12.5
   

     Assignments           Data files            SAS and R
      (These are                 for                    Code for
        pdf files)          Assignments:       Assignments:      Solutions:

 

                                                                                       rabbits.R 

                                           rating.dat                           rating.sas 

                                                                                      rating.R 

 

                                           census.dat                                        

                                           broota.dat                          broota05.sas 

                                                                                       broota05.R 


                                                  shippers.dat                      shippers.sas  

                                                                                      shippers.R    

                                                                                                       chbones.sas 


 

Current Exams:
       1. Formula sheets
 

Previous Exams:

      1.  Midterm Exam 1999      No Solutions are available 
     2.  Final Exam 1999         Solutions
      3.  Midterm Exam 2001        Solutions
     4.  Final Exam 2001         Solutions
      5.  Midterm Exam 2005        Solutions
 
 

         R / S-Plus and SAS code and data files
                       for lecture examples:
 

            R / S-PLUS                    SAS
           Program Code           Program Code         Data Files

    1.     trees.R                     trees.sas                  trees.dat

    2.     lrcov.R                     lrcov.sas

    3.     lawschl.R                 lawschl.sas              lawschl.dat

            lawschl.boot.R        lawschl.boot.sas  

    4.     turtles.cov.R            turtles.cov.sas        turtles.dat

    5.    hotel1.R                    turnips.sas

    6.    hotel2.R                    steel.sas                  steel.dat

    7.    dogs.R                       dogs.sas                 dogs.dat

    8.    morel.R                     morel.sas               morel.dat

    9.    morel.Wilks.R  

  10.     plates.R                    plates.sas

  11.                                       turtlef.sas

  12.    turtles.R                    turtleb.sas             turtles.dat

  13.    race100k.R               race100k.sas         rack100k.dat

  13.                                                                     fbeetle.dat

  14.                                       stocks.sas                stocks.dat

  15.                                       ecorr.sas                 ecorr.dat

  16.     corresp2.R                 srolecor.sas          

  17.     careers.R                  careers.sas            careers.dat

                                                                            careersR.dat

  18.     loans.R                      loans.sas                loans.dat

  19.     logcrime.R                logcrime.sas            crime.dat

  20.                                       logcross.sas

  21.   treecrim.R                                                crimeR.dat

 

  22.   diabetes.R                 diabetes.sas           diabetes.dat