Statistics 501

Multivariate Statistical Methods

Basic Information:

Date:  Spring 2005


Kenneth J. Koehler
120  Snedecor Hall
Telephone :  515-294-4181
          Fax :  515-294-4040

           Office hours:  MWF 2-3 pm  (Friday session is in 115 Davidson)


Teaching Assistant:   MinHui Paik
                                           202D Snedecor Hall
                                   Telephone: 294-6609

                                    Office hour:  Tuesday 1-2 pm  


Johnson, R.A. & Wichern, D. W.,  Applied Multivariate Statistical Analysis,  5-th edition,
        Prentice-Hall,  2002,  Upper Saddle River, New Jersey.  ISBN 0-471-29008-5

Lecture Notes:

Lecture notes will be scanned and posted on this webpage as the semester progresses.  

In this course we will examine statistical methods for situations in which there is more than one response variable.  We will develop multivariate extension of t-tests and analysis of variance procedures.  Since these extensions are based on the multivariate normal distribution, we will briefly explore properties of the multivariate normal distribution.  We will explore ordination techniques for selecting low dimensional summaries of high dimensional data.  These include principal component analysis, factor analysis, canonical correlations, correspondence analysis, projection pursuit, multidimensional scaling and related graphical techniques.  We will explore a variety of methods for classifying cases into pre-specified groups, including linear and quadratic discriminant analysis, logistic regression, neural nets, and classification trees.  We will also explore methods for creating groups known as hierarchical and non-hierarchical clustering procedures.

Software and computational devices:

Assignments largely focus on analyzing data and becoming familiar with software packages.  Examples will be presented in the lectures using both the SAS and S-PLUS or R packages.  Links to files containing SAS and R code will be made available on this web page as we present them in the lectures.  Files containing R code should also run in S-PLUS.  Students may use other software, such as JMP, SPSS, or MATLAB to complete assignments, but we will only provide examples and help for the SAS and R and S-PLUS packages.

R software can be downloaded for free from the Comprehensive R Archive Network (CRAN) which may be reached through the main R web site 


Click on the CRAN button under Downloads and scroll down to the United States entries.  You can explore those web sites, but clicking on 


will get you to a web site from which you can download the version of R that fits your operating system.  There are Windows, Mac, and Linux versions.

Details on how to access SAS are iavailable on the new SAS web page from the Statistics Department


           (URL http// ).

Linux SAS is now available for all ISU NetIDs. Vincent SAS will completely stop running at the end of February 2005.  Linux SAS is secured by SSH technology and can be run in full screen mode by installing the SSH version of Xwin32. It can also be run in batch mode by installing either putty.exe or the SSH client available from Scout. Details on how to install Xwin32/putty and how to access and run Linux SAS are available at 


Copies of S-PLUS, R and SAS books and manuals are available in Snedecor 115, the Computation Center Library, and the Parks Library.  On-line help is also available.

You should have a calculator that you can bring to exams.

Course grades

About nine written assignments,  mid-term exam,  final exam.

Material to be Covered

Topic Reading Assignments Lecture Notes
 1.  Introduction: Applications and notation  (You can review matrix algebra on your own, Chapter 2 and Supplement 2A)   Johnson&Wichern,  Chapters 1 and 
     Sections 2.5-2.6
Random Vectors
 2.  Properties of the multivariate normal distribution and measures of variability   Johnson&Wichern, Chapters 4 and 3 Distributions (part 1)
Distributions (part 2)
Distributions (part 3) 

 3.  Inferences about mean vectors
  Johnson&Wichern, Chapter 5 and
     Sections 6.1-6.3

Covariance inference (part 1)
Covariance inference (part 2)

One Sample Hotelling T 
Two Sample Hotelling T

 4.  Multivariate analysis of variance (MANOVA) and multivariate regression
  Johnson&Wichern, Sections 6.4-6.9,
    Sections 7.7-7.10, Supplement 7A 
    (You can also review the material in Sections 7.1-7.6 on univariate regression)

Repeated measures (part1)
Repeated measures (part2)

Manova (part1)

Manova (part2)
Manova (part3)

 5.  Principal component analysis, projection pursuit and graphical procedures.   Johnson&Wichern,  Chapter 8 and 
     Sections 12.7-12.8
PC (part1)

PC (part2)
PC (part3)

 6.  Factor analysis

  Johnson&Wichern,  Chapter 9
FactorAnalysis (part1)

FactorAnalysis (part2)
FactorAnalysis (part3)

FactorAnalysis (part4)
7. Correspondence Analysis   Johnson&Wichern,  Section 12.6 CorrespondenceAnalysis
 8. Classification: linear  and quadratic discriminant analysis, classification trees, logistic regression, neural nets   Johnson&Wichern,  Chapter 11 Classification (part1)

Classification (part2)
Classification (part3)

Classification (part4)

Classification (part5)

  9. Canonical correlation    Johnson&Wichern, Chapter 10
10. Cluster analysis Johnson&Wichern,  Sections 12.1-12.4 Cluster (part1)

Cluster (part2)

11. Multidimensional scaling   Johnson&Wichern,  Sections 12.5

     Assignments           Data files            SAS and R
      (These are                 for                    Code for
        pdf files)          Assignments:       Assignments:      Solutions:













Current Exams:
       1. Formula sheets

Previous Exams:

      1.  Midterm Exam 1999      No Solutions are available 
     2.  Final Exam 1999         Solutions
      3.  Midterm Exam 2001        Solutions
     4.  Final Exam 2001         Solutions
      5.  Midterm Exam 2005        Solutions

         R / S-Plus and SAS code and data files
                       for lecture examples:

            R / S-PLUS                    SAS
           Program Code           Program Code         Data Files

    1.     trees.R                             trees.dat

    2.     lrcov.R           

    3.     lawschl.R                     lawschl.dat


    4.     turtles.cov.R          turtles.dat

    5.    hotel1.R          

    6.    hotel2.R                            steel.dat

    7.    dogs.R                              dogs.dat

    8.    morel.R                          morel.dat

    9.    morel.Wilks.R  

  10.     plates.R          


  12.    turtles.R                       turtles.dat

  13.    race100k.R              rack100k.dat

  13.                                                                     fbeetle.dat

  14.                                             stocks.dat

  15.                                              ecorr.dat

  16.     corresp2.R                 

  17.     careers.R                    careers.dat


  18.     loans.R                            loans.dat

  19.     logcrime.R                  crime.dat


  21.   treecrim.R                                                crimeR.dat


  22.   diabetes.R                  diabetes.dat