SAS/STAT Discriminant Analysis Procedure

FREE Online Courses: Transform Your Career – Enroll for Free!

We looked at SAS/STAT Longitudinal Data Analysis Procedures in our previous tutorial, today we will look at SAS/STAT discriminant analysis. Moreover, we will also discuss how can we use discriminant analysis in SAS/STAT.

Our focus here will be to understand different procedures for performing SAS/STAT discriminant analysis: PROC DISCRIM, PROC CANDISC, PROC STEPDISC through the use of examples.

So, let’s start SAS/STAT Discriminant Analysis Procedure.

What is SAS/STAT Discriminant Analysis?

SAS/STAT Discriminant analysis is a statistical technique that is used to analyze the data when the criterion or the dependent variable is categorical and the predictor or the independent variable is an interval in nature.
Discriminant analysis in SAS/STAT is very similar to an analysis of variance (ANOVA).

Let us consider a simple example, suppose we measure height in a random sample of 50 males and 50 females. Females are, on the average, not as tall as males, and this difference will be reflected in the difference in means (for the variable Height). Therefore, variable height allows us to discriminate between males and females with a better than chance probability: if a person is tall, then he is likely to be a male, if a person is short, then she is likely to be a female.

The most common application of discriminant analysis in SAS/STAT is to include many measures in the study, in order to determine the ones that discriminate between groups. For example, an educational researcher interested in predicting high school graduates choices for further education would probably include as many measures of personality, achievement, motivation, academic performance, etc. as possible in order to learn which one(s) offer the best prediction.

Steps in Discriminant Analysis in SAS/STAT

Steps in Discriminant Analysis in SAS/STAT

Procedures for Performing Discriminant Analysis in SAS/STAT

Following procedures performs in SAS/STAT discriminant analysis of a sample data. Each procedure has a different syntax and is used with different type of data in different contexts. Let us explore each one of these.

a. PROC CANDISC

The PROC CANDISC procedure in SAS/STAT is used as a dimension reduction technique to find linear combinations of quant variables that provide maximum separation between classes. It uses Mahalanobis distance between classes for separation.
Syntax of PROC CANDISC

PROC CANDISC DATASET <OPTIONS>;
CLASS <variable>;
VAR <variable>;

Example Of PROC CANDISC

data iris;
set sashelp.iris;
run;
proc candisc data=iris out=outcan distance anova;
class species;
var sepallength sepalwidth petallength petalwidth;
run;

 

SAS/STAT Discriminant Analysis

SAS/STAT Discriminant Analysis – PROC CANDISC

SAS/STAT Discriminant Analysis

Discriminant Analysis In SAS/STAT – PROC CANDISC

SAS/STAT Discriminant Analysis

Statistical Discriminant Analysis – PROC CANDISC

SAS/STAT Discriminant Analysis

SAS/STAT Discriminant Analysis – PROC CANDISC

SAS/STAT Discriminant Analysis

Discriminant Analysis in SAS/STAT – PROC CANDISC

b. PROC DISCRIM

The PROC DISCRIM procedure in SAS/STAT performs discriminant analysis through which it classifies observations into different groups. It is similar to logistic regression, the only difference is that we have two categories, in this multiple categories can be used.
Syntax for PROC DISCRIM

PROC DISCRIM dataset <OPTIONS>;
CLASS <VARIABLES>;
Var <VARIABLES>;

Example of PROC DISCRIM

data iris;
set sashelp.iris;
run; 
proc DISCRIM data=iris
distance anova MANOVA CROSSLISTERR;
class species;
var sepallength sepalwidth petallength petalwidth;
run;

The DISCRIM procedure begins by displaying summary information about the variables in the analysis. This information includes the number of observations, the number of quantitative variables in the analysis (specified with the VAR statement), and the number of classes in the classification variable (specified with the CLASS statement).

The frequency of each class, its weight, the proportion of the total sample, and the prior probability are also displayed.

SAS/STAT Discriminant Analysis

SAS/STAT Discriminant Analysis –  PROC DISCRIM

SAS/STAT Discriminant Analysis

Discriminant Analysis in SAS/STAT – PROC DISCRIM

SAS/STAT Discriminant Analysis

Discriminant Analysis in STAT –  PROC DISCRIM

SAS/STAT Discriminant Analysis

Statistical Discriminant Analysis –  PROC DISCRIM

SAS/STAT Discriminant Analysis

SAS/STAT Discriminant Analysis –  PROC DISCRIM

c. PROC STEPDISC

The PROC STEPDISC procedure in SAS/STAT performs a stepwise discriminant analysis to select a subset of the quantitative variables for use in discriminating among the classes. The STEPDISC procedure can be used for forward selection, backward elimination, or stepwise selection.
Syntax for PROC STEPDISC

PROC STEPDISC  dataset OPTIONS;
CLASS <VARIABLES>;
     Var  < variable>;

Example of PROC STEPDISC

data iris;
set sashelp.iris;
run; 
proc stepdisc data=iris;
class species;
var sepallength sepalwidth petallength petalwidth;
run;
SAS STAT discriminant analysis

Discriminant analysis – PROC STEPDISC

SAS STAT discriminant analysis

STAT discriminant analysis – PROC STEPDISC

SAS STAT discriminant analysis

Discriminant analysis in SAS/STAT – PROC STEPDISC

SAS STAT discriminant analysis

Statistical discriminant analysis – PROC STEPDISC

STAT discriminant analysis - PROC STEPDISC

STAT discriminant analysis – PROC STEPDISC

SAS STAT discriminant analysis

SAS/STAT discriminant analysis – PROC STEPDISC

This was all about SAS/STAT Discriminant Analysis Tutorial. Hope you like our explanation.

Conclusion

Hence, this was a complete description and a comprehensive understanding of all the procedures offered by SAS/STAT Discriminant Analysis: PROC DISCRIM, PROC CANDISC, and PROC STEPDISC. We looked at each one of them, their syntax, and how they can be used.

Hope you all enjoyed it. Stay tuned for more interesting topics in SAS/STAT and for any doubts, post it in the comments section below.

Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *