Different Cases of SAS Concatenate Data Sets with SET Statement

FREE Online Courses: Transform Your Career – Enroll for Free!

Previously, we studied SAS Functions, now we will move on to a very easy and interesting method to combine two data sets in SAS programming i.e. SAS concatenate data sets.

After the end of this tutorial, you will be able to understand SAS concatenate datasets and different cases of concatenating data sets.

What is SAS Concatenate Data Sets?

We come across different types of SAS datasets and many times we would want to combine two data sets to have a clear idea about our analysis. The SET statement is used to concatenate two datasets in SAS.

If there are two data sets, say one has 2 observations and the other has 3 observations, then, our SAS concatenate dataset will have 5 observations. The order of observations is sequential. All observations from the first data set are followed by all observations from the second data set, and so on.

Below is the syntax of a SET statement in SAS

SET data-set 1 data-set 2 data-set 3.....;

Data-set 1, data-set 2 and data-set 3 are names of the data sets we wish to combine.

For example – suppose the data set store1 contains three variables, store (number), day (of the week), and sales (in dollars):

Store        Day           Sales
1               M             1200
1               T              1435
1               W             1712
1               R              1529
1               F              1920
1               S              2325

and the data set store2 contains the same three variables:

Store         Day           Sales
2                M            2215
2                T             2458
2                W            1798
2                R             1692
2                F             2105
2                S             2847

For SAS concatenate data sets, you simply specify a list of data set names in one SET statement.

DATA store1;
      input Store Day $ Sales;
1    M    1200
1    T    1435
1    W    1712
1    R    1529
1    F    1920
1    S    2325
DATA store2;
      input Store Day $ Sales;
2    M   2215
2    T   2458
2    W   1798
2    R   1692
2    F   2105
2    S   2847
DATA bothstores;
      set store1 store2;
PROC PRINT data = bothstores;
       title ‘ The bothstores dataset’;

The output of the above program will look like:
Store      Day        Sales
1             M           1200
1             T             1435
1             W            1712
1             R             1529
1             F             1920
1             S             2325
2             M             2215
2             T             2458
2             W            1798
2             R             1692
2             F              2105
2             S             2847

Different Cases of Concatenating Data Sets

In the above example, we saw that both datasets had three variables so it was easy to concatenate them. But, what happens if the variables are different, or of different lengths or of different names? Let us understand this from SAS Concatenate Datasets in different cases.

SAS Concatenate Data Sets

1. Different Names of Variables

In this case, both the data sets have different names for their variables. So, when we try and concatenate, an error might get displayed.

In order to overcome this, when we concatenate the datasets, we will rename the variables so that their name appears as one in the concatenated data set. Let us look at how we can do this.


In the below example data set x1 has the variable name x1inc whereas the data set x2 has the variable name x2inc. But both of these variables represent the same type(numeric). We apply the RENAME function in the SET statement as shown below:

DATA x1;
INPUT empid name $ x1inc ;
2 Art  22000
1 Bill 30000
3 Paul 25000
DATA x2;
INPUT empid name $ x2inc ;
1 Bess 15000
3 Pat  50000
2 Amy  18000
DATA x3; 
SET x1(RENAME=(x1inc=inc))x2(RENAME=(x2inc=inc));


OBS           EMPID          NAME             INC
1                 2                  Art              22000
2                 1                  Bill              30000
3                 3                  Paul             25000
4                 1                  Bess            15000
5                 3                  Pat              50000
6                 2                  Amy            18000

2. Different Variable Attributes

We talked about variable attributes in earlier classes which are a type, length, label, etc. Now, what happens if these differ. Again, while combining datasets, we can specify the length in our SET statement, that we want to keep or the data type of the variables.

Let us look at an example:

In the below example, the variable name is of length 3 in the first data set and 4 in the second. When concatenating, we apply the LENGTH statement in SAS concatenate data set to set the name length to 7 or the sum of two lengths.

DATA x1;   
INPUT empid name $3. inc;
2 Art  22000  
1 Bob  30000  
3 Tom  25000  
DATA x2;  
INPUT empid name $4. inc;
1 Bess 15000   
3 Rory 50000   
2 Jane 18000 
DATA x3;  
SET x1 x2;

OBS        empID         NAME           INC
1              2                Art             22000
2              1                Bob            30000
3              3                Tom            25000
4              1                Bes            15000
5              3                Ror            50000
6              2                Jan            18000

3. Different Number of Variables

If one of the two datasets has more number of variables than the other, then the data set which has less number of variables shows missing values in place of that extra variable.


Today we learned what is concatenation in SAS, how SAS concatenate data sets in SAS programming. We also discussed different cases of concatenating data sets in SAS.

Hope you find the explanation helpful. If you have any queries, feel free to ask in the comment section.

Did you like this article? If Yes, please give DataFlair 5 Stars on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *