The INDSNAME Option in SAS

I frequently find myself needing to concatenate data sets but also wanting to be able to distinguish which row came from which data set originally. Introductory SAS courses tend to teach the in keyword, for a workflow similar to this:

data Concat1;
set data1(in = ds0)  
    data2(in = ds1);
if ds0 then source = "data1";
else if ds1 then source = "data2";
run;

With more than two input data sets, this can get unwieldy and repetitive. In an old blog post on Rick Wicklin’s DO LOOP, a better method is introduced - the indsname option. Using this method, the above code looks much nicer:

data Concat2;
set data1-data2 indsname = source;  /* the INDSNAME= option is on the SET statement */
libref = scan(source,1,'.');        /* extract the libref */
dsname = scan(source,2,'.');        /* extract the data set name */
run;

As long as your input data sets are reasonably named, you’ll now have access to all the information needed.

D. Michael Senter
D. Michael Senter
Research Statistician Developer

My research interests include data analytics and missing data.

Related