Parallel execution in SAS

The example below shows one way to do parallel execution of SAS sessions. In the example below two SAS sessions is started in parallel and a third is executed depending on the result of the two session.

Gets the path of the main SAS program that is executed.

%macro GetSASProgramPath;
 %qsubstr(%sysget(SAS_EXECFILEPATH),1,%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILEname))-1);
%mend;

 %let SASProgramPath = %GetSASProgramPath;
%put &SASProgramPath;

Starts the first SAS session. %syslput makes it possible to give the SAS session a macrovariable from the main program. In this case it is the path for the SAS program being executed.
wait=no tells SAS not to wait for the SAS session to end before starting the next SAS session.
%sysrput makes it possible to return a macrovariable from a SAS session to the main SAS program. In this cause it whether or not the SAS session executed with og without errors.

option sascmd = “sas”;
signon remote = OneSes;
%syslput _SASProgramPath = &SASProgramPath;
rsubmit process = OneSes wait=no;
 %include “&_SASProgramPath\<SOME PROGRAM.SAS>”;
 %sysrput FirstSessionError = &syscc;
endrsubmit;

Starts the second SAS session.

option sascmd = “sas”;
signon remote = TwoSes;
%syslput _SASProgramPath = &SASProgramPath;
rsubmit process = TwoSes wait=no;
%include “&_SASProgramPath\<SOME PROGRAM.SAS>”;
%sysrput SecondSessionError = &syscc;
endrsubmit;

The statement below waits for the two sessions above to finnish and executes a SAS program if the two sessions did not fail.

waitfor OneSes TwoSes;
signoff OneSes;
signoff TwoSes;
%macro FinalRun;
%if &FirstSessionError < 5 and &SecondSessionError < 5 %then %do;
%put –== No Errors ==–;
%include “&_SASProgramPath\<SOME PROGRAM.SAS>”;
%end;
%mend;
%FinalRun;

Use SAS with multiple cores in a CPU – and other hardware considerations

SAS does use multiple cores in some of it’s procedures eg. PROC SORT, PROC SQL (order by and group by) and others. The use of parallel execution was introduced with SAS version 9. A description of how SAS uses threads can be found here.

It is possible to test the use and impact of executing SAS using multiple cores in parallel by using the OPTION NOTHREADS. This option tells SAS, not to use parallel execution when possible. OPTION THREADS is on by default. It is also possible to do some experimenting with OPTIONS CPUCOUNT where it is possible to set the number of cores for SAS to use in it’s execution of the SAS-program.
Experience says that SAS works best in a 4 core environment. Which probably has something to do with  Amdahls law. That is of course not the case if your system handles multiple users or you start multiple SAS-processes eg. through the use of RSUBMIT with multiple tasks in one SAS-session or a script that starts multiple SAS-sessions.
If you are operating in an Intel environment the type of processor you use can have a big impact on the performance of your SAS-program. Intel XEON X5000-series performs a lot better than the Intel XEON X7000-series. Rule of thumb is that for big jobs a high clockfrequency is preferable.

Another rule of thumb is to use 4GB of RAM per core on a 64bit system and 2GB of RAM per core on a 32bit system of course depending on your individual use of the available memory. Paging of the memory should of course be avoided.

Reading from SSD disks connected to the server is of course very fast but still expensive considering that you probably need some sort of RAID configuration. Writing temporary files and log files to SSD disks could give a performance advantage, but with long seriel writes cheaper disk maybe connected through a SAN might give a better performance than SSD disks.