Using Git with SAS Studio
Git is a widely used version control system that allows users to track their software development in both public and private repositories. It is also increasingly used to store data in text formats, see for example the New York Times COVID-19 data set. This post will briefly demonstrate how to clone and pull updates from a GitHub repository using the git functions that are built into SAS Studio.
Git functionality has been built into SAS Studio for a little while, so there are actually
two slightly different iterations of the git functions. The examples in this post will use the versions
compatible with SAS Studio 3.8, which is the current version available at SAS OnDemand for Academics.
All git functions use the same prefix. In older versions such as SAS Studio 3.8 the prefix is gitfn_
,
which is followed by a git command such as “clone” or “pull”. In SAS Studio 5, the prefix has been
simplified to just git_
. Most git functions have the same name between the
two versions, so that the only difference is the prefix. A complete table of the old and new
versions of the git functions is available
in the documentation.
We use the git functions by calling them in an otherwise empty DATA step. In other words, we use the format
data _null_;
/* use your git functions here */
run;
Cloning a Repo
To clone a repo from github we use gitfn_clone
. It takes two arguments -
the URL of the repository of interest and the path to an empty folder. You can
have SAS create the folder for you by using OPTIONS DLCREATEDIR
. The basic
syntax for the clone is as follows:
data _null_;
rc = gitfn_clone (
"&repoURL.", /* URL to repo */
"&targetDIR."); /* folder to put repo in */
put rc=; /* equals 0 if successful */
run;
It doesn’t matter if the URL you use ends in “.git” or not. In other words, the following two macros would both work the same:
%LET repoURL=https://github.com/nytimes/covid-19-data;
/* works the same as */
%LET repoURL=https://github.com/nytimes/covid-19-data.git;
You can also use password based authentication to pull in private repositories:
data _null_;
rc = gitfn_clone (
"&repoURL.",
"&targetDIR.",
"&githubUSER.", /* your GitHub username */
"&githubPASSW."); /* your GitHub password */
put rc=; /* equals 0 if successful */
run;
NOTE: GitHub is deprecating password-based authentication; you will need to switch to OAuth authentication or SSH keys if you are not already using them. To access a repository using an SSH key, use the following:
data _null_;
rc = gitfn_clone(
"&repoURL.",
"&targetDIR.",
"&sshUSER.",
"&sshPASSW.",
"&sshPUBkey.",
"&sshPRIVkey.");
put rc=;
run;
Pull-ing in Updates
It is just as easy to pull in updates to a local repository by using
gitfn_pull("&repoDIR.")
. This also works with SSH keys for private
repositories:
data _null_;
rc = gitfn_pull(
"&repoDIR.",
"&sshUSER.",
"&sshPASSW.",
"&sshPUBkey.",
"&sshPRIVkey.");
run;
Other Functions
SAS also offers other built-in functions, such as _diff
, _status
, _push
,
_commit
, and others. For a complete list, see the SAS documentation
here.