sas

Missing Data Mechanisms

Understanding whether a variable’s missingness from a dataset is related to the underlying value of the data is a key concept in the field of missing data analysis. We distinguish three broad categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR).

CSV2DS

CSV2DS is a new program I wrote in Go to help me create minimum working examples for SAS that can be shared as a single SAS script.

SAS Markdown for Reproducibility

One of the coolest packages for R is knitr. Essentially, it allows you to combine explanatory writing, such as a paper or blog post, directly with your analysis code in a Markdown document.

Loading Zillow Housing Data in SAS

Zillow is a well-known website widely used by those searching for a home or curious to find out the value of their current home. What you may not know is that Zillow has a dedicated research page.

The INDSNAME Option in SAS

I frequently find myself needing to concatenate data sets but also wanting to be able to distinguish which row came from which data set originally. Introductory SAS courses tend to teach the in keyword, for a workflow similar to this:

Working with the Census API Directly from SAS

A post showing how PROC HTTP and LIBNAME JSON can be used to directly work with the Census API from SAS.

Cleaning up a Date String with RegEx in SAS

Sometimes we have to deal with manually entered data, which means there is a good chance that the data needs to be cleaned for consistency due to the inevitable errors that creep in when typing in data, not to speak of any inconsistencies between individuals entering data.

From Proc Import to a Data Step with Regex

I find myself needing to import CSV files with a relatively large number of columns. In many cases, proc import works surprisingly well in giving me what I want. But sometimes, I need to do some work while reading in the file and it would be nice to just use a data step to do so, but I don’t want to type it in by hand.

Making INPUT and LABEL Statements with AWK

I am currently working with a database provided by the North Carolina Department of Public Safety that consists of several fixed-width files. Each of these has an associated codebook that gives the internal variable name, a label of the variable, its data type, as well as the start column and the length of the fields for each column.

SASPy Video Tutorial

I have been using both SAS and Python extensively for a while now. With each having great features, it was very useful to combine my skills in both languages by seamlessly moving between SAS and Python in a single notebook.