sas

New MI Feature: Flux Statistics

The Viya 2024.04 release includes a brand new MI feature: new missing data statistics. An important choice when building an imputation model is the selection of variables to be included. One method to help in the variable selection process is the usage of summary statistics such as influx and outflux, as proposed by van Buuren.

Calling R From SAS

The statistics literature is filled with example code and sample data in R. Sometimes I find myself wanting to work through some provided sample data and compare the output from R with SAS code.

Some Basic SQL Joins

A non-technical friend recently asked me for help with a merge problem. They had two separate data pulls of electronic medical records based on specific study parameters. The set of people in the database who fit the study parameters changed in between the data pulls, for example by having people age into our out of a study, or by having new diagnoses added to their records that cause them to either be newly included or excluded.

Univariate Missing Data with PROC MI

In Chapter 3 of van Buuren’s Flexible Imputation of Missing Data a variety of methods for imputing univariate missing data are presented. This post will summarize these techniques and show how to implement them in SAS.

Sampling Regression Lines

Last week we saw how to generate posterior samples using PROC MCMC for simple linear and logistic regression models. This week, I want to show how to sample regression lines from the data set returned by MCMC by plotting several sample regression linse on top of a scatter plot of the source data.

Simple Regression With PROC MCMC

In this post I’ll show how to fit simple linear and logistic regression models using the MCMC procedure in SAS. Note that the point of this post is to show how the mathematical model is translated into PROC MCMC syntax and not to discuss the method itself.

Loading Several XPT Files From a URL

The SAS Transport File Format (XPORT) is an open file format maintained by SAS for exchanging datasets. Its use is mandated by the FDA for data set submission for new drug or device applications and the CDC uses this format to distribute public data.

PROC MI Added to SASPy

I’m excited to announce that the new SAPy v4.6.0 release includes a pull request of mine that adds PROC MI to the SAS/STAT procedures directly exposed in SASPy. This procedure allows you to analyze missing data patterns and create imputations for missing data.

Missing Data Mechanisms

Understanding whether a variable’s missingness from a dataset is related to the underlying value of the data is a key concept in the field of missing data analysis. We distinguish three broad categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR).

CSV2DS

CSV2DS is a new program I wrote in Go to help me create minimum working examples for SAS that can be shared as a single SAS script.