Remote Hosted, Local Jupyter?!

If you visit the Project Jupyter website you’ll encounter a bunch of “try it in your browser” buttons. If you’ve used Jupyter for a decade or so like me, you probably have also been ignoring these buttons. And if you have clicked on them, you might have been lead to a mybinder.org. Don’t get me wrong, mybinder is cool. It creates a docker image that remote-hosts a live environment so that you can share your interactive notebooks on the web. Cool stuff. But I just found something better. ...

Dec 23, 2024 · 2 min · 351 words · D. Michael Senter

PROC MI Added to SASPy

I’m excited to announce that the new SAPy v4.6.0 release includes a pull request of mine that adds PROC MI to the SAS/STAT procedures directly exposed in SASPy. This procedure allows you to analyze missing data patterns and create imputations for missing data. ...

Feb 6, 2023 · 3 min · 546 words · D. Michael Senter

Conditional RegEx Matching with Python

When the endpoint of your match depends on an earlier term, try conditional regex matching in Python.

May 19, 2022 · 3 min · 529 words · D. Michael Senter

Making VS Code and Python Play Nice on Windows

One of the editors I use regularly is VS Code. I work a lot with Python, but when installing Anaconda using default settings on a Windows machine already having VSC installed there’s a good chance you’ll run into an issue. When attempting to run Python code straight from VSC you may get an error. This should be fixed on some newer versions of Anaconda, but I’ve needed to do something about it often enough I feel it’s useful to save the solution janh posted on StackExchange. ...

Jul 21, 2021 · 1 min · 126 words · D. Michael Senter

SASPy Video Tutorial

I have been using both SAS and Python extensively for a while now. With each having great features, it was very useful to combine my skills in both languages by seamlessly moving between SAS and Python in a single notebook. In the video below, fellow SAS intern Ariel Chien and I show how easy it is to connect the SAS and Python kernels using the open-source SASPy package together with SAS OnDemand for Academics. I hope you will also find that this adds to your workflow! ...

Jun 29, 2021 · 1 min · 130 words · D. Michael Senter

North Carolina Housing Data

A popular beginners machine learning problem is the prediction of housing prices. A frequently used data set for this purpose uses housing prices in California along some additional gathered through the 1990 Census. One such data set is available here at Kaggle. Unfortunately, that data set is rather old. And I live in North Carolina, not California! So I figured I might as well create a new housing data set, but this time with more up-to-date information and using North Carolina as the state to be analyzed. One thing that may be interesting about North Carlina as compared to California is the position of major populations centers. In California, major population centers are near the beach, while major population centers in North Carolina are in the interior of the state. Both large citites and proximity to the beach tend to correlate with higher housing prices. In California, unlike in North Carolina, both of these go together. ...

Nov 6, 2020 · 8 min · 1644 words · D. Michael Senter

Teacher Salaries

What do you do when your data table is in PDF format? Let’s use tabula-py to extract teacher salary information from PDFs directly into Pandas dataframes. We’ll also use some regex to clean up the results.

Oct 29, 2020 · 9 min · 1788 words · D. Michael Senter

Accessing Census Data via API

The Census Bureau makes an incredible amount of data available online. In this post, I will summarize how to get access to this data via Python by using the Census Bureau’s API. The Census Bureau makes a pretty useful guide available here - I recommend checking it out. ...

Aug 22, 2020 · 5 min · 999 words · D. Michael Senter