Some Basic SQL Joins
A non-technical friend recently asked me for help with a merge problem. They had two separate data pulls of electronic medical records based on specific study parameters. The set of people in the database who fit the study parameters changed in between the data pulls, for example by having people age into our out of a study, or by having new diagnoses added to their records that cause them to either be newly included or excluded. Let’s call the older data set A and the newer data set B. The goal was to get all those entries from B that don’t also show up in A. The data sets were pulled by a staff data scientist at that company who, despite their title, said they couldn’t figure out how to remove those entries from B that were already in A. Barring any special circumstances, this is a fairly standard problem so let’s look at a couple of tools we could use to solve it. ...