Loading the 2019 VAERS CSV Datasets into SQL Server

Loading the 2019 VAERS CSV Datasets into SQL Server

To quickly view the CSV files, you can always open them into a spreadsheet.

The VAERS data has been successfully loaded onto both SQL Server 2014 (Windows 7), and 2017 (Windows 10). Here, it is loaded onto SQL Server 2014 for backward compatibility.

SELECT @@SERVICENAME as Servicename
, @@VERSION as Version

Servicename Version
MSSQLSERVER Microsoft SQL Server 2014 - 12.0.2000.8 (X64) Feb 20 2014 20:04:26 Copyright (c) Microsoft Corporation Express Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1)

Download:

Download the VAERS CSV datasets from:
https://vaers.hhs.gov/data/datasets.html

Chose the 2019 Zip file,  about 11 MB (in Februrary 2020)

The zip file unzips into three CSV files:
2019VAERSDATA.csv
2019VAERSSYMPTOMS.csv
2019VAERSVAX.csv

Create Sql Server Database:

On your SQL Server platform, if not already created, create a SQL Server database called:
VAERS_2019

Then of course, before running any SQL Server commands, first run the command:

use VAERS_2019;
go

 

Loading CSV Files Via GUI:

You can try to use the SQL Server GUI to load. Although there are issues loading two of the files.

Loading CSV Files Via SQL Scripts:

So, to do the load, SQL scripts were created at
http://convertcsv.com/csv-to-sql.htm
Which worked well.

Download the SQL scripts from here:  2019LoadCSVTables.zip
The zip file is about 12.6 mb

Which will contain the files:

2019VAERSDATA1.sql
2019VAERSDATA2.sql
2019VAERSSYMPTOMS1.sql
2019VAERSSYMPTOMS2.sql
2019VAERSSYMPTOMS3.sql
2019VAERSVAX1.sql
2019VAERSVAX2.sql

 

Using SQL Server, run the files one by one.
Create the table.
Load the data.

 

Primary Key:
Add a primary key to 2019VAERSDATA:

alter table
dbo.[2019VAERSDATA]
add constraint pk_2019VAERSDATA
primary key
( vaers_id ) ;

 

Quick Query:

Do a quick query to make sure the data loaded

use vaers_2019;
go

select
min (vaers_id) min_vaers_id
, max (vaers_id) max_vaers_id
, count(*) count
from dbo.[2019VAERSDATA];
go

countmax_vaers_idcount
47183856660759483

Duplicate Rows:

If you run some queries, you will notice that there are some duplicate rows in 2019VAERSSYMPTOMS, and 2019VAERSVAX.

Thus the difficulty of adding primary keys.

This will be a subject for another blog post.

Leave a Reply

Your email address will not be published. Required fields are marked *