Loading the 2019 VAERS CSV Datasets into SQL Server
To quickly view the CSV files, you can always open them into a spreadsheet.
The VAERS data has been successfully loaded onto both SQL Server 2014 (Windows 7), and 2017 (Windows 10). Here, it is loaded onto SQL Server 2014 for backward compatibility.
SELECT @@SERVICENAME as Servicename , @@VERSION as Version Servicename Version MSSQLSERVER Microsoft SQL Server 2014 - 12.0.2000.8 (X64) Feb 20 2014 20:04:26 Copyright (c) Microsoft Corporation Express Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1)
Download:
Download the VAERS CSV datasets from:
https://vaers.hhs.gov/data/datasets.html
Chose the 2019 Zip file, about 11 MB (in Februrary 2020)
The zip file unzips into three CSV files:
2019VAERSDATA.csv
2019VAERSSYMPTOMS.csv
2019VAERSVAX.csv
Create Sql Server Database:
On your SQL Server platform, if not already created, create a SQL Server database called:
VAERS_2019
Then of course, before running any SQL Server commands, first run the command:
use VAERS_2019;
go
Loading CSV Files Via GUI:
You can try to use the SQL Server GUI to load. Although there are issues loading two of the files.
Loading CSV Files Via SQL Scripts:
So, to do the load, SQL scripts were created at
http://convertcsv.com/csv-to-sql.htm
Which worked well.
Download the SQL scripts from here: 2019LoadCSVTables.zip
The zip file is about 12.6 mb
Which will contain the files:
2019VAERSDATA1.sql
2019VAERSDATA2.sql
2019VAERSSYMPTOMS1.sql
2019VAERSSYMPTOMS2.sql
2019VAERSSYMPTOMS3.sql
2019VAERSVAX1.sql
2019VAERSVAX2.sql
Using SQL Server, run the files one by one.
Create the table.
Load the data.
Primary Key:
Add a primary key to 2019VAERSDATA:
alter table dbo.[2019VAERSDATA] add constraint pk_2019VAERSDATA primary key ( vaers_id ) ;
Quick Query:
Do a quick query to make sure the data loaded
use vaers_2019; go select min (vaers_id) min_vaers_id , max (vaers_id) max_vaers_id , count(*) count from dbo.[2019VAERSDATA]; go
count | max_vaers_id | count |
---|---|---|
47183 | 856660 | 759483 |
Duplicate Rows:
If you run some queries, you will notice that there are some duplicate rows in 2019VAERSSYMPTOMS, and 2019VAERSVAX.
Thus the difficulty of adding primary keys.
This will be a subject for another blog post.