Field Analysis of NUMDAYS

Exploratory data analysis.
Anomalies of the field: VAERSDATA.NUMDAYS

Most of the fields in the VAERS database have pretty clean data. For instance, many fields will have either ‘Y’, or NULL in them.

You won’t find odd data such as ‘N’, numbers or other letters. The
data looks quite reliable.

However, analysis of the field NUMDAYS reveals some oddities.

In the VAERS documentation, NUMDAYS is defined as:
Number of days (ONSET_DATE – VAX_DATE)

Does this hold true, and can it be relied upon?
In how many rows is NUMDAYS populated?

 

use vaers_2019;
go

Select  count (numdays ) cnt_numdays
, count(*) Count
, 100.0 * count (numdays ) / count(*)  as Pct_NUMDAYS_Not_Null
from dbo.[vaersdata];
go

Cnt_numdaysCountPct_NUMDAYS_Not_Null
593,718759,48378.17

 

 
Approximately 78% of the rows in VAERSDATA have the column NUMDAYS, populated.

 
Interestingly, after loading the VAERSDATA MDB file into SQL SERVER, the NUMDAYS column is loaded as type Nvarchar.

Are there any rows where NUMDAYS differs from the difference in days between VAX_DATE and ONSET_DATE?

 


select 
DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)  DateDiff
, CONVERT(INT, numdays)   NUMDAYS
from dbo.[vaersdata]
where numdays is not null
and  DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)
<> CONVERT(INT, numdays)   

Zero rows. Reliable.

However, looking at the fields, VAX_DATE and ONSET_DATE, there are a number of rows where the ONSET_DATE precedes the VAX_DATE. This does not make sense. VAX_DATE needs to precedes the ONSET_DATE.


Select count(*)  Count
from dbo.[vaersdata]
where VAX_DATE > ONSET_DATE

Count:
2,530

Does this affect NUMDAYS?


Select NUMDAYS
, count(*)  Count
from dbo.[vaersdata]
where ONSET_DATE <  VAX_DATE 
group by NUMDAYS;

NUMDAYSCount
NULL2530

 
If the ONSET_DATE precedes the VAX_DATE, then NUMDAYS remains NULL.
...

 

When ONSET_DATE < VAX_DATE, what values are found?


select 
  DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)  DateDiff
, CONVERT(INT, numdays)   NUMDAYS
, count(*) Count
from dbo.[vaersdata]
where numdays is null
and VAX_DATE is not null
and ONSET_DATE is not null	
Group by DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)  
, CONVERT(INT, numdays) 
order by 1 desc

 

Row NumberDateDiffNUMDAYSCount
1-1NULL135
2-2NULL123
3-3NULL105
4-4NULL98
5-5NULL98
6-6NULL86
7-7NULL97
8-8NULL94
9-9NULL101
10-10NULL99
11-11NULL80
12-12NULL109
13-13NULL77
14-14NULL90
15-15NULL92
   
95-815NULL1
96-1,271NULL1
97-1,511NULL1
98-1,854NULL1
99-2,554NULL1
100-2,701NULL1
101-3,287NULL1
102-3,652NULL3
103-23,010NULL1
104-35,667NULL1

104 rows.

The majority of the odd results are within 30 days.

At the same time, there are some very odd results. The maximum, 35667 days would be over 97 years.

Some Excerpts Where ONSET_DATE < VAX_DATE


select VAERS_ID
,  VAX_DATE
, ONSET_DATE
, CONCAT (symptom_text, symptom_text2) symptom_text
from dbo.[vaersdata]
where numdays is null
and DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)  =  -2
order by 1;

 

VAERS_IDVAX_DATEONSET_DATE
7913832018-12-032018-12-01

Symptom_text:
This case was reported by a consumer via call center representative and described the occurrence of dizziness in a 50-year-old female patient who received Shingles vaccine for prophylaxis. On 3rd December 2018, the patient received Shingles vaccine. In December 2018, less than a week after receiving Shingles vaccine, the patient experienced dizziness. On an unknown date, the outcome of the dizziness was unknown. It was unknown if the reporter considered the dizziness to be related to Shingles vaccine. Add

Comment:
Looks like ONSET_DATE should be entered as: 2018-12-10, one week after the VAX_DATE.

VAERS_IDVAX_DATEONSET_DATE
8337372019-09-032019-09-01

 
Symptom_text:
The edema increased until the elbow and on 05 Sep2019 [the arm of the patient) was swollen until the elbow with spots, very warm and painful at touch; The edema increased until the elbow and on 05 Sep2019 [the arm of the patient) was swollen until the elbow with spots, very warm and painful at touch; The edema increased until the elbow and on 05 Sep2019 [the arm of the patient) was swollen until the elbow with spots, very warm and painful at touch; The edema increased until the elbow and on 05 Sep2019 [th

Comment:
Onset date should be: 05 Sep2019, as it says in the field SYMPTOM_TEXT. Not 2019-09-01
Onset_Date should be two days higher, not two days less.

VAERS_IDVAX_DATEONSET_DATE
8351362019-08-032019-08-01

 
Symptom_Text:
Diarrhoea; Pyrexia; This case was reported by a consumer via regulatory authority and described the occurrence of diarrhoea in a 2-year-old female patient who received Rabies NVS (Rabipur) (batch number ARBA341A, expiry date unknown) for prophylaxis. Co-suspect products included TYPHOID VACCINE (TYPHIM VI) (batch number P1E191V, expiry date unknown) for prophylaxis. On 3rd August 2019, the patient received Rabipur (parenteral) and TYPHIM VI (parenteral). On 5th August 2019, 2 days after receiving Rabi

Comment:
Vaccination date: 2019-08-03
Onset date should be: 5th August 2019, as specified in the text.
Not 2019-08-01
Onset_Date should be two days higher, not two days less.

….

These all look like data entry errors.

—-

Date anomalies in thousands of days:


select VAERS_ID
,  VAX_DATE
, ONSET_DATE
, Age_yrs
, DATEDIFF(DAY, VAX_DATE, ONSET_DATE	) as theDateDiff
, CONCAT (symptom_text, symptom_text2) Symptom_text
from dbo.[vaersdata]
where numdays is null
and DATEDIFF(DAY, VAX_DATE, ONSET_DATE	)  < -10000
order by 1

VAERS_IDVAX_DATEONSET_DATEAge_yrstheDateDiff
8167652019-01-081921-05-1531-35,667

 
Symptom_text:
Arthralgia

Comment:
Entered as 1921-05-15.
Probably should have been entered as: 2019-05-15

....

VAERS_IDVAX_DATEONSET_DATEAge_yrstheDateDiff
8359102019-09-221956-09-2263-23,010

Symptom_text:
Gave HD dose to 63 year old patient inadvertedly

Comment:
1956-09-22 to 2019-09-22 is exactly 63 years, the patient's age.
Some kind of data entry error.

---

Does It Affect The Results?


Select  count(*) NUMDAYS_Count
from dbo.[vaersdata]
where numdays is not null

NUMDAYS_Count:
593,718

...

Select  count(*) Vax_Onset_Date_Count
from dbo.[vaersdata]
where VAX_DATE is not null
and  ONSET_DATE  is not null

Vax_Onset_Date_Count:
596,248

...

Select  count(*) Count
from dbo.[vaersdata]

Count:
759,483

...

NUMDAYS percent populated:
100 * 593718/759483 = 78.17%

VAX_DATE and ONSET_DATE populated:
100 * 596248/759483 = 78.5%

Not even a 1% difference

 

Leave a Reply

Your email address will not be published. Required fields are marked *