Exploratory data analysis.
Anomalies of the fields: NUMDAYS and AGE_YRS.
In some rows, NUMDAYS converted to years, would actually be larger than the person’s age in years (AGE_YRS).
In the VAERS documentation, NUMDAYS is defined as:
Number of days (ONSET_DATE – VAX_DATE)
In how many rows is NUMDAYS populated?
Select count (numdays ) Cnt_numdays , count(*) Count , 100.0 * count (numdays ) / count(*) as Pct_NUMDAYS_Not_Null from dbo.[vaersdata]
Cnt_numdays | Count | Pct_NUMDAYS_Not_Null |
---|---|---|
593,718 | 759,483 | 78.17 |
In how many rows is AGE_YRS populated?
Select count ( AGE_YRS) cnt_AGE_YRS , count(*) Count , 100.0 * count (AGE_YRS) / count(*) as Pct_AGE_YRS_Not_Null from dbo.[vaersdata]
Cnt_AGE_YRS | Count | Pct_AGE_YRS_Not_Null |
---|---|---|
623,450 | 759,483 | 82.09 |
In how many rows are both AGE_YRS and NUMDAYS populated?
select count(*) Count from dbo.[vaersdata] where AGE_YRS is not null and NUMDAYS is not null
Count:
546,062
Percent of VAERSDATA populated with both AGE_YRS and NUMDAYS NOT NULL
= 100 * 546,062 / 759,483
= 71.89%
Rounded, is: 72%
Interestingly, after being loaded into SQL Server, the datatype of AGE_YRS, and NUMDAYS is Nvarchar, not a numeric. So, for calculations, the field’s data must be converted into numeric.
Documentation for AGE_YRS:
From the documentation:
AGE_YRS Num(xxx.xx) Box 4: Age in Years
– Does this mean, the age when they first got a vaccine? Perhaps years earlier?
– Or, the age, when it was recently reported?
—–
Eyeballing the Data:
select VAERS_ID , convert ( float, age_yrs ) AGE_YRS , vax_date , onset_date , numdays , ( convert ( float, NUMDAYS ) / 365.0 ) as NUM_YRS , ( convert ( float, NUMDAYS ) / 365.0 ) - convert ( float, age_yrs ) as YRS_Diff , concat ( symptom_text , symptom_text2 ) as Symptom_text from dbo.[VAERSDATA] where NUMDAYS IS NOT NULL and age_yrs is NOT NULL AND ( convert ( float, NUMDAYS ) / 365.0 ) - /* NUM_YRS */ convert ( float, age_yrs ) > .2 /* AGE_YRS */ order by 7 desc
In this query, NUMDAYS (VAX_DATE-ONSET_DATE) is translated into NUM_YRS.
YRS_Diff is the difference between AGE_YRS and NUM_YRS.
Results:
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
784039 | 0.67 | October 3, 1935 | October 5, 2018 | 30,318 | 83.06 | 82.39 |
576345 | 0.28 | April 27, 1946 | April 27, 2015 | 25,202 | 69.05 | 68.77 |
507426 | 0.47 | October 16, 1947 | October 16, 2013 | 24,107 | 66.05 | 65.58 |
794101 | 0.08 | December 28, 1954 | December 29, 2018 | 23,377 | 64.05 | 63.97 |
299070 | 0.2 | November 16, 1945 | November 29, 2007 | 22,658 | 62.08 | 61.88 |
428021 | 0.3 | July 21, 1949 | July 23, 2011 | 22,647 | 62.05 | 61.75 |
534164 | 0.32 | June 2, 1954 | June 5, 2014 | 21,918 | 60.05 | 59.73 |
727856 | 0.92 | October 16, 1957 | November 10, 2017 | 21,940 | 60.11 | 59.19 |
361670 | 0.41 | October 19, 1953 | October 19, 2009 | 20,454 | 56.04 | 55.63 |
486526 | 0.5 | September 3, 1957 | September 3, 2012 | 20,089 | 55.04 | 54.54 |
668153 | 0.33 | November 7, 1963 | November 7, 2016 | 19,359 | 53.04 | 52.71 |
809506 | 0 | October 5, 1966 | April 10, 2019 | 19,180 | 52.55 | 52.55 |
400196 | 0.41 | September 23, 1961 | September 23, 2010 | 17,897 | 49.03 | 48.62 |
… | ||||||
94174 | 1 | August 8, 1995 | October 21, 1996 | 440 | 1.21 | 0.21 |
453451 | 0.51 | August 27, 2010 | May 15, 2011 | 261 | 0.72 | 0.21 |
238767 | 1.05 | August 13, 2003 | November 13, 2004 | 458 | 1.25 | 0.20 |
202686 | 0.8 | April 23, 2002 | April 24, 2003 | 366 | 1.00 | 0.20 |
106472 | 1 | July 10, 1996 | September 22, 1997 | 439 | 1.20 | 0.20 |
643785 | 1.2 | April 17, 2013 | September 11, 2014 | 512 | 1.40 | 0.20 |
56590 | 0.3 | March 12, 1993 | September 11, 1993 | 183 | 0.50 | 0.20 |
324175 | 1.3 | March 6, 2007 | September 4, 2008 | 548 | 1.50 | 0.20 |
385451 | 0.01 | September 1, 2008 | November 17, 2008 | 77 | 0.21 | 0.20 |
570800 | 0.83 | June 1, 2011 | June 11, 2012 | 376 | 1.03 | 0.20 |
6425 rows |
—-
Using Different search parameters in the query:
convert ( float, age_yrs ) > .1
Count total rows: 6688
….
convert ( float, age_yrs ) > .2
Count total rows: 6425
….
convert ( float, age_yrs ) > .5
Count total rows: 5780
….
convert ( float, age_yrs ) > 1.0
Count total rows: 4884
—–
Some Examples of NUM_YRS Greater Than AGE_YRS:
VAERS_ID: 784039
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
784039 | 0.67 | October 3, 1935 | October 5, 2018 | 30,318 | 83.06 | 82.39 |
VAERS_ID 784039 has AGE_YRS of 0.67 (about 8 months old).
But has NUMDAYS of 30,318, indicating an AGE_YRS of 83.06.
Looking at some other fields for VAERS_ID 784039
Symptom_Text: Itching and rash.
Other Meds:
Ativan; Carvedilol; Coumadin; Hydralazine; oxybutynin; Protonix; Requip; Potassium Chloride; Preservision supplement; Lasix
History:
CHF low back pain; hyperlipidemia; diabetes; restless legs; anxiety
(CHF = Congestive Heart Failure)
Comment:
Lots of medications, and
Lots of health problems, usually attributed to an older person, not an 8 month old infant.
This looks like data entry error.
ONSET_DATE is October 5, 2018.
VAX_DATE should probably be: October 3, 2018
—
VAERS_ID: 576345
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
576345 | 0.28 | April 27, 1946 | April 27, 2015 | 25,202 | 69.05 | 68.77 |
Symptom_text:
After having a regular checkup, my doctor recommended at my age I take the preventative pneumonia shot. I saw no reason not to do so however, following the PCV13 Vaccine, the injection site was very sore and within 3 hours I had a high fever, chills, chest/stomach cramps. Later my chest became congested and I have been in bed for 4 days. I did notify my doctor and he indicated it was unusual to have a negative side effect. I have been consuming liquids and resting.
OTHER_MEDS:
Simvastatin; Triamt/hctz; Propranolol
Comment:
Symptom_Text indicates that the onset of the symptoms occurred:
“within 3 hours”
So, both VAX_DATE and ONSET_DATE should be the same:
2015-04-27
By the medications, seems this female is older:
Simvastatin is a lipid-lowering medication, also used to decrease the risk of heart problems.
Triamterene-Hydrochlorothiazid is used to treat high blood pressure.
Propranolol is used to treat high blood pressure.
Again, it could be that the date of birth, was erroneously entered into VAX_DATE. Using VAX_DATE of 1946-04-27, the patient has an age of 69.
AGE_YRS has been entered as .28. Clearly incorrect.
This entry was done by the patient themselves, so it could be that the website was confusing, as so many websites are.
———-
ONSET_DATE Starts A Few Years Later:
In some cases, the patient was young at the time of vaccination.
But the ONSET_DATE was listed a few years later, the date they saw a doctor, or the date the symptoms began.
VAERS_ID: 106306
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
103758 | 1.3 | September 22, 1995 | August 17, 1997 | 695 | 1.90 | 0.60 |
Symptom_text:
17AUG97 pt devel herpes zoster lesions;21AUG97 pt was dx w/shingles;It was reported that no cult were available for analysis;
Comment:
Seems that when the first vaccination was received, the patient was 1.3 years old
1.9 years later, was the Onset_date
—-
VAERS_ID: 203430
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
203430 | 0.01 | December 17, 1996 | May 1, 2001 | 1,596 | 4.37 | 4.36 |
Symptom_text:
This report describes the occurrence of mercury poisoning in a 4 year old male who was vaccinated with hep B vaccine recombinant (Engerix B) for prophylaxis. This report was received as part of litigation proceedings and has been verified by a physician or other healthcare professional. The subject”s medical history, concurrent conditions, concurrent medications, and concomitant vaccines were not reported. The subject reportedly received injections of Engerix B on 11/19/96 and 12/17/96.The subject”s att
Comment:
AGE_YRS is listed at 0.01 years, a newborn.
But in the field, Symptom_text, the patient is listed as a 4 year old male
Seems that the patient was vaccinated as a newborn, but diagnosed with mercury poisoning some years later.
—————————————-
Data Entry Errors:
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
500644 | 0.06 | August 8, 2012 | August 19, 2013 | 376 | 1.03 | 0.97 |
Symptom_text:
Severe measles type rash 11 days post vaccine – fever, fussy and painful.
Comment:
Wrong year on Vax_Date, or Onset_Date
11 days after 2012-08-08, would be 2012-08-19, not in 2013, as listed.
Data entry error.
———
More Data Entry Errors:
Exactly one year multiple between the
VAX_DATE and ONSET_DATE
select VAERS_ID , convert ( float, age_yrs ) AGE_YRS , vax_date , onset_date , numdays , ( convert ( float, NUMDAYS ) / 365.0 ) as NUM_YRS , ( convert ( float, NUMDAYS ) / 365.0 ) - convert ( float, age_yrs ) as YRS_Diff , concat ( symptom_text , symptom_text2 ) as Symptom_text from dbo.[VAERSDATA] where NUMDAYS IS NOT NULL and age_yrs is NOT NULL and ( convert ( float, NUMDAYS ) / 365.0 ) - convert ( float, age_yrs ) > 0 /* NUMDAYS more than patient's age */ and ( convert ( float, NUMDAYS ) / 365.0) - ( convert ( int, NUMDAYS ) / 365) = 0 /* NUMDAYS Exactly 1 year */ order by 6, 1
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
29291 | 0.30 | February 19, 1990 | February 19, 1991 | 365 | 1 | 0.70 |
52265 | 0.80 | March 16, 1992 | March 16, 1993 | 365 | 1 | 0.20 |
114216 | 0.60 | June 26, 1997 | June 26, 1998 | 365 | 1 | 0.40 |
196819 | 0.30 | January 23, 2002 | January 23, 2003 | 365 | 1 | 0.70 |
204919 | 0.30 | November 1, 2002 | November 1, 2003 | 365 | 1 | 0.70 |
213538 | 0.08 | September 17, 2002 | September 17, 2003 | 365 | 1 | 0.92 |
259761 | 0.33 | May 5, 2005 | May 5, 2006 | 365 | 1 | 0.67 |
262407 | 0.32 | May 2, 2005 | May 2, 2006 | 365 | 1 | 0.68 |
454649 | 0.10 | February 14, 2011 | February 14, 2012 | 365 | 1 | 0.90 |
495894 | 0.34 | July 3, 2012 | July 3, 2013 | 365 | 1 | 0.66 |
512886 | 0.01 | February 11, 2011 | February 11, 2012 | 365 | 1 | 0.99 |
522752 | 0.33 | February 14, 2013 | February 14, 2014 | 365 | 1 | 0.67 |
536603 | 0.36 | March 25, 2013 | March 25, 2014 | 365 | 1 | 0.64 |
558048 | 0.50 | December 11, 2013 | December 11, 2014 | 365 | 1 | 0.50 |
614507 | 0.40 | August 22, 2014 | August 22, 2015 | 365 | 1 | 0.60 |
693416 | 0.33 | March 31, 2016 | March 31, 2017 | 365 | 1 | 0.67 |
731552 | 0.33 | June 23, 2016 | June 23, 2017 | 365 | 1 | 0.67 |
766331 | 0.17 | January 8, 2017 | January 8, 2018 | 365 | 1 | 0.83 |
195200 | 0.60 | April 27, 2000 | April 27, 2002 | 730 | 2 | 1.40 |
293877 | 0.20 | July 8, 2005 | July 8, 2007 | 730 | 2 | 1.80 |
333889 | 1.00 | January 18, 2006 | January 18, 2008 | 730 | 2 | 1.00 |
802963 | 0.25 | November 25, 2013 | November 25, 2015 | 730 | 2 | 1.75 |
347850 | 1.00 | June 5, 2005 | June 4, 2008 | 1,095 | 3 | 2.00 |
453004 | 0.36 | March 10, 2009 | March 9, 2012 | 1,095 | 3 | 2.64 |
248601 | 1.04 | November 29, 2001 | November 28, 2005 | 1,460 | 4 | 2.96 |
732241 | 1.00 | November 29, 2012 | November 28, 2017 | 1,825 | 5 | 4.00 |
178283 | 4.00 | October 26, 1995 | October 24, 2001 | 2,190 | 6 | 2.00 |
256869 | 3.00 | November 4, 1999 | November 2, 2005 | 2,190 | 6 | 3.00 |
281275 | 1.00 | July 25, 2000 | July 24, 2006 | 2,190 | 6 | 5.00 |
324891 | 1.00 | September 12, 2002 | September 10, 2008 | 2,190 | 6 | 5.00 |
238742 | 1.02 | November 3, 1995 | October 31, 2004 | 3,285 | 9 | 7.98 |
423772 | 1.01 | September 4, 1998 | September 1, 2010 | 4,380 | 12 | 10.99 |
388723 | 1.26 | November 16, 1995 | November 12, 2009 | 5,110 | 14 | 12.74 |
—-
VAERS_ID: 114216
Symptom_text:
Pt recv vax on 6/26/97; on the same day pt exp fever (104) after Tylenol &Advil tx. Pt went to Dr who advised no more pertussis vax. Pt was irritable.
Comment:
VAX_DATE: 1997-06-26
ONSET_DATE: 1998-06-26
But, SYMPTOM_TEXT lists reaction the same day
Data entry error.
—-
VAERS_ID: 204919
Symptom_Text:
Thirty to 45 min. after heart rate and resp. rate started to drop. She was on a apnea monitor. She didn”t act right and the next 24 hrs feeding, crying a lot Nov 3 she was taken back to Dr. jerking ? and legs and dismissed. Seizure disorder.
Comment:
VAX_DATE: 2002-11-01
ONSET_DATE: 2003-11-01
But, SYMPTOM_TEXT lists reaction the same day, 30 to 45 minutes after the vaccination. ONSET_DATE should be the same as VAX_DATE, not the same day, the following year.
——
VAERS_ID: 766331
Symptom_Text:
Within 5 minutes of being vaccinated patient let out high pitch screaming, followed by high pitch crying, then lips turned purple and gasping for breathe.
Comment:
VAX_DATE: 2017-01-08
ONSET_DATE: 2018-01-08
Symptom_Text:
“Within 5 minutes”. ONSET_DATE should be the same day as VAX_DATE.
————–
Incorrect AGE_YRS:
VAERS_ID | AGE_YRS | vax_date | onset_date | numdays | NUM_YRS | YRS_Diff |
---|---|---|---|---|---|---|
747613 | 0 | May 7, 2007 | May 7, 2018 | 4,018 | 11.01 | 11.01 |
Symptom_text:
Late entry 0953 11 year old female received Tdap and MCV-4 while getting in position to administer last shot (HPV-9) she had a syncope episode with Skin pale, clammy, jerking activity, unresponsive and sliding to the floor. Assisted patient to floor safely and elevated her feet. Patient regained consciousness immediately. Alert and oriented to name and place. BP 110/62 HR 88 respirations 22. No rashes noted, denied itching, no SOB Complained of nausea and headache. Lungs clear all 4 lobes. Patient hands s
Comment:
AGE_YRS listed as 0
But Symptom_text lists age as 11
vax_date: 2007-05-07
onset_date: 2018-05-07
Looks like data entry error
From Symptom_text, Vax_Date should be the same as the Onset_date
Looks like Vax_Date was used for the Birth date.
AGE_YRS is listed as 0, when it should be the actual age of 11, or NULL
—–
Impact Analysis:
Percent of VAERSDATA populated with both AGE_YRS and NUMDAYS NOT NULL
(see queries above)
= 100 * 546,062/ 759,483
= 71.89%
Rounded, is: 72%
If we don’t include the 6,425 rows that comes from this query (see above)
convert ( float, age_yrs ) > .2
546,062 – 6,425
= 539,637
= 100 * 539,637 / 759,483
= 71.05
Rounded, is: 71%
About a one percent difference.
These rows will eliminated from some queries involving NUMDAYS and/or AGE_YRS.