Errors in DOH data were honest mistakes, says data scientist



May 14 ------ The errors and inconsistencies in the Department of Health’s (DOH) data on coronavirus patients were mostly a result of honest mistakes by frontliners, a data scientist said Wednesday. Stephanie Sy, chief of Thinking Machines Data Science that is working with the DOH amid the pandemic, explained that the mistakes pointed out by the University of the Philippines Resilience Institute were only less than 1 percent of the total data. “There is no such thing as a 100 percent correct data set in the real world. In this case, our objective is to provide timely and useful data sets for policymaking,” she said. “A 99 percent consistent and reliable data set is a useful data set for the policies and decisions that we have been making.” The DOH has likewise clarified that the errors and inconsistencies were only a “nominal” part of all of the data. Sy also assured that they are working toward a data set that is 99.9 percent reliable and consistent by digitizing the surveillance of coronavirus patients through the COVID KAYA application. At present, health workers must manually fill out case investigation forms which are then encoded into the DOH’s system. Sy explained that they do not update publicly shared data sets after 4 p.m., the time when DOH announces updates on coronavirus cases and holds its daily press briefing. “We do not update the previous day’s data because we believe in transparency and maintaining what is in the public record as was true of that day, but we do have it in the technical notes that the most reliable data is the data shared for that day,” she said. Starting Wednesday, the publicly released data sets will include information about which particular data points were corrected or modified, Sy added. “Our goal here is to create a data system that can account for these errors, correct these errors, and have the corrected data sets reflected so that policymakers can make good decisions,” she said. Source: gmanetwork.com