Intern data scientists developing Covid-19 database
Data science interns from Africa's biggest data science school are using their skills to build a comprehensive database on the Covid-19 pandemic that will be available in the next week.
In a world where there is an abundance of data due to the virus, they believe that it is important for information to be centralised and easy to consume. Data scientists use science, maths and algorhythms to centralise extensive data.
Twenty-four-year-old Aphiwe Rasisemula is one of these students from Explore Data Science Academy who are working on the publicly available database.
"With this database we hope to give valuable insights, so that the government and those working so hard on the ground to try reduce the spread of this virus can make informed decisions based on the current statistics. We hope the database helps us understand this pandemic and ensure a successful fight in saving lives," he said.
Although the government has their own database, they believe this can be complementary and is the first created by young data scientists under the watchful eye of experienced scientists.
The database will be coming out in two stages. The first stage will have daily testing numbers, case numbers by country and province, and government countermeasure data, automated real-time updates.
The second stage will include patient-level-data meaning people will be able to know details such as the patients travel history, symptoms, age, gender. Hospital data will also be available including ICU beds that are available, patients on ventilators etc.
Rasisemula, who hails from Vosloorus in Ekurhuleni, said he joined the academy because he wants to be a part of the fourth industrial revolution.
"I joined Explore Data Science Academy because I realised the world is now in Fourth Industrial Revolution [mode]. Everything is run by computers and I had to be a part of it."Taking this journey towards becoming a data analyst was a transition from electrical engineering, which I was studying at the University of Johannesburg."
Co-founder of Explore Data Science Academy Shaun Dipnall said : “The aim is to centralise data coming from a host of available resources, all of which are useful in making beneficial analyses. These resources include data from GitHub, repositories such as the one from the University of Pretoria, the NICD and global data sources."NICD stands for National Institute for Communicable Diseases, while GitHub.com is a well-known platform for software development.