View this email in your browser

Data Science vs Data Literacy

I have just returned from two and a half days at the Data Science Education Technology (DSET) Conference in Berkeley, California. It was truly amazing to have ~100 people in a room who have all spent time thinking about educating students about working with data and offer a unique perspective on what, exactly, that means.

One of the topics that occupied a great deal of conversation throughout the conference had to do with the distinction between “data science” and “data literacy.” According to WikiPedia, “Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to Knowledge Discovery in Databases (KDD).” I actually like that definition, in that it talks about using data as part of a scientific method of producing knowledge, and describes it as an interdisciplinary field. Furthermore, it is consistent with the idea that data science can be pursued at multiple levels, ranging from that of a novice user to that of a highly-trained professional data analyst.

Data science is not currently taught in most schools. However, statistics has been part of education for decades, and statistics plays an important role in data science. It provides a set of mathematical tools and techniques that are critical to analyzing data, and as such, data science must be grounded in a solid knowledge of statistical methods. However, there are other aspects of data science, like data management, documentation and metadata, that seem to fall beyond the realm of typical statistics. Also, as datasets become larger and less structured in the world of “big data,” new challenges arise in applying classical statistical methods, with phenomena like auto-correlation becoming increasingly possible.

At the Oceans of Data Institute, our focus has been to promote “data literacy,” which is not quite the same thing as data science, and which extends far beyond the boundaries of statistics. In our “Promoting Global Data Literacy” workshop, a panel of twelve experts was asked to help us define what a data-literate high school graduate looked like. They agreed that:

The data-literate individual understands, explains, and documents the utility and limitations of data by becoming a critical consumer of data, controlling his/her personal data trail, finding meaning in data, and taking action based on data. The data-literate individual can identify, collect, evaluate, analyze, interpret, present, and protect data.

I think that individuals focused on statistics will find plenty of familiar ground in this definition; those concerned with data science even more so. However, I don’t believe that either of these disciplines quite capture the entirety of what it means to be data-literate, which includes the critical thinking skills involved in becoming a critical consumer of data, and consciously curating the data trail that each of us is continuously producing – and which is growing in depth and breadth with each new digital device we incorporate into our lives. I believe that all of these aspects are important to being a responsible citizen in our society. To achieve data literacy we must honor and build upon the considerable knowledge that has already been gained over the years about teaching data skills, while incorporating the teaching of new skills essential in the age of big data.


Randy Kochevar, Director
The EDC Oceans of Data Institute

Portrait of Randy Kochevar
View the DSET Program & Presenters.

See DSET Conference Tweets at #dsetonline.
Sign our Call for Action to Promote Data Literacy
Forward to Friend
Visit us online:

unsubscribe from this list    update subscription preferences 

Email Marketing Powered by Mailchimp