The South Big Data Hub brings missing voices to the conversation on data science education


HUB Group
Participants of the Negotiating the Digital and Data Divide Workshop, in front of the wall of challenges and visions used to collect ideas on the future of data science education.

This month, participants from universities across the nation, community colleges, tribal colleges, minority-serving institutions, nonprofits, and industry joined forces with the South Big Data Hub and Georgia Tech to confront the challenges of building data science capacity through traditional and alternative educational practices. Organized by Dr. Renata Rawlings-Goss, a co-executive director of the South Big Data Hub, the two-day workshop, sponsored by multiple directorates within the National Science Foundation, brought together a diverse mix of participants to navigate the complex issues of reforming data science education to prepare for the data-driven workforce of the future.

“An entirely new type of workforce is needed for the 21-century. One that will require data-enabled talent for jobs across industries and government, as well as for future scientific discovery,” said Rawlings-Goss “That is why we are partnering with sponsors and people spanning many disciplines and roles to make sure that the discussion of data science remains as broad as it needs to be. To achieve the talent pool needed for continued U.S. growth and competitiveness in this new data economy, we must break open and structure education with access for all. Programs must include minorities, reach low-income future students, and pair different institution types like two-year teaching institutions with universities.”

Rawlings-Goss also heads the education working group within the Big Data Hub consisting of a nationwide network that tackles tough problems like these. And she wants everyone to join the conversation. “Data science is something that can be brought into nearly every discipline; the idea that data science resides in only a few traditional math and computer science disciplines is part of the challenge we are trying to overcome in education,” she said.

National Science Foundation Program Manager Stephanie August, who attended the event said, “NSF is involved because we want to get at the missing voices. We need these participants who are essential but mostly absent and unheard in the movement to develop data science programs.”

Many research universities are developing comprehensive data science programs at the undergraduate and graduate levels. As the movement to develop data science programs grows, a gap is forming that separates research institutions from primarily undergraduate-focused institutions, including community colleges and several minority-serving institutions.

Specific issues discussed included access to data, critical thinking, designing curriculum and assessment, data literacy, diversity, ethics, resources and staffing, building collaborations, and the pipeline to higher education from K-12. Participants also addressed training data science practitioners, and translational data science, or the application of data science principles, techniques, and technologies to scientific problems impacting human or societal welfare.

Recent education-enabling projects were showcased at the event. Aleksandr Blekh of Georgia Tech’s College of Engineering introduced participants to his work with JupyterHub, building executable textbooks in support of data science education. The South Hub recently partnered with U.C. Berkeley and completed an installation and demo that included 60 faculty members and deans from across the country interested in using the tool to expand their data science capacity.

The workshop agenda moved from awareness of existing programs, practice and challenges, specific topics and stakeholders, to actions needed to create the vision, curricula, programs, and opportunities of the future. It included discussion of what is important for the future of technology and society in the United States, and improving access to essential and economically viable jobs, especially for minorities and low-income students and workers.

Tasha Inniss, director of education and industry outreach at the nonprofit professional society INFORMS said, “We have all gone to conferences that attempt to discuss these issues and policy without having the full representation of the necessary stakeholders. This workshop, fortunately, has been different, because it includes participation of faculty from minority-serving institutions and community colleges as well as representatives from industry and non-profit organizations. Since INFORMS has members who are analytics and operations research professionals and students, I am particularly interested because it fits nicely with our analytics education goals.

Mary Rudis, a mathematics instructor from Bates College agreed. “This is the first event I have been to that is constructed from the ground up with the right mix, capable of producing a real outcome. The blend of participants in terms of diversity, institutional types, roles, and expertise reflects the very nature of data science. It is multidisciplinary, transdisciplinary, with complex issues that require input from many points of view.”

Rawlings-Goss, Iniss, Rudis, and more than a dozen other authors will continue to work together after the event to develop a report including plans and schedules to convert their vision to reality. Findings will be presented at the National Academies of Science in Washington DC in early December.

Hub Amy
Amy Langville, Professor of Mathematics at the College of Charleston joins the conversation on keeping data science broad at the Negotiating the Digital and Data Divide Workshop.  

This workshop, “Negotiating the Digital and Data Divide,” is part of the “Keeping Data Science Broad” series created by Rawlings-Goss. Other activities include webinars and presentations to garner community input into pathways for keeping data science as a discipline broadly inclusive. The growing community seeks input from data science programs in any region across the nation, either traditional or alternative, and from a range of institution types. Two webinars leading up to the workshop explored the future of data science education and workforce at institutions of higher learning that are primarily teaching-focused. A webinar after this workshop will be announced to report on its outcomes and next steps.

The Keeping Data Science Broad series is co-sponsored by the National Science Foundation’s Directorates of CISE, MPS, EHR, and SBE, with participation from the National Academies of Science. It is also sponsored by the South Big Data Innovation Hub, and Georgia Tech’s Institute for Data Engineering and Science.

Visit this link to share your thoughts on the future of data science education with the South Big Data Hub.


Previous events in the Keeping Data Science Broad series include:

  • “Data Science Education in Traditional Contexts,” recorded August 31, 2017, highlighted universities, teaching institutions, community colleges, and minority-serving institutions that have implemented data science education undergraduate programs as case studies for workshop participants to consider and compare to their own contexts.
  • Alternative Avenues for Development of Data Science Education Capacity, recorded September 22, 2017, explored efforts that build data science education capacity outside of the context of tradition curricular program development. Examples include integration of data science into courses and curricula outside of the traditional computer science, math, or statistics context (i.e., arts and humanities), the expansion of capacity by integrating third party or shared resources (i.e., MOOCs and open source educational resources) into curricula, and additional educational options outside of traditional courses (i.e., faculty training, “Data Science for Social Good” programs, and bootcamps).

View recordings of past events on the South Hub website. Engage in upcoming 2018 South Big Data Hub activities. Join the Education Working Group or participate in a South Big Data Hub Data Carpentry Event. Increase your impact by attending a “train the trainer” session, or learn about the Hub’s recently completed installation and demo of the JupyterHub executable textbooks in support of data science education. Contact Renata Rawlings-Goss of the South Big Data Hub at




Hub Clustering
Participants at the Negotiating the Digital and Data Divide Workshop cluster ideas into topics. Kari Jordan, Director of Assessment at Data Carpentry; Nicholas Horton, Professor of Statistics at Amherst College; (seated) Melvin Greer, Chief Data Scientist at Intel Corp.


Hub TeamWriting
Small writing teams at the Negotiating the Digital and Data Divide Workshop capture and translate ideas into recommendations. Melvin Greer, Chief Data Scientist of Intel Corp.; Patricia Ordóñez of the University of Puerto Rico – Rio Piedras.


Hub Voting Priorities
Voting on top priority “asks,” representing concrete next steps to achieving the bright future envisioned for data science education at the Negotiating the Digital and Data Divide Workshop. (right to left) Patricia Ordóñez of the University of Puerto Rico – Rio Piedras; Jessis Bemley, professor at Bowie State University; Catherine Cramer, senior program developer at New York Hall of Science; Karl Schmitt, director of data sciences at Valparaiso University.


Hub Presentations
Participants give presentations and seek peer feedback at the Negotiating the Digital and Data Divide Workshop. (from left) Velma Latson, lecturer at Bowie State University; Tasha Inniss, director of education and industry outreach at INFORMS.


Hub . Renata
Renata Rawlings-Goss, co-executive director of the South Big Data Hub, and director of industry engagement for the Institute for Data Engineering and Science at Georgia Tech. Rawlings-Goss is the creator of the Keeping Data Science Broad Series; she organized and ran the Negotiating the Digital and Data Divide Workshop and other series events.






Login Logout