In November 2017, the National Science Foundation’s Big Data Innovation Hubs sponsored a workshop in Versailles, France to discuss the formation of public-private partnerships in big data research among institutions in the United States and the European Union. Organized in conjunction with the Big Data Value Association, the PICASSO Project, and Inria, the workshop was the first of its kind to bring together international big data experts representing government, industry, and academia.
The one-day workshop began with thematic keynote speeches and panel discussions in the morning, followed by breakout sessions with facilitated discussions of innovative projects in academia and industry. Challenges and opportunities in four main areas of big data research emerged as discussions unfolded:
Those working to create smart cities face a number of challenges, including how to collect and integrate massive amounts of data and make it usable; a lack of public-private partnerships; and the challenges involved in ensuring that all smart city residents have equal opportunities to use and derive benefit from big data. Countries around the world are experimenting with various levels of “smartness” within their cities, so there are many opportunities for learning about what works in creating cities in which data is used to benefit all. The group recommended compiling a list of smart cities with exemplary success stories, creating and expanding public-private partnerships and integrated transatlantic working groups, and facilitating conversations through joint presentations and conferences in a variety of sectors.
There are numerous challenges associated with collecting, sharing, and storing vast amounts of complex and disparate health data. These challenges include data interoperability and integration, navigating complex legal requirements and regulations that exist when sharing data internationally, and concerns about biased data that lack inputs from minority or disadvantaged populations. Despite these challenges, there are opportunities to leverage both EU and US commonalities and goals for collaboration. For example, a unique opportunity exists to create an international community of citizen scientists, where individuals can share data or propose their own research questions. Data from this community could be stored and managed separately within the EU and US, and each country could leverage machine learning to develop, train, and test models on separate datasets. This crowdsourcing approach involves sharing machine learning models and metadata without sharing the raw data. Additionally, advanced statistical methods used in creating these models could be applied to addressing the problem of data sparsity related to disadvantaged populations.
Environment, Agriculture, Food, Energy, and Water
The connection between the environment, agriculture, food, energy, and water is vast and complicated, but several themes emerged during the breakout sessions. Data interoperability was identified as a significant issue; a large amount of available data doesn’t reach potential users due to a lack of organized, enriched metadata. The application and use of sensors and sensor networks is vital to many projects in this sphere, and there is an urgent need for low-cost, high-resolution sensors that can deliver quality data. These kinds of sensors would enable not just professional researchers, but also citizen scientists, to generate valuable data with the potential to shape policy and manage the demand for scarce resources.
In any data science education program, the most significant measure of success will be not how well the program teaches students to master specific teachnologies or skills, but rather how well the program does at producing graduates that can adapt their skills to an ever-changing technical landscape. There is a strong need for a standardized curriculum that teaches data skills from elementary school all the way through master’s and doctoral programs, though there is not yet a clear path forward on how to share those standards across institutions and countries.
To learn more about the workshop and the report that will be produced about it, or to discover all of the NSF Big Data Innovation Hubs’ efforts, stay in the loop with each of the four regional hubs: The South Hub, The Northeast Hub, The Midwest Hub, and The West Hub.
Authors: Samia Ansari, Alex Cheng, Wendy Christensen, Ashley Griffin, Chloe Rotman