Prompt 4: How can data sharing and archiving capabilities be enhanced to ensure the greatest scientific impact?
We conducted our research project explicitly to impact mitigation rebuilding after Hurricane Harvey. When the goal of research is to inform operational decision-making, it is essential to share data not only as broadly as possible, but also in a targeted manner with those capable of implementing research results. In our case, this meant the Federal Emergency Management Agency and Texas state officials overseeing the Hazard Mitigation Grant Program, as well as the administrators of both public and private hospitals. It was, therefore, critical to examine not only peer-reviewed publications, but also options to engage policy makers and emergency managers. This ensures the greatest scientific and operational impacts.
Recovery efforts from natural disasters can be more efficient with data-driven information on current needs and future risks. We advance open-source software infrastructure to support scientific investigation and data-driven decision making with a data sharing system using a water quality assessment developed to investigate post-Hurricane Maria drinking water contamination in Puerto Rico. One limitation to effective disaster response is easy and rapid access to diverse information about available resources and maps of community resource needs and risks. Research products are made Findable, Accessible, Interoperable, and Reproducible (FAIR) using HydroShare, a collaborative online sharing platform. Curating a central repository of assembled research data has the potential to greatly facilitate coordinated disaster responses of all types, with opportunities to improve planning, preparedness, and monitoring of the recovery process.
Despite the tremendous financial investments made in collecting valuable data about the post-event condition of our communities, current data repositories have several limitations. There is a need to overcome these limitations to greatly accelerate the use of these data to learn from hazards. For instance, searching across one, or several, data repositories is just not possible. Access to the image collections is typically restricted, except through a given portal interface. In addition, the images are rarely annotated and, when they are, there is a lack of consistency and ambiguous definitions. In addition, none of these repositories allow for the search of the visual contents of those images, and the analytic tools and capabilities needed are not available for researchers. Overall, a user must already know what she or he is looking for before searching for it. Only a limited portion of the data in the repositories are searchable for future reuse.
A database of data and observations collected should be available to every researcher. This way, collaboration stemming from the need to develop the resiliency and sustainability of the built environment can be pursued efficiently. In addition, new trends in technology allow different sharing capabilities. A very powerful example is the DesignSafe initiative encouraged by the National Science Foundation.
Data sharing and archiving promote scientific impacts of earthquake reconnaissance and research through enhanced access to information. Earthquake reconnaissance requires extraordinary financial, time, and personnel resources. As more multidisciplinary reconnaissance teams are formed and actively engaged with communities during reconnaissance, the data that results from these trips provide both breadth and depth to reconnaissance topics. Data sharing and archiving provide opportunities for researchers around the world to build off of work and use the data to benchmark numerical analyses techniques. Access to this type of data also provides opportunities for cross-disciplinary research to demonstrate whole community impacts of the disaster and innovative pre-disaster mitigation and post-disaster recovery initiatives. The scientific impact of post-earthquake reconnaissance and data collection can have a real-life impact on communities around the world through data sharing and archiving.
The federation and fusion of multi-sourced data can greatly facilitate damage assessment. In addition, content-based retrieval is key to efficient exploration of large disaster datasets. Cloud-based infrastructures provide promising capabilities in data archiving and sharing. However, more research is needed on dedicated data analytics.
Responses to various disasters and health emergencies have revealed the dire need for improved ability to perform timely data collection and research for such events. While much has been done to improve the life-saving response for public health emergencies, these events have revealed notable gaps in our ability to develop, coordinate, and implement needed scientific research in response to disasters. It took 11 months to begin a longitudinal health study of exposed workers after the Gulf Oil spill. Unfortunate delays were also experienced with Superstorm Sandy, responses to Ebola and Zika outbreaks, and other recent events. Such delays adversely affect the ability to identify participants or gather critical information to determine disaster-related risk factors, such as resiliency, health outcomes related to exposure or other stressors, or efficacy of various response activities. Critical data are lost if not collected in a timely, systematic, and scientifically rigorous manner through coordinated interdisciplinary efforts with multiple stakeholders, including impacted communities.
A workshop was held at the last National Conference of Earthquake Engineering in Anchorage, Alaska, to convene researchers from the United States, New Zealand, Italy, Chile, Japan, and Canada to discuss recent seismic events, experiences with post-earthquake data collection, and the current culture of data sharing among scientists and engineers. The majority of the outcomes of this workshop are relevant across hazard types and disciplines, and necessary to enhance the knowledge gained from these field activities. These include cooperation during data collection in the field, a culture of open sharing, development of agreements to support data sharing, establishment of collaborative relationships pre-event, a standardized taxonomy, and development of inventories of existing infrastructure. Several important developments have occurred in the past four years, including the establishment of the National Institute of Standards and Technology's Center of Excellence for Risk-Based Community Resilience Planning and the National Science Foundation's Natural Hazards Engineering Research Infrastructure network, that can significantly contribute towards addressing the needs identified by this workshop.
There are limited disaster-related assistance and resources because private well water is unregulated by state and federal agencies. Continued collaboration and data sharing will help further characterize needs and threats of this underserved community. Our well water quality dataset and accompanying surveys provide an overview of flood-impacted groundwater quality, well system characteristics, well owner behaviors before and after hurricanes, well water resources and information needs, and flooding damage. Such information could be paired with other datasets, such as outbreak surveillance data, to understand the potential health risks for this community. Moreover, emergency response related sampling protocols, research surveys, and outreach strategies are in need of collaborative refinement to ensure that outcomes of mutual interest are assessed.
A key aspect of DesignSafe's mission is to provide researchers with the infrastructure to share their data and research results such that they may be discoverable, reused, and cited. DesignSafe is a web-based platform where researchers can upload, analyze, curate, and publish data. The curation process is based on data models developed jointly by the DesignSafe team and the research community so that the vocabulary is controlled and consistent, enhancing the discoverability and reuse of data. Researchers curate their data by assigning their files to categories (e.g., model, sensor, event) as described in the relevant data model for their research (e.g., experimental, field reconnaissance, numerical simulation). Researchers then select which of their data they want to publish that then receives a Digital Object Identifier that is a permanent, citable record.