Prompt 4: How can data sharing and archiving capabilities be enhanced to ensure the greatest scientific impact?

Sarah Alcala, Motivf
Laura Wolf, U.S. Department of Health and Human Services
Leremy Colf, U.S. Department of Health and Human Services

We conducted our research project explicitly to impact mitigation rebuilding after Hurricane Harvey. When the goal of research is to inform operational decision-making, it is essential to share data not only as broadly as possible, but also in a targeted manner with those capable of implementing research results. In our case, this meant the Federal Emergency Management Agency and Texas state officials overseeing the Hazard Mitigation Grant Program, as well as the administrators of both public and private hospitals. It was, therefore, critical to examine not only peer-reviewed publications, but also options to engage policy makers and emergency managers. This ensures the greatest scientific and operational impacts.

Christina Bandaragoda, University of Washington
Miguel Leon, University of Pennsylvania
Jim Phuong, University of Washington
Graciela Ramirez-Toro, Inter American University of Puerto Rico
Kelsey Pieper, Virginia Tech
William Rhoads, Virginia Tech
Tim Ferguson-Sauder, Olin College
Jeffery Horsburgh, Utah State University
Jerad Bales, Consortium of Universities for the Advancement of Hydrological Science
Sean Mooney, University of Washington
Martin Seul, Consortium of Universities for the Advancement of Hydrological Science
Kari Stephens, University of Washington
Erkan Istanbulluoglu, University of Washington
Julia Hart, University of Washington
Marc Edwards, Virginia Tech
Amy Pruden, Virginia Tech
Virginia Riquelme, Virginia Tech
Ishi Keenum, Virginia Tech
Ben Davis, Virginia Tech
Emily Lipscomb, Virginia Tech
David Tarboton, Utah State University
Amber Spackman Jones, Utah State University
Eric Hutton, Cooperative Institute for Research in Environmental Sciences
Gregory Tucker, University of Colorado Boulder
Scott Peckham, University of Colorado Boulder
Christopher Lenhardt, Renaissance Computing Institute
William McDowell, University of New Hampshire
David Arctur, University of Texas at Austin

Recovery efforts from natural disasters can be more efficient with data-driven information on current needs and future risks. We advance open-source software infrastructure to support scientific investigation and data-driven decision making with a data sharing system using a water quality assessment developed to investigate post-Hurricane Maria drinking water contamination in Puerto Rico. One limitation to effective disaster response is easy and rapid access to diverse information about available resources and maps of community resource needs and risks. Research products are made Findable, Accessible, Interoperable, and Reproducible (FAIR) using HydroShare, a collaborative online sharing platform. Curating a central repository of assembled research data has the potential to greatly facilitate coordinated disaster responses of all types, with opportunities to improve planning, preparedness, and monitoring of the recovery process.

Shirley Dyke, Purdue University
Chul Min Yeum, Purdue University
Mathieu Gaillard, Purdue University
Bedrich Benes, Purdue University
Thomas Hacker, Purdue University
Alana Lund, Purdue University
Ali Lenjani, Purdue University
Julio Ramirez, Purdue University

Despite the tremendous financial investments made in collecting valuable data about the post-event condition of our communities, current data repositories have several limitations. There is a need to overcome these limitations to greatly accelerate the use of these data to learn from hazards. For instance, searching across one, or several, data repositories is just not possible. Access to the image collections is typically restricted, except through a given portal interface. In addition, the images are rarely annotated and, when they are, there is a lack of consistency and ambiguous definitions. In addition, none of these repositories allow for the search of the visual contents of those images, and the analytic tools and capabilities needed are not available for researchers. Overall, a user must already know what she or he is looking for before searching for it. Only a limited portion of the data in the repositories are searchable for future reuse.

Amal Elawady, Florida International University
Ehssan Sayyafi, Rimkus Consulting Group
Arindam Gan Chowdhury, Florida International University
Peter Irwin, Florida International University

A database of data and observations collected should be available to every researcher. This way, collaboration stemming from the need to develop the resiliency and sustainability of the built environment can be pursued efficiently. In addition, new trends in technology allow different sharing capabilities. A very powerful example is the DesignSafe initiative encouraged by the National Science Foundation.  

Erica Fischer, Oregon State University
Manny Hakhamaneshi, AMEC Foster Wheeler
Maggie Ortiz-Millan, Earthquake Engineering Research Institute
Beki McElvain, Earthquake Engineering Research Institute

Data sharing and archiving promote scientific impacts of earthquake reconnaissance and research through enhanced access to information. Earthquake reconnaissance requires extraordinary financial, time, and personnel resources. As more multidisciplinary reconnaissance teams are formed and actively engaged with communities during reconnaissance, the data that results from these trips provide both breadth and depth to reconnaissance topics. Data sharing and archiving provide opportunities for researchers around the world to build off of work and use the data to benchmark numerical analyses techniques. Access to this type of data also provides opportunities for cross-disciplinary research to demonstrate whole community impacts of the disaster and innovative pre-disaster mitigation and post-disaster recovery initiatives. The scientific impact of post-earthquake reconnaissance and data collection can have a real-life impact on communities around the world through data sharing and archiving.

Jie Gong, Rutgers University
Mengyang Guo, Rutgers University
Yi Yu, Rutgers University

The federation and fusion of multi-sourced data can greatly facilitate damage assessment. In addition, content-based retrieval is key to efficient exploration of large disaster datasets. Cloud-based infrastructures provide promising capabilities in data archiving and sharing. However, more research is needed on dedicated data analytics.

Michelle Meyer, Louisiana State University
Brant Mitchell, Louisiana State University
Stuart Nolan, Louisiana State University
This project involves qualitative interviews and document data collection from persons who value their privacy and confidentiality. But also, this research presents a challenge to data sharing and archiving, as much of the interview data will need to be redacted to ensure confidentiality is maintained. We would like to discuss processes and protocols for ensuring data access while also maintaining confidentiality for participants, especially for qualitative research. 

Aubrey Miller, National Institute of Environmental Health Sciences

Responses to various disasters and health emergencies have revealed the dire need for improved ability to perform timely data collection and research for such events. While much has been done to improve the life-saving response for public health emergencies, these events have revealed notable gaps in our ability to develop, coordinate, and implement needed scientific research in response to disasters. It took 11 months to begin a longitudinal health study of exposed workers after the Gulf Oil spill. Unfortunate delays were also experienced with Superstorm Sandy, responses to Ebola and Zika outbreaks, and other recent events. Such delays adversely affect the ability to identify participants or gather critical information to determine disaster-related risk factors, such as resiliency, health outcomes related to exposure or other stressors, or efficacy of various response activities. Critical data are lost if not collected in a timely, systematic, and scientifically rigorous manner through coordinated interdisciplinary efforts with multiple stakeholders, including impacted communities.

Judith Mitrani-Reiser, National Institute of Standards and Technology
John van de Lindt, Colorado State University
Shane Crawford, University of Alabama
Andrew Graettinger, University of Alabama
Nathanael Rosenheim, Texas A&M University
Walter Peacock, Texas A&M University

A workshop was held at the last National Conference of Earthquake Engineering in Anchorage, Alaska, to convene researchers from the United States, New Zealand, Italy, Chile, Japan, and Canada to discuss recent seismic events, experiences with post-earthquake data collection, and the current culture of data sharing among scientists and engineers. The majority of the outcomes of this workshop are relevant across hazard types and disciplines, and necessary to enhance the knowledge gained from these field activities. These include cooperation during data collection in the field, a culture of open sharing, development of agreements to support data sharing, establishment of collaborative relationships pre-event, a standardized taxonomy, and development of inventories of existing infrastructure. Several important developments have occurred in the past four years, including the establishment of the National Institute of Standards and Technology's Center of Excellence for Risk-Based Community Resilience Planning and the National Science Foundation's Natural Hazards Engineering Research Infrastructure network, that can significantly contribute towards addressing the needs identified by this workshop. 

Ali Mostafavi, Texas A&M University
Philip Berke, Texas A&M University
Arnold Vedlitz, Texas A&M University
Bjron Birgisson, Texas A&M University
Sierra Woodruff, Texas A&M University

In the aftermath of disasters, several research teams conduct rapid response studies and data collection. One important limitation is the lack of information regarding who collects what data. Improving coordination in data collection efforts can provide great opportunities for synergizing efforts and reducing requests from agencies for survey participation and interviews. An online application system that enables researchers to coordinate data collection efforts and possibly share and integrate datasets will have significant scientific impacts. To accomplish this, users of online application could sign a memorandum of understanding to ensure confidentiality of information shared. In addition to a system for coordinating data collection, there could be incentives for research teams to synergistically collaborate in data collection efforts. Perhaps, part of National Science Foundation RAPID awards could be dedicated for collaborative data collection efforts as an incentive for further coordination and collaboration. Addressing institutional review board issues for multi-team/multi-institution data collection efforts would be another challenge to address.

Kelsey Pieper, Virginia Tech
William Rhoads, Virginia Tech
Drew Gholson, Texas A&M University
Diane Boellstorff, Texas A&M University
Gregory House, Virginia Tech
Adrienne Katner, Louisiana State University
Kristine Mapili, Virginia Tech
Amy Pruden, Virginia Tech
Marc Edwards, Virginia Tech

There are limited disaster-related assistance and resources because private well water is unregulated by state and federal agencies. Continued collaboration and data sharing will help further characterize needs and threats of this underserved community. Our well water quality dataset and accompanying surveys provide an overview of flood-impacted groundwater quality, well system characteristics, well owner behaviors before and after hurricanes, well water resources and information needs, and flooding damage. Such information could be paired with other datasets, such as outbreak surveillance data, to understand the potential health risks for this community. Moreover, emergency response related sampling protocols, research surveys, and outreach strategies are in need of collaborative refinement to ensure that outcomes of mutual interest are assessed.

Ellen Rathje, University of Texas at Austin
Tim Cockerill, University of Texas at Austin

A key aspect of DesignSafe's mission is to provide researchers with the infrastructure to share their data and research results such that they may be discoverable, reused, and cited. DesignSafe is a web-based platform where researchers can upload, analyze, curate, and publish data. The curation process is based on data models developed jointly by the DesignSafe team and the research community so that the vocabulary is controlled and consistent, enhancing the discoverability and reuse of data. Researchers curate their data by assigning their files to categories (e.g., model, sensor, event) as described in the relevant data model for their research (e.g., experimental, field reconnaissance, numerical simulation). Researchers then select which of their data they want to publish that then receives a Digital Object Identifier that is a permanent, citable record.