Chapter 1: Introduction to the User Guide

Democratizing Data Search and Discovery Platform User Guide

1.1 Overview

The Search and Discovery Platform has been developed to describe how datasets identified by a set of federal agencies have been used. This user guide provides information about the Search and Discovery Platform workflow. The Guide is designed to provide information to agency staff and to researchers who are using the platform to understand how datasets are used and want to know more information about how the reported results were generated. The Guide is also designed to encourage agency and researcher communities to contribute to the platform, and thus increase the value of data for both for themselves and the community at large. Understanding data use and value is a complex endeavor, and the platform will need the contribution of many experts to be fully successful.

With these two goals in mind each chapter unpacks a piece of the workflow: a brief summary is followed by a non-technical description with links to more details for those who are interested. More technical information is provided in the appendices.

The structure of the user guide is as follows. It begins with the vision, goals and context (Chapter 2). Chapter 3 provides a roadmap to each of the subsequent chapters. The process for generating the underlying information (the metadata) is described in Chapters 4-6. It begins by describing the source corpus (Chapter 4), the Machine Learning models that are used to find how datasets are used in publications (Chapter 5) and how the output is validated (Chapter 6). The following sections (Chapters 7-8) describe how users can access the data through SciServer (Chapter 7) and through the API (Chapter 8). The current use of the datasets is through the researcher dashboard (Chapter 9) and the network visualization tools (Chapter 10), although new uses can always be developed by the community through SciServer. The concluding section (Chapter 11) describes the potential user community and identifies ways in which the agencies can engage with the community. It describes how government agencies and researchers are beginning to use the tools, and provides information about other ways in which stakeholders can become involved - including participating in upcoming workshops, developing better models, contributing new usage measures or providing links to missing documents or data providers. The appendices provide details about the data models and dictionaries.

The project team members would be thrilled to receive suggestions about how to improve the guide, the platform, or any of its components. Please send any suggestions to: democratizing.data.project@gmail.com.

Last updated