APPENDIX B: Metadata Table and Data Dictionary

The following metadata tables apply to the data that can be accessed both in SciServer and through using the APIs. For each API call, results can be filtered by agency designation using url parameter agency={AGENCY}. For example, https://prod.democratizing-data.tacc.utexas.edu/authors?agency=USDA

In each case below, the API returns data in the schema listed in the table, unless noted otherwise. The API endpoint can be found below each table.

agency_run: the table of runs for the different agencies and their datasets

https://prod.democratizing-data.tacc.utexas.edu/agency_runs

asjc: the table with the All Science Journal Classification codes which define the research area of a journal and the articles it contains

https://prod.democratizing-data.tacc.utexas.edu/asjc A publication's research areas are defined through Elsevier's All Science Journal Classification scheme. More information about this classification system can be found here: https://service.elsevier.com/app/answers/detail/a_id/15181/supporthub/scopus/

author: the table with author information

https://prod.democratizing-data.tacc.utexas.edu/authors

author_affiliation: the table linking authors to their affiliations in a publication

https://prod.democratizing-data.tacc.utexas.edu/author_affiliations

dataset_alias: the datasets provided by an agency for a particular run and possible aliases

https://prod.democratizing-data.tacc.utexas.edu/dataset_aliases (Note that the endpoint https://prod.democratizing-data.tacc.utexas.edu/datasets joins dataset_alias and agency_run, the schema being a concatenation of the two.) (To filter dataset_alias by other parameters, use https://prod.democratizing-data.tacc.utexas.edu/topics/{topic_id}/datasets, https://prod.democratizing-data.tacc.utexas.edu/authors/{author_id}/datasets, https://prod.democratizing-data.tacc.utexas.edu/publications/{publication_id}/datasets)

dyad: the core table with dyads representing dataset references

https://prod.democratizing-data.tacc.utexas.edu/dyads

*This column is set to NULL in the publicly available databases, and is only shown in restricted use databases.

dyad_model: the table with model scores for particular entries in the dyad table

https://prod.democratizing-data.tacc.utexas.edu/dyad_models

issn: the table with ISSN/ISBN codes for the journal

https://prod.democratizing-data.tacc.utexas.edu/issns

journal: the table linking publications to the journal in which they appeared

https://prod.democratizing-data.tacc.utexas.edu/journals

model: the table with the Kaggle models that are run

https://prod.democratizing-data.tacc.utexas.edu/models

publication: the publications discovered in a run and their metadata

https://prod.democratizing-data.tacc.utexas.edu/publications (To filter publications using other parameters, use https://prod.democratizing-data.tacc.utexas.edu/topics/{topic_id}/publications, https://prod.democratizing-data.tacc.utexas.edu/authors/{author_id}/publications, https://prod.democratizing-data.tacc.utexas.edu/datasets/{parent_alias_id}/publications)

publication_affiliation: the table with affiliations linked to a publication

https://prod.democratizing-data.tacc.utexas.edu/publication_affiliations

publication_asjc: the table linking a publication to its ASJC code(s)

https://prod.democratizing-data.tacc.utexas.edu/publication_asjcs

publication_author: the table linking publication and author tables

https://prod.democratizing-data.tacc.utexas.edu/publication_authors

publication_topic: the table identifying the topic assigned to a publication

https://prod.democratizing-data.tacc.utexas.edu/publication_topics

publisher: the table with a list of publishers that can be linked to journals

https://prod.democratizing-data.tacc.utexas.edu/publishers

topic: the table with Topics defined by Elsevier, consistinging of topic names with three concatenated keywords

https://prod.democratizing-data.tacc.utexas.edu/topics Topics are defined through Elsevier's Topic Prominence in Science methodology. More information about these Topics can be found here: https://www.elsevier.com/solutions/scival/features/topic-prominence-in-science (To filter topic by other parameters, use https://prod.democratizing-data.tacc.utexas.edu/authors/{author_id}/topics, https://prod.democratizing-data.tacc.utexas.edu/datasets/{parent_alias_id}/topics, https://prod.democratizing-data.tacc.utexas.edu/publications/{publication_id}/topics)

publication_ufc: the table with unified fingerprint concepts assigned to a publication

The following metadata tables apply only to the data that can be accessed in SciServer restricted use databases.

reviewer: the table with reviewers assigned to validate dyads in the publication_dataset_alias table

snippet_validation: the table containing validation results for dyads provided by reviewers

susd_user: the table identifying users of the validation tool

Last updated