The Code of Practice for Research Data Usage Metrics standardizes the generation and distribution of usage metrics for research data, enabling for the first time the consistent and credible reporting of research data usage.
COUNTER welcomes input and feedback from the community on this first iteration, so that it can be further developed and refined.
Aligned as much as possible with the COUNTER Code of Practice Release 5 glossary.
|Access_Method||A COUNTER attribute indicating whether the usage related to investigations and requests was generated by a human user browsing and searching a website (Regular) or by a computer (Machine).|
|Collection||A curated collection of metadata about content items.|
|Component||A uniquely identifiable constituent part of a content item composed of more than one file (digital object).|
|Content item||A generic term describing a unit of content accessed by a user of a content host. Typical content items include articles, books, chapters, datasets, multimedia, etc.|
|Content provider||An organization whose function is to commission, create, collect, validate, host, distribute, and trade information in electronic form.|
|Creator(s)||The person/people who wrote/created the datasets whose usage is being reported-|
|Data repository||A content provider that provides access to research data.|
|Data type||The field identifying type of content. The Code of Practice for Research Data Usage Metrics only recognizes the Data type Dataset.|
|Dataset||An aggregation of data, published or curated by a single agent, and available for access or download in one or more formats, with accompanying metadata. Other term: data package.|
|Description||A short description of a dataset. Accessing the description falls into the usage category of Investigations.|
|DOI (digital object identifier)||The digital object identifier is a means of identifying a piece of intellectual property (a creation) on a digital network, irrespective of its current location (IDF).|
|Double-click||A repeated click or repeated access to the same resource by the same user within a period of 30 seconds. COUNTER requires that double-clicks must be counted as a single click.|
|Host types||A categorization of Content Providers used by COUNTER. The Code of Practice for Research Data Usage Metrics uses the following host types:
● Data Repository
|Internet robot, crawler, spider||An identifiable, automated program or script that visits websites and systematically retrieves information from them, often to provide indexes for search engines rather than for research. Not all programs or scripts are classified as robots.|
|Investigation||A category of COUNTER metric types that represent a user accessing information related to a dataset (i.e. a description or detailed descriptive metadata) or the content of the dataset itself.|
|Log file analysis||A method of collecting usage data in which the web server records all of its transactions.|
|Machine||A category of COUNTER Metric Types that represents a machine accessing content, e.g. a script written by a researcher. This does not include robots, crawlers and spiders.|
|Master reports||Reports that contain additional filters and breakdowns beyond those included in the standard COUNTER reports.|
|Metadata||A series of textual elements that describes a content item but does not include the item itself. For example, metadata for a dataset would typically include publisher, a list of names and affiliations of the creators, the title and description, and keywords or other subject classifications.|
|Metric types, Metric_Type||An attribute of COUNTER usage that identifies the nature of the usage activity.|
|ORCID (Open Researcher and Contributor ID)||An international standard identifier for individuals (i.e. authors) to use with their name as they engage in research, scholarship, and innovation activities.|
|Persistent Identifier (PID)||Globally unique identifier and associated metadata for research data, or other entities (articles, researchers, scholarly institutions) relevant in scholarly communication.|
|Platform||An interface from an aggregator, publisher, or other online service that delivers the content to the user and that counts and provides the COUNTER usage reports.|
|Provider ID||A unique identifier for a Content Provider and used by discovery services and other content sites to track usage for content items provided by that provider.|
|Publication date, Publication_Date||An optional field in COUNTER item reports and Provider Discovery Reports. The date of release by the publisher to customers of a content item.|
|Publisher||An organization whose function is to commission, create, collect, validate, host, distribute and trade information online and/or in printed form.|
|Regular||A COUNTER Access_Method. Indicates that usage was generated by a human user browsing/searching a website, rather than by a computer.|
|Reporting period, Reporting_Period||The total time period covered in a usage report.|
|Request||A category of COUNTER Metric Types that represents a user accessing the dataset content.|
|Session||A successful request of an online service. A single user connects to the service or database and ends by terminating activity that is either explicit (by leaving the service through exit or logout) or implicit (timeout due to user inactivity). (NISO).|
|SUSHI||An international standard (Z39-93) that describes a method for automating the harvesting of reports. Research Data SUSHI API Specification is an implementation of this standard for harvesting Code of Practice for Research Data Usage Metrics reports.|
|Total_Dataset_Investigations||A COUNTER Metric_Type that represents the number of times users accessed the content of a dataset, or information describing that dataset (i.e. metadata).|
|Total_Dataset_Requests||A COUNTER Metric_Type that represents the number of times users requested the content of a dataset. Requests may take the form of viewing, downloading, or emailing the dataset provided such actions can be tracked by the content provider’s server.|
|Transactions||A usage event.|
|Unique_Dataset_Investigations||A COUNTER Metric Type that represents the number of unique “Datasets” investigated in a user-session.|
|Unique_Dataset_Requests||A COUNTER Metric Type that represents the number of unique datasets requested in a user-session.|
|User||A person who accesses the online resource.|
|User agent||An identifier that is part of the HTTP/S protocol that identifies the software (i.e. browser) being used to access the site. May be used by robots to identify themselves.|
|Version||Multiple versions of a dataset are defined by significant changes to the content and/or metadata, associated with changes in one or more components.|
|Year of publication||Calendar year in which a dataset is published.|
Usage metrics forare seen as an important indicator of impact by researchers and other stakeholders (Costas, Meijer, Zahedi, & Wouters, 2013, Kratz & Strasser, 2015), second only to data citations. They currently can’t fill that role due to the lack of standardization on how usage metrics should be collected and reported.
The Code of Practice forUsage Metrics standardizes the generation and distribution of usage metrics for research data. This enables data repositories and providers to produce consistent and credible usage metrics for research data, and helps data repositories, libraries, funders and other stakeholders to understand and demonstrate the reuse of research data.
This is the first release of the Code of Practice forCOUNTER Code of Practice Release 5 (COUNTER Code of Practice Release 5, 2017) that standardizes usage metrics for many scholarly resources, including journals and books. Many definitions, processing rules, and reporting recommendations apply to in the same way as they apply to other scholarly resources.Usage Metrics. The recommendations are aligned as much as possible with the
The dataset (aof data published or curated by a single agent) is the content for which we report usage in terms of investigations (i.e. how many times are accessed) and requests (i.e. how many times data are retrieved, a subset of all investigations). Investigations and requests for components of the can be reported in the same way as other scholarly resources under COUNTER Code of Practice Release 5, in that the total number of investigations or requests are summed across the components of a given dataset. Sessions allow the differentiation between total investigations and requests of a dataset (in which all accesses are summed) and unique investigations and requests (in which accesses are only counted once per if they are within a unique user-session), aligned with the reporting for content items in COUNTER Code of Practice Release 5.
Some aspects of the processing and reporting of usage data are unique to research data, and the Code of Practice forUsage Metrics thus at times needs to deviate from the COUNTER Code of Practice Release 5 and specifically address them. This starts with the main use cases for data usage metrics reporting: subscription access to is uncommon, therefore breaking down the usage data by accessing the is less relevant. While there is interest in understanding the geographic distribution of investigations and requests to research data, these usage data can be reported at a coarser granularity (by country rather than by institution) and can be aggregated and openly shared.
COUNTER Code of Practice Release 5 focuses usage reporting on human users andout all known robots, crawlers, and spiders. While the same exclusion list should be applied to research data, there is significant legitimate usage in which humans employ scripts and other automated tools in the normal course of research. The Code of Practice for Usage Metrics defines how usage metrics from these automated tools used can be reported.
Versioning is much more common and complex withcompared to most other scholarly resources, and the Code of Practice for Usage Metrics addresses this. We recommend reporting usage metrics for each specific version, as well as the combined usage for all versions. This first release of the Code of Practice for Research Data Usage Metrics will not fully address the particular challenges associated with reporting usage for dynamically changing datasets.
Research data can be retrieved in a wide variety of file formats, different from text-based scholarly resources. For the Code of Practice forUsage Metrics we will not break down requests by file format. We will include the data transferred as part of the reporting, since the variations are much greater than for other scholarly resources. Reporting data transfer in addition to the number of requests and investigations also helps with understanding differences between data repositories with regards to how data are packaged and made available for retrieval.
The Code of Practice forUsage Metrics enables the reporting of usage metrics by different data repositories following common best practices, and thus is an essential step towards realizing usage metrics to facilitate understanding how publicly available research datas are being reused. This complements ongoing work on establishing best practices and services for data citation (Burton, Fenner, Haak, & Manghi, 2017).