The COKI Open Access Dataset is available in JSON Lines format. See below for dataset releases, license, how to cite the website and dataset, attributions and the dataset schema.
Releases
2024-11-10 | Download coki-oa-dataset.zip |
2024-10-07 | Download coki-oa-dataset.zip |
2024-09-16 | Download coki-oa-dataset.zip |
2024-08-13 | Download coki-oa-dataset.zip |
2024-07-29 | Download coki-oa-dataset.zip |
The COKI Open Access Dataset © 2022 by Curtin University is licensed under CC BY 4.0.
Citing
To cite the COKI Open Access Dashboard please use the following citation:
Diprose, J., Hosking, R., Rigoni, R., Roelofs, A., Chien, T., Napier, K., Wilson, K., Huang, C., Handcock, R., Montgomery, L., & Neylon, C. (2023). A User-Friendly Dashboard for Tracking Global Open Access Performance. The Journal of Electronic Publishing 26(1). doi: https://doi.org/10.3998/jep.3398
If you use the website code, please cite it as below:
James P. Diprose, Richard Hosking, Richard Rigoni, Aniek Roelofs, Alex Massen-Hane, Kathryn R. Napier, Tuan-Yow Chien, Katie S. Wilson, Lucy Montgomery, & Cameron Neylon. (2022). COKI Open Access Website. Zenodo. https://doi.org/10.5281/zenodo.6374486
If you use this dataset, please cite it as below:
Richard Hosking, James P. Diprose, Aniek Roelofs, Tuan-Yow Chien, Lucy Montgomery, & Cameron Neylon. (2022). COKI Open Access Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6399462
For other citation formats follow the doi.org links in the above citations.
The COKI Open Access Dataset contains information from:
Field | Type | Description |
---|---|---|
id | String | The country id; an ISO 3166-1 alpha-3 country code. |
name | String | The country name. |
subregion | String | The name of the subregion the country is located in. |
region | String | The name of the region the country is located in. |
start_year | Integer | The start year of data used to calculate the statistics. |
end_year | Integer | The end year of data used to calculate the statistics. |
stats | PublicationStats | The aggregated publication statistics for this country, for all time. |
years | List<Year> | The publication statistics for each year. |
Table 1. Country Schema.
Field | Type | Description |
---|---|---|
id | String | The institution id; a Research Organization Registry identifier. |
name | String | The institution name. |
country_name | String | The name of the country where the institution is located. |
country_code | String | The three letter an ISO 3166-1 alpha-3 code of the country where the institution is located. |
subregion | String | The name of the subregion where the institution is located. |
region | String | The name of the region where the institution is located. |
institution_types | List<String> | A list of institution types that apply to this institution. Each instance can be one of: Education, Healthcare, Company, Archive, Nonprofit, Government, Facility, Other. |
start_year | Integer | The start year of data used to calculate the statistics. |
end_year | Integer | The end year of data used to calculate the statistics. |
stats | PublicationStats | The aggregated publication statistics for this institution, for all time. |
years | List<Year> | The publication statistics for each year. |
Table 2. Institution Schema.
Field | Type | Description |
---|---|---|
n_citations | Integer | The total number of outputs cited. |
n_outputs | Integer | The total number of outputs published. |
n_outputs_open | Integer | The total number of open outputs. |
n_outputs_publisher_open | Integer | The total number of outputs published as Publisher Open. |
n_outputs_publisher_open_only | Integer | The total number of outputs published only as Publisher Open (and not Other Platform Open or Closed). |
n_outputs_both | Integer | The total number of outputs published that are both Publisher Open and Other Platform Open. |
n_outputs_other_platform_open | Integer | The total number of outputs published as Other Platform Open. |
n_outputs_other_platform_open_only | Integer | The total number of outputs published only as Other Platform Open (and not Publisher Open or Closed). |
n_outputs_closed | Integer | The total number of outputs published as Closed. |
n_outputs_oa_journal | Integer | Publisher Open Breakdown: the total number of outputs published in an Open Access Journal. |
n_outputs_hybrid | Integer | Publisher Open Breakdown: the total number of outputs made accessible in a Subscription Journal with an open license. |
n_outputs_no_guarantees | Integer | Publisher Open Breakdown: the total number of outputs made accessible in a Subscription Publisher with no reuse rights. |
p_outputs_open | Float | The percentage of open outputs. |
p_outputs_publisher_open | Float | The percentage of outputs published as Publisher Open. |
p_outputs_publisher_open_only | Float | The percentage of outputs published only as Publisher Open (and not Other Platform Open or Closed). |
p_outputs_both | Float | The percentage of outputs published that are both Publisher Open and Other Platform Open. |
p_outputs_other_platform_open | Float | The percentage of outputs published as Other Platform Open. |
p_outputs_other_platform_open_only | Float | The percentage of outputs published only as Other Platform Open (and not Publisher Open or Closed). |
p_outputs_closed | Float | The percentage of outputs published as Closed. |
p_outputs_oa_journal | Float | The percentage of Publisher Open outputs published in an Open Access Journal. |
p_outputs_hybrid | Float | The percentage of Publisher Open outputs made accessible in a Subscription Journal with an open license. |
p_outputs_no_guarantees | Float | The percentage of Publisher Open outputs made accessible in a Subscription Publisher with no reuse rights. |
Table 3. PublicationStats Schema.
Field | Type | Description |
---|---|---|
year | Integer | The year that this record applies to. |
date | Date | The date that this record applies to, in the format YYYY-MM-DD. The day and month are always the end of the year in question, i.e. the 31st of December. |
stats | PublicationStats | The aggregated publication statistics for the year that this record applies to. |
Table 4. Year Schema.