The CWTS Leiden Ranking

The CWTS Leiden Ranking
     The Leiden ranking system, established by the Center for Science and Technology Studies (CWTS) of Leiden University in the Netherlands, is recognized as a leading global ranking institution. Distinguished from other university ranking systems, the Leiden system employs advanced scientific principles and scientometric indicators to assess universities. It leverages research works indexed in the Web of Science citation database of Clarivate Analytics company to evaluate the scientific standing of universities.
     The 2023 Leiden ranking data is based on core research works (articles and review articles) indexed in the citation indexes of the sciences, social sciences, arts, and humanities of the Web of Science database, published from 2018 to 2021. Core research works are defined as publications in core international scientific journals with the capacity for citation analysis, typically published in English, authored by specific individuals, and not retracted.
     To qualify for ranking in the Leiden 2023 system, a university must have a minimum of 800 articles in the Web of Science database. Assigning each article to a university follows the Fractional Counting method, which assigns varied weights to articles from collaborative efforts between multiple universities. The enriched data incorporates citation matching, geographic coding, open access, and gender diversity of research work authors.
     The 2023 Leiden ranking exclusively considers citations (excluding self-citations) to research works until the conclusion of 2021. The inclusion of Iranian universities in the Leiden ranking system since 2013 underscores the quantitative and qualitative expansion of scientific output and heightened scientific involvement, affirming the global prominence of these universities.
Outlined below are the general criteria and indicators of the Leiden rating system:
- Utilization of advanced scientific principles and scientometric indicators
- Data sourced from research works published from 2018 to 2021
- Core research works indexed in the Web of Science database
- Requirement of a minimum of 800 articles in the database for university ranking
- Utilization of the Fractional Counting method for article assignment
- Data enrichment through citation matching, geographic coding, open access, and gender diversity of authors
- Sole consideration of citations until the conclusion of 2021 in the 2023 ranking
- Exclusion of self-citations in the calculation of indicators.
The indicators available in the Leiden Ranking are discussed in detail below.
•    Size-dependent vs. size-independent indicators
     Indicators included in the Leiden Ranking have two variants: A size-dependent and a size-independent variant. In general, size-dependent indicators are obtained by counting the absolute number of publications of a university that have a certain property, while size-independent indicators are obtained by calculating the proportion of the publications of a university with a certain property. For instance, the number of highly cited publications of a university and the number of publications of a university co-authored with other organizations are size-dependent indicators. The proportion of the publications of a university that are highly cited and the proportion of a university’s publications co-authored with other organizations are size-independent indicators. In the case of size-dependent indicators, universities with a larger publication output tend to perform better than universities with a smaller publication output. Size-independent indicators have been corrected for the size of the publication output of a university. Hence, when size-independent indicators are used, both larger and smaller universities may perform well.
•    Scientific impact indicators
     The Leiden Ranking provides the following indicators of scientific impact:
P. Total number of publications of a university.
P(top 1%) and PP(top 1%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 1% most frequently cited.
P(top 5%) and PP(top 5%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 5% most frequently cited.
P(top 10%) and PP(top 10%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 10% most frequently cited.
P(top 50%) and PP(top 50%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 50% most frequently cited.
TCS and MCS. The total and the average number of citations of the publications of a university.
TNCS and MNCS. The total and the average number of citations of the publications of a university, normalized for field and publication year. An MNCS value of two for instance means that the publications of a university have been cited twice above the average of their field and publication year.
Citations are counted until the end of 2023 in the calculation of the above indicators. Author self–citations are excluded. All indicators except for TCS and MCS are normalized for differences in citation patterns between scientific fields. For the purpose of this field normalization, about 4000 fields are distinguished. These fields are defined at the level of individual publications. Using a computer algorithm, each publication in Web of Science is assigned to a field based on its citation relations with other publications.
The TCS, MCS, TNCS, and MNCS indicators are not available on the main ranking page. These indicators can be accessed by clicking on the name of a university. An overview of all bibliometric statistics available for the university will then be presented. This overview also includes the TCS, MCS, TNCS, and MNCS indicators.
•    Collaboration indicators
The Leiden Ranking provides the following indicators of collaboration:
P. Total number of publications of a university.
P(collab) and PP(collab). The number and the proportion of a university’s publications that have been co-authored with one or more other organizations.
P(int collab) and PP(int collab). The number and the proportion of a university’s publications that have been co-authored by two or more countries.
P(industry) and PP(industry). The number and the proportion of a university’s publications that have been co-authored with one or more industrial organizations. All private sector for profit business enterprises, covering all manufacturing and services sectors, are regarded as industrial organizations. This includes research institutes and other corporate R&D laboratories that are fully funded or owned by for profit business enterprises. Organizations in the private education sector and private medical/health sector (including hospitals and clinics) are not classified as industrial organizations.
P(<100 km) and PP(<100 km). The number and the proportion of a university’s publications with a geographical collaboration distance of less than 100 km. The geographical collaboration distance of a publication equals the largest geographical distance between two addresses mentioned in the publication’s address list.
P(>5000 km) and PP(>5000 km). The number and the proportion of a university’s publications with a geographical collaboration distance of more than 5000 km.
Some limitations of the above indicators need to be mentioned. In the case of the P(industry) and PP(industry) indicators, we have made an effort to identify industrial organizations as accurately as possible. Inevitably, however, there will be inaccuracies and omissions in the identification of industrial organizations. In the case of the P(<100 km), pp(<100 km), P(>5000 km), and PP(>5000 km) indicators, we rely on geocoding of addresses listed in Web of Science. There may be some inaccuracies in the geocoding that we have performed, and for addresses that are used infrequently no geocodes may be available. In general, we e xpect these inaccuracies and omissions to have only a small effect on the indicators.
•    Open access indicators
The Leiden Ranking provides the following indicators of open access publishing:
P. Total number of publications of a university.
P(OA) and PP(OA). The number and the proportion of open access publications of a university.
P(gold OA) and PP(gold OA). The number and the proportion of gold open access publications of a university. Gold open access publications are publications in an open access journal.
P(hybrid OA) and PP(hybrid OA). The number and the proportion of hybrid open access publications of a university. Hybrid open access publications are publications in a subscription journal that are open access with a license that allows the publication to be reused.
P(bronze OA) and PP(bronze OA). The number and the proportion of bronze open access publications of a university. Bronze open access publications are publications in a subscription journal that are open access without a license that allows the publication to be reused.
P(green OA) and PP(green OA). The number and the proportion of green open access publications of a university. Green open access publications are publications in a subscription journal that are open access not in the journal itself but in a repository.
P(OA unknown) and PP(OA unknown). The number and the proportion of a university’s publications for which the open access status is unknown. These publications typically do not have a DOI in the Web of Science database.
In the calculation of the P(OA) and PP(OA) indicators, a publication is considered open access if it is gold, hybrid, bronze, or green open access. The open access status of a publication is determined based on OpenAlex data.
Gender indicators
     The Leiden Ranking provides the following indicators of gender diversity:
A. The total number of authorships of a university. Consider for instance a publication that has five authors, of which three report university X as their affiliation and two report university Y as their affiliation. This publication then yields three authorships for university X and two authorships for university Y.
A(MF). The number of male and female authorships of a university, that is, a university’s number of authorships for which the gender is known.
A(unknown) and PA(unknown). The number of authorships of a university for which the gender is unknown and the number of authorships for which the gender is unknown as a proportion of a university’s total number of authorships.
A(M), PA(M), and PA(M|MF). The number of male authorships of a university, the number of male authorships as a proportion of a university’s total number of authorships, and the number of male authorships as a proportion of a university’s number of male and female authorships.
A(F), PA(F), and PA(F|MF). The number of female authorships of a university, the number of female authorships as a proportion of a university’s total number of authorships, and the number of female authorships as a proportion of a university’s number of male and female authorships.
For each authorship of a university, the gender is determined using the following four-step procedure:
Author disambiguation. Using an author disambiguation algorithm developed by CWTS, authorships are linked to authors. If there is sufficient evidence to assume that different publications have been authored by the same individual, the algorithm links the corresponding authorships to the same author.
Author-country linking. Each author is linked to one or more countries. If the country of the author’s first publication is the same as the country occurring most often in the author’s publications, the author is linked to this country. Otherwise, the author is linked to all countries occurring in his or her publications.
Retrieval of gender statistics. For each author, gender statistics are collected from three sources: Gender API, Genderize.io, and Gender Guesser. Gender statistics are obtained based on the first name of an author and the countries to which the author is linked.
Gender assignment. For each author, a gender (male or female) is assigned if Gender API is able to determine the gender with a reported accuracy of at least 90%. If Gender API does not recognize the first name of an author, Gender Guesser and Genderize.io are used. If none of these sources is able to determine the gender of an author with sufficient accuracy, the gender is considered unknown. For authors from Russia and a number of other countries, the last name is also used to determine the gender of the author.
Using the above procedure, the gender can be determined for about 70% of all authorships of universities included in the Leiden Ranking. For the remaining authorships, the gender is unknown.
Counting method
     The scientific impact indicators in the Leiden Ranking can be calculated using either a full counting or a fractional counting method. The full counting method gives a full weight of one to each publication of a university. The fractional counting method gives less weight to collaborative publications than to non-collaborative ones. For instance, if a publication has been co-authored by five researchers and two of these researchers are affiliated with a particular university, the publication has a weight of 2 / 5 = 0.4 in the calculation of the scientific impact indicators for this university. The fractional counting method leads to a more proper field normalization of scientific impact indicators and therefore to fairer comparisons between universities active in different fields. For this reason, fractional counting is the preferred counting method for the scientific impact indicators in the Leiden Ranking. Collaboration, open access, and gender indicators are always calculated using the full counting method.
Trend analysis
     To facilitate trend analyses, the Leiden Ranking provides statistics not only based on publications from the period 2019–2022, but also based on publications from earlier periods: 2006–2009, 2007–2010, ..., 2018–2021. The statistics for the different periods are calculated in a fully consistent way. For each period, citations are counted until the end of the first year after the period has ended. For instance, in the case of the period 2006–2009 citations are counted until the end of 2010, while in the case of the period 2019–2022 citations are counted until the end of 2023.
Stability intervals
     Stability intervals provide some insight into the uncertainty in bibliometric statistics. A stability interval indicates a range of values of an indicator that are likely to be observed when the underlying set of publications changes. For instance, the PP(top 10%) indicator may be equal to 15.3% for a particular university, with a stability interval ranging from 14.1% to 16.5%. This means that the PP(top 10%) indicator equals 15.3% for this university, but that changes in the set of publications of the university may relatively easily lead to PP(top 10%) values in the range from 14.1% to 16.5%. The Leiden Ranking employs 95% stability intervals constructed using a statistical technique known as bootstrapping.