Data & Methodology

Data and Methodology

The Indian Science Reports portal provides indicators & values about research output, citations, authorship, international collaboration, gender distribution, open access, research grants and social media visibility etc. for the Indian research output during 2010-19. The portal not only present analytical results about Indian research but also compares the values with related values for some other major countries. The portal also includes institutional reports containing different analysis for 1000 Indian institutions.

The data for analysis is obtained from multiple sources and processed using standard scientometric, computational, network-theoretic and text-based methods. The Dimensions scholarly database is taken as primary source of research metadata and inputs from several other sources such as and Gender API etc. are also used. The portal only provides computed indicators of Indian scientific research and does not in any way expose any data taken from the different sources.

In terms of methodological approach choices, we have used whole counting for research output values. The citations are analysed both as absolute counts as well as relative citation ratio. Cited percentage of research output is computed by identifying proportion of research output that got at least one or more citation. The collaboration patterns are identified from author affiliation field and a research paper is considered as an instance of international collaboration if it involves authors from at least two countries. A research paper is called Domestic-Single Institution if it involves authors from a single institution only. A research paper is denoted as Domestic- Multi Institution if it involves authors from at least two different institutions of the same country.

The gender of the first author of each research paper is determined by using the GenderAPI service. Based on the value returned by the API, a research paper is categorised as female 1st authored or male 1st authored depending on whether the first author is a female or male, respectively. Confidence level of more than 70% is used, i.e. gender determination by API is taken as successful if it returns an accuracy of more than 70% with it.

The data for grants and open access availability are taken as obtained from the Dimensions database. The grants data for both domestic and international funding agencies is obtained and analysed. The open access availability in different forms (gold, green, bronze, hybrid) are captured and analysed.

Several indicators, such as CAGR and h-type indices are computed as per their standard definitions. The computation of x-index and x(g) index is as per the idea proposed in Lathabai, Nandy & Singh (2021). Some external data (such as international and NIRF rankings) for selected Indian institutions are also obtained from respective sources and presented as it is. The description for institutions is obtained from Wikipedia and their logo is obtained from a web scrapping of Wikipedia and the institutional webpages.