• Request Info
  • Visit
  • Apply
  • Give
  • Request Info
  • Visit
  • Apply
  • Give

Search

  • A-Z Index
  • Map

Educational Leadership and Policy Studies

  • About
  • Our People
    • Our People Overview
    • Faculty
    • Staff
    • Students
  • Academic Programs
    • Academic Programs Overview
    • Adult & Continuing Education
    • College Student Personnel
    • Educational Administration
    • Evaluation Programs
    • Higher Education Administration
    • Undergraduate Studies
  • Education Research & Opportunity Center
  • Admissions & Information
    • Admissions Overview
    • Graduate Forms, Handbooks, and Resources
    • Contact ELPS
  • About
  • Our People
    • Our People Overview
    • Faculty
    • Staff
    • Students
  • Academic Programs
    • Academic Programs Overview
    • Adult & Continuing Education
    • College Student Personnel
    • Educational Administration
    • Evaluation Programs
    • Higher Education Administration
    • Undergraduate Studies
  • Education Research & Opportunity Center
  • Admissions & Information
    • Admissions Overview
    • Graduate Forms, Handbooks, and Resources
    • Contact ELPS
Home » My Journey In Writing A Bibliometric Analysis Paper

My Journey In Writing A Bibliometric Analysis Paper

My Journey In Writing A Bibliometric Analysis Paper

My Journey In Writing A Bibliometric Analysis Paper

June 1, 2025 by Jonah Hall

As a third-year doctoral student in Evaluation, Statistics, and Methodology at the University of Tennessee, Knoxville, I recently completed a bibliometric analysis paper for my capstone project on Data Visualization and Communication in Evaluation. Bibliometrics offers a powerful way to quantify research trends, map scholarly networks, and identify gaps in literature. It is an invaluable research method for evaluators and researchers alike. Hello everyone! I am Richard D. Amoako. 

Learning bibliometrics isn’t always straightforward. Between choosing the right database, wrangling APIs, and figuring out which R or Python packages won’t crash your laptop, there’s a steep learning curve. That’s why I’m writing this: to share the lessons, tools, and occasional frustrations I’ve picked up along the way. Whether you’re an evaluator looking to map trends in your field or a researcher venturing into bibliometrics for the first time, I hope this post saves you time, sanity, and a few coding headaches. Let’s explore the methodology, applications, and resources that shaped my project. 

Understanding Bibliometric Analysis 

Bibliometric analysis is the systematic study of academic publications through quantitative methods- examining citations, authorship patterns, and keyword frequencies to reveal research trends. Bibliometric analysis differs from traditional literature reviews by delivering data-driven insights into knowledge evolution within a field. Common applications include identifying influential papers, mapping collaboration networks, and assessing journal impact (Donthu, et al., 2021; Van Raan, et a., 2018; Zupic & Čater, 2015). 

For evaluators, this approach is particularly valuable. It helps track the adoption of evaluation frameworks, measure scholarly influence, and detect emerging themes, such as how data visualization has gained traction in recent years. My interest in bibliometrics began while reviewing literature for my capstone project. Faced with hundreds of papers, I needed a way to objectively analyze trends rather than rely on subjective selection. Bibliometrics provide that structure, turning scattered research into actionable insights. 

Key Steps in Writing a Bibliometric Paper 

Defining Research Objectives 
The foundation of any successful bibliometric study lies in crafting a precise research question. For my capstone on data visualization in evaluation literature, I focused on: “How has the application of data visualization techniques evolved in program evaluation research from 2010-2025?” This specificity helped me avoid irrelevant data while maintaining analytical depth. Before finalizing my question, I reviewed existing systematic reviews to identify underexplored areas – a crucial step that prevented duplication of prior work. When brainstorming and refining your thoughts, utilize productive technologies such as generative AI tools (such as ChatGPT, Claude, Perplexity, Google Gemini, Microsoft Copilot, DeepSeek, etc.)  to enhance and clarify your ideas.   

Database Selection and Data Collection 
Choosing the right database significantly impacts study quality. After comparing options, I selected Scopus for its comprehensive coverage of social science literature and robust citation metrics. While Web of Science (WoS) offers stronger impact metrics, its limited coverage of evaluation journals made it less suitable. Nonetheless, I examined the potential applications of using WoS. Google Scholar’s expansive but uncurated collection proved too noisy for systematic analysis. Scopus’s ability to export 2,000 records at once and include meta-data such as author affiliation, country proved invaluable for my collaboration mapping. 

Data Extraction and Automation 
To efficiently handle large datasets, I leveraged R’s Bibliometrix package. Use this R script to automate your data extraction with the Scopus API (Application Programming Interface). APIs enable software systems to communicate with each other. Researchers can use APIs to automate access to database records (like Scopus, WoS) without manual downloading. To access the Scopus database, request access via Elsevier’s Developer Portal. 

Pros: Good for large-scale scraping. Cons: Requires API key approval (can take days or weeks).  

For targeted bibliometric searches, carefully construct your keyword strings using Boolean operators (AND/OR/NOT) and field tags like TITLE-ABS-KEY() to balance recall and precision – for example, my search TITLE-ABS-KEY(“data visualization” AND “evaluation”) retrieved 37% more relevant papers than a simple keyword search by excluding off-topic mentions in references. 

After exporting Scopus results to CSV, a simple script converted and analyzed the data (Aria & Cuccurullo, 2017): 

library(bibliometrix) 

M <- convert2df(“scopus.csv”, dbsource = “scopus”, format = “csv”) 

results <- biblioAnalysis(M) 

This approach provided immediate insights into citation patterns and author networks.  

Data Screening and Cleaning 
The initial search may return many papers; my search returned over 2,000. To narrow down the most relevant articles, you can apply filters such as: 

  1. Removing duplicates via DOI matching [use R code, M <- M[!duplicated(M$DO), ] #Remove by DOI. Duplicates are common in multidatabase studies.  
  1. Excluding non-journal articles 
  1. Excluding irrelevant articles that do not match your research questions or inclusion criteria 
  1. Manual review of random samples to verify relevance 

Additional data cleaning may be required. I use R’s tidyverse, janitor or dplyr packages for these tasks.  

The screening process can be overwhelming and time-consuming if performed manually. Fortunately, several tools and websites are available to assist with this task. Notable examples include abstrackr, convidence.org, rayyan.ai, AsReview, Loonlens.com, and nested-knowledge. These tools require well-defined inclusion and exclusion criteria. It is essential to have thoroughly considered criteria in place. Among these tools, my preferred choice is Loonlens.com, which automates the screening process based on the specified criteria and generates a CSV file with decisions and reasons upon completion. 

Analysis and Visualization  

Key analytical approaches included (refer to the appendices for R codes and this guideline): 

  • Citation analysis to identify influential works 
  • Co-authorship network mapping to reveal collaboration patterns 
  • Keyword co-occurrence analysis to track conceptual evolution 
  • Country and institution analysis to identify geographical collaborations and impacts 

For visualization, VOSviewer creates clear keyword co-occurrence maps, while CiteSpace helps identify temporal trends. The bibliometrix package streamlined these analyses, with functions like conceptualStructure() revealing important thematic connections. Visualization adjustments (like setting minimum node frequencies) transformed initial “hairball” network diagrams into clear, interpretable maps.  

This structured approach, from precise question formulation through iterative visualization – transformed a potentially overwhelming project into manageable stages. The automation and filtering strategies proved particularly valuable, saving countless hours of manual processing while ensuring analytical rigor.  

All the R code I used for data cleaning, analysis, and visualization is available on my GitHub repository. 

Challenges & How to Overcome Them 

Bibliometric analysis comes with its fair share of hurdles. Early in my project, I hit a major roadblock when I discovered many key papers were behind paywalls. My solution? I leveraged my university’s interlibrary loan/resource sharing system and reached out directly to authors via ResearchGate to request for full text – some responded with their papers. API limits were another frustration, particularly with Scopus’s weekly request cap (20,000 publications per week). I used R’s httr package to space out requests systematically, grouping queries by year or keyword to stay under Scopus’s weekly limit while automating the process. In addition to utilizing the API, you may access Scopus with your institutional credentials to manually search for papers using your key terms. You can then export your results in various formats such as CSV, RIS, and BibTex. 

The learning curve for R’s Bibliometrix package nearly derailed me in week two. After spending hours on error messages, I discovered the package’s excellent documentation and worked through their tutorial examples line by line. This hands-on approach helped me master essential functions within a week. 

Perhaps the trickiest challenge was avoiding overinterpretation. My initial excitement at seeing strong keyword clusters nearly led me to make unsupported claims. Consult with your advisor, a colleague or expertise in your field to help you distinguish between meaningful patterns and statistical noise. For instance, I found that a seemingly important keyword connection was just due to some prolific author’s preferred terminology. 

For clarity in my visualization, I use a consistent color scheme across visualizations to help readers quickly identify key themes. I used blue for methodological terms, green for application areas, and red for emerging concepts. This small touch markedly improved my visual’s readability. 

Conclusion 

This journey through bibliometric analysis has transformed how I approach research. From crafting precise questions to interpreting network visualizations, these methods bring clarity to complex literature landscapes. The technical hurdles are real but manageable – the payoff in insights is worth the effort. 

For those just starting, I recommend beginning with a small pilot study, perhaps analyzing 100-200 papers on a focused topic. The skills build quickly. 

I’d love to hear about your experiences with bibliometrics or help troubleshoot any challenges you encounter. Feel free to reach out at contact@rd-amoako.com or continue the conversation on research forums and other online platforms. Let’s explore how these methods can advance our evaluation and research  practice together. 

Interested in seeing the results of my bibliometric analysis and exploring the key findings? Connect with me via LinkedIn  or my blog. 

View an interactive map of publication counts by country from my project:  publications_map.html  

Bibliography 

an Eck, N. J., & Waltman, L. (2014). Visualizing bibliometric networks. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 285–320). Springer. 

Aria, M. & Cuccurullo, C. (2017) bibliometrix: An R-tool for comprehensive science mapping analysis, Journal of Informetrics, 11(4), pp 959-975, Elsevier. 

Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis: An overview and guidelines. Journal of Business Research, 133, 285–296. https://doi.org/10.1016/j.jbusres.2021.04.070 

Liu, A., Urquía-Grande, E., López-Sánchez, P., & Rodríguez-López, Á. (2023). Research into microfinance and ICTs: A bibliometric analysis. Evaluation and Program Planning, 97, 102215. https://doi.org/10.1016/j.evalprogplan.2022.102215 

Van Raan, A. F. J. (2018). Measuring science: Basic principles and application of advanced bibliometrics. In W. Glänzel, H. F. Moed, U. Schmoch, & M. Thelwall (Eds.), Handbook of science and technology indicators. Springer. 

Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E. C. M., Tijssen, R. J. W., Van Eck, N. J., & Wouters, P. (2012). The Leiden Ranking 2011/2012: Data collection, indicators, and interpretation. Journal of the American Society for Information Science and Technology, 63(12), 2419–2432. https://doi.org/10.1002/asi.22708 

Yao, S., Tang, Y., Yi, C., & Xiao, Y. (2022). Research hotspots and trend exploration on the clinical translational outcome of simulation-based medical education: A 10-year scientific bibliometric analysis from 2011 to 2021. Frontiers in Medicine, 8, 801277. https://doi.org/10.3389/fmed.2021.801277 

Zupic, I., & Čater, T. (2014). Bibliometric Methods in Management and Organization. Organizational Research Methods, 18(3), 429-472. https://doi.org/10.1177/1094428114562629 

 Resources: 

  • Bibliometrix Tutorial 
  • Scopus API Guide 
  • VOSviewer 
  • CiteSpace Manual  

Data Screening  

Abstractr- https://www.youtube.com/watch?v=jy9NJsODtT8 

Convidence.org- https://www.youtube.com/watch?v=tPGuwoh834A 

Rayyan.ai- https://www.youtube.com/watch?v=YFfzH4P6YKw&t=9s 

AsReview- https://www.youtube.com/watch?v=gBmDJ1pdPR0 

Nested-knowledge- https://www.youtube.com/watch?v=7xih-5awJuM 

R resources:  

My project repository https://github.com/amoakor/BibliometricAnalysis.git 

Packages: 

-tidyverse, – bibliometrix, – rscopus, -janitor 

-pysch, -tm 

httr package documentation: https://httr.r-lib.org/, https://github.com/r-lib/httr 

Analyzing & Visualizing Data 

  • Key Metrics to Explore (See the Bibliometrix Tutorial for more examples): 
  1. Citation Analysis: 

citations <- citations(M, field = “article”, sep = “;”) 

head(citations$Cited, 10) # Top 10 most cited 

  1. Co-authorship Networks: 

networkPlot(M, normalize = “salton”, type = “collaboration”) 

  1. Keyword Trends: 

conceptualStructure(M, field = “ID”, method = “CA”, minDegree = 10) 

Filed Under: Evaluation Methodology Blog

Educational Leadership and Policy Studies

325 Bailey Education Complex
Knoxville, Tennessee 37996

Phone: 865-974-2214
Fax: 865.974.6146

The University of Tennessee, Knoxville
Knoxville, Tennessee 37996
865-974-1000

The flagship campus of the University of Tennessee System and partner in the Tennessee Transfer Pathway.

ADA Privacy Safety Title IX