NIH Topic Explorer

Project Summary

This project uses AI-powered topic modeling to uncover and visualize the main research themes in NIH-funded grant applications. Each proposal abstract is grouped into a research "topic" using clustering algorithms (k-means). We then visualize how common each topic is and which NIH Institutes and Centers (ICs) fund research in those areas.

What you can explore:
– Which research topics are most common (left bar chart)
– Which NIH ICs fund each topic (right bubble chart)
– Multiple levels of detail (k = 5, 10, 15, 20, 30)

This tool helps researchers, policymakers, and the public understand how NIH distributes funding across scientific themes.

How to Read the Visualizations

Left chart: Bars show how many applications each research topic has—the longer the bar, the more applications. Topic names appear next to the bars, sorted from most (top) to least (bottom).

Right chart: For each topic row, circles under IC codes show which NIH institutes fund that topic; bigger circles mean more applications. Hovering reveals the full institute name and count.

How to read both: First look at bar length to see which topics are busiest, then check that topic’s row of circles to see which institutes are most involved.

📥 Download the dataset (CSV)

Topics with k = 5

Topics with k = 10

Topics with k = 15

Topics with k = 20

Topics with k = 30

Cite this work

If you use this visualization or dataset, please cite:
(Replace following with actual info-->) Paula Fearon. NIH Topic Explorer: Mapping Research Priorities in NIH-Funded Grants Using Topic Modeling. arXiv preprint arXiv:xxxx.xxxxx, 2025. https://arxiv.org/abs/xxxx.xxxxx