Opening your research funding data: a practical guide for funders

Open research infrastructure works best when the underlying metadata is open too. Research funding is one important area where that openness still lags. Over the past year, we’ve been working directly with funders around the world to help make funding information a first-class part of the open scholarly graph (see more on that here: https://blog.openalex.org/funding-metadata-in-openalex). Along the way, we have been hearing from funders of all shapes and sizes that they want practical advice: for those that are ready to make their metadata more open, what exactly should they share, and where should they start? This post is our attempt to answer those questions in a concrete and incremental way.

Why this matters

When funding data is openly available, it can be linked to the rest of the scholarly ecosystem: research outputs (papers, datasets, software), institutions, researchers, topics, and citations. This makes it much easier to understand what research was supported, by whom, and with what impact as well as where funding is flowing (and not flowing).

OpenAlex represents connections between funding and research either by funders → grants → outputs or directly from funders → outputs when information on specific grants is not available or linkable. 

Two ways to connect funding to research

1) Funder identity + acknowledgement matching (no grant database required)

Even if you don’t share grant-by-grant information, funded research can often be identified when we scan acknowledgements sections of research outputs for funder mentions and linking those mentions to a known funder.

What this enables

  • Tracking of research outputs and their impacts by funder
  • Topic and institutional views based on acknowledged support
  • Better tracking despite acronyms, translations, and name changes

What we need for this to be effective

  • A public record of the funders’ operating names (e.g., meaningful name variants and historical names). ROR.org has most funders already– adding records and curating is free and easy and makes it easier to trace the different ways researchers might refer to your funding.
  • Providing guidance to researchers on how to represent your organization in publications when they declare funding

If you share nothing else, curating your ROR record and name variants is still high value.

2) Grant/Award records (richer linkages + funding-flow intelligence)

If you also share information about your grants/awards, you can make connections to specific grants (not just the funder name) and unlock deeper intelligence about research funding flows.

What this enables

  • Which funders fund topic X (and how that changes over time)
  • Which institutions receive funding from funder Y
  • Where there may be “underfunded” topics
  • Linking specific grants/programs → outputs → citations/impact
  • Funding-flow analytics (when amounts/currencies/dates are shared)

“Share what you can” tiers (minimum → recommended → optional)

Funders vary widely (public accountability vs. privacy, administrative constraints, donor preferences). It’s okay to publish only a subset of fields. You can start small and expand over time.

Tier 0: No grant records, but you want to track funded research

Do this

  • Ensure you have a ROR record and keep it updated (especially name variants).

Tier 1: Minimum viable grant data (high value, low sensitivity)

Share these fields for each grant/award (even without funding amounts):

  • Grant ID (your internal identifier)
  • Grant title
  • Short description/abstract 
  • Start year/date (and end year/date if possible)

Enables: topic discovery, portfolio timelines, and better matching to outputs—without publishing sensitive financial info.

Tier 2: Strong linkage + attribution (recommended for most funders)

Add:

  • Awarded institutions
  • Investigators (PI or individual awardee; co-investigators optional)
  • Program/scheme/call name
  • Funding type (grant/fellowship/infrastructure/etc.) 
  • Reported outputs linked to grants (optional): if you collect publication/output lists from grantee reporting (papers, datasets, software, preprints, etc.), consider sharing those as explicit grant → output links. This captures links that may not appear in acknowledgement sections and reuses effort researchers already invested in reporting.

Enables: “who/where did we fund,” collaboration networks, and more reliable grant-to-output linking.

Tier 3: Funding-flow intelligence (optional; may be sensitive)

Add:

  • Amount + currency 

Enables: investment totals by topic, cross-funder comparisons, and richer budget analytics.

What if you don’t want to share certain fields?

That’s okay! It’s better to share some information than to not share any because you don’t want to share specific fields. Here’s what you gain/lose:

  • No amounts/currency: you lose “how much was invested” analytics, but still get strong portfolio discovery and topic/institution linkages via titles/descriptions/dates/institutions.
  • No investigator identities: you lose person-level linking, but institutional and topical linking can remain strong.

Special case: Donor Advised Funds (DAFs) and fund administrators

Sometimes a fund administrator or DAF sponsor has stricter disclosure rules than the underlying funder’s preferences. In these cases, funders and administrators can collaborate on a field-sharing policy that protects sensitive info while still enabling research tracking.

Practical approach

  • Separate donor identity constraints from research grant data
  • Share Tier 1–2 fields first (title, description, years, institutions)
  • If amounts are sensitive: omit them
  • Use a simple workflow: funder approves which fields are public; administrator publishes and maintains the feed

A simple “how to start” plan

  1. Confirm identity: ensure your funder organization has a curated ROR record (names/aliases).
  2. Pick a tier: Tier 0 if you can’t publish grants; Tier 1 if you can publish minimal records.
  3. Publish in a sustainable format: website, database, bulk download, or API—getting data out there is more important than the specific format
  4. Iterate: add fields over time as policy and capacity allow.

Thank you to the many funders who have already worked with us to help make research funding data more open and more useful. Building a comprehensive, connected, and open view of research funding will take collaboration from funders of all kinds, and we’re excited to keep learning alongside the community. If your organization is thinking about sharing its metadata, we’d be glad to talk—please reach out to kyle@openalex.org

Recommitting to the Principles of Open Scholarly Infrastructure (POSI)

We’re excited to reconfirm our commitment to the Principles of Open Scholarly Infrastructure (POSI).

Back in 2021, when we were still called OurResearch, we published a blog post assessing our fit to POSI and making a public commitment to these principles. That commitment still stands. But both we and POSI have evolved since then, and with the recent release of the POSI v2, it felt like a good time to revisit that post.

Since our beginnings in an all-night hackathon more than a decade ago, we’ve tried to build a sustainable, mission-driven, genuinely open piece of scholarly infrastructure. So while we didn’t write POSI, we’ve long felt that it describes the kind of organization we want to be: one that belongs to the scholarly community, one that takes sustainability seriously, and one that is built to endure—or to wind down responsibly, if that ever becomes the right thing to do.

The POSI community has now released a revised version of the principles. The updated version keeps the same overall spirit, but clarifies and strengthens a few important areas. In particular, it now separates transparent governance from transparent operations, updates the principle on lobbying to allow advocacy in support of the community, strengthens expectations around living wills and transitions, and adds more explicit attention to financial reserves, volunteer labour, preservation, and interoperability.

As before, this post is our public self-assessment. We are not claiming perfection, but we are claiming commitment.

Summary

Governance

💚 Coverage across the scholarly enterprise
💚 Stakeholder governed
💚 Non-discriminatory participation or membership
💚 Transparent governance
💚 Cannot lobby
💛 Living will
💚 Regular review of community support and need

Sustainability

💚 Transparent operations
💚 Time-limited funds are used only for time-limited activities
💚 Goal to generate surplus
💛 Goal to create financial reserves
💚 Mission-consistent revenue generation
💚 Revenue based on services, not data
💚 Volunteer labour
💚 Transition planning

Insurance

💚 Open source
💚 Open data (within constraints of privacy laws)
💚 Available data (within constraints of privacy laws)
💚 Patent non-assertion
💚 Preservation
💚 Interoperability and open standards

(💚 = good,💛 = less good)

Governance

💚 Coverage across the scholarly enterprise

What it means: Research transcends disciplines, geographies, institutions, and stakeholder groups. The infrastructure that supports it should too.

OpenAlex: This remains central to our mission. OpenAlex aims to index and support the entire research ecosystem: across disciplines, geographies, languages, output types, and stakeholder groups. Since our 2021 assessment, we’ve also broadened the scope of what OpenAlex covers, including major work to expand beyond the traditional DOI-centered literature. We still have more work to do here, but are strongly aligned with the principle.

💚 Stakeholder governed

What it means: A board-governed organization drawn from the stakeholder community builds confidence that decisions will reflect community interests.

OpenAlex: OpenAlex is a nonprofit with a public-interest mission and a governing board. Since our 2021 post, we have also added a Community Advisory Board, which brings broader stakeholder perspectives into our work. Information about our team and governance is public, the CAB terms are public, and the CAB itself was selected through a community vote. We think we are in a stronger position here than we were a few years ago.

💚 Non-discriminatory participation or membership

What it means: Any stakeholder group should be able to participate, and participation should be inclusive.

OpenAlex: Anyone can use OpenAlex. Anyone can access the data. Anyone can use the API or snapshots. Anyone can report bugs, suggest features, or engage with us publicly. Since 2021, we’ve also added more structured ways for the community to participate, including the CAB, CAB working groups, community curation pathways, community google groups, and the Member program (which is intended to support sustainability and deepen engagement, not to gate access to the infrastructure itself).

💚 Transparent governance

What it means: To achieve trust, the processes and policies for governance should be transparent.

OpenAlex: This is one place where POSI v2 improves on the earlier version. Governance transparency and operational transparency are now treated separately, which makes sense to us. OpenAlex now has both a governing board and a public Community Advisory Board. CAB terms are public, and the advisory board was selected through a community vote. We think that puts us in a better place than we were in 2021, when some of these structures were more aspirational than real.

💚 Cannot lobby

What it means: Infrastructure organizations should not lobby for narrow self-interest, though they may advocate for policy changes in support of their communities.

OpenAlex: We like the revised wording of this principle. OpenAlex is a mission-driven nonprofit. We do not lobby for narrow organizational advantage, and as a 501(c)(3) we also operate within legal constraints in this area. At the same time, we do advocate publicly for the adoption of open science, open infrastructure, open metadata, and transparent research intelligence. We see that as support for the community, not as self-serving lobbying.

💛 Living will

What it means: A trustworthy organization should describe the conditions under which it or its services would be wound down, and how assets would be preserved or passed to a successor that also honors POSI.

OpenAlex: We continue to support this principle strongly. The core assets of OpenAlex remain our source code and datasets, and those are completely open under CC0 and MIT licenses. Our code is openly developed and archived. Our snapshots are publicly available, including through some third-party sources with a backup archived on Zenodo, and more academic groups around the world are hosting local copies of OpenAlex snapshots. That said, our position here is broadly the same as it was in 2021: we have many of the practical ingredients of a living will, but we do not yet have a more formal public statement of wind-down and successor conditions than we did then.

💚 Regular review of community support and need

What it means: Organizations should regularly review whether their activities are still needed and still supported by the community.

OpenAlex: A lot of what we build is, in one way or another, a response to failures or gaps in the current scholarly communication system. If those failures disappeared, some of our work should disappear too. OpenAlex is not trying to exist forever for its own sake. We want to build infrastructure that is useful, community-aligned, and durable for as long as it is needed.

Sustainability

💚 Transparent operations

What it means: Community trust requires transparency not just in governance, but in the practical realities of how the organization works.

OpenAlex: OpenAlex publishes pricing information, member information, nonprofit transparency materials, and public information about our grant funding (including openly depositing our grant proposals on Open Grants). We want the community to be able to see not just what we say we value, but how we are actually trying to sustain the work.

💚 Time-limited funds are used only for time-limited activities

What it means: Operations should be supported by sustainable revenue sources. Time-limited funds should be used for time-limited work.

OpenAlex: This remains central to how we think about sustainability. We continue to use grants to support bounded development work, major new initiatives, and strategic expansion. We continue to believe that day-to-day operations should be supported by revenue from sustainable services. Since the initial assessment, our operational revenue has grown significantly to scale new services with the growing operational needs of OpenAlex.

💚 Goal to generate surplus

What it means: Merely breaking even is not enough. Infrastructure organizations need enough flexibility to adapt and survive shocks.

OpenAlex: We agree with this more strongly now than we did in 2021. Open infrastructure at global scale needs slack. It needs room to invest, recover from surprises, and support transitions. A model that aims only at exact cost recovery is too brittle. We’re in a massive build phase at the moment, investing significant resources to build OpenAlex, but aim to generate surplus in future operational phases.

💛 Goal to create financial reserves

What it means: Organizations should maintain reserves that can support orderly wind-down, transition, or response to major unexpected events.

OpenAlex: We continue to believe in this principle, and we continue to work toward it. We currently have funds available to support our operations for the next year but have not set aside a formal contingency fund, so this remains an area where the work is ongoing.

💚 Mission-consistent revenue generation

What it means: Revenue sources should support the mission, not undermine it.

OpenAlex: This remains one of our strongest convictions. OpenAlex needs revenue to survive, but not all revenue is equally compatible with our mission. We believe a robust sustainability model is built on revenue from services, support, memberships, and mission-aligned partnerships and not on revenue that would require restricting the core infrastructure or distorting our priorities.

💚 Revenue based on services, not data

What it means: Data related to the scholarly infrastructure should be community property. Revenue should come from services, not from locking up the data itself.

OpenAlex: This remains a core principle for us. OpenAlex data remains open and will always be open. What we charge for are services around that infrastructure: support, enhanced access, memberships, and other value-added offerings.

💚 Volunteer labour

What it means: Organizations should be honest about the extent to which they rely on volunteer labour, and thoughtful about the risks and responsibilities that come with it.

OpenAlex: OpenAlex benefits from a great deal of community contribution. CAB members volunteer their time and expertise. CAB working groups help us think through specific topics. Community members submit metadata corrections, bug reports, and feature requests. We are grateful for that work and rely on it for ensuring that our development meets community needs.

💚 Transition planning

What it means: Organizations should reduce dependence on a small number of people and make transitions survivable.

OpenAlex: Since 2021, one of our co-founders left the organization (and field). It was a difficult transition period that reinforced for us the importance of transition planning, but it also demonstrated that we can handle major transitions. Since then, we have brought on new team members with clearer roles, better documentation, and more mature systems. The launch of Walden, with its easier-to-operate architecture that LLMs can grok and develop, is also an important part of this story. All of this reduces key-person risk and makes the organization more durable.

Insurance

💚 Open source

What it means: All software and assets required to run the infrastructure should be available under an open-source license.

OpenAlex: As in 2021, our code is openly available and openly developed. We continue to see “born open” development as the right default. Our code is also archived through Software Heritage.

💚 Open data (within constraints of privacy laws)

What it means: For an infrastructure to be reproducible, the relevant data must be openly and legally available where possible.

OpenAlex: The core data behind OpenAlex is open and intended to remain open. At the same time, some of our products and services involve private or user-provided data. As we wrote in 2021, when users share private data with us in order to receive a service, we do not share that private data or the data derived from it.

💚 Available data (within constraints of privacy laws)

What it means: It is not enough for data to be open in principle; there must be a practical way to obtain it.

OpenAlex: OpenAlex data is not just nominally open; it is completely open through full snapshots that are free to download as well as open UI and API services with generous daily free limits. Snapshots are also preserved and redistributed in ways that reduce dependence on us as the sole host and data is always available under a CC0 license.

💚 Patent non-assertion

What it means: Organizations should not use patents to prevent the community from replicating the infrastructure.

OpenAlex: Our position here is unchanged in substance from 2021. We do not believe patents belong at the core of scholarly infrastructure. In our earlier post, we said we would not pursue or assert patents and would look into formalizing that commitment. That still reflects our position.

💚 Preservation

What it means: Open infrastructure should be preserved in ways that make rescue and continuity possible.

OpenAlex: OpenAlex snapshots are stored on Zenodo, our code is archived via Software Heritage, and copies of the data are increasingly being hosted by academic groups around the world. That kind of distributed preservation is exactly what open infrastructure should enable.

💚 Interoperability and open standards

What it means: Infrastructure should use open standards and fit into the larger ecosystem in ways that make continuity and reuse easier.

OpenAlex: Interoperability has always been central to what OpenAlex is trying to do. We rely heavily on shared identifiers, open metadata flows, public snapshots, and open APIs. We want others to be able to build with and around OpenAlex without asking permission.

Closing

So: OpenAlex remains committed to POSI.

The revised principles are better in a few important ways. They push organizations like ours to be clearer about governance, more transparent about operations, more deliberate about reserves and transitions, more honest about volunteer labour, and more serious about preservation and interoperability.

As in 2021, we are publishing this not because we think we have everything figured out, but because we think these commitments are worth making in public. They give our community something concrete to expect from us, and something concrete to hold us accountable to.

And if the long arc of scholarly infrastructure bends toward a world where less of our current work is necessary because the ecosystem has become more open, more interoperable, and less broken, we will count that as success too.

That would be a nice problem to have.

Affiliation curation is coming to OpenAlex 

Algorithmic matching of affiliation text to real institutions is one of those things that only really becomes visible when it’s wrong.

For institutions adopting open research metadata, accurate affiliation matching is foundational: after all, tracking and understanding your research outputs requires first having an accurate list of your research outputs. When affiliation matching is noisy, institutions can lose confidence in open data—sometimes even when the underlying work’s metadata is otherwise excellent.

That’s why we’re launching a new affiliation curation tool inside OpenAlex, starting with our existing Member supporters.

Why we’re launching this now — and why it’s Member-only

Building affiliation curation properly is labour-intensive in two ways:

  1. Developing the tool itself
    We’re bringing curation into our production environment so it’s stable, auditable, and fast. That means building the interface, workflows, safeguards, and monitoring needed to support real institutional use at scale. And, of course, we need to iteratively develop this tool with partners as they start using it.
  2. Operating curation as a service
    Affiliation curation is much more complex than it looks and we can’t sustain the activities needed to moderate curation requests from any user. Moving forward, we need to provide training, guidance, moderation practices, and ongoing support.

OpenAlex Members aren’t just users of our data—they help us stress-test the workflows, surface edge cases, and shape the FAQ, training materials, and governance that will make the tool durable long-term.

What the tool does (and doesn’t do)

This new tool lets authorized institutional curators create and manage matches between:

  • Raw affiliation strings — the free-text affiliation lines authors include in publications (e.g., “University of X, Dept. Y, City, Country”), and
  • Your institution’s ROR record — the persistent identifier record for your organization.

In plain language: it helps you link the affiliation text that appears in publications to the correct institutional identity in OpenAlex.

What it does not do:

  • It does not let an institution “claim” a work if the institution isn’t actually present in the affiliation text. 
  • It’s not designed to replicate your full internal hierarchy (departments, labs, etc.). In some cases, distinct branded units that report to the institution may warrant their own organizational identifier, but the tool’s core job is linking affiliation text to organizational identifiers.

A quick thank-you to our French partners

This work builds on a strong collaboration with our partners in the French Ministry of Higher Education and Research (MESR), who created and operated the works-magnet. This tool supported affiliation curation at global scale, demonstrated just how much the community is willing to contribute to better open metadata, and enabled many institutions to shift from proprietary to open databases.

We’re hugely appreciative: the success of works-magnet made the need (and the opportunity) unmistakable, and we’re grateful to continue this partnership as we bring curation natively into OpenAlex.

As part of this transition, the works-magnet submission pathway has been closed and we have fully processed previous submissions. We’re excited to move forward with a workflow that’s stable inside our production systems.

What Members can expect

Member institutions will be able to:

  • access the curation interface through a curator-enabled OpenAlex account,
  • search affiliation strings that may refer to their institution (including variants, acronyms, and location cues),
  • filter between strings that are already matched vs not yet matched,
  • and add or remove linkages to improve both recall (catch missing matches) and precision (remove incorrect matches when names are similar across institutions).

We’ll provide onboarding and training, plus guidance on best practices—especially for tricky scenarios like similarly named universities, multilingual variants, and hospital/university affiliation patterns.

What if you have an urgent need but can’t become a Member?

If poor affiliation matching is causing significant harm in a time-sensitive workflow (for example, a major reporting deadline or a high-stakes rankings exercise) and your institution can’t currently support membership, please reach out to kyle@openalex.org.

We can’t promise we’ll be able to solve every case immediately, but we do want to understand urgent situations and help where we can—especially when a small, well-scoped intervention can prevent real damage.

What’s next

We’re excited to put better affiliation control directly into the hands of institutions who rely on OpenAlex—and to do it in a way that’s sustainable for open infrastructure.

If you’re already a Member, keep an eye out for onboarding details and training materials. If you’re considering membership, you can learn more at openalex.org/members.

And if you’ve been part of the works-magnet effort: thank you. This launch is a continuation of that shared work—making open research metadata not just available, but dependable.

A new way to support OpenAlex: become a Member!

Starting today, institutions can now support OpenAlex as a Member for $5,000 USD/year—a lightweight way to help sustain fully-open research metadata for institutions who don’t need the services provided by our existing institutional service offerings.

🎉A special thank you and shout-out to the University of Victoria for becoming our first OpenAlex Member supporter!

OpenAlex remains free to use (website, API, and quarterly public snapshot), with data released under CC0 license. Membership is about keeping that open infrastructure healthy and helping us scale sustainably.

What you get as a Member

Membership is designed for institutions (often university libraries) who want to invest in open infrastructure and also get a few practical benefits in return.

The Member tier includes:

  • Admin dashboard (with institutional use statistics)
  • Affiliation editor (access provided to certified curators)
  • Unsub access (helping libraries with data-driven collections strategies)
  • Nomination rights (for our Community Advisory Board)
  • Members roundtables (quarterly meetings on roadmap priorities)

For more information on what is included in the new Member support package, head to https://openalex.org/members

We also offer higher tiers of membership

If your institution relies on higher volume access to OpenAlex or needs our time for additional services, we offer Member+ and Partner support packages that include increased API quotas and consulting hours, in addition to all of the benefits listed above for Member. For more information on what’s included in each membership tier check out https://openalex.org/pricing/institutions.

Why we’re doing this

OpenAlex is completely open research infrastructure that ingests, deduplicates, links, and enriches metadata so anyone in the world can build on a shared, open index of the global research system. Keeping that open takes real resources. Revenue from our existing paid subscriptions (previously called Premium and Institutional, but now Member+ and Partner) have been critical for our growth over the last few years. But we’ve heard from many institutions with less extensive service needs, that they would like a lighter weight option that costs less with fewer services— something similar to what other open infrastructures offer (e.g., ORCID). And so that’s what we’ve done!

How to join

For more information on which membership level is right for your institution, head to https://openalex.org/pricing/institutions. If you’re ready to become an OpenAlex Member, Member+, or Partner, or would like to discuss these options further, send an e-mail to sales@openalex.org.

Funding metadata in OpenAlex

With the Walden launch behind us, 2026 promises to be an exciting year for OpenAlex. And thanks to a transformative grant from Wellcome of $3.6M over three years, funding metadata will be a major focus of that development.

This Wellcome-funded project aims to make funding information a first-class part of the open scholarly graph so that funders, institutions, researchers, and tool-builders can rely on open, structured, reusable funding metadata.

Below is a progress update on what we’ve shipped so far, what we’re working on now, and how funders can help shape what comes next.

Why funding metadata (and why now)

Funding data is essential infrastructure for research strategy and accountability: funders need to understand what they supported, what it produced, and what changed as a result. They also need global data to position their work within the global funding landscape.

But today, most funding intelligence workflows still depend on closed databases or on burdensome reporting from grantees into siloed funder databases. OpenAlex already provides a comprehensive, open inventory of research outputs. This project extends that foundation so funding metadata becomes similarly open, structured, and connected.

What’s new in OpenAlex

We are hosting a webinar February 19, 2026 at 10am EST to review updates in more detail and allow time for interactive Q&A. You can register for that webinar here and a recording will be available on our YouTube channel afterwards. Here’s a quick update on recent progress.

1) We’re mining full text to match funders to outputs

We’ve begun matching funder names to research outputs through full-text data mining, adding millions of new linkages between funders and their outputs.

We have just started this work and have 10s of millions of PDFs to continue working through, but the momentum is building quickly.

2) “Awards” are now first-class objects in the OpenAlex graph

We’ve updated the OpenAlex schema so awards are first-class citizens, with their own entity type and API endpoint: https://api.openalex.org/awards

This is foundational work: it lets us represent grants/awards as structured nodes in the graph (instead of only as scattered fragments attached to works), which is required for reliable linking, curation, and downstream funding intelligence.

3) When DOIs are registered for grants, they appear in OpenAlex

Any funder registering DOIs for grants can now have their award metadata show up in OpenAlex almost immediately after registration. We’ve built this integration for Crossref award DOIs and will soon have completed the integration for DataCite award DOIs as well.

4) We’re ingesting grant metadata directly from funders

We’ve started ingesting funding metadata directly from funders who make their grant data available online but don’t mint DOIs. At the time of posting this, we had already ingested 11.5M grants.

This is critical: To build a comprehensive database of funding metadata, we need to meet funders where they’re at and ingest their data directly in the formats they’ve made available.

What we’re working on next

Here’s what we’re working on during 2026:

  • Full-text matching (finish running across our corpus of fulltext; set up on-going pipeline for new PDFs)
  • Improving matching quality (funder name disambiguation)
  • Grant ID matching (create linkages between individual grant IDs and papers)
  • Scaling ingest across many funders and formats (from well-structured national databases to the long tail of smaller or distributed sources)
    • We’re starting with a seed list of 50 funders to develop these pipelines. You can check out that list and monitor our progress here
    • We’ll scale funder ingest later this year, but if you want to suggest specific funders you don’t see on our roadmap yet, e-mail kyle@openalex.org 
  • Expanding linkages beyond acknowledgements by incorporating trusted reporting sources wherever possible (e.g., funder impact reports)
  • Clarifying and prioritizing use cases so we build the funding intelligence workflows funders actually need
  • Pilot apps that suggest linkages between grants and outputs (e.g., based on vector distance of text in grants and outputs)

Funder workshop in London: April 27–28, 2026

We’re convening an in-person workshop with collaborating funders on April 27–28, 2026 in London, England.

The goals are to:

  1. Review what we’ve learned so far (what’s working, what’s messy, what needs partner input)
  2. Confirm and refine funder use cases for open funding intelligence and impact reporting
  3. Jointly shape the next phase of the project—both technical priorities and outreach activities to scale this initiative globally in the following two years

We will publish a report summarizing the workshop and detailing next phases of the project.

Call to action: we’re looking for funder collaborators (all shapes and sizes)

If you’re a funder—large or small, national or regional, public or private, anywhere in the world—we’d love to talk.

With each funder collaborator, we’re looking to:

  • Assess the current state of their grant metadata (coverage, structure, identifiers, openness, and constraints)
  • Help make their award records (and impact reports) easier to discover and reuse when possible
  • Ingest their grant metadata into OpenAlex to improve linkages between awards and outputs
  • Fully understand the funding intelligence use cases that matter most to them, so the open dataset supports real reporting and strategy needs

How to get started

The simplest next step is an introductory meeting.

Email the project lead and OpenAlex COO, Kyle Demes: kyle@openalex.org

Thanks (and more soon)

—Kyle

OpenAlex and NORA Collaborate to connect publications to the OECD FORD Taxonomy

OpenAlex and NORA (the Danish National Open Research Analytics team) are pleased to announce a collaboration mapping the OpenAlex research classification system to the OECD Fields of Research and Development (FORD) taxonomy. This alignment supports the upcoming launch of the new Danish Research Portal, but also enables OpenAlex users globally to use the taxonomy in their research analytics.

🎯 Why This Matters for Research Analytics

Widely adopted taxonomies like OECD FORD are critical for international benchmarking, reporting, and policy alignment. At the same time, national governments, research institutions, and regional bodies often rely on their own classification schemes that reflect local research priorities and funding strategies.

By linking OpenAlex’s aboutness classification system with the OECD FORD taxonomy, this collaboration creates:

  • A bridge between global standards and national strategy
  • An open and transparent alternative to proprietary classification systems
  • A pathway for countries and institutions to conduct policy-relevant analytics using fully open data
  • A blueprint for creating crosswalks between OpenAlex and additional research taxonomies

This mapping supports both broader interoperability and regionally specific analysis—without compromising either goal.

🧭 How We Built the Mapping

The mapping was developed using a systematic methodology that relates OpenAlex research subfields with OECD FORD categories. OpenAlex uses metadata about research articles (e.g., title, abstract, journal) to classify research outputs into research topics, subfields, fields, and domains (full documentation here).

  • OpenAlex subfields were successfully mapped to 38 out of 42 two-digit FORD fields.
  • The four remaining categories did not have direct equivalents given the current OpenAlex taxonomy structure.
  • The resulting crosswalk supports comprehensive coverage of major research areas across the OECD framework.

The figure below shows the number of OpenAlex subfields that were mapped to each FORD category. A full table listing each OpenAlex subfield and its corresponding FORD categories is available here.

🤖 Combining Expert Knowledge with AI

To ensure quality and scalability, we employed a dual approach:

  • A human expert (from OpenAlex) manually assigned OpenAlex subfields to FORD categories.
  • The same task was conducted using ChatGPT to test whether AI could reliably assist in classification alignment.

Out of 250+ assignments, the two approaches differed in only 11 cases. These were reviewed in collaboration with researchers in those fields: ChatGPT’s classification was determined a better fit in 7 of the 11 cases, while the human’s classification was a better fit only 4 times!

This result gives both teams confidence in using AI to assist with future classification crosswalks—especially as a way to accelerate mappings between OpenAlex and other national or domain-specific taxonomies.

📊 What the Mapping Enables

Once mapped, the classifications were applied by NORA to publications in the Danish Research Portal, which aggregates research outputs from across Denmark’s institutions. The FORD classifications derived from OpenAlex were then compared with classifications from Scopus and Web of Science.

While proprietary licensing prevents sharing of detailed comparisons, results from the three systems were broadly aligned, with some differences reflecting their underlying methodologies. Importantly, this confirms that open infrastructure can meet the same analytical needs traditionally served by closed systems.

🚀 What’s Next

  • OpenAlex users around the world can apply the crosswalk in their own analyses. If you think it’s useful for us to expose the OECD directly in our public API, let us know! If there is enough interest, we’ll add it this year.
  • The Danish Research Portal will launch in mid 2026, showcasing Danish research outputs across the OECD FORD classifications.

With the new OpenAlex Walden system, we look forward to expanding support for multiple taxonomies to meet the needs of different countries, research communities, and policy environments.

⚠️ Important Note on Use

This mapping is not formally endorsed by the OECD. We consulted with the OECD team and shared preliminary results to ensure accuracy and transparency. However, users conducting official reporting should validate the mapping according to their institutional or national guidance.

🌍 A Shared Vision for Open, Interoperable Research Infrastructure

This collaboration demonstrates what is possible when national research infrastructure and open data providers work together to align global and local needs. By combining methodological rigor, AI-assisted innovation, and a commitment to openness, NORA and OpenAlex are helping advance a more interoperable and transparent research ecosystem.

If your organization or country uses its own classification system and is interested in implementing it in OpenAlex, we invite you to reach out and collaborate with us.

— The OpenAlex and NORA Teams

OpenAlex: 2025 in Review

2025 was a defining year for OpenAlex. After two years of learning what the world needs from OpenAlex, we spent last year rebuilding our entire foundation and massively expanding our coverage. During this rebuild, we served exponential growth across academia, government, and industry, solidifying OpenAlex as essential global infrastructure for research.

A New Foundation: Walden Launch

At the end of the year, we launched Walden, the complete rewrite of the OpenAlex system.

On day one, Walden added more than 190 million new works, including records from DataCite and thousands of institutional repositories. For the first time, OpenAlex now creates records even when research exists only in repositories—making millions of previously hard-to-find works truly discoverable. These new records currently live as a dedicated subset (xpac) while we continue strengthening metadata before full integration into the core index.

Walden also gives OpenAlex a modern, flexible architecture making it faster to add new sources, easier to improve quality at scale, and ready for the next generation of features and curation.

Unprecedented Adoption & Global Reach

Use of OpenAlex grew dramatically, ending the year with:

  • 350,000+ monthly unique visitors to our UI
  • 3+ million monthly pageviews on our UI
  • 1.5 billion monthly API calls across OpenAlex (1B) + Unpaywall (0.5B), exceeding Crossref for the first time!
  • 1,100+ Research outputs in 2025 referencing OpenAlex

Rebranding and Clarifying the Mission

As OpenAlex continued to expand, it became clear that OpenAlex is not just one of our products—it is our mission. And in 2025, we reorganized to reflect that realization.

Today:

  • OpenAlex is the purpose and platform.
  • Unpaywall is a slice of the OpenAlex database delivered in a specific format.
  • Unsub is a dashboard built on top of OpenAlex, supporting specific use cases.

This unified identity makes it clearer for our users, clearer for our partners, and clearer for ourselves what we are collectively building together.

Financial Progress & Sustainability

We achieved major sustainability milestones in 2025:

  • Reached our year 2 $800k ARR target—three months ahead of schedule
  • Received a $3.5M Wellcome grant to integrate global research funding metadata
  • Continued strong renewal rates and growing institutional engagement

Running both the old and new systems in parallel, supporting unprecedented usage growth, and delivering Walden led to higher costs than projected. But these were intentional investments to make OpenAlex stronger, more scalable, and more valuable for the long term.

Looking Ahead

With Walden now live, we’re excited to start our next chapter. In 2026, we will:

  • Launch full community curation pipelines
  • Integrate global funding metadata
  • Begin integrating research software as first class research objects
  • Deepen partnerships with governments, universities, and industry, rolling out new support models and new features.
  • Continue strengthening sustainability and reliability

Thank You

To everyone who contributed, partnered, advocated, experimented, and trusted OpenAlex this year: thank you! We are thrilled and humbled to watch OpenAlex become the open, global scholarly knowledge graph the world depends on and are deeply aware that none of this happens without you.

Here’s to an even bigger 2026.

The OpenAlex Team

A Better Way to Detect Language in OpenAlex—and a Better Way to Collaborate

As part of the recent Walden system launch, we’ve improved how OpenAlex detects the language of scholarly works. The results are immediately visible in the data: many more works are now correctly recognized as non-English, new languages appear that weren’t represented at all before, and previously unclassified works now have accurate language assignments. 

The chart below (source) shows the number of works attributed to each language in the Classic vs. Walden OpenAlex. Most languages fall above the diagonal line, meaning more works in Walden are classified with that language and the cluster of languages on the y-axis are all languages that had no works in Classic OpenAlex but now have works in Walden.

We’re excited about this improvement. But the story behind this improvement is just as important as the technical result—it’s a model for how the research community and open infrastructures like OpenAlex can collaborate to make real, shared progress.

From helpful critique to a true collaboration

Last year, a group of researchers published a preprint evaluating OpenAlex’s language-classification system using a large multilingual gold standard (Céspedes et al., arXiv:2409.10633v2, now published as https://doi.org/10.1002/asi.24979). We were excited to see that an international research collaborative had undertaken such a significant project using OpenAlex with the aim of improving its usefulness for the global research community. Their study was rigorous and thoughtful, and it confirmed something we already knew: our approach to language detection could be improved.

However, the paper stopped short of evaluating and recommending the concrete next steps we could take to improve language detection in OpenAlex. We hadn’t been involved at the beginning of the study to provide the authors with the kinds of metrics or performance comparisons that would actually let us deploy a better model in production. But after publication, we met with some of the authors to discuss what we needed to be able to turn their work into improvements in OpenAlex. 

  • We needed precision and recall metrics for multiple competing candidate algorithms (with a bias towards precision); and
  • We needed analysis that considered cost and runtime, given that any model we deploy must scale to 400 million+ records.

The researchers enthusiastically took on the additional work— checking in with us throughout the process to make sure they were on the right track. The result was a preprint from their follow-on study, (Sainte-Marie et al., arXiv:2502.03627), that provided exactly the applied, scalable insight we needed.

Turning research into real-world impact

As part of the Walden rewrite, we implemented one of the top-recommended approaches from their study. The improvement has been dramatic:

  • More works are now correctly classified as non-English languages, instead of being incorrectly labeled as English.
  • New languages, previously absent from OpenAlex, are now detected for the first time.
  • Previously “null” records now have reliable language tags.

Before deploying the new model in production, we already knew from the researchers’ analyses and their multilingual gold-standard sets that it would yield a strong overall improvement across the corpus. But we wanted to confirm that in practice. So we manually reviewed a random sample of works whose language classification differed between the old and new systems—and in the vast majority of those cases, the new system was correct.

We also validated against real-world feedback. For instance, the NORA team at Research Portal Denmark had previously submitted support tickets detailing mix-ups between Danish and Norwegian, two languages that are notoriously similar in writing. In ~75% of those cases, the new system now gets it right.

A model for future collaboration

To be clear– we value and learn from every independent evaluation of OpenAlex. One-way critiques from researchers are a vital part of the open-infrastructure ecosystem, and we deeply appreciate the time and expertise the global research community is investing in making OpenAlex better.

What made this case stand out was the second step: turning that critique into a direct collaboration that produced immediately deployable improvements. By working together, we created a fast-tracked feedback loop—from identifying issues in OpenAlex, to developing and testing solutions, to rolling out fixes across hundreds of millions of records. It’s a model we’d love to repeat.

And this is only the beginning. In the next few weeks, we’ll be launching a new community curation system letting researchers and metadata experts around the world submit corrections directly to OpenAlex—creating an even faster, more transparent, and more collaborative way to improve research metadata at scale.

Stay tuned—and thank you to everyone helping make open research information better, one contribution (and one collaboration) at a time.

Major Update to Unpaywall Database

We recently announced major changes to Unpaywall on our Unpaywall google group (https://groups.google.com/g/unpaywall) and via email to Unpaywall Premium Subscribers. A lot of folks aren’t on the group so we’re announcing here as well.


TL;DR
Unpaywall has migrated to a new codebase that helps us address data quality issues faster, and you may notice some changes.

  • The API is way faster → 10× faster API responses (avg 500 ms → 50 ms).
  • Some data has changed → About 23% of works saw data change, with about 10% seeing changes in oa_status (green, gold, etc) and 5% in is_oa (closed or open).
  • Overall accuracy is similar → Overall, precision remains constant. We have better recall of some Gold articles and worse detection of some Green articles.
  • Tiny schema changes→ Your scripts, API calls, and data feeds keep running, but two fields are now deprecated (oa_locations.evidence & oa_locations.updated)
  • Community curation → Users can now report and fix errors at unpaywall.org/fix.
  • Action required only if you host the full dataset locally (details below).

Why rewrite a perfectly good tool?

A decade ago we developed Unpaywall to:

  1. make open access research in institutional repositories discoverable by users globally,
  2. track open access behaviours and generate evidence for effective open access policies, and 
  3. raise the bar for open infrastructures by ensuring that the industry standard for determining open access status, was itself completely open. 

We’re happy to report it has been very effective at achieving those goals: 

  • Our Chrome and Firefox extensions are used by 800k monthly active users around the world, 
  • Unpaywall sees an average of 200 API calls per second every second of the year, 
  • Unpaywall now underpins every major open access monitoring and tracking initiative globally, and 
  • Unpaywall has demonstrated an effective model for operating open research infrastructure. 

Over the years, Open Access has become increasingly important to researchers, institutions, funders, and publishers. And steady changes over the years brought us to a publishing system that looks differently than the one we started in. At first, it was exceedingly rare for a publication that was open access to later become closed access. It was rare for publishers to make closed access works openly available for short times (like during COVID). And with the exception of embargo periods, it was rare for closed journals to later be made completely open. 

All of these are common now, and at the scale of millions of publications. And publication landing pages aren’t just about providing the user with access to information– they also now collect information on users. As scholarly communication has evolved, it was clear that Unpaywall needed to evolve from a product into a process. And unfortunately, the code base that supported Unpaywall was struggling to adapted. With every change, we introduced new bugs and fixing each new bug kept creating more bugs. To continue delivering high quality open access metadata in an efficient way, we needed to start from scratch.

We spent the last year completely re-writing the code base for Unpaywall to make it: 

  • faster; 
  • easier to fix when it breaks; and 
  • easier for users and publishers to curate.

On May 20, 2025 we launched the update. We have been working with our premium subscribers to implement the changes of their locally hosted databases that rely on Unpaywall. Most of our users switched to the new code base without even noticing– and that was intentional. Still, we think it is important for our users, especially those whose work depends on the Unpaywall database to understand these changes.


What didn’t change

Stable as everDetails
Data format & schemaAll keys stay the same (only the fields: oa_locations.evidence and oa_locations.updated are now marked “deprecated”).
API & data feed URLsZero downtime, same endpoints.
Aggregate metrics
10% of records saw a change in oa_status (i.e., color) and 5% saw a change in is_oa (open access vs. closed). Some changes were improvements and some were degradations, but overall precision remains the same

What did change

Better than beforeHow it helps you
SpeedAverage API now returns in 50 ms, compared with 500ms before–10x speedup! ⚡
AccuracyWe detect more Gold OA, licenses, fresh OA URLs, and works that were once open access but are now closed. We detect less Green OA (but we’ll be able to improve that soon).
Curation UIUsers around the world can submit fixes via a web form; they go live in days.
Bulk CurationPublishers can now directly submit to us bulk changes when their journals change from closed to open (or vice versa); they go live within 2 weeks.
Bug-fix velocityCleaner code = faster bug fixes.

Do you need to do anything?

Your setupRequired action
API-onlyNothing. You’re already on the new code and likely didn’t even notice
Data-feed mirrorDownload our one-time “May 20 Snapshot” and overwrite your current database—too many small tweaks for a changefile.

Meet the new Curation Portal

We heard loud and clear from our users that they need to be able to fix open access metadata errors when they find them. And that’s why we developed a community curation pipeline for Unpaywall. 

Found a record that still looks off? Head to unpaywall.org/fix, flag the issue, and we’ll merge your correction shortly (typically within 3 business days). Your expertise powers continual data quality improvements. 

If you have ideas on how to improve the functionality of the curation user interface, please send them to brett@ourresearch.org


Looking ahead

  • Community curation of Unpaywall will become increasingly important for overall database accuracy and fixing in Unpaywall will fix in all downstream users (Web of Science, Scopus, Dimensions, and more).
  • We will collaborate more closely with publishers directly to make large-scale changes associated with journal policy changes more quickly and accurately.
  • We will continue refining specific parts of our pipelines to increase their overall reliability, including better detection of OA status, journal OA status, license information, and fulltext links.
  • Users will see faster patch cycles for reported issues.
  • We will increase repository coverage and enhance linkage between publisher and repository versions.
  • Later this summer, we’ll be launching a full re-write of OpenAlex to bring the databases into closer alignment where they overlap (i.e., OA status metadata for publications with Crossref DOIs)

Thank you

We heard loud and clear from our communities of users that timely fixes of data quality issues is critical for them to be able to rely on Unpaywall. And we know that our response times slipped while we tackled this rewrite—thanks for sticking with us! 

If you spot an error in the Unpaywall database that you would like to see fixed, the fastest way is to do that at unpaywall.org/fix. If you have other questions, send a note to  support@unpaywall.org.

Here’s to a faster, cleaner, and ever-more-useful Unpaywall!

The OurResearch Team

OpenAlex: 2024 in Review

As 2024 comes to a close, we’re taking the opportunity to reflect on the year behind us. And what a year it has been for OpenAlex!

It’s hard to believe that it was only one year ago when we launched the Beta of our web interface and the first University, The Sorbonne, announced that they were replacing their proprietary database with OpenAlex. 

Since then the team has worked hard to meet the evolving needs of our communities of users. Below are some of the highlights of 2024.

Organization:

  • We received a 5-year grant from Arcadia totalling $7.5M to establish OpenAlex as a sustainably open index of the global research ecosystem
  • We received a 2-year grant from the Navigation Fund totaling $688k to enhance the OpenAlex user interface
  • We hired a Chief Operating Officer (Kyle Demes) and Senior Frontend Developer (Brett Lockspeiser)
  • Our Premium subscriptions exceeded our first year’s sustainability target by 25%

Data:

  • We started parsing fulltext PDFs to add more affiliation and reference metadata
  • We started matching references without DOIs (we now have 2.5B citations)
  • We added HAL as a primary source for new works
  • We started ingesting DataCite as a primary source. We now have 6.4M DataCite records (we’ll have them all in a few months)
  • We enhanced metadata accuracy for work type, publication year, author, institution, source, open access status, and more
  • Our data was adopted by three major University rankings:
  • Our data was featured in a Science News article examining the sustainability of APC feeds paid by researchers

Product:

  • We launched our Beta User Interface
  • We launched a new aboutness classification system (topics → subfields → fields → domains)
  • We launched new normalized citation metrics (field-weighted citation impact and citation percentiles) to facilitate comparison across fields and years.
  • We introduced user curation for affiliation, author, source, and work-level metadata and have already received more than 10k requests
  • We expanded our offerings of paid services to help us get to sustainability faster
  • We laid the foundation for an exciting new analytics product we’re looking forward to showing off early next year

Community (you):

  • Monthly users of OpenAlex.org have grown from 28k at the beginning of the year to 78k, now representing 440k visits per month!
  • Our first OpenAlex User Meeting was a huge success with 27 presentations from OpenAlex users in diverse organizations around the world
  • We attended 9 conferences to promote OpenAlex and engage with our user community globally: Research Analytics Summit, CARA, BRIC, ICSSI, Make Data Count, LIS, STI, SRAI, The Charleston Conference and were truly humbled to see presenters and vendors at every conference using OpenAlex data!
  • We launched a YouTube channel which now has 49 videos, 736 subscribers, and almost 25,000 views!
  • Over 500 publications mention or reference OpenAlex and that number grows daily!
  • We hosted an open call for our first Community Advisory Board where 50+ stellar nominees received almost 1,400 votes from the community– stay tuned for an announcement of results in early 2025

None of this would have been possible without all of you. So thank you! For your continued support, ideas, engagement, criticism, cheerleading, and collaboration. We’re looking forward to continuing to work together to build off these successes in 2025. Until then, Happy Holidays to you and yours.

Sincerely,

The OpenAlex Team