It looks like you're on a small screen. For best display, please download the PDF or switch to a larger screen.

What the ROR integration tells us about metadata in Janeway

2,660 words

In this talk

Main question: How can we help Janeway users create and manage high-quality metadata that includes URIs?

  1. About ROR and Janeway
  2. The five whys of URIs
  3. Usability strategies for ROR features
  4. Comparison to ORCID and DOI features
  5. Takeaways
  6. References

About ROR and Janeway

We integrated ROR into Janeway in 2024-25

The ROR logo

The Research Organization Registry is “a global, community-led registry of open persistent identifiers for research and funding organizations.”

User stories:

  • “As an author I want to create and edit affiliations for me and my co-authors.”

  • “As an editor I want to edit affiliations for authors.”

  • “As an author or editor I want affiliations to be clear to readers and funders.”

Searching to add an affiliation on Janeway

A screenshot of the page in Janeway where you can search for an organization to add your affiliation.

ROR on Janeway article page

Screenshot showing an article page with a ROR-linked affiliation in the sidebar.

ROR in Janeway-generated Crossref metadata

Screenshot showing JSON Crossref metadata with a ROR-linked affiliation.

The five whys of URIs

Hold on. Why?

We now have some links:

Screenshot showing an article page with a ROR-linked affiliation in the sidebar.

Screenshot showing JSON Crossref metadata with a ROR-linked affiliation.

Why #1

Why do we need them? Can’t authors just type their affiliations in and call it a day?

Five whys: ask why five times to discover the root cause of a problem.

Griff, my college English professor: “Why is it thus and not otherwise?”

Organization names are re-usable and amorphous

Three separate organizations on three continents:

Name
Museum of Modern Art
Museum of Modern Art
Museum of Modern Art

Three names for one organization:

Name
University of Michigan–Ann Arbor
UMich
UM

Also: misspelling, capitalization, punctuation, translation...

Why #2

Why do people do this?

The vocabulary problem

In 1987, researchers at Bell Labs asked people to name things in several knowledge domains.

In most cases, people chose different words from each other.

“The data tell us there is no one good access term for most objects. The idea of an ‘obvious,’ ‘self-evident,’ or ‘natural’ term is a myth!”

They called this the “vocabulary problem in human-system interaction.”

Why #3

Why is it worse when computers are involved?

Banal context collapse

The scholarly record available through the World Wide Web is international and immense, and what you find there has been recontextualized by library discovery systems and search engines.

Problems caused:

  • ambiguity - a word or phrase could refer to many things
  • duplication - multiple records exist for a single thing
  • retrieval failure - you don't always know the right search term
  • outdated references - past names may be inaccurate

Why #4

Why does the Web contribute to these problems?

The Web uses URLs

URLs are a ridiculously powerful vehicle. Their distinctive powers are:

  • universality
  • location

They can go get anything from any context (in theory) and put it next anything else.

In a stadium at night, the words "This is for everyone" written in lights across the crowd

Web inventor Tim Berners-Lee's catchphrase "THIS IS FOR EVERYONE" at the 2012 olympics.

Why #5

I asked you about names. Why are we talking about links?

Links are names

When URLs refer to things in the world, not just their own webpages, they are called URIs or Uniform Resource Identifiers.

They are "identifiers" because they primarily identify the thing, and only secondarily point to a webpage about the thing.

Why #?

Using links as names is a horrible idea. They are too long and clunky. Can we shorten them at least?

You need the whole thing

httpsror.org04a1a1e81
schemehostpath

There are other schemes, like http and urn and ftp.

The are lots of hosts. The host is a namespace managed by an institution (i.e. ROR) that guarantees the URI is distinct, persistent, and uniform.

The path is essential. It distinguishes this item from all the others like it.

Why #?

So we are stuck putting raw links everywhere?

Usability strategies for ROR features

Principles that shed light

In 1990 Jakob Nielsen began developing usability heuristics for interaction design.

Today "Nielsen's Ten" is still used as a checklist for noticing what's wrong with a user interface, and what's working well.

An A4 size list of Nielsen's Ten Usability Heuristics

Users search with the name they know

Heuristic 2: Match between the system and the real world. The design should speak the users' language. Use words, phrases, and concepts familiar to the user, rather than internal jargon.

The Janeway ROR search with the search term "Washtenaw" highlighted

“Well, I know it has Washtenaw in it.”

Users don't edit the ROR ID or display name

Heuristic 5: Error prevention. Good error message are important, but the best designs carefully prevent problems from occuring in the first place.

The Janeway ROR search showing that name and ROR ID cannot be edited

“Oh good, someone has already put it in.”

Users are shown other info they may recognize

Heuristic 6: Recognition rather than recall. Minimize the user's memory load by making elements, actions, and options visible. Avoid making users remember information.

The Janeway ROR search showing the information presented for recognition

“Ah yes, that's the WCC I know.”

Users can put ROR ID in Janeway using ORCID

The Janeway login screen with an option to use ORCID

The Janeway profile page showing a ROR affiliation that was populated from ORCID data

“It looks like it got my affiliation info from my ORCID profile.”

URIs combine to create linked open data

We can get ROR IDs programmatically from ORCID because they encode them as URIs, creating linked open data.

  1. User registers with ORCID
  2. Janeway checks their public ORCID profile for affiliations
  3. Janeway imports the primary affiliation and matches any ROR URI with other ROR data in the Janeway database
  4. Janeway shows the post-login screen to the user

A colourful abstract network graph of data in Wikidata

Users expect us to talk to other systems

Heuristic 4: Consistency and standards. Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.

The ORCID window for adding employment

“I have already entered my affiliation on ORCID and made it public. Can you just use that?”

Users may want to edit or delete the affiliation

Heuristic 3: User control and freedom. Users often perform actions by mistake. They need a clearly marked "emergency exit" to leave the unwanted action.

The Janeway window for editing an affiliation

“Ope. My affiliation on ORCID was out of date. Let me just change that.”

Comparison to ORCID IDs and DOIs

Turning to ORCID IDs

People’s names overlap and vary in form, just like organization names.

Real examples from orcid.org:

First nameLast name
JosephMuller
JosephMuller
JosephMuller
JosephMuller
JosephMuller

The ORCID logo

Should use a search interface, like with ROR records?

A search interface to choose an ORCID is not always enough. It is better to have each ORCID holder authorize the link by logging in to ORCID.

Users can log in with ORCID

The Janeway login screen with an option to use ORCID

The ORCID sign-in page

“Yes please. One fewer password to track.”

Heuristic 2: Match between the system and the real world.

Heuristic 4: Consistency and standards.

Heuristic 6: Recognition rather than recall.

Oh dear! ORCID IDs can be saved in various ways

Here are some real ORCID ID field values for published OLH articles:

ORCID ID
0000-0002-2911-8382
https://orcid.org/0000-0002-9651-3315opens in new tab
orcid.org/0000-0001-9039-2201
https://orcid.org/opens in new tab 0000-0001-8731-9097
ORCID: 0000-0002-1148-689X
000-0002-8489-4405
0009-0002-9526- 9012
nasukawa

Anything can be entered into the ORCID ID field, with no required format.

The ORCID field in Janeway

One result: Janeway might not know which ORCID ID format to check for, letting you create a duplicate account.

(Lest you get too worried: Janeway does makes ORCID IDs uniform in outputs.)

Users never enter ORCID ID manually

Work in progress by Esther Verreau and Alainna Wrigley at CDL:

The new ORCID field in Janeway

This small button will make a big difference to users and improve the quality of ORCID metadata.

Heuristic 4: Consistency and standards.

Heuristic 5: Error prevention.

Heuristic 6: Recognition rather than recall.

Turning to DOIs

Article titles vary and overlap too, so the five whys are similar for DOIs too.

Real examples from Crossref:

Title
Back to the Future
Back to the Future
Back to the Future?
Back from the Future
Taking back the Future
Back to the Future II comment
The Future Is Back; Back to the Future!*

The DOI Foundation logo

Differences in circumstances:

  • Generation and registration of the URI rather than retrieval and linking
  • Journal editors and press managers, not authors
  • Additional namespace in URI path (e.g. 10.1234)
  • Multiple registation agencies
  • Location matters just as much as identity since a DOI represents a digital resource

Oh dear! Invalid DOIs can be created

Here are some real DOI field values for articles published on Janeway installations:

DOI
10.5334/gjgl.1610
10.16995/la.9419
10.16995/la.9419
/pn.2005
/ahac.9516
/cpo.1879
  • Problems with uniformity and

  • The middle ones are duplicates. This can lead to problems when registering the DOI, though only one is displayed on the public article page.

  • The last few are missing a DOI prefix (e.g. 10.16995) so won't resolve properly.

Users can check resolution before publishing

A screenshot of the verify DOI tool in Janeway

A screenshot of the DOI resolution status card in Janeway

To resolve, a DOI must be:

  • A valid URL
  • Registered with doi.org via Crossref or Datacite
  • Pointing back at the article page on Janeway

Users can see how the DOI resolves and catch errors

This brings in two more usability heuristics:

Heuristic 1: Visibility of system status. Designs should keep users informed about what is going on, through appropriate, timely feedback.

Heuristic 9: Recognize, diagnose, and recover from errors. Error messages should be expressed in plain language (no error codes), precisely indicate the problem, and constructively suggest a solution.

What if: Users can check DOI validity earlier

We could add checks based on the expected character sequence of a DOI (using regular expressions).

These checks would happen before any communication with Crossref or Datacite, when the user...

  • edits the DOI prefix and pattern fields
  • advances an article, triggering a deposit
  • manually creates or edits a DOI

Heuristic 1: Visibility of system status.

Heuristic 5: Error prevention.

Takeaways

Takeaways

  • When dealing with URIs, users work better when they have tailored interfaces that minimize error
  • Top strategies for guiding users with URIs include:
    • context-rich search
    • sensible limits on what can be edited
    • input validation
    • prepopulation from linked open data
  • Usability can have an affect on metadata quality
  • Janeway already provides many tailored interfaces, but we need a few more for ORCID IDs and DOIs, as well as any other URIs we integrate in the future

Thank you

References (1 of 2)

Furnas, G., T. Landauer, L. Gomez, and S. Dumais. “The Vocabulary Problem in Human-System Communication.” Communications of the ACM 30, no. 11 (1987): 964–71. https://doi.org/10.1145/32206.32212opens in new tab.

“Manuscript Submission Systems Integration Best Practices.” ORCID, n.d. Accessed May 18, 2026. https://info.orcid.org/manuscript-submission-systems/opens in new tab.

Molich, Rolf, and Jakob Nielsen. “Improving a Human-Computer Dialogue.” Communications of the ACM 33, no. 3 (1990): 338–48. https://doi.org/10.1145/77481.77486opens in new tab.

Muller, Joseph. From User Stories to High-Quality Data: Implementing ROR on the Janeway Platform. March 17, 2026. Text/html. https://doi.org/10.71938/E3FB-V153opens in new tab.

Nielsen Norman Group. “10 Usability Heuristics for User Interface Design.” Accessed May 18, 2026. https://www.nngroup.com/articles/ten-usability-heuristics/opens in new tab.

References (2 of 2)

Research Organization Registry (ROR). “About ROR.” Accessed May 18, 2026. https://ror.org/about/opens in new tab.

Webb, Nick. Tim Berners-Lee’s Tweet “This Is for Everyone” at the 2012 Summer Olympics Opening Ceremony. July 27, 2012. Flickr: DSC_3232. https://commons.wikimedia.org/wiki/File:This_is_for_Everyone.jpgopens in new tab.