How to reliably identify Crossref DOIs?

The “Crossref Display Guidelines (March 2017)", retrieved 2021-12-09, DOI display guidelines - Crossref say that https://doi/org is only for “Crossref DOIs, not anyone else’s DOIs, as not all DOIs are made equal.”

Neither this page nor the linked membership page clearly explain how to identify a Crossref DOI vs any other possible DOI.

An example URL is given, https://0-doi-org.libus.csd.mu.edu/10.xxxx/xxxxx which seems to imply that:

  • All Crossref DOIs begin with “10.”
  • No non-Crossref DOIs begin with “10.”

Are both of these statements true? If not, what additional information do I need to reliably identify Crossref DOIs.

I am working on scripting to automate conversion of references (in papers we publish) that include “doi:…” identifiers. I do not want to inadvertently convert to https://0-doi-org.libus.csd.mu.edu/ any doi: identifiers that do not “belong” to Crossref.

Thank you

Hi. The display guidelines say that the display guidelines are only for Crossref DOIs and not necessarily other DOI agencies, not that the resolver doi.org is for Crossref only. There are many other agencies of the DOI Foundation so the first of your statements is true, the second is not; all DOIs begin with 10., even the DOIs that are for the entertainment industry or for the construction industry. Unfortunately, the DOI Foundation’s handbook is very out-of-date but you can see their site to read about the other DOI agencies. If you’re scraping papers that you publish then you’re probably fine to include all DOIs, whether they’re Crossref’s or not, as they will likely all be scholarly/research-related DOIs. Why limit to only Crossref’s or to only ones that are displayed as doi:? (which Crossref recommends against)?

3 Likes

Thank you for correcting my misunderstanding of the scope of the https://0-doi-org.libus.csd.mu.edu/ resolver, Ginny.

I think I understand that while a different DOI agency – say, the EU Publications Office to just pick on the one I tested – might prefer to display their DOIs as just “doi:10.2870/…” their DOIs should still be correctly handled and resolved by https://0-doi-org.libus.csd.mu.edu/ (the one I tested was).

As my organization is a (brand new) member of Crossref, it seems best (as you suggest is “probably fine”) to display any DOI using the resolver because that’s how we do it here in scholarly/research-related contexts.

To clarify the context, this scripting I’m developing is for our journal and proceedings editorial staff to efficiently process manuscripts on the way to publication. So, “scraping,” yes, but with decent human mediation :slight_smile: Later steps will convert text that appears to be a URL into a hyperlink, so any DOIs already displayed using the resolver will also be “included,” not just ones initially given as a “doi:.” (Most come to us as “doi:” because that’s the recommendation – but not firm rule – for AMA style.)

It seems that the actual answer to reliably identifying Crossref DOIs would have to be something like a real-time (or frequently cached) lookup into some list of all prefixes assigned to publishers by Crossref. This seems impractical, but also unnecessary for my purpose because non-Crossref DOIs will still work with the resolver.

Thank you again,
David

2 Likes