So, you published something and it was cited. That’s great! But it looks like there’s a problem — you check the number of citations on different websites and Crossref, Google Scholar, and Web of Science all disagree. Why is that? Did we do something wrong?
In reality, it’s quite normal for the citation counts on different platforms to be different. It’s all to do with which publications are included as sources and how the citations are collected.
So how does Crossref do it? Our citation counts only look at items in reference lists of works registered with us. The reference lists are provided by our members, but it’s optional because not every member has the resources to deposit them and, of course, some works don’t contain any references. We don’t scrape PDFs, full-text files, or websites to find references. Once we have the reference list in our system, if there are any items without a DOI we try and match them against other registered items. If we find a match we add the DOI into the reference list entry. Sometimes references are altered after publication and in this case we adjust the citation count. The citation counts we report are between two works of any type with a Crossref DOI.
Other citation services do it differently. Some only include certain types of outputs (like journal articles or books) whereas others include works that don’t typically have a Crossref DOI (such as datasets or patents). Some are selective and only look at a limited corpus of works. Citation counts are often based on scraping full-text versions of works, either in PDF or XML form, and in some cases they are checked and moderated.
Why is there a difference?
No method is perfect, but that doesn’t mean that if two counts don’t agree that something went wrong. Different methods lead to different outcomes and if you find that the counts are not the same there count be a number of reasons:
If the Crossref count is lower than another source, it could be because:
- The citing item doesn’t have a Crossref DOI, or we haven’t yet received the metadata.
- The publisher of the citing work has registered the DOI, but hasn’t included the references.
- We didn’t make the correct match or the other source made the wrong match.
- The reference was removed but not updated by the other source.
On the other hand, sometimes the Crossref count is higher, which can be because:
- The other service doesn’t include the citing work when counting citations: it might be the wrong type or not in their selected corpus.
- The cited item has been registered with Crossref but the other service hasn’t yet processed it.
- The reference was added after publication and hasn’t been reprocessed by the other service.
- We got the match wrong, or the other service missed the match.
What can you do If you see a difference in citation counts?
First, check the lists above to see if any of those cases apply. The most likely scenario is that it’s a different in how the citation counts are carried out.
Second, ask yourself what purpose you are looking for citation counts for and which method is best. For example, if you’re interested in citations by patents, try something with a broader catch than Crossref, like Google Scholar or Dimensions; if you’re mainly interested in citations from journal articles then try a selective database like Web of Science or Scopus; or if you’re interested in a particular field of study, try a field-specific database like PubMed.
Finally, if you see a reference that we’ve missed, you can check whether the work includes deposited reference metadata (look for the ‘reference’ section in the metadata record. If the reference is missing, you could ask the publisher to consider depositing the reference metadata. Many members already deposit reference metadata and next year we will be looking at how to encourage even more members to do so. If the reference metadata is available from our APIs but we didn’t match it and you think we should have done, please leave a comment here or start a new forum post.
Citations ≠ References
A pedantic postscript: a citation and a reference aren’t the same thing, but they’re frequently used interchangeably and unfortunately our metadata doesn’t clearly differentiate them:
- A reference is an entry in a bibliography, typically a list at the end of a work like a research article or book chapter: there is one entry for each work.
- A citation is where the reference is called out in the text: this could happen several times in different parts of the text.
Crossref has a metadata section called references, but we don’t specify whether it should include references or citations. While most members include works only once in the reference list, there’s nothing to stop them listing every citation separately. When we count how many times a work is mentioned in ‘references’ we call it a ‘citation count’, which unfortunately confuses the two concepts.