Multi-language support

Hello, we are sending data to crossref via xml schema. We have 2 language options in our articles, Turkish and English. We want to display in crossref according to the first language of the article. We send the first language of the article via original_title_language. But the notation in crossref is referencing the tag. How can we include the <original_language_title> tag in the display? Or how can we do multi-language support? Thank you.

2 Likes

Hi @mkoc ,

Thanks for your message, and welcome to the community forum.

DOIs are really citation identifiers, so we advise that members register the journal-level metadata as well as your journal-article-level metadata as you believe those works will be cited. Do you believe that the content is more likely to be cited in English or Turkish? If the former, register the journal title in English. If the latter, register the journal title in Turkish.

We recommend registering the metadata in the language that matches the language of the content itself. For illustration, if the article (or, content) was written and published in Turkish, then the article would most likely be cited in Turkish, so we’d recommend that the metadata registered with us also be in Turkish.

Keep in mind that <original_language_title> is meant for use with translations only. It’s not a way to insert extra metadata in another language. If you want to submit multiple titles for a given article, to have the metadata in two languages, then you can register two <titles> tags.

Please let me know if you have any additional questions,
Isaac

1 Like

thank you very much for answer. I needed this so much. :slightly_smiling_face:

1 Like

Dear @ifarley,

I was reading the following example:

Could you confirm which option below would be correct when the multilingual content is not a translation, please?

Option A:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
<title language="fr">Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</title>
</titles>

Option B:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
<titles>
<title language="fr">Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</title>
</titles>

Option C:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
<titles language="fr">
<title>Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</title>
</titles>

I tried reading the schema but couldn’t confirm the issue above:
https://0-data-crossref-org.libus.csd.mu.edu/reports/help/schema_doc/5.3.1/crossref5_3_1_xsd.html#titles

Thank you,
-Felipe.

Hi @fgnievinski ,

Yes. This is the correct format for multilingual content that is not [Note: I was wrong on this recommendation; our documentation, at the time of the original post, was not as clear as it could be and I misinterpreted the recommendation - <original_language_title> is only for content that is a translation.] a translation:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
<original_language_title language="fr">Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</original_language_title>
</titles>

Warm regards,
Isaac

Hi Isaac, now I’m confused:

  • initially we had “Keep in mind that <original_language_title> is meant for use with translations only.”
  • later we had “This is the correct format for multilingual content that is not a translation … <original_language_title>

The two cases seem contradictory, unless I’m misinterpreting them?

Kind regards
Felipe

1 Like

Hi @fgnievinski ,

My apologies for the added confusion. I think the documentation could be clearer, so I’m going to update it.

I was wrong in my previous response. Sorry about that. <original_language_title> is only for content that is a translation. Like I said above, that could be clearer in the documentation too, so I’ll get that updated. It will probably be next week before that change is live on https://0-www-crossref-org.libus.csd.mu.edu/documentation/schema-library/markup-guide-metadata-segments/multi-language/.

The annotation in the schema for <original_language_title> is pretty legible so I don’t think we need an update there:

<xsd:documentation> The title of an entity in its original language if the registration is for a translation of a work. When providing the original language of a title, you should set the language attribute.</xsd:documentation>

In your examples above, option C is the one that works for multi-lingual non-translated content, and the schema.

My best,
Isaac

1 Like

Thanks for the clarification, Isaac – now I can follow it. Just to recap:

1st) for single-language translated content, do:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
<original_language_title language="fr">Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</original_language_title>
</titles>

2nd) for multi-language content, do:

<titles>
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
<titles language="fr">
<title>Quand vos meilleures métadonnées ne suffisent pas: travailler avec une spécification imparfaite</title>
</titles>

How about a third common case: multi-language metadata for a mostly single-language non-translated content, where only the title (and abstract) are provided in a secondary language? For example, the full text is in Portuguese (primary language) but the title is also provided in English, for broader dissemination. Could the metadata for this third case be deposited as in the second case above? My intention is to eventually expose multi-language metadata already setup in PKP/OJS.

Thank you!
-Felipe.

1 Like

Hi Felipe,

Adding the article titles and the abstracts in different languages should be fine. Yes, this additional example can follow the 2nd option.

My best,
Isaac

Thanks once again for the confirmation, Isaac.

I hope you don’t mind a follow-up question: what would be the best level for setting a non-English primary language attribute – in the journal_metadata, journal_article, or titles tag?

Journal level:

<journal>
<journal_metadata language="pt">
<full_title>A Revista</full_title>
</journal_metadata>
<journal_article publication_type="full_text">
<titles>
<title>Quando mesmo os seus melhores metadados não são o suficiente: trabalhando com uma especificação imperfeita</title>
</titles>
<titles language="en">
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
(...)
</journal>

Article level:

<journal>
<journal_metadata>
<full_title>A Revista</full_title>
</journal_metadata>
<journal_article publication_type="full_text" language="pt">
<titles>
<title>Quando mesmo os seus melhores metadados não são o suficiente: trabalhando com uma especificação imperfeita</title>
</titles>
<titles language="en">
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
(...)
</journal>

Titles level:

<journal>
<journal_metadata>
<full_title>A Revista</full_title>
</journal_metadata>
<journal_article publication_type="full_text">
<titles language="pt">
<title>Quando mesmo os seus melhores metadados não são o suficiente: trabalhando com uma especificação imperfeita</title>
</titles>
<titles language="en">
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
(...)
</journal>

I think I remember reading that journal_metadata would be the recommended level, but I’m thinking the journal_article might be more appropriate.

Thanks,
-Felipe.

1 Like

Hi @fgnievinski ,

In your use case here, I’d recommend adding it to both the journal-level and journal-article-level metadata. Why not both?! I think it makes sense for it to be included at both levels.

-Isaac

1 Like

I tried to use the mentioned format for multilingual metadata for conference proceedings, but it doesn’t work.

What I tried:

<conference>
	<!-- ... -->
	<proceedings_metadata language="uk">
		<proceedings_title>КОНСТРУКТИВНА РЕФЛЕКСІЯ КОНФРОНТАЦІЇ І КООПЕРАЦІЇ: ПСИХОЛОГІЧНІ РИЗИКИ І РЕСУРСИ ВІЙНИ. Матеріали міжнародної міжгалузевої конференції</proceedings_title>
		<!-- ... -->
	</proceedings_metadata>
	<conference_paper publication_type="full_text" language="uk">
		<titles><title>МIЖНАРОДНИЙ ТУРИЗМ ТА ПУБЛIЧНА ДИПЛОМАТIЯ</title></titles>
		<titles language="en"><title>INTERNATIONAL TOURISM AND PUBLIC DIPLOMACY</title></titles>
		<!-- ... -->
	</conference_paper>
</conference>

The error I got:

Error: cvc-complex-type.2.4.a: Invalid content was found starting with element 'titles'. One of '{"http://0-www-ncbi-nlm-nih-gov.libus.csd.mu.edu/JATS1":abstract, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":publication_date, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":acceptance_date, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":pages, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":publisher_item, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":crossmark, "http://0-www-crossref-org.libus.csd.mu.edu/fundref.xsd":program, "http://0-www-crossref-org.libus.csd.mu.edu/AccessIndicators.xsd":program, "http://0-www-crossref-org.libus.csd.mu.edu/clinicaltrials.xsd":program, "http://0-www-crossref-org.libus.csd.mu.edu/relations.xsd":program, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":archive_locations, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":scn_policies, "http://0-www-crossref-org.libus.csd.mu.edu/schema/5.3.0":doi_data}' is expected.
Error: cvc-complex-type.3.2.2: Attribute 'language' is not allowed to appear in element 'titles'.

Is it not supported for conference proceedings, or I am doing something wrong? Schema indeed doesn’t mention language attribute on neither titles nor title elements, but it is the same for titles in journal_article as well.

I would also like to enter english version of the title of the proceedings book itself if it would be supported.

Hi @samogot ,

Thanks for posting this example, and apologies for the delayed response. Do you happen to have the submission ID that is producing that error message? If so, can you share here so I can take a closer look.

-Isaac

Dear @ifarley, could you confirm if the citation and citation_list elements support the language attribute, please? Or do they only inherit the language attribute from the parent journal_article element?
Thanks,
-FGN.

Hi again Isaac. I’m afraid my constructed XML snippets might have mislead @samogot. That’s because the language attribute remains undefined for titles or title elements – it has no <xsd:attributeGroup ref="language.atts"/>, which is only defined for element original_language_title:
Schema documentation for crossref5.3.1.xsd

This affects mainly the “third common case” discussed above (multilingual metadata of monolingual content), e.g.:

<journal>
<journal_metadata language="pt">
<full_title>A Revista</full_title>
</journal_metadata>
(...)
<journal_article language="pt">
<titles language="pt">
<title>Quando mesmo os seus melhores metadados não são o suficiente: trabalhando com uma especificação imperfeita</title>
</titles>
<titles language="en">
<title>When your best metadata isn't good enough: working with an imperfect specification</title>
</titles>
(...)
</journal>

In the current schema version, the secondary title’s language would have to be left unspecified. Could CrossRef consider supporting the language attribute for titles in the future, please? Otherwise, the value of providing multiple versions of titles seems more limited.

Thanks,
-Felipe.

Hello, I was wondering if this particular proposal could be considered in the roadmap for new features in CrossRef, please?

Thanks,
FGN.

Hello @fgnievinski ,

Thanks for following up. Can you help me understand the use case that the current schema is preventing you from accomplishing? I’m not sure I understand.

The value of providing multiple versions of titles seems more limited.

How so? DOIs are really citation identifiers, and in your case, we would certainly be matching citations based on both languages of the title in question.


As a reminder, I am going to include information on the best practice for multilingual content here (I’m using example DOIs from prefix 10.11606 below):

Advice on multilingual registrations often gets complicated because our members don’t always follow best practice and then want guidance on practices that fall short of our best practice. Given that, I want to take a step back and address our best practice recommendations for DOIs registered for multilingual content.

First off, let’s look at an example journal article where the full text of the article appears in English, Portuguese, and Spanish: https://www.revistas.usp.br/espinosanos/article/view/226719

In this example, the journal articles are published and available in English, Portuguese, and Spanish. Thus, best practice is to register a DOI for the articles in English, Portuguese, and Spanish, since the article in English would be cited differently from the articles in Portuguese and Spanish (and, journal article DOIs are really citation identifiers). In this hypothetical example, let’s say that the article in Portuguese is what you want to register as the primary DOI. You would register the Portuguese article by sending us only the journal title in Portuguese appearing in the XML registered with Crossref, like this:

<titles><title>Por uma Filosofia do Futuro em Ludwig Feuerbach</title></titles>

Then, the English translation of that journal article in Portuguese would be registered with its own distinct DOI and then linked to the DOI of the journal article in Portuguese. The title metadata would be registered with us like this:

<titles> <title>For a philosophy of the future in Ludwig Feuerbach</title><original_language_title language="pt">Por uma Filosofia do Futuro em Ludwig Feuerbach</original_language_title> </titles>

You’d also want to include a relationship in the metadata of the English translation to link back to the journal article in Portuguese, like this:

<program xmlns="https://0-www-crossref-org.libus.csd.mu.edu/relations.xsd">     <related_item>       <description>Portuguese translation of an article</description>       <intra_work_relation relationship-type="isTranslationOf" identifier-type="doi">10.11606/issn.2447-9012.espinosa.2024.226719</intra_work_relation>     </related_item>  </program>

A full example of the XML needed for a translated article is available on our website. Also, relationships can be set for previously registered DOIs using what we call a resource-only deposit. An example of that for a translated article is available for review here.

That is best practice for translated DOIs. Anything short of that doesn’t meet best practice.

Warm regards,
Isaac

Hi @ifarley, sorry for my delay.

As you’ve described, the case for multi-language full-text content is well supported in CrossRef, specially with the isTranslationOf relationship and the discount for DOI minting of translations.

The use case I had in mind is for single-language full-text content with multi-language metadata. Journals whose primary language is not English often provide English translations of titles and abstracts. The main motivation is to improve to content discoverability, in the hopes interested readers might find the publication. Upon accessing the full text (not in English), the reader can browse over the figures and translate the most relevant parts of the article body.

DOIs are citation identifiers, so it’s the publication’s primary title that should always be included in citations, the one matching the full-text content. However, APA style and others often require the English translation of the title to be included between brackets in the citation text. Providing both titles in CrossRef would help in the automatic citation generation, or at least ease the burden on the writer who needs to manually tweak their list of references.

Currently, CrossRef already supports the language attribute for Abstract, even if it’s not typically used in citations. So, publishers can deposit Abstracts in different languages and tag them accordingly. The proposal would be to offer a similar support for language attribute in titles, too. It’d benefit specially publications in smaller languages, who like to provide basic translations in the larger languages (English, French, etc.). Without a first-class space for title translations of single-language content, journals resort to questionable practices, such as cramming the English title in the beginning of the English abstract or, worse, at the end of the non-English title. It’s all desperate measures to get the content discovered.

-Felipe.

1 Like

Thanks for the additional context, Felipe!

Adding a language attribute for titles is not something that we’ll be adding. In your use case, only adding the title in multiple languages is a somewhat misleading practice, since the translated title is only available in the metadata. Thus, if a metadata user is looking for content in Spanish, French, and English and you’ve registered titles in all three languages within the metadata, but then there is no full-text in, say, French and English, we would argue that more harm than good has been done to that metadata user who was attempting to discover the content in those other languages. That metadata user would retrieve the metadata title in multiple languages to only find out that the full-text content itself was not available in those other languages. That’s not best practice.

Warm regards,
Isaac