xlf-analyze

v0.3.5

Published

4 months ago

Get statistics from XLF translation files

0High
0Medium
0Low

leibrug

xlf xliff i18n internationalization translation localization l10n analyze

xlf-analyze

Script for analyzing .xlf files (XLIFF version 1.2).

Rationale and example run: https://leibrug.pl/xlf-analyze

Features

Count <target>s per state (number and % of all, new, translated, etc.)
Compare content in new and translated <target>s with corresponding <source>s
Count words in <source>s (for amount/price estimation of translations)
Point out ids of problematic translations

Usage

As a command

npx xlf-analyze <file> <options>

where <file> is a path to one of the language files (that is, not the "base" messages.xlf).

Run the script against all the language files to get the full insight - there may be differences between them.

As a module

const analyze = require('xlf-analyze');

analyze(file, options).then(result => console.log(result));

With this method, you need to interpret result yourself, for example there are no percentage stats. Interface XlfAnalyzeResult describes shape of the result object.

Options

Options can be passed as flags (for the npx use-case) or as a plain object. They are all: boolean, optional, and false by default. Examples:

# Both have equal effect (word counter disabled by default or by explicit option)
npx xlf-analyze messages.pl.xlf
npx xlf-analyze messages.pl.xlf --words=false

# Both have equal effect (word counter enabled)
npx xlf-analyze messages.pl.xlf --words
npx xlf-analyze messages.pl.xlf --words=true

// Word count enabled in-code
analyze('messages.pl.xlf', { words: true });

You can also save values to environment variables - that way you don't need to write flags every run, however - if passed - the command-line arguments take precedence. This works only for the npx use-case.

| Description | CLI flag | Property of options | Environment variable | | ----------------------------------------- | --------- | --------------------- | -------------------- | | Add word count to the output. | --words | words | XLF_ANALYZE_WORDS | | Return ids of problematic <trans-unit>s | --ids | ids | XLF_ANALYZE_IDS |

Problematic translations

newButChanged

Has state="new", but different than source

Probably didn't set state after translation:

<source>Hello!</source>
<target state="new">Cześć!</target>

translatedButUnchanged

Has state="translated", but same as source

Can be incorrectly marked:

<source>Hello!</source>
<target state="translated">Hello!</target><!-- still needs translation -->

Another example is some proper name...

<source>KFC</source>
<target state="translated">KFC</target><!-- may be worth undoing i18n on such text -->

...or something trivial, that shouldn't be marked for translation, like interpuntion or interpolation:

<source>.</source>
<target state="translated">.</target>

empty

Empty or missing state value

<trans-unit id="...">
  <source>Hello!</source>
  <target>Cześć!</target>
</trans-unit>
<trans-unit id="...">
  <source>Hello!</source>
  <target state="">Cześć!</target>
</trans-unit>

none

No translation at all

<trans-unit id="...">
  <source>Hello!</source>
  <!-- missing target element -->
</trans-unit>

nonStandard

Non-standard state value

There are 10 standard state values defined in schema and the translation has none of them:

<source>Hello!</source>
<target state="x-halfway">Cze</target>

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

xlf-analyze

Features

Usage

As a command

As a module

Options

Problematic translations

newButChanged

translatedButUnchanged

empty

none

nonStandard