xlf-analyze
v0.3.5
Published
Get statistics from XLF translation files
Maintainers
Readme
xlf-analyze
Script for analyzing .xlf files (XLIFF version 1.2).
Rationale and example run: https://leibrug.pl/xlf-analyze
Features
- Count
<target>s perstate(number and % of all,new,translated, etc.) - Compare content in
newandtranslated<target>s with corresponding<source>s - Count words in
<source>s (for amount/price estimation of translations) - Point out
ids of problematic translations
Usage
As a command
npx xlf-analyze <file> <options>where <file> is a path to one of the language files (that is, not the "base" messages.xlf).
Run the script against all the language files to get the full insight - there may be differences between them.
As a module
const analyze = require('xlf-analyze');
analyze(file, options).then(result => console.log(result));With this method, you need to interpret result yourself, for example there are no percentage stats. Interface XlfAnalyzeResult describes shape of the result object.
Options
Options can be passed as flags (for the npx use-case) or as a plain object. They are all: boolean, optional, and false by default. Examples:
# Both have equal effect (word counter disabled by default or by explicit option)
npx xlf-analyze messages.pl.xlf
npx xlf-analyze messages.pl.xlf --words=false
# Both have equal effect (word counter enabled)
npx xlf-analyze messages.pl.xlf --words
npx xlf-analyze messages.pl.xlf --words=true
// Word count enabled in-code
analyze('messages.pl.xlf', { words: true });You can also save values to environment variables - that way you don't need to write flags every run, however - if passed - the command-line arguments take precedence. This works only for the npx use-case.
| Description | CLI flag | Property of options | Environment variable |
| ----------------------------------------- | --------- | --------------------- | -------------------- |
| Add word count to the output. | --words | words | XLF_ANALYZE_WORDS |
| Return ids of problematic <trans-unit>s | --ids | ids | XLF_ANALYZE_IDS |
Problematic translations
newButChanged
Has state="new", but different than source
Probably didn't set state after translation:
<source>Hello!</source>
<target state="new">Cześć!</target>translatedButUnchanged
Has state="translated", but same as source
Can be incorrectly marked:
<source>Hello!</source>
<target state="translated">Hello!</target><!-- still needs translation -->Another example is some proper name...
<source>KFC</source>
<target state="translated">KFC</target><!-- may be worth undoing i18n on such text -->...or something trivial, that shouldn't be marked for translation, like interpuntion or interpolation:
<source>.</source>
<target state="translated">.</target>empty
Empty or missing state value
<trans-unit id="...">
<source>Hello!</source>
<target>Cześć!</target>
</trans-unit>
<trans-unit id="...">
<source>Hello!</source>
<target state="">Cześć!</target>
</trans-unit>none
No translation at all
<trans-unit id="...">
<source>Hello!</source>
<!-- missing target element -->
</trans-unit>nonStandard
Non-standard state value
There are 10 standard state values defined in schema and the translation has none of them:
<source>Hello!</source>
<target state="x-halfway">Cze</target>