How can I translate questions from another language besides English?

tom_bogle · January 31, 2018, 2:58pm

Currently Transcelerator’s user interface and base questions are only in English. Some progress has been made toward translating the questions into Spanish, but there are a lot of questions and it’s no small job. We have had field requests for French and Tok Pisin, but so far there has not been enough clamor for it to get the ball rolling. (The task of translating to create a localized version of SIL software is normally the responsibility of the field(s) that want it.)
If you have a specific need, please search the existing feature requests to see if there is already a request for the language you’re interested in. If so, vote for it. If not, please add your request. The more requests we get, the better the chances this will be moved to the front burner.

Martin · May 29, 2018, 10:41am

+1 for French please

tom_bogle · August 9, 2018, 7:23pm

Check out the latest news! Localization into French is coming…

Martin · August 13, 2018, 12:25pm

Hi, this is great news. I was planning on spending money and having the pros at Deepl translate a first draft of all the questions. Look here:

I cannot put links it seems so go and find Deepl and the Pro page and writeup (I am not afiliated, but a happy user of the free online Deepl for months.)

For about 30 Euros, I would get a first draft of all En>Fr if I can weed out tags and repetitions. I was planning on using OmegaT and Deepl Pro also as a research effort.

Now that Transcelerator can “handle” more languages, this will move up on my priorities list. If anybody else (developers) want to try it directly, I might consider to sponsor. So let us stay in touch.

I will also now sign up for the Crowdin, but need to research who is operating it.

So I have signed up tp Crowin and have just pasted a few example translations in LUK under my new user name “Transmogrinator”. So you can go and check the quality which we would get from Deepl. So far it looks equal or better than what I see from MS Translator.

Update: What you see is not pure Deepl, I have already edited a few corrections and improvements. So with a professional TM and just some edits we could get this done nicely. I was happy how spiritual stuff like heaven/ciel was handled.

tom_bogle · August 13, 2018, 8:20pm

Thanks for your enthusiastic reply. And thanks for checking in before going for the deep dive. After releasing this version of Transcelerator and uploading the new format of the localizable file to Crowdin, I started working through the process of trying to upload my existing Spanish translations and realized I hadn’t fully understood their model and how it ties in with XLIFF. So I need a little more time to explore options and think things through to make sure we have a smooth path forward. I’ll keep you posted…

Martin · August 14, 2018, 10:29am

Yes, very good. If you have personally invested love and time into a Spanish version, this is best for figuring things out with Crowdin.

I would not mind experimenting (for my personal learning) with OmegaT and DeeplPro. So if you ever have time, you could send me simple text files in English (utf-8). Since I am doing “stuff like that” on the side, on top of my main work, I will need a few weeks anyway.

We could then benefit from your Spanish discoveries and see whether we want to upload my automatic drafts in Crowdin (or not). And how it could be done.

Personally I would rather review and correct 9000 questions and answers than working them from scratch. At least working from a machine-draft would help avoid many “Martin generated French typos”.

But for now, you should celebrate a little longer the fact that this tool is going i18n or L10n. I will hold my p3e.

tom_bogle · August 14, 2018, 2:07pm

By “simple text files”, I assume you mean an XML format such as XLIFF or something similar. I’d be happy to do that once we settle on a definite strategy. (Crowdin does allow uploads of XLIFF files, so any tool or process that can take an XLIFF file (v 1.2) and add the needed translations will fit in easily with Crowdin.)

I’m sure that for both Spanish and French (and other well-traveled languages with gigantic corpuses and lots of work invested in optimized machine-based translation approaches), we’ll definitely find that the fastest way to the finish line is to start with a machine-translated draft.

In response to your Crowdin message, yes, I realize that Crowdin is unable to interpret the structure of the incoming file and has no idea about Scripture books. I could change the approach to separate the questions for each book into a different file. The up-side of that is that Transcelerator could load them lazily, which might speed up initial opening time (especially if a filter is already in place) and reduce memory size. The biggest downside would be that it would be hard to properly deal with strings that occur in multiple books, without ending up with potentially unnecessary duplicates. Also, there would be additional file opening overhead, and a good bit more complexity in the Installer, as well as administrative overhead shepherding multiple files through the L10N process.

Martin · August 14, 2018, 2:31pm

Oops. Yes, all your hesitations are correct. My idea (chop the work into books or smaller portions with names) was only meant for the Crowdin translation volunteers. I had assumed that you would splice the results back into the original format for plugging into the existing Transcelerator structures.

I bet that Crowdin would find matches even accross several individual files, especially within one project. So chopping your mother-file would be extra work for you, if it is XML (need a header for each portion). So you could ask around people in your office or in this here forum whether smaller portions would motivate them or not.

tom_bogle · August 16, 2018, 1:18pm

Hold the presses! Just in case anyone stumbles upon this and wants to start localizing, please be aware that we have decided to change course and use an XLIFF-compliant file to store localized strings. So if you do any translation to create a new localization (outside of Crowdin), there will probably not be an easy (non-manual) way to get your translations shifted over to the new approach. I hope to have a better new version available soon…

tom_bogle · September 4, 2018, 8:17pm

The XLIFF compliant version is now available and the strings are up on Crowdin. I am aware of the serious perfomance issues with this version of the program if you try to display the questions in a language other than American English. I think that will be easy to fix, but it might be a few weeks before I get to it. The main priority for me right now was to get a working “proof of concept” and to get the strings up on Crowdin. But if you have a fast machine and a little patience, feel free to check out this feature in the new version. To try it, on the View menu, point to Display Language, and then select the language of your choice. (French won’t be displayed until we have some French strings localized and include that in the Installer.)

Martin · November 28, 2018, 11:01pm

I spent a half-night just now, figuring how to use XLIFF files in my good ole OmegaT.

I found some Okapi Rainbow tool, which can convert an XLIFF file into an OmegaT translation memory, thereby making available any (potentially) already translated segments. And all segments should also come up as source-matter.

Since OmegaT is my best access to the commercial Deepl service (rather than doing thousands of copy and paste by hand) and since having the data on my machine also means that as the works progresses, I will need to pay less money as my tm grows, I beg again to please receive a copy of the XLIFF file called LocalizedPhrases.xlf.

I understand if you do not want to unlock the download in Crowdin (still locked), but maybe you can check my SIL credentials with our TGB translation coordinator (should be Joseph…) and send me (at least Luke) for testing. I want to translate the Questions for Luke since the project needs to get into checking asap - and we need to know whether Transcelerator will work for us or not.

Once we got Luke up and running, it might encourage other users to have a look, and then maybe help with the localization…

I have installed the latest Transcelerator and have spied around. It works fine in PT8, showing the Spanish questions. Took me a while to spot the LocalizedPhrases-es.xlf file. So French should also work, seems you got the technology figured out, compliments.

Grovel, grovel, please les us have the questions. Thank you.

tom_bogle · November 30, 2018, 7:33pm

I have enabled downloading of the files. Sorry, I hadn’t noticed that checkbox before. The Transcelerator project is entirely open source, so there is absolutely no reason to try to prevent downloading the XLIFF file. No groveling necessary.

Martin · December 1, 2018, 3:43pm

Awesome. I am in a two-day meeting. So I just managed to download the file from Crowdin. And I converted it rather painlessly into a OmegaT project. My few previous translation drafts were processed into its tm, so I do not need to do those again. (This is very important, as this detail will allow other volunteers to also work on this file and/or on Crowdin without repeating the same work.)

I believe in prototypes, from my first profession and from life experience. So I will try to translate 20 Euros worth of the file by Deepl Pro (and manual editing/correcting by myself). And then I would like to hear from you how this can/will show up in your (laboratory) installation of PT8. I noticed that for your Spanish tests, you also managed to include less-than-100% completed files.

If you find the quality acceptable (if you do not have access to Francophone consultants, we could together discuss how to approach some people we know here) you could also test please, if my 5% work on the file is worth it to splice/merge/hack it back into Crowdin. I am muchly interested on your input and opinion (even emotions) about this one.

Thank you for doing all this and keeping it Open Source, that is very generous and modern and encouraging hopefully other volunteers too.

Good weekend.

tom_bogle · December 4, 2018, 2:15pm

I am downloading the few French strings you translated in Crowdin now and will incorporate those into a build in Transcelerator, so you can see the results of your effort. For the strings you translate in OmegaT, I think it would be best to upload them back into Crowdin. Technically that isn’t absolutely essential as long as OmegaT spits back out an XLIFF file that correctly preserves the necessary fields. But unless you can commit to doing the entire translation task yourself using OmegaT, we’d obviously like to keep the Crowdin site up-to-date so others can collaborate. Once you’ve uploaded your translations, I can download them and use them to update the file that ships with Transcelerator.

tom_bogle · December 4, 2018, 2:22pm

By the way, if you wish you can test your work out directly in Transcelerator before uploading it. Assuming your XLIFF file is correct, naming it LocalizedPhrases-fr.xlf and copying it to C:\Program Files (x86)\Paratext 8\plugins\Transcelerator will make it immediately available in Transcelerator the next time you start it up. (The version I actually ship will be a stripped down version that only contains the strings that have actually been translated.)

Martin · December 6, 2018, 7:11am

Sorry for the delai, we need to travel abroad tomorrow and this is a crazy week.

I got stuck with DeepL because they got a new tariff structure and I was un-subscribed while it happened. Just fixed this morning before breakfast. Good news: They raised the rate by 50% but turned it into a flat-rate.

So I will indeed try to have a draft for the entire file done this month. I hope I do not need to clicke CTRL-M 46.000 times but can find a button “do them all”.

I do not know yet how to upload my stuff back into Crowdin, except by copy and paste. So I need your help. I will do a portion or the entire beast. And will re-convert into a valid XLIFF (hopefully) and send it to you somehow. If you can figure out some magic and if you like, then I would do the same for the entire Spanish. Remember that I found a way to “save” the already existing Spanish translations via tm into OmegaT.

So let us stay in touch, I need your help a lot but maybe I can contribute some helpful “magic” with my 30 Euros which went to DeepL just now. Greetings. Will be traveling and hopefully online next week from Togo, but not before Monday.

tom_bogle · December 6, 2018, 1:27pm

For me, when I’m logged into Crowdin and go to the page for a specific language on the Transcelerator project, the … menu has a command to Upload Translations. It’s possible that that is an admin-only function. If so, you can just send me the translated XLIFF file and I can upload it.

Martin · December 10, 2018, 8:22pm

Great, I can see “upload translations” now too.

I need to tick boxes for two choices. And not wanting to ruin your Crowdin setup, I ask you what to tick:

I believe I have created valid XLIFF files, which reproduce the structure of the XLIFF which I had downloaded from Crowdin.

Now I need to tick or not-tick “Add translations that duplicate the source strings”.

And I need to tick or not-tick “Add translations that duplicate existing ones”.

Since I basically bring back a filled-in-version of the original file, I ask you to please advise which options will do least damage to your Crowdin project.

If I understand the second option correctly, I am being asked whether I want to add my translations alongside already stored translations. And my answer would be yes. But I do not understand and do not imagine the consequences for the first option either way.

I have done a lot of testing all day long today and am also keeping notes, so that you can maybe use the same for Spanish, if you want to try. Sadly I do not write Groovy scripts, which would be useful for automating OmegaT.

I need to do more tidiying up of typography, but would be ready to upload my first 2000 segments for testing tomorrow, Tuesday.

Or if you find it safer, I would gladly e-mail you my LocalizedPhrases-fr.xlf file so you can have a look before we take any risk with the Crowdin project. I hope to have the entire French version machine-drafted by Wednesday night, unless you find bad structural damage and my work-flow is useless.

I can also send you my dirty-draft-notes so you can see my work-flow.

Greetings.

tom_bogle · December 11, 2018, 3:06am

No need to tick the first one, but no harm if you do either. That just whether it will upload the strings if the French or Spanish happen to exactly match the English. The only advantage of ticking it would be to be able to note that you had in fact looked at the string (as opposed to it being a new string, or one that had been previously overlooked). In the case of Transcelerator, it should be fairly rare that the strings would be an exact match. I’m not entirely sure of the meaning of the second tick box. The documentation seems to skirt past it without any real explanation. On the surface of it, it seems pointless: uploading strings that are already there doesn’t seem like it would do much. But I’m thinking that it possibly means that if there are two instances of an identical English string and one of them is already translated in crowdin, this tick box could control whether to upload a duplicate copy of the translation for the other instance. Crowdin does allow for the two instances to be translated differently if necessary, but by default if you translate one, that translation is automatically applied to the other. So I’m really not 100% sure how to advise. Bottom line is it probably won’t make a significant difference. I’d probably just go with the default for now and leave it cleared unless/until we can see a use for it.
When you’re ready, feel free to upload what you have. We have nothing for the French (other than the handful of strings you did manually in crowdin), so there’s nothing to lose. For the Spanish, on the off-chance that it did something disastrous, I could recover from my copies outside of crowdin.

Martin · December 13, 2018, 3:40pm

Dear Tom, did you or did I?

I tried before lunch today to upload and had the attempt crash with some vague error message. Now I want to upload again - and find that suddendly the counter has jumped up from 0% to 99%. So was that my upload or was it you, who uploaded what I had sent you by e-mail?

I will filter on the Crowdin site and will see what might be the missing 1%.

My next steps would be:

adopt Luke and prove-read that book over the coming weeks on Crowdin
or rather first learn how to prove-read on Crowdin
recruit Francophone consultants and colleague-translators to adopt one book each and prove-read
maybe recruit some educated Friends in France to look at Crowdin and see whether they can help fix the machine translation into natural French where needed
beg you for some more documentation re the localisation (you mentioned a long time ago about translating the contents of TxlQuestionWords.xml and maybe other files)
beg you for a little more documentation on the use of Transcelerator, especially some examples or screenshots of the Biblical Terms Rendering Selection Rules and of the Question Adjustments Window and on anything that needs treatment “in the guts of the PT8 installation”
I am offering to translate all written documentation from English to French (because I will learn from that)
If people find that my machine translated draft for En>Fr is not entirely useless, I might take time and polish my personal notes on my workflow with OmegaT, Rainbow, DeepL Pro, Regex Tools and Crowdin. This will not be for end-users but for fellow nerds who fear nothing.

Please let me know your ides too on what needs to happen to make Transcelerator fully available for francophone users. I only noticed today (sorry) about the latest version 1.3.9.0 and updated. Nothing obvious on the GUI but I guess technically the tool is already ready for more languages, right?

Thank you for your help in all this. Greetings.