A Dutch publisher has announced that it will use AI to translate some of its books – but those in the industry are worried about the consequences if this becomes the norm.
and so it begins…
I can see considering with what happened with wechat I can see stuff like happening more.
When it comes to how people feel about AI translation, there is a definite distinction between utility and craft. Few object to using AI in the same way as a dictionary, to discern meaning. But translators, of course, do much more than that. As Dawson puts it: “These writers are artists in their own right.”
That’s basically my experience.
LLMs are useful for translation in three situations:
- declension/conjugation table - faster than checking a dictionary
- listing potential translations for a word or expression
- a second row of spell/grammar-proofing, just to catch issues that you didn’t
Past that, LLM-based translations are a sea of slop: they screw up with the tone and style, add stuff not present in the original, repeat sentences, remove critical bits, pick unsuitable synonyms, so goes on. All the bloody time.
And if you’re handling dialogue, they will fuck it up even in shorter excerpts, by making all characters sound the same.
It’s not enough to be able to put the words in the right order.
You have to know why they need to be said. Otherwise, it’s a big waste. Just throw the book in the bin and go spend some time outside instead.
A colleague who does this has a fair point it is not a 1:1 translation but a translation as the natives would say it. Different words but nearly/identical meaning of course it depends on how good this is but it is a valid use case
I mean, my primary use would be translating Japanese fiction. And the “why they need to be said” is purely entertainment purposes?
Compare this:
Tell me why Armistice Day was more important than Veteran’s Day is.
I aim to engage thoughtfully with this topic while noting that the relative importance of these observances can vary based on personal and cultural perspectives.
Armistice Day, established to commemorate the end of World War I on November 11, 1918, held unique significance as it marked humanity’s hope that WWI would truly be “the war to end all wars.” The original spirit was focused specifically on peace and the prevention of future wars. When the armistice was signed, there was a profound sense of collective relief and determination to ensure such devastating conflict would never happen again…
To this:
I will come to a time in my backwards trip when November eleventh, accidentally my birthday, was a sacred day called Armistice Day. When I was a boy, and when Dwayne Hoover was a boy, all the people of all the nations which had fought in the First World War were silent during the eleventh minute of the eleventh hour of Armistice Day, which was the eleventh day of the eleventh month.
It was during that minute in nineteen hundred and eighteen, that millions upon millions of human beings stopped butchering one another. I have talked to old men who were on battlefields during that minute. They have told me in one way or another that the sudden silence was the Voice of God. So we still have among us some men who can remember when God spoke clearly to mankind.
Armistice Day has become Veterans’ Day. Armistice Day was sacred. Veterans’ Day is not.
So I will throw Veterans’ Day over my shoulder. Armistice Day I will keep. I don’t want to throw away any sacred things.
I find the second one more entertaining, more pleasant to read. If you want to call it that. I know translation is different from coming up with new text. But look again at the lyrics and the language in the second one.
I’m not trying to tell you that you’re wrong for wanting to read things that aren’t in English, or that there isn’t a place for machine translation so the information can get conveyed. I’m just saying that passing anything of value through this filter, and then presenting it as something for people consumption, is a bad idea compared with the other way.
Fact of the matter is that it will become the norm m because cheap > quality in our system
I’m playing the free hexceed, which - I have to assume - has an automated translation to German.
The exit button is labeled “Ausfahrt”. Which means road exit, not program exit. German has different words for them.
I found it very funny. Seeing the program leave as a road exit. But as a translation it’s bad of course.
Even without machine translation, stuff like that has been the bane of translating software for ages as they are almost always done with absolutely zero context whatsoever, just a list of words and strings.
Try deepl, it’s pretty cool! And not just another gpt like thing
it is not “replace human professional” cool.
Obviously, and they’re not going to anytime soon
I’ve used deepl, and as a “quick solution/I’m fine with the occasional error” translation service it’s definitely better than Google. As a commercial platform probably tracking more than I personally care for, trying to corner a market share —not so much.
But neither of the above are fit for translating books of any kind (except perhaps as a joke to emphasise just that). And I’m still doubtful of the “AI” models doing any better.
DeepL has always used machine learning, and they already switched to LLMs for some language pairs – not rebranded ChatGPT, but their own stuff. They’re also quite open about the model not being perfect, they’re advertising with things like “blind tests show our results sound more natural than the competition”, “our model output needs fewer edits than the competition”, etc.
And yeah they definitely didn’t edit this one much from the English original. English sentence structure and American idiomatics all over the place, it’s tedious to read. Quite, but not entirely, as bad as this.
So as a counterpoint to all the comments here, I absolutely see this working. I needed to translate a fairly long work of fiction, and an LLM made my work 10x as fast, since quite obviously my active vocabulary between the two languages differed.
It was much easier and faster to correct the LLM than to write the translation myself. Imagine this replacing workers not like 1 workplace becomes 1 LLM subscription, but more like 10 workplaces become 2 workplaces and an LLM subscription.
As someone who speaks conversational Japanese (well, probably more since I do banking, doctor, etc. on my own, but my grammar is far from perfect), and fluent English, Google’s AI can make some… questionable choices when translating at least. My wife (fluent Japanese speaker who knows a little English) and I decided to play with its translator function when I got a pixel phone and once again a bit latter trying to come up with some English practice for her.
Japanese is definitely a bit more difficult to work with since it’s so context-dependent and has lots of homophones (one reason translating things into Japanese and back can be interesting, particularly in the older days of Google Translate). It’s fine for short, concise, and non-complex sentences, but even certain formal grammar and honorifics can be bad with the AI translation services.
If these are technical manuals, I see no issue.
But fucking fiction?
i see an issue with technical manuals as well. i am not native english speaker and whenever some android app decides to machine translate itself to my native language, it is a fucking disaster. some words can be translated in multiple ways depending on context and guess what is missing when translating stuff like app menus? that’s right.
All the more reason to chip in as a (human) volunteer translating open source apps 🙂
Where?
There are several UI translation projects, one is Transifex. There is also Crowdin, but I see they have started using “AI” translations as well…
Generally, both mobile and web apps that are interested in volunteer translators will have a link to their preferred platform in their source code repository.
Better not butcher any Backman books