Machine Learning Translation and the Google Translate Algorithm

The basic principles of machine translation engines

Google Machine Translation

Every day we use different technologies without even knowing how exactly they work. In fact, it’s not very easy to understand engines powered by machine learning. The Statsbot team wants to make machine learning clear by telling data stories in this blog. Today, we’ve decided to explore machine translators and explain how the Google Translate algorithm works.


Years ago, it was very time consuming to translate the text from an unknown language. Using simple vocabularies with word-for-word translation was hard for two reasons: 1) the reader had to know the grammar rules and 2) needed to keep in mind all language versions while translating the whole sentence.

Now, we don’t need to struggle so much– we can translate phrases, sentences, and even large texts just by putting them in Google Translate. But most people don’t actually care how the engine of machine learning translation works. This post is for those who do care.

Deep learning translation problems

If the Google Translate engine tried to kept the translations for even short sentences, it wouldn’t work because of the huge number of possible variations. The best idea can be to teach the computer sets of grammar rules and translate the sentences according to them. If only it were as easy as it sounds.

If you have ever tried learning a foreign language, you know that there are always a lot of exceptions to rules. When we try to capture all these rules, exceptions and exceptions to the exceptions in the program, the quality of translation breaks down.

Modern machine translation systems use a different approach: they allocate the rules from text by analyzing a huge set of documents.

Creating your own simple machine translator would be a great project for any data science resume.

Let’s try to investigate what hides in the “black boxes” that we call machine translators. Deep neural networks can achieve excellent results in very complicated tasks (speech/visual object recognition), but despite their flexibility, they can be applied only for tasks where the input and target have fixed dimensionality.

Recurrent Neural Networks

Here is where Long Short-Term Memory networks (LSTMs) come into play, helping us to work with sequences whose length we can’t know a priori.

LSTMs are a special kind of recurrent neural network (RNN), capable of learning long-term dependencies. All RNNs look like a chain of repeating modules.

Unrolled recurrent neural network

So the LSTM transmits data from module to module and, for example, for generating Ht we use not only Xt, but all previous input values X. To learn more about structure and mathematical models of LSTM, you can read the great article “Understanding LSTM Networks.”

Bidirectional RNNs

Our next step is bidirectional recurrent neural networks (BRNNs). What a BRNN does, is split the neurons of a regular RNN into two directions. One direction is for positive time, or forward states. The other direction is for negative time, or backward states. The output of these two states are not connected to inputs of the opposite direction states.

Bidirectional recurrent neural networks

To understand why BRNNs can work better than a simple RNN, imagine that we have a sentence of 9 words and we want to predict the 5th word. We can make it know either only the first 4 words, or the first 4 words and last 4 words. Of course, the quality in the second case would be better.

Sequence to sequence

Now we’re ready to move to sequence to sequence models (also called seq2seq). The basic seq2seq model consist of two RNNs: an encoder network that processes the input and a decoder network that generates the output.

Sequence to sequence model

Finally, we can make our first machine translator!

However, let’s think about one trick. Google Translate currently supports 103 languages, so we should have 103×102 different models for each pair of languages. Of course, the quality of these models varies according to the popularity of languages and the amount of documents needed for training this network. The best that we can do is to make one NN to take any language as input and translate into any language.

Google Translate

That very idea was realized by Google engineers at the end of 2016. Architecture of NN was build on the seq2seq model, which we have already studied.

The only exception is that between the encoder and decoder there are 8 layers of LSTM-RNN that have residual connections between layers with some tweaks for accuracy and speed. If you want to go deeper with that, take a look at the article Google’s Neural Machine Translation System.

The main thing about this approach is that now the Google Translate algorithm uses only one system instead of a huge set for every pair of languages.

The system requires a “token” at the beginning of the input sentence which specifies the language you’re trying to translate the phrase into.

This improves translation quality and enables translations even between two languages which the system hasn’t seen yet, a method termed “Zero-Shot Translation.”

What means better translation?

When we’re talking about improvements and better results from Google Translate algorithms, how can we correctly evaluate that the first candidate for translation is better than the second?

It’s not a trivial problem, because for some commonly used sentences we have the sets of reference translations from the professional translators, that have, of course, some differences.

There are a lot of approaches that partly solve this problem, but the most popular and effective metric is BLEU (bilingual evaluation understudy). Imagine, we have two candidates from machine translators:

Candidate 1: Statsbot makes it easy for companies to closely monitor data from various analytical platforms via natural language.

Candidate 2: Statsbot uses natural language to accurately analyze businesses’ metrics from different analytical platforms.

Although they have the same meaning they differ in quality and have different structure.

Let’s look at two human translations:

Reference 1: Statsbot helps companies closely monitor their data from different analytical platforms via natural language.

Reference 2: Statsbot allows companies to carefully monitor data from various analytics platforms by using natural language.

Obviously, Candidate 1 is better, sharing more words and phrases compared to Candidate 2. This is a key idea of the simple BLEU approach. We can compare n-grams of the candidate with n-grams of the reference translation and count the number of matches (independent from their position). We use only n-gram precisions, because calculating recall is difficult with multiple refs and the result is the geometric average of n-gram scores.


Now you can evaluate the complex engine of machine learning translation. Next time when you translate something with Google Translate, imagine how many millions of documents it analyzed before giving you the best language version.

0 0 votes
Article Rating
Subscribe
Notify of
guest
9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
golden goose sneakers
5 years ago

I not to mention my guys ended up reading the great thoughts from your web page then unexpectedly got a horrible suspicion I never thanked the web site owner for them. Most of the boys became totally excited to see them and have now clearly been using those things. Thank you for really being really accommodating and also for figuring out varieties of excellent resources millions of individuals are really desirous to be informed on. My very own sincere regret for not saying thanks to earlier.

yeezy boost 350
5 years ago

I’m writing to let you be aware of what a superb experience my friend’s child developed going through the blog. She came to understand lots of details, which include what it is like to possess a very effective giving mindset to get many others really easily completely grasp various complex topics. You truly exceeded readers’ expectations. Many thanks for coming up with the good, safe, explanatory and also cool tips about the topic to Julie.

adidas nmd
5 years ago

I simply wished to thank you very much again. I’m not certain the things that I would’ve used in the absence of the entire smart ideas revealed by you over such a industry. It was a real distressing situation in my circumstances, but viewing a new expert technique you handled the issue took me to jump with joy. I’m grateful for your assistance as well as pray you realize what an amazing job that you’re accomplishing educating the rest via a site. I am sure you have never come across any of us.

adidas ultra boost
5 years ago

Thank you for your entire efforts on this site. My daughter takes pleasure in engaging in investigations and it’s easy to understand why. Almost all notice all about the compelling form you render both interesting and useful guides on your web blog and therefore encourage response from others on that situation and our girl is certainly studying a lot of things. Have fun with the rest of the year. You have been performing a fabulous job.

yeezy boost 350
5 years ago

I have to show appreciation to the writer just for rescuing me from this particular difficulty. After browsing through the the web and finding techniques which were not productive, I thought my entire life was over. Existing minus the strategies to the difficulties you’ve solved as a result of your good report is a critical case, as well as those that might have negatively affected my entire career if I had not noticed your web page. Your good skills and kindness in touching all the pieces was very useful. I’m not sure what I would have done if I hadn’t… Read more »

yeezy
5 years ago

I wanted to compose a quick comment to thank you for the splendid suggestions you are posting at this website. My rather long internet search has finally been recognized with good quality facts and strategies to write about with my visitors. I would declare that we site visitors are unequivocally lucky to dwell in a magnificent network with many brilliant individuals with insightful principles. I feel extremely privileged to have encountered the webpage and look forward to so many more entertaining moments reading here. Thank you again for all the details.

nmd
nmd
5 years ago

My husband and i were ecstatic Jordan managed to do his researching from your ideas he gained from your site. It is now and again perplexing to just find yourself freely giving secrets and techniques which some people may have been selling. And we take into account we’ve got the blog owner to appreciate for that. The type of illustrations you made, the straightforward site navigation, the relationships you can help foster – it’s got all spectacular, and it’s letting our son and us do think this idea is amusing, which is certainly tremendously important. Thanks for the whole thing!

nmd uk
5 years ago

I wish to show my thanks to the writer for rescuing me from this predicament. Just after browsing through the the web and meeting methods which were not productive, I thought my entire life was well over. Existing without the presence of strategies to the difficulties you have sorted out as a result of this blog post is a critical case, as well as the kind which could have in a negative way affected my career if I had not encountered your blog post. Your personal training and kindness in controlling all the details was tremendous. I’m not sure what… Read more »

nfl store
5 years ago

Thanks a lot for providing individuals with an extraordinarily superb opportunity to read in detail from here. It can be very awesome and also jam-packed with fun for me personally and my office acquaintances to search the blog at the least 3 times in a week to read the latest guidance you will have. And definitely, I am at all times fulfilled considering the mind-boggling solutions you give. Selected two areas in this article are easily the most impressive we have ever had.