Launching into Computer Science- Unit 3

5 minute read

Published:

Unit 3

The task for this unit was to investigate and explore the potential of Google Translate.

Google Translate currently uses AI to perform translations, and as AI improves over time, it may become possible to generate translations with perfect accuracy. Originally launched in 2006 with the intent of allowing users to find more information, Google Translate performed translations by using human-made translations as a basis for deciding how to interpret unseen batches of text (Google, 2006). This method is known as Statistical Machine Translation, and was the main system used for translating between languages. This system, however, had multiple shortcomings and often could not produce clear, understandable translations. As a means of improving translation accuracy, in 2016, Google changed the statistical machine translation system to a system that was built on artificial intelligence- specifically, an LSTM neural network. Adoption of a neural network would mean that instead of having researchers attempt to model the relationship between languages, a computer would be able to discover the precise relationships without human intervention. As a result of this new system, translation quality has improved significantly, however, there is still room for improvement. This post will provide a critical analysis of the technology behind Google Translate, and will discuss ways of improving this technology.

Google Translate has received multiple benefits due to the adoption of artificial intelligence, implemented in their proprietary model, known as Google Neural Machine Translation (GNMT). The most notable benefit from this adoption that translation quality has improved by 60%; furthermore, the creators of this technology found that in certain cases, their machine-based translations were nearly identical to human-based traslations (Wu et al., 2016). The creators also noted that human translators tend to vary in their understanding of sentences, leading to inconsistency when translations are made. This implies another benefit of GNMT- it will produce more consistent translations as it has a singular understanding developed through training. These benefits demonstrate the potential for translations to become a fully automated, instantaneous, and an on-demand service, which could become even more useful if used in conjunction with other technologies. For example, real-time translation can be combined with speech recognition, text-to-speech and video calling, thereby enabling workers to work together while using their native languages.

Although Google Translate’s technology has created many new benefits, there are challenges to achieving this outcome. Most of these challenges relate to the availability of resources in different languages (Google, 2020). Digitization of material is a significant obstacle- as someone who speaks a language with a lower amount of resources available (Afrikaans), I have noticed that translation accuracy for Afrikaans varies drastically- certain sentences are translated well, however, other translations fail to properly interpret idiomatic expressions, identify the correct gender, and so on. One example I tested changed the meaning of a sentence completely: I translated a sentence from English to Afrikaans. The sentence was “Neurale netwerke, by die oomblik, is een van die beste metode wat kan gebruik wees vir vertalings- dit kan vertaling doen op ‘n vlak wat amper dieselfde as mens is.”, which in English, would essentually mean “Neural Networks are currently one of the best methods that can be used for translations- it can translate at a level similar to humans”. The Afrikaans version which came from Google Translate turned it into a negative, meaning that if I were to translate it back to English, the result would be something like “… able to do translations at a level that is not similar to a human level”, instead of “able to do translations at a level similar to humans”.

Currently, neural networks are the strongest method available for performing translations at a level that is on-par with humans, and consequently, I believe that research in the area should be maximized to improve machine translations. Many issues that are faced by Google Translate can be solved by the accumulation of training data. It has been reported by Google that translations are less accurate for languages which do not have a wide variety of resources available for GNMT to train on. Society is becoming increasingly digitized, naturally providing a solution for Google to address this area. However, the digitization of existing literature is major concern. For lower resource languages, having resources such as paperback novels and textbooks are crucial in order for machine translation to process different styles of writing (e.g. formal or scientific). This is something that could be solved with technologies such as OCR, but humans would still produce more accurate data. Investing into low-resource languages by helping commmunities digitize their language’s works could be a viable option for this. If not addressed, a lot of literature may be lost to time, and in addition, Google’s translation will never be as accurate as it could be.

References Google. (2006) Statistical machine translation live. Available from: https://ai.googleblog.com/2006/04/statistical-machine-translation-live.html [Accessed 11 February 2021]. Google. (2016) A Neural Network for Machine Translation, at Production Scale. Available from https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html [Accessed 11 February 2021]. Google. (2020) Recent Advances in Google Translate. Available from: https://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html [Accessed 11 February 2021]. Wu, Y., Schuster, M., Chen, Z., Le, Q., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M. and Dean, J., 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. [online] arXiv.org. Available at: https://arxiv.org/abs/1609.08144 [Accessed 11 February 2021].