Translating with MT systems
After a whole term learning about Human Language Technologies and Machine (or Machine-aided) Translation, it is time to put all this into practise.
The following are some examples (more or less valid) of my using a computer for a translation.
- “Me gusta más el arte impresionista que el expresionista” => “M’agrada més l’art impressionista que l’expressionista” (spanish-catalan with Automatic Translation Server.)
- “Quisiera ser tan alta como la luna” => “Ich wurde mögen Essenz als alta als la luna” (spanish-german with Intertram.)
- “I would like to go to the cinema with you” => “Я хотел бы пойти к кино с Вами” (english-russian with Promt.)
- “Quiero irme a dormir en este mismo momento” => “ Je veux aller endormir dans le même moment” (spanish-french with Promt.)
- “I would like to go to your house” => “Ich möchte zu Ihrem Haus gehen” (english-german with Free2Professional Translation.)
After trying some of the possibilities of machine-aided translation, I would say that it could be of some help if you have no idea of the language to which you need to translate your sentences; in that case, a dictionary may not be enough.
But, as far as possible, I prefer trying my own translation. Though my sentences may be a bit wrong, at least they would have human mistakes.
Make use, but not abuse, is my conclusion.
Hans Uszkoreit
Professor of Computational Linguistics at the Department of Computational Linguistics and Phonetics of Saarland University at Saarbrücken, Hans Uszkoreit is one of the best-known scholars on Human Language Technologies.
As we can read in his website’s short CV, he also is the Scientific Director at the German Research Center for Artificial Intelligence (DFKI) where he leads the Language Technology Lab. He is Professor of the Computer Science Department.
With guaranteed professional experience, you can find more about Uszkoreit’s work in his oficial website -publications, projects, invited talks, research interests… etc.
Terminology on HLT
- According to the Wikipedia, machine translation refers to “a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.”
- Machine-aided translation: what does this similar term mean? Also known as “computer-assisted translation”, it is “a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process.” (Wikipedia.)
So, in other words, the basic difference between these terms is that “machine translation” refers to a machine or a programm set to translate on its own, while “machine-aided translation” refers to a human translator who use software as a help.
- Content Management Systems provide “mechanisms for storage and retrieval of content data, but it may also give support for indexing of documents, distributed document editing, version management, and generation of different views and guided tours.”
So it is easy to understand that nowadays managing is needed in several languages, and that is the purpose of Multilingual Content Management.
In general, the term “translation technology” includes the other terms we have already seen, because it is a technology which researches “to promote data exchange standards which allow translation tools to share data.”
International Meetings on HLT
Once we have learnt what means Human Language Technologies and read what some professionals say on the subject, it is worth talking about meetings and conferences on HLT:
- 45th Annual Meeting of the Association for Computational Linguistics, held in Prague (June 23rd-30th.)
- Human Language Technology Conference 2007, in Rochester (NY.)
- XXIII Congreso de la SEPLN, in Spain (10th, 11th and 12th Septembre 2007.)
These three are of the best and most remarkable examples of meetings on our subject. Placed in different and opposite locations (Prague, EEUU and Spain), these conferences are perfect occasions to learn more, through scholars, about Human Language Technologies.
Each conference’s website offers information about accomodation, organization, other events… Apart from attending the meeting, it would a very interesting opportunity to visit the city where they are held!
Research Centers on HLT
There are several teams and groups of professionals seeing to develop these Human Language Technologies.
For example, the DFKI (the German Research Center for Artificial Intelligence) has got a research lab, leaded by Hans Uszkoreit, which works on this area of knowledge.
These are some of this team’s summarized working activities, which help us to understand the meaning and importance of HLT for us languages students:
- “We develop and improve core language technologies for information extraction, mono- and cross-lingual information access, morphological processing (…) and other advanced software functionalities.”
- “We design and implement processing components (…) for the maintenance of terminologies and dictionaries.”
- “We create resources for R&D such as lexicons, grammars, test suites, and discourse models.”
- “We provide consulting on the potentials and ramifications of language technology.”
Other organizations whose websites are worth visiting are, among some others, the European Association for Machine Translation (EAMT) and the Human Language Technology Rearch Institute (HLTRI.)
What is Human Language Technologies?
The topic of our subject is current issue. I am going to develop it through articles on several conferences, organizations, etc., but first of all it is necessary to clarify what is the aim and the nature of these HLT.
According to the expert Hans Uszkoreit, on What is Human Language Technologies?, HLT “comprises computational methods, computer programms and electronic devices that are specialized for analyzing, producing or modifying texts and speech.”
Saying the same with not so technical words, HLT work basically on developing a number of methods for digital analysis of texts and translation.
There are some other worth reading opinions on the subject. As an example, read Ron Cole’s article on HLT.
Humanidades digitales y el Portal Andrés de Poza
¿Hasta dónde es posible imaginar el avance de la tecnología?
Para muchos de nosotros resulta ya claro que las nuevas tecnologías, como por ejemplo Internet, han venido para quedarse, convirtiéndose para el ser humano en una tercera mano que amplía las posibilidades de las otras dos. Más aún: si esta nueva y hábil extremidad nos fuera amputada, nuestra vida diaria se vería tan afectada que inequívocamente sentiríamos la situación como un retroceso a la época de las cavernas.
Tecnologías, por tanto, en cierto modo humanizadas, al ocupar ya inevitablemente un importante lugar en nuestros mundos.
Sin embargo, siempre he pensado que tal vez en nuestro campo, el de los filólogos, resulte más complicado buscar una aplicación a las Nuevas Tecnologías; incluso que aparentemente podrían ser prácticas incompatibles. Pero ya hay gente que lleva un tiempo dedicándose a ello.
El pasado 17 de noviembre los alumnos de 1º de Filología Inglesa tuvimos la oportunidad de asistir a una conferencia en la propia universidad cuyo título enunciaba precisamente este tema: las Humanidades digitales.
En el coloquio, a cargo de la profesora Carmen Isasi, se presentaba el Portal Andrés de Poza, un proyecto nacido en el seno de la Universidad de Deusto para la “creación de un sitio web especializado en ediciones de textos con versiones múltiples, generadas tanto en procesos de traducción de textos literarios europeos, como en procesos de transmisión de documentos, en especial del ámbito vasco-románico”.
A lo largo de la conferencia quedó clara la relevancia que podría llegar a tener este tipo novedoso de herramienta para los nuevos estudios literarios, ya que ofrece la posibilidad, entre otras, de comparar en una misma pantalla de ordenador distintos y muy diferentes textos, en distintas lenguas y de diferentes ámbitos geográficos.
El Portal Andrés de Poza, y cualquier otra iniciativa de este tipo, comienza a revelarse como una de las mejores armas que empuñarán en el S.XXI disciplinas para el estudio literario como por ejemplo la Literatura comparada.
Estoy segura de que nosotros, como futuros filólogos, seguiremos muy de cerca los avances que las Nuevas Tecnologías, en continuo e implacable desarrollo, puedan ofrecernos en el campo que nos atañe. Testigos somos de que Literatura y Tecnología se conjugan hoy en una simbiosis de la que ambas partes se podrán beneficiar.
El resultado, aún está por ver.
Metadata and the Semantic Web (Issue 3)
We have already talked about the so-called Web 2.0, the interactive web, but there is another important term we should know something about -the Semantic Web, which can be defined as the intelligent web.
The Semantic Web is “an evolution of the Tim Berners-Lee’s World Wide Web in which information is machine processable (rather than being only human oriented), thus permitting browsers or other software agents to find, share and combine information more easily”. So this is the point which really interests us -this intelligent web works with, and what is more, processes metadata.
Working with semantic technologies is little by little increasing nowadays, and it is also a very discussed topic. So this year’s “pre-eminent meeting place for the growing community of developers, entrepreneurs, technology architects and researchers who are building software and systems based on semantic technologies” is going to be set in California. Click here for more information about the 2007 Semantic Technology Conference.
Next comings on Metadata (Issue 3)
In order to learn more metadata, its characteristics and applications, there are annual conferences on the topic.
You will not have to wait longer for attending to DAMA International Symposium and WILSHIRE Meta-Data Conference, which is going to take place in March in Boston (Massachusetts, USA).
But it looks to me more attractive this International Conference on Dublin Core and Metadata Applications in August in Singapore -if you don’t care about travelling a bit further! Its website provides other kind of interesting information about the exotic Asian city, such as accomodation and turistic and cultural aspects. It seems to be worthy meeting!
Group E – Metadata & Metacontents
A piece of information -for example the number 16086364- given out of a context is meaningless. It is necessary something more about that data in order to understand that the content represents one individual’s ID card, and that is what we call metadata. This concept of metadata is very interesting in a variety of fields of computer science.
There are some definitions needed so as to understand accurately what all these terms mean on the whole:
- Data: in a very large sense, it refers to “numbers, characters, images or other outputs from devices to convert physical quantities into symbols processed by a human or input into a computer or transmitted to another human or computer”. Data processing occurs by stages -from raw data to processed data (Wikipedia, visited: 01/12/2007).
- Metadata: the most common and specific definition for the term is “data about data“, that could be developed as “structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities” (Wikipedia, visited: 01/12/2007).
- Content: referring to computing, it means “the ’stuff’ that makes up a website“, such as words, pictures, images or sounds. In other words, “the ‘information’ a website provides” (Web-Designz.com, visited: 01/12/2007).
- Metacontent: it is the information relating to the document’s content, such as its title, author, size, date, changes-history, key words…, etc. A metacontent can be used for searching and leaking information, and administering documents (Joaquín Bravo Montero, visited: 01/12/2007).
Metalanguage, in computing terms, refers to the programming languages developed for computers to process data and information. Three examples of these are:
- HTML (http://www.w3.org/MarkUp/).
- XML (http://www.w3.org/XML/).
- SGML (http://www.w3.org/MarkUp/SGML/).
The new Internet, the Web 2.0, offers us endless possibilities, but there is something missing – in words of W3C platform “a part of the Web which contains information about information – labeling, cataloging and descriptive information structured in such a way that allows Web pages to be properly searched and processed in particular by computers. In other words, what is now very much needed on the Web is metadata“.
Itziar Pascuas, Leire Rodríguez, Marta Rodríguez, Daniel Vergara, Janire Zalbidea and Paula Zumalacárregui (English Studies).
Dejar un comentario
Dejar un comentario
Comentarios (1)