Starting Time: 2003
Status: Concluded in 2007
To automatically induce linguistic knowledge useful for machine translation ---transfer rules and bilingual dictionaries--- from PoS-tagged and lexically aligned parallel corpora. Another goal of this project is to develope a simple machine translation system to translate source sentences into target sentences based on the induced resources.
Project's Features
The experiments carried out for machine translation involving three languages ---Brazilian Portuguese (pt), Spanish (es) and English (en), combined in two pairs of languages: pt-es and pt-en--- showed reasonable results.
Computational resources:
Induction Systems available at SourceForge
Linguistic resources:
Helena de Medeiros Caseli (PhD Student, 2003-2007)
Maria das Graças Volpe Nunes (supervisor, 2003-2007)
Mikel L. Forcada (foreign supervisor, 2004-2005)
Finantial Support
FAPESP: 2004-2007
CAPES (Sandwich): 2004-2005
Helena de Medeiros Caseli
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation. Machine Translation. v. 1, p. 227-245, 2008.
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. From free shallow monolingual resources to machine translation systems: easing the task. In Proceedings of the Workshop on Mixing Approaches to Machine Translation (MATMT08). San Sebastian, Spain: University of the Basque Country, 2008. v. 1. p. 41-48.
Caseli, H.M.; Nunes, M.G.V. Automatic induction of bilingual lexicons for machine translation. International Journal of Translation. v. 19, p. 29-43, 2007.
Caseli, H.M.; Nunes, M.G.V. Automatic induction of translation lexicons from aligned parallel corpus. In Anais do XXVII Congresso da Sociedade Brasileira de Computação - V Workshop em Tecnologia da Informação e da Linguagem Humana (TIL). p. 1669-1678. Rio de Janeiro - RJ, Brazil, 2007. PDF
Caseli, H.M. Indução de léxicos bilíngües e regras para a tradução automática. Tese de Doutorado. ICMC-USP, Abril, 2007. 158 p. PDF (versão defendida) PDF (versão revisada)
Caseli, H.M.; Nunes, M.G.V. Automatic transfer rule induction from parallel corpora. In Proceedings of the International Joint Conference IBERAMIA/SBIA/SBRN 2006 - 3rd Workshop on MSc dissertations and PhD thesis in Artificial Intelligence (WTDIA'2006). Ribeirão Preto, Brazil, October 23-28, 2006. PDF
Caseli, H.M.; Nunes, M.G.V. Anali: uma ferramenta de análise morfossintática. Série de Relatórios do Técnicos do ICMC, 285 (NILC-TR-06-09), Outubro 2006. 44 p.ZIP
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. Evaluating the LIHLA lexical aligner on Spanish, Brazilian Portuguese and Basque parallel texts. Procesamiento del Lenguaje Natural, v. 35, Granada, Spain, pp.237-244, 2005. ISSN 1135-5948. Also in Cadernos de Computação, v. 6, n. 2, ICMC-USP, pp.149-163, October 2005. PDF
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. O Alinhador Lexical LIHLA: Experimentos com o Português do Brasil. In Caderno de resumos do V Encontro de Corpora, pp. 21-22. São Carlos -- SP, Brasil. 24 e 25 de novembro de 2005.
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. LIHLA: A lexical aligner based on language-independent heuristics. In Proceedings of the V Encontro Nacional de Inteligência Artificial (ENIA), pp. 641-650. São Leopoldo -- RS, Brazil. July 25-19, 2005. PDF
Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. LIHLA: Shared task system description. In Proceedings of the ACL Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, pp. 111-114. Ann Arbor, Michigan, USA. June 29-30, 2005. PDF
Caseli, H.M.; Scalco, M.A.G.; Nunes, M.G.V. Manual para marcação de alinhamentos lexicais. Série de Relatórios Técnicos do ICMC, 256 (NILC-TR-05-09), Abril 2005. 21 p.ZIP ZIP English version
Caseli, H.M. Regras de tradução automática induzidas de textos paralelos envolvendo o português do Brasil. Monografia de Qualificação. ICMC-USP, Agosto, 2004. 67 p. PDF
Related Publications
