3rd Workshop on Example-Based Machine Translation

**UPDATE** Submission deadline extension until September 30th

Call for Papers
"Going open-source to revive EBMT"

Hosted by the Centre for Next Generation Localisation (CNGL)
Dublin City University, Dublin, Ireland
November 12–13, 2009


  • Mikel L. Forcada (DLSI, Universitat d'Alacant and CNGL, Dublin City University)
  • Andy Way (CNGL, Dublin City University)

Following two successful EBMT workshops in 2001 at the MT Summit VIII in Santiago de Compostela, Spain, and in 2005 at the MT Summit X in Phuket, Thailand, this is the third workshop of its kind. Many things have happened since 2005. The last few years have witnessed a decline in example-based machine translation (EBMT) research and statistical machine translation (SMT) has almost completely taken over the corpus-based machine translation arena, with many EBMT practitioners moving into hybrid approaches integrating EBMT with other approaches, mostly (but not only) SMT. Not having a clear definition of what EBMT is has also contributed to this lack of visibility. In fact, research that would have been considered EBMT has been published without the EBMT label.

Is the success of SMT due to the fact that it is the best way to do corpus-based machine translation or is it because many SMT software packages are readily available to researchers under free/open-source licences that allow use as well as collaborative improvement? Shouldn't EBMT practitioners start to think about putting together their tools, their engines and their data and releasing them under open licenses to extend their use both in academia and in industry?

The pressure on machine translation researchers to prove their results through detailed empirical evaluation is growing. But the validity of empirical results hinges on reproducibility. Turning our experimental research into packages and tools that other researchers can use and improve is a challenge but it is not infeasible as SMT practitioners have shown.

Papers addressing this or other aspects related to a prospective strengthening of research on EBMT and real-world applications of it would be very welcome. As in previous editions, we expect papers on:

  • descriptions of 'pure' EBMT systems
    • knowledge resources used
    • representation of numeric and symbolic knowledge
  • descriptions of 'hybrid' systems
    • integration of EBMT with rule-based methods
    • integration of EBMT with statistical methods
  • (semi-)automatic preparation of existing mono/bi/multilingual corpora for EBMT
    • extraction of bi/multilingual texts from the web
    • preparation of treebanks for EBMT
    • bi/multilingual alignment/bracketing/parsing
    • inference of mono/bi/multilingual grammars
    • inference of bi/multilingual transfer rules
  • evaluation of EBMT results and/or comparison with other MT systems
  • considerations on domain-(in)dependence of EBMT systems
  • computational and/or system complexity of EBMT systems
In addition, we plan to have a panel session focusing on the main theme of the conference, that is, open-sourcing of existing EBMT software to strenghten EBMT research and usage.
Important Dates
  • Paper Submission: 18 September 2009
  • Notification of acceptance: 30 September 2009
  • Camera Ready Papers due: 22 October 2009
  • Workshop takes place: 12–13 November 2009