Balto-Slavonic Natural Language Processing 2007 (BSNLP 2007) with Special Theme: Information Extraction and Enabling Technologies June 29, 2007 Prague, Czech Republic BSNLP will be held in conjunction with the ACL 2007 conference ( and is co-organised by the European Commission's Joint Research Centre TOPIC AND MOTIVATION: The recent political and economic changes in Central and Eastern Europe and the related on-going enlargement of the European Union brings into focus new cultures and languages. Among them, the languages from the Balto-Slavonic group have an outstanding role because of their rich cultural heritage and the widespread use - over 400 million speakers. The topic of the workshop addresses Natural Language Processing (NLP) for the Balto-Slavonic languages, with the focus on Information Extraction (IE) and enabling technologies for this language family. The task of IE is to identify a predefined set of concepts from natural language text. The spectrum of IE tasks ranges from named-entity recognition, through relation extraction and co-reference resolution to the identification of complex events and cross-document entity profile extraction. Although a considerable amount of IE-related work exists, most of the studies are concentrated on a few major languages. Research on this topic, as well as on general-purpose NLP tools in the context of Balto-Slavonic languages, is still in its early stage and is progressing relatively slowly. Due to some specific phenomena like the highly inflectional character and relatively free word order, a construction of IE systems and other language processing tools (question-answering, text summarization, machine translation) for these languages is an intriguing and challenging task. This workshop can be seen as the follow-up to the successful workshop on Information Extraction for Slavonic and Other Central and Eastern European Languages ( held in conjunction with the RANLP 2003 conference. It is also related to the EACL 2003 workshop on Morphological Processing of Slavic Languages ( In particular, we would strongly encourage submissions describing systems, resources or solutions that are made available to the wider public, as these would help to promote computational linguistics applications for these languages. AREAS OF INTEREST include, but are not limited to: A. Specific challenges for Balto-Slavonic NLP, in particular in the context of IE and enabling technologies - text segmentation - morphological analysis - morphology models - morpho-syntactic disambiguation - named-entity recognition - named-entity disambiguation (e.g., geo-referencing) - named-entity lemmatisation - term and keyword extraction - name variant recognition and merging - syntactic parsing and chunking - co-reference resolution - word sense disambiguation - corpus-based knowledge acquisition B. Multilingual IE frameworks and techniques applied to these languages - tools and resources (freely available for research purposes will be preferred) - experience with, and evaluation of, linguistic data and processing resources - comparative evaluation between languages C. IE solutions for these languages: - scenario template filling / event extraction - relation extraction - automatic pattern learning - corpus studies and statistical techniques for IE - IE from Web sources - IE-based ontology population - IE evaluation - IE techniques for Question/Answering and Answer Extraction - Utilisation of IE-based techniques in other NLP applications INTENDED AUDIENCE The goal of this workshop is to bring together researchers and practitioners working on NLP for Balto-Slavonic languages, in particular on IE and core technologies supporting IE for these languages. The workshop will give an opportunity to exchange ideas and experience, to discuss hard-to-tackle problems in this field of research, and to make available resources more widely known. SUBMISSION Papers should describe original work and should indicate the state of completion of the reported results. In particular, an overlap with previously published work should be clearly mentioned. Submissions will be judged on correctness, novelty, technical strength, clarity of presentation, usability, and significance/relevance to the workshop. Submissions should follow the two-column format of the ACL 2007 main-conference proceedings and should not exceed eight (8) pages, including references. We recommend to use either the LaTeX style file or the Microsoft-Word style file, which can be found at The reviewing will be blind. Therefore, the paper should not include the authors' names and affiliations. Furthermore, self-citations and other references that could reveal the author's identity should be avoided. Submission will be electronic. The only accepted format for submitted papers is Adobe PDF. Papers must be submitted no later than April 1 April 9 , 2006 (the call is now closed) using the submission webpage Submissions will be reviewed by 3 members of the Program Committee. Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the ACL workshop proceedings. IMPORTANT DATES Workshop Paper Submission deadline: April 1 April 9 (deadline extension). The call is now closed. Notification of Acceptance: April 25 April 30 Camera-ready Version: May 9 Workshop: June 29, 2007. LOCATION Prague, the capital of the Czech Republic, is located in the centre of Europe. It is one of the most valuable historical city reserves in Europe. The historical core of the city is listed in the UNESCO World Cultural and Natural Heritage Register. The workshop itself will take place in the TOP HOTEL Praha, located in the quiet neighbourhood of the Prague 4 district, only 15-20 minutes from the historic centre of Prague. Prague is easily reachable by car, bus or train from Central Europe (only 3-hour drive from Vienna or Budapest or 4 hours from Berlin or Munich), by cheap flights from the rest of Europe, and by several direct flights from overseas. FURTHER INFORMATION For further information please write to any staff member (see or check the workshop web page PROGRAM COMMITTEE Tania Avgustinova (University of Saarland / DFKI, Germany) Kalina Bontcheva (University of Sheffield, UK) Tomaz Erjavec (Jozef Stefan Institute, Slovenia) Vaclav Kubon (Charles University Prague, Czech Republic) Anna Kupsc (University Paris III, France) Ruta Marcinkeviciene (Vytautas Magnus University, Kaunas, Lithuania) Agnieszka Mykowiecka (Polish Academy of Sciences, Poland) Jakub Piskorski (Joint Research Centre, Italy) Bruno Pouliquen (Joint Research Centre, Italy) Hristo Tanev (Joint Research Centre, Italy) Marko Tadic (University of Zagreb, Croatia) Agata Savary (University of Tours, France) Kiril Simov (Bulgarian Academy of Sciences, Bulgaria) Wojciech Skut (Google Inc., USA) Ralf Steinberger (Joint Research Centre, Italy) Dusko Vitas (University of Beograd, Serbia) Roman Yangarber (University of Helsinki, Finland) PROGRAM COMMITTEE CHAIR Jakub Piskorski (Joint Research Centre, Italy) Hristo Tanev (Joint Research Centre, Italy) ORGANIZING COMMITTEE Jakub Piskorski (Joint Research Centre, Italy) Bruno Pouliquen (Joint Research Centre, Italy) Hristo Tanev (Joint Research Centre, Italy) Ralf Steinberger (Joint Research Centre, Italy)

