PULS Project Publications

    Edited Collections

  1. The 6th Workshop on Balto-Slavic Natural Language Processing   
    Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šajder, Josef Steinberger, Roman Yangarber (eds.)
    Proceedings of the Workshop, EACL-2017
    (2017) Valencia, Spain

  2. The 5th Workshop on Balto-Slavic Natural Language Processing   
    Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Hristo Tanev, Roman Yangarber (eds.)
    Proceedings of the Workshop, RANLP-2015
    (2015) Hissar, Bulgaria

  3. The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing   
    Jakub Piskorski, Lidia Pivovarova, Hristo Tanev, Roman Yangarber (eds.)
    Proceedings of the Workshop, ACL-2013
    (2013) Sofia, Bulgaria

  4. Multi-source, Multilingual Information Extraction and Summarization
    Thierry Poibeau, Horacio Saggion, Jakub Piskorski, Roman Yangarber (Eds.)
      Theory and Applications of Natural Language Processing. Springer-Verlag (2012)
    Berlin, Heidelberg

    Conference Papers, Journal Articles, Book Chapters

  5. Grouping business news stories based on salience of named entities    (pdf)
    Llorenç Escoter, Lidia Pivovarova, Mian Du, Anisia Katinskaia, Roman Yangarber.
    15th Conference of the European Chapter of the Association for Computational Linguistics Proceedings of Conference, (EACL) (2017) Valencia, Spain

  6. HCS at SemEval-2017 Task 5: Sentiment detection in business news using convolutional neural networks    (pdf)
    Lidia Pivovarova, Llorenc Escoter, Arto Klami, Roman Yangarber.
    Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017) (2017) Vancouver, Canada

  7. The First Cross-Lingual Challenge on Recognition, Normalization and Matching of Named Entities in Slavic Languages   (pdf)
    Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber.
    Proceedings of the The 6th Workshop on Balto-Slavic Natural Language Processing
    EACL (2017) Valencia, Spain

  8. Toward Never Ending Language Learning for Morphologically Rich Languages   (pdf)
    Kseniya Buraya, Lidia Pivovarova, Sergey Budkov, Andrey Filchenkov.
    Proceedings of the The 6th Workshop on Balto-Slavic Natural Language Processing
    EACL (2017) Valencia, Spain

  9. PULS: natural language processing for business intelligence   (pdf)
    Mian Du, Lidia Pivovarova, Roman Yangarber.
    Proceedings of the 2016 Workshop on Human Language Technology
    Go to Print Publisher (2016) New York, USA

  10. Tracking interactions across business news, social media, and stock fluctuations
    Ossi Karkulahti, Lidia Pivovarova, Mian Du, Jussi Kangasharju, Roman Yangarber.
    Proceedings of European Conference on Information Retrieval (ECIR). Springer International Publishing (2016) Padua, Italy

  11. Acquisition of domain-specific patterns for single document summarization and information extraction.   (pdf)
    Mian Du, Roman Yangarber.
    Proceedings of the The Second International Conference on Artificial Intelligence and Pattern Recognition (2015) Shenzhen, China

  12. Large-scale Multi-Label Text Classification for an Online News Monitoring System   (pdf)
    Master's Thesis: Matthew Pierce   (2015) University of Helsinki, Department of Computer Science

  13. Improving Supervised Classification Using Information Extraction   (pdf)
    Mian Du, Matthew Pierce, Lidia Pivovarova, Roman Yangarber
    NLDB 2015
    Springer Verlag, Lecture Notes of Computer Science, LNCS Volume 9103 (2014) Passau, Germany

  14. Supervised Classification Using Balanced Training   (pdf)
    Mian Du, Matthew Pierce, Lidia Pivovarova, Roman Yangarber
    SLSP 2014: International Conference on Statistical Language and Speech Processing
    Springer Verlag, Lecture Notes in Artificial Intelligence (LNAI), LNCS Volume 7978 (2014) Grenoble, France

  15. MDL-based Models for Transliteration Generation   (pdf)
    Javad Nouri, Lidia Pivovarova, Roman Yangarber.
    SLSP 2013: International Conference on Statistical Language and Speech Processing
    Springer Verlag, Lecture Notes in Artificial Intelligence (LNAI), LNCS Volume 7978 (2013) Tarragona, Spain

  16. Combined analysis of news and Twitter messages    (pdf)
    Mian Du, Ossi Mikael Karkulahti, Jussi Kangasharju, Lidia Pivovarova, Roman Yangarber.
    RANLP-2013 workshop on Semantic Web and Information Extraction
    (2013) Hissar, Bulgaria

  17. Adapting the PULS event extraction framework to analyze Russian text    (pdf)
    Lidia Pivovarova, Mian Du, Roman Yangarber.
    At ACL: 4th Biennial Workshop on Balto-Slavic Natural Language Processing
    (2013) Sofia, Bulgaria

  18. Automatic detection of stable grammatical features in N-grams   (pdf)
    Mikhail Kopotev, Lidia Pivovarova, Natalia Kochetkova, Roman Yangarber.
    The 9th Workshop on Multiword Expressions: MWE 2013 Co-located with NAACL/HLT: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013) Atlanta, GA

  19. Event representation across genre   (pdf)
    Lidia Pivovarova, Silja Huttunen, Roman Yangarber.
    Workshop on EVENTS: Definition, Detection, Coreference, and Representation Co-located with NAACL/HLT: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013) Atlanta, GA

  20. An Overview of Internet biosurveillance   (pdf)
    DM Hartley, NP Nelson, RR Arthur, P Barboza, N Collier, N Lightfoot, JP Linge, E van der Goot, A Mawudeku, LC Madoff, L Vaillant, R Walters, R Yangarber, J Mantero, CD Corley, JS Brownstein.
    (2013) Journal of Clinical Microbiology and Infection, 19(6), Wiley

  21. Evaluation of epidemic intelligence systems integrated in the early alerting and reporting project for the detection of A/H5N1 influenza events   (pdf)
    Barboza P, Vaillant L, Mawudeku A, Nelson NP, Hartley DM, Madoff LC, Linge JP, Collier N, Brownstein JS, Yangarber R, Astagneau P.
    (2013) In PLoS One Journal, 8(3)

  22. Improving performance quality and user experience in the PULS News Mining system   (pdf)
    Master's Thesis: Mian Du   (abstract)
    (2012) University of Helsinki, Department of Computer Science

  23. Techniques for Multilingual Security-related Event Extraction from Online News   (abstract)
    Martin Atkinson, Mian Du, Jakub Piskorski, Hristo Tanev, Roman Yangarber, Vanni Zavarella.
    In Computational Linguistics—Applications (A. Przepiórkowski, M. Piasecki, K. Jassem, P. Fuglewicz, eds.) Studies in Computational Intelligence, Vol. 458
    (2012) Springer Verlag

  24. Information Extraction: Past, Present and Future   (pdf)
    Jakub Piskorski, Roman Yangarber.
    Survey Chapter in "Multi-source, Multilingual Information Extraction and Summarization", Theory and Applications of Natural Language Processing (T. Poibeau et al., eds.).
    Springer-Verlag (2012) Berlin, Heidelberg

  25. Predicting Relevance of Event Extraction for the End User   (abstract, pdf)
    Silja Huttunen, Arto Vihavainen, Mian Du, Roman Yangarber.
    In "Multi-source, Multilingual Information Extraction and Summarization", Theory and Applications of Natural Language Processing (T. Poibeau et al., eds.).
    Springer-Verlag (2012) Berlin, Heidelberg

  26. Tietojenkäsittelytiede: Tiedoneristäminen ("Information Extraction", in Finnish )   
    Silja Huttunen.
    Invited chapter in "Genreanalyysi—tekstilajitutkimuksen käsikirja"
    ("The Handbook of Genre Analysis and Text-Type Research")
    (Heikkinen, V., Voutilainen, E., Lauerma, P., Tiililä, U. & Lounela, M., eds.).
    Gaudeamus Helsinki University Press (2012) Helsinki
    (Kotimaisten kielten keskuksen julkaisuja 169).

  27. Building support tools for Russian-language information extraction   
    Mian Du, Peter von Etter, Mikhail Kopotev, Mikhail Novikov, Natalia Tarbeeva, Roman Yangarber.
    BSNLP-2011: Balto-Slavonic Natural Language Processing (2011) Plzeň, Czech Republic.
    Springer-Verlag, Lecture Notes in Computer Science, Volume 6836. Series: Text, Speech and Dialogue.

  28. User-Oriented Information Extraction   (pdf)   (HTML)
    Master's Thesis: Peter von Etter
    University of Helsinki, Department of Computer Science (2011)

  29. Event Relevance in Information Extraction   (pdf)
    Master's Thesis: Arto Vihavainen
    University of Helsinki, Department of Computer Science (2011)

  30. Multilingual real-time event extraction for border security intelligence gathering   
    Martin Atkinson, Jakub Piskorski, Erik Van der Goot, Roman Yangarber.
    Counterterrorism and Open Source Intelligence. Springer Lecture Notes in Social Networks, Vol. 2. (Uffe Kock Wiil, editor).
    (2011) pp. 355-390

  31. Relevance prediction in information extraction using discourse and lexical features
    Silja Huttunen, Arto Vihavainen, Peter von Etter, Roman Yangarber.
    Nodalida-2011: Nordic Conference on Computational Linguistics
    (2011) Riga, Latvia

  32. Assessment of utility in Web mining for the domain of Public Health    (pdf)
    Peter von Etter, Silja Huttunen, Arto Vihavainen, Matti Vuorinen, Roman Yangarber.
    In Proceedings of LOUHI-2010: the Second Louhi Workshop on Text and Data Mining of Health Documents, at the NAACL/HLT Conference,
    (2010) Los Angeles, California

  33. MedISys—Medical Information System   
    Jens P. Linge, Ralf Steinberger, Flavio Fuart, Stefano Bucci, Jenya Belyaeva, Monica Gemo, Delilah Al-Khudhairy, Roman Yangarber, Erik van der Goot.
    In Advanced ICTs for Disaster Management and Threat Detection: Collaborative and Distributed Frameworks. Eleana Asimakopoulou, Nik Bessis (eds.),
    (2010) IGI GLobal Press,

  34. Real-time Text Mining in Multilingual News for the Creation of a Pre-frontier Intelligence Picture    (pdf)
    Jakub Piskorski, Martin Atkinson, Jenya Belyaeva, Vanni Zavarella, Silja Huttunen, Roman Yangarber.
    In Proceedings of the 16th Conference on Knowledge Discovery and Data Mining (KDD-2010); ACM SIGKDD Workshop on Intelligence and Security Informatics.
    (2010) Washington, DC

  35. Filtering news for epidemic surveillance: towards processing more languages with fewer resources   
    Gael Lejeune, Antoine Doucet, Roman Yangarber, Nadine Lucas.
    CLIA: Fourth International Workshop On Cross Lingual Information Access, at COLING 2010
    (2010) Beijing, China

  36. Utility evaluation of tools for collaborative development and maintenance of ontologies   
    Alex Norta, Roman Yangarber, Lauri Carlson.
    VORTE-2010: Joint 5th International Workshop on Vocabularies, Ontologies and Rules for The Enterprise / International Workshop on Metamodels, Ontologies and Semantic Technologies (MOST) at EDOC-2010: the Fourteenth IEEE International Conference On Enterprise Computing
    (2010) Vitória, ES, Brazil

  37. News mining for border security intelligence    (pdf)
    Jakub Piskorski, Martin Atkinson, Jenya Belayeva, Vanni Zavarella, Silja Huttunen, Roman Yangarber.
    In IEEE ISI-2010: Intelligence and Security Informatics
    (2010) Vancouver, BC, Canada

  38. The landscape of international event-based biosurveillance    (pdf)
    D Hartley, N Nelson, R Walters, R Arthur, R Yangarber, L Madoff, J Linge, A Mawudeku, N Collier, J Brownstein, G Thinus, N Lightfoot.
    In Emerging Health Threats Journal, 3:e3 (2010)

  39. A proposal for a multilingual epidemic surveillance system.   
    Gael Lejeune, Mohammed Hatmi, Antoine Doucet, Silja Huttunen, Nadine Lucas.
    In Proceedings of MINUCS-2009: Workshop on Mining User-Generated Content for Security, at the UCMedia-2009: ICST Conference on User-Centric Media
    (2009) Venice, Italy

  40. Automated event extraction in the domain of Border Security    (pdf)
    Martin Atkinson, Jakub Piskorski, Hristo Tanev, Eric van der Goot, Roman Yangarber, Vanni Zavarella.
    In Proceedings of MINUCS-2009: Workshop on Mining User-Generated Content for Security, at the UCMedia-2009: ICST Conference on User-Centric Media
    (2009) Venice, Italy

  41. Automatic epidemiological surveillance from on-line news in MedISys and PULS    (pdf)
    Roman Yangarber, Peter von Etter, Ralf Steinberger.
    In Proceedings of IMED-2009: International Meeting on Emerging Diseases and Surveillance
    (2009) Vienna, Austria

  42. Internet surveillance systems for early alerting of health threats    (pdf)
    Jens P. Linge, Ralf Steinberger, Thomas P. Weber, Roman Yangarber, Erik van der Goot, Delilah H. Al-Khudhairy, Nikolaos I. Stilianakis.
    In Eurosurveillance Journal, 14(13)
    (2009) Stockholm, Sweden

  43. Text mining from the Web for Medical Intelligence   (pdf)   (abstract)
    Ralf Steinberger, Flavio Fuart, Erik van der Groot, Clive Best,
    Peter von Etter, Roman Yangarber.
    In: Mining Massive Data Sets for Security, D. Perrotta, J. Piskorski, F. Soulié-Fogelman & R. Steinberger (eds.): OIS Press
    (2008) Amsterdam, The Netherlands

  44. Content Collection and Analysis in the Domain of Epidemiology   (pdf)
    Roman Yangarber, Peter von Etter, Ralf Steinberger.
    In Proceedings of DrMED-2008: International Workshop on Describing Medical Web Resources, at MIE-2008: the 21st International Congress of the European Federation for Medical Informatics
    (2008) Göteborg, Sweden

  45. Combining information retrieval and information extraction for medical intelligence    (pdf)
    Roman Yangarber, Ralf Steinberger, Clive Best, Peter von Etter, Flavio Fuart, David Horby.
    Mining Massive Data Sets for Security, NATO Advanced Study Institute
    (2007) Gazzada, Italy

  46. Combining Information about Epidemic Threats from Multiple Sources    (pdf)
    Roman Yangarber, Clive Best, Peter von Etter, Flavio Fuart, David Horby, Ralf Steinberger.
    In Proceedings Multi-source, Multilingual Information Extraction and Summarization at RANLP-2007.
    (2007) Borovets, Bulgaria

  47. Verification of Facts across Document Boundaries    (pdf)
    Roman Yangarber.
    In Proceedings IIIA-2006: International Workshop on Intelligent Information Access
    (2006) Helsinki, Finland

  48. Confidence measuring and data improvement of extracted information from disease outbreak reports   (pdf)
    Master's Thesis: Lauri Jokipii   (html)
    University of Helsinki, Department of Computer Science (2006)