Publications

2023

  1. Pęzik, Piotr, Agnieszka Mikołajczyk, Adam Wawrzyński, Filip Żarnecki, Bartłomiej Nitoń, and Maciej Ogrodniczuk. ‘Transferable Keyword Extraction and Generation with Text-to-Text Language Models’. In Computational Science – ICCS 2023, edited by Jiří Mikyška, Clélia de Mulatier, Maciej Paszynski, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M.A. Sloot, 398–405. Cham: Springer Nature Switzerland, 2023.
  2. Ogrodniczuk, Maciej, Piotr Pęzik, Marek Łaziński, and Marcin Miłkowski. ‘Language Report Polish’. In European Language Equality: A Strategic Agenda for Digital Language Equality, edited by Georg Rehm and Andy Way, 191–94. Cham: Springer International Publishing, 2023. https://doi.org/10.1007/978-3-031-28819-7_29.
  3. Deckert, Mikołaj, Piotr Pęzik, and Raffaele Zago, eds. Language, Expressivity and Cognition. London New York Oxford New Dehli Sydney: Bloomsbury Academic, 2023.
  4. Deckert, Mikołaj, Krzysztof Hejduk, and Piotr Pęzik. ‘A Phraseological Perspective on Evaluation: The Covid-19 Vaccination in Polish Web-Based News’. In Language, Expressivity and Cognition. London New York Oxford New Dehli Sydney: Bloomsbury Academic, 2023.
  5. Deckert, Mikołaj, Piotr Pęzik, and Raffaele Zago. ‘Constructing Emotion in Contemporary Discourses: A Taste for Expressivity’. In Language, Expressivity and Cognition. London New York Oxford New Dehli Sydney: Bloomsbury Academic, 2023.

2022

  1. Pęzik, Piotr, Agnieszka Mikołajczyk, Adam Wawrzyński, Bartłomiej Nitoń, and Maciej Ogrodniczuk. ‘Keyword Extraction from Short Texts with a Text-to-Text Transfer Transformer’. In Recent Challenges in Intelligent Information and Database Systems, edited by Edward Szczerbicki, Krystian Wojtkiewicz, Sinh Van Nguyen, Marcin Pietranik, and Marek Krótkiewicz, 1716:530–42. Communications in Computer and Information Science. Singapore: Springer Nature Singapore, 2022. https://doi.org/10.1007/978-981-19-8234-7_41.
  2. Mojedano Batel, Andrea, Mitchell Abrams, and Piotr Pęzik. ‘Native Dialect Influence Detection (NDID): Differentiating between Mexican and Peninsular L1 Spanish in L2 English’. Language and Law / Linguagem e Direito 9, no. 1 (22 November 2022). https://ojs.letras.up.pt/index.php/LLLD/article/view/12829.
  3. Bevendorff, Janek, Berta Chulvi, Elisabetta Fersini, Annina Heini, Mike Kestemont, Krzysztof Kredens, Maximilian Mayerl, Piotr Pęzik et al. ‘Overview of PAN 2022: Authorship Verification, Profiling Irony and Stereotype Spreaders, Style Change Detection, and Trigger Detection: Extended Abstract’. In Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 331–38. Berlin, Heidelberg: Springer-Verlag, 2022. https://doi.org/10.1007/978-3-030-99739-7_42.
  4. Cichosz, Anna, Piotr Pęzik, Maciej Grabski, Sylwia Karasińska, Michał Adamczyk, Paulina Rybińska, and Aneta Ostrowska. A Frequency Dictionary of Old English Prose for Learners of Old English and Historical Linguists. Wydawnictwo Uniwersytetu Łódzkiego, 2022. https://doi.org/10.18778/8220-899-3.
  5. Váradi, Tamás, Bence Nyéki, Svetla Koeva, Marko Tadić, Vanja Štefanec, Maciej Ogrodniczuk, Bartłomiej Nitoń, Piotr Pęzik et al. ‘Introducing the CURLICAT Corpora: Seven-Language Domain Specific Annotated Corpora from Curated Sources’. In Proceedings of the Language Resources and Evaluation Conference, 100–108. Marseille, France: European Language Resources Association, 2022. http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.11.pdf.
  6. Pęzik, Piotr, Gosia Krawentek, Sylwia Karasińska, Paweł Wilk, Paulina Rybińska, Anna Cichosz, Angelika Peljak-Łapińska, Mikołaj Deckert, and Michał Adamczyk. ‘DiaBiz – an Annotated Corpus of Polish Call Center Dialogs’. In Proceedings of the Language Resources and Evaluation Conference, 723–26. Marseille, France: European Language Resources Association, 2022. http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.76.pdf.

2021

  1. Lewandowska-Tomaszczyk, Barbara, and Piotr Pęzik. ‘Emergent Impoliteness and Persuasive Emotionality in Polish Media Discourses’. Russian Journal of Linguistics 25, no. 3 (2021): 685–704. https://doi.org/10.22363/2687-0088-2021-25-3-685-704.
  2. Mikołajczyk, Agnieszka, Adam Wawrzyński, Piotr Pęzik, Michał Adamczyk, Adam Kaczmarek, and Wojciech Janowski. ‘Punctuation Restoration from Read Text’. In Proceedings of the PolEval 2021 Workshop, 21–31. Institute of Computer Science, Polish Academy of Sciences, 2021. http://poleval.pl/files/poleval2021.pdf.
  3. Pappagari, Raghavendra, Piotr Żelasko, Agnieszka Mikołajczyk, Piotr Pęzik, and Najim Dehak. ‘Joint Prediction of Truecasing and Punctuation for Conversational Speech in Low-Resource Scenarios’, 2021. https://arxiv.org/abs/2109.06103.
  4. Pęzik, Piotr. ‘Exploring the Valency of Collocational Chains’. In Formulaic Language. Theories and Methods. Phraseology and Multiword Expressions 5. Language Science Press, 2021. https://doi.org/10.5281/ZENODO.4727665.

2020

  1. Piotr Pęzik. 'Budowa i zastosowania korpusu monitorującego MoncoPL'. Forum Lingwistyczne (7). 2020. Pp. 133-150. doi: 10.31261/fl.2020.07.11 Download.
  2. Pęzik, Piotr. ‘Korpusowe Narzędzia Weryfikacji Frazeostylistycznej Tłumaczeń’. Konińskie Studia Językowe, 2019. Download
  3. Majewska-Tworek, Anna, Monika Zaśko-Zielińska, and Piotr Pęzik. ‘„Polszczyzna Mówiona Miast” – Kontynuacja Badań z Lat 80. XX Wieku z Wykorzystaniem Narzędzi Lingwistyki Cyfrowej’. Forum Lingwistyczne, no. 7 (20 November 2020): 71–87. https://doi.org/10.31261/FL.2020.07.06.

2001-2019

  1. Ogrodniczuk, Maciej, Rafał L. Górski, Marek Łaziński, and Piotr Pęzik. ‘From the National Corpus of Polish to the Polish Corpus Infrastructure’. Jazykovedný Časopis, no. 2 (2019): 315–323. https://doi.org/10.2478/jazcas-2019-0061.
  2. Piotrowski, Mateusz, Wojciech Janowski, and Piotr Pęzik. “A Bidirectional LSTM-CRF Network with Subword Representations, Character Convolutions and Morphosyntactic Features for Named Entity Recognition in Polish.” Proceedings of the PolEval 2018 Workshop, 2018, 93. Download
  3. Pęzik, Piotr. “Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix,” 4297–4300. Miyazaki, Japan, 2018. http://www.lrec-conf.org/proceedings/lrec2018/pdf/888.pdf.
  4. Pęzik, Piotr. Facets of Prefabrication. Perspectives on Modelling and Detecting Phraseological Units. Łódź: Wydawnictwo Uniwersytetu Łódzkiego, 2018.
  5. Lew, Michał, and Piotr Pęzik. “A Sequential Child-Combination Tree-LSTM Network for Sentiment Analysis.” In Human Language Technologies as a Challenge for Computer Science and Linguistics, 397–401. Poznań, 2017. http://ltc.amu.edu.pl/book/papers/PolEval2-2.pdf.
  6. Molenda, Marek, Piotr Pęzik, and John Osborne. “Self-Repetitions in Learners’ Spoken Language: A Corpus-Based Study.” In Learner Corpus Research, New Perspectives and Applications, 1st ed. Bloomsbury Academic, 2017.
  7. Pęzik, Piotr. Experimental Applications of Dependency-Based Phraseology Extraction. In Language, Corpora and Cognition, edited by Piotr Pęzik and Jacek Waliński, 29–55. Peter Lang, 2017.
  8. Cichosz, Anna, Jerzy Gaszewski, and Piotr Pȩzik. Element Order in Old English and Old High German Translations. Nowele Supplement Series, Volume 28. Amsterdam ; Philadelphia: Johns Benjamins Publishing Company, 2016.
  9. Pęzik, Piotr. “Exploring Phraseological Equivalence with Paralela.” In Polish-Language Parallel Corpora, edited by Ewa Gruszczyńska and Agnieszka Leńko-Szymańska, 67–81. Warsaw: Instytut Lingwistyki Stosowanej UW, 2016. Download.
  10. Pęzik, Piotr, and Mikołaj Deckert. “Time-Discretising Adverbials. Distributional Evidence of Conceptualisation Patterns.” In Conceptualizations of Time, edited by Barbara Lewandowska-Tomaszczyk, 295–316. Human Cognitive Processing 52. Amsterdam ; Philadelphia: John Benjamins Publishing Company, 2016.
  11. Pęzik, Piotr. “Spokes – a Search and Exploration Service for Conversational Corpus Data.” In Selected Papers from CLARIN 2014, 99–109. Linköping Electronic Conference Proceedings. Linköping University Electronic Press, Linköpings universitet, 2015. http://www.ep.liu.se/ecp_article/index.en.aspx?issue=116;article=009.
  12. Pęzik, Piotr. “Using N-Gram Independence to Identify Discourse-Functional Lexical Units in Spoken Learner Corpus Data.” International Journal of Learner Corpus Research 1, no. 2 (2015): 242–55. doi:10.1075/ijlcr.1.2.03pez. http://www.jbe-platform.com/content/journals/10.1075/ijlcr.1.2.03pez
  13. Molenda, Marek, and Piotr Pęzik. “Extending the Definition of Confluence. A Corpus-Based Study of Advanced Learners’ Spoken Language.” In Insights into Technology Enhanced Language Pedagogy, 2015.
  14. Pęzik, Piotr. “Graph-Based Analysis of Collocational Profiles.” In Phraseologie Im Wörterbuch Und Korpus (Phraseology in Dictionaries and Corpora), edited by Vida Jesenšek and Peter Grzybek, 227–43. ZORA 97. Maribor, Bielsko‑Biała, Budapest, Kansas, Praha: Filozofska fakuteta, 2014. http://www.ff.um.si/zalozba-in-knjigarna/ponudba/zbirke-in-revije/zora/priloge/Zora97zasplet.pdf#page=229.
  15. Deckert, Mikołaj, and Piotr Pęzik. “Degrees of Propositionality in Construals of Time Quantities.” Research in Language 12, no. 4 (January 1, 2014). doi:10.1515/rela-2015-0003. http://www.degruyter.com/view/j/rela.2014.12.issue-4/rela-2015-0003/rela-2015-0003.xml
  16. Rehm, Georg, Hans Uszkoreit, Sophia Ananiadou, Núria Bel, Audronė Bielevičienė, Lars Borin, António Branco, (…) Piotr Pęzik (…) et al. “The Strategic Impact of META-NET on the Regional, National and International Level.” In LREC 2014 Proceedings, 1517–24, 2014.
  17. Pęzik, Piotr. “Wybrane aspekty reprezentatywności małych i średnich korpusów.” In Na Tropach Korpusów. W Poszukiwaniu Optymalnych Zbiorów Tekstów, edited by Wojciech Chlebda, 45–58. Opole, 2013.
  18. Pęzik, Piotr. “Paradygmat Dystrybucyjny W Badaniach Frazeologicznych. Powtarzalność, Reprodukcja I Idiomatyzacja.” In Metodologie Językoznawstwa. Ewolucja Języka, Ewolucja Teorii Językoznawczych., edited by Piotr Stalmaszczyk, 141–60. Wydawnictwo Uniwersytetu Łódzkiego, 2013.
  19. Lewandowska-Tomaszczyk, Barbara, Mirosław Bańko, Rafał L. Górski, Marek Łazinski, Piotr Pęzik, and Adam Przepiórkowski. “Narodowy Korpus Języka Polskiego: Geneza I Dzień Dzisiejszy.” In Narodowy Korpus Języka Polskiego: Geneza I Dzień Dzisiejszy, edited by Aadam Przepiórkowski, Mirosław Bańko, R. L. Górski, and Barbara Lewandowska-Tomaszczyk, 3–10, 2012.
  20. Ogrodniczuk, Maciej, Radovan Garabík, Svetla Koeva, Cvetana Krstev, Piotr Pęzik, Tibor Pintér, Adam Przepiórkowski, et al. “Central and South-European Language Resources in META-SHARE,” 2012. http://infoteka.bg.ac.rs/index.php?where=107&Jezik=engleski.
  21. Ogrodniczuk, Maciej, Piotr Pęzik, and Adam Przepiórkowski. “Towards a Comprehensive Open Repository of Polish Language Resources.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, 3593–97. Istanbul: ELRA, 2012.
  22. Pęzik, Piotr. “Towards the PELCRA Learner English Corpus.” In Corpus Data across Languages and Disciplines, edited by Piotr Pęzik, 28:33–42. Łódź Studies in Language. Peter Lang, 2012.
  23. Przepiórkowski, Adam, Mirosław Bańko, Marek Łaziński, Rafał Górski, Barbara Lewandowska-Tomaszczyk, and Piotr Pęzik. “Practical Applications of the National Corpus of Polish.” Prace Filologiczne LXIII (2012): 231–39.
  24. Pęzik, Piotr, ed. Corpus Data across Languages and Disciplines. Vol. 28. Łódź Studies in Language. Peter Lang, 2012. http://www.peterlang.com/index.cfmevent=cmp.ccc.seitenstruktur.detailseiten&seitentyp=produkt&pk=70630&concordeid=262547.
  25. Pęzik, Piotr. “Język mówiony w NKJP.” In Narodowy Korpus Języka Polskiego, edited by Adam Przepiórkowski, Mirosław Bańko, Rafał Górski, and Barbara Lewandowska-Tomaszczyk, 37–47. Warszawa: Wydawnictwo Naukowe PWN, 2012. http://nkjp.pl/settings/papers/NKJP_ksiazka.pdf.
  26. Pęzik, Piotr. “NKJP w warsztacie tłumacza.” In Narodowy Korpus Języka Polskiego, edited by Adam Przepiórkowski, Mirosław Bańko, Rafał Górski, and Barbara Lewandowska-Tomaszczyk, 301–11. Warszawa: Wydawnictwo Naukowe PWN, 2012. http://nkjp.pl/settings/papers/NKJP_ksiazka.pdf.
  27. Pęzik, Piotr. “Wyszukiwarka PELCRA dla danych NKJP.” In Narodowy Korpus Języka Polskiego, edited by Adam Przepiórkowski, Mirosław Bańko, Rafał Górski, and Barbara Lewandowska-Tomaszczyk, 253–79. Warszawa: Wydawnictwo Naukowe PWN, 2012. http://nkjp.pl/settings/papers/NKJP_ksiazka.pdf.
  28. Pęzik, Piotr. “Providing Corpus Feedback for Translators with the PELCRA Search Engine for NKJP.” In Explorations across Languages and Corpora : PALC 2009, edited by Stanislaw Gozdz-Roszkowski, 135–44. Łódź Studies in Linguistics. Frankfurt am Main; New York: Peter Lang, 2011.
  29. Pęzik, Piotr, Maciej Ogrodniczuk, and Adam Przepiórkowski. “Parallel and Spoken Corpora in an Open Repository of Polish Language Resources.” In Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, edited by Zygmunt Vetulani, 511–15, 2011.
  30. Przepiórkowski, Adam, Mirosław Bańko, Rafał Górski, Barbara Lewandowska-Tomaszczyk, and Piotr Pęzik. “National Corpus of Polish.” In Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, edited by Zygmunt Vetulani, 259–63, 2011.
  31. Thompson, Paul, John McNaught, Simonetta Montemagni, Nicoletta Calzolari, Riccardo del Gratta, Vivian Lee, Simone Marchi, Piotr Pęzik et al. “The BioLexicon: A Large-Scale Terminological Resource for Biomedical Text Mining.” BMC Bioinformatics 12 (2011): 397. doi:10.1186/1471-2105-12-397.
  32. Pęzik, Piotr. “Computational and Corpus Linguistics.” In New Ways to Language, edited by Barbara Lewandowska-Tomaszczyk, 433–60. Łódź: Wydawnictwo Uniwersytetu Łódzkiego, 2010.
  33. Rebholz-Schuhmann, D, S Kavaliauskas, and Piotr Pęzik. “PaperMaker: Validation of Biomedical Scientific Publications.” Bioinformatics (Oxford, England) 26, no. 7 (April 1, 2010): 982–84. doi:10.1093/bioinformatics/btq060.
  34. Grego, Tiago, Piotr Pęzik, Francisco M. Couto, and Dietrich Rebholz-Schuhmann. “Identification of Chemical Entities in Patent Documents.” In Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living, edited by Sigeru Omatu, Miguel P. Rocha, José Bravo, Florentino Fernández, Emilio Corchado, Andrés Bustillo, and Juan M. Corchado, 5518:942–49. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://www.springerlink.com/index/10.1007/978-3-642-02481-8_144.
  35. Pęzik, Piotr. “Extraction of Multiword Expressions for Corpus-Based Discourse Analysis.” In Studies in Cognitive Corpus Linguistics, edited by Barbara Lewandowska-Tomaszczyk and Katarzyna Dziwirek. Frankfurt am Main; New York: P. Lang, 2009.
  36. Pęzik, P., A. Jimeno-Yepes, and D. Rebholz-Schuhmann. “Using Biomedical Terminological Resources for Information Retrieval.” In Information Retrieval in Biomedicine: Natural Language Processing for Knowledge Integration, edited by Violaine Prince and Mathieu Roche, 58–77. IGI Global, 2009. http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-60566-274-9.
  37. Przepiórkowski, Adam, Rafał L. Górski, Marek Łaziński, and Piotr Pęzik. “Recent Developments in the National Corpus of Polish.” In NLP, Corpus Linguistics, Corpus Based Grammar Research: Proceedings of the Fifth International Conference, Smolenice, Slovakia, 25–27 November 2009, edited by Jana Levická and Radovan Garabík, 302–9. Brno: Tribun, 2009. http://nlp.ipipan.waw.pl/ adamp/Papers/2009-slovko-nkjp/.
  38. Trieschnigg, Dolf, Piotr Pęzik, Vivian Lee, Franciska de Jong, Wessel Kraaij, and Dietrich Rebholz-Schuhmann. “MeSH Up: Effective MeSH Text Classification for Improved Document Retrieval.” Bioinformatics (Oxford, England) 25, no. 11 (June 1, 2009): 1412–18. doi:10.1093/bioinformatics/btp249.
  39. Trieschnigg, D., P. Pezik, V. Lee, F. de Jong, W. Kraaij, and D. Rebholz-Schuhmann. “Response to Comment on ‘MeSH-up: Effective MeSH Text Classification for Improved Document Retrieval.’” Bioinformatics 25, no. 20 (August 24, 2009): 2772–2772. doi:10.1093/bioinformatics/btp484.
  40. Waagmeester, Andra, Piotr Pęzik, Susan Coort, Franck Tourniaire, Chris Evelo, and Dietrich Rebholz-Schuhmann. “Pathway Enrichment Based on Text Mining and Its Validation on Carotenoid and Vitamin A Metabolism.” Omics: A Journal of Integrative Biology 13, no. 5 (2009): 367–79. doi:10.1089/omi.2009.0029.
  41. Kim, Jung-Jae, Piotr Pęzik, and Dietrich Rebholz-Schuhmann. “MedEvi: Retrieving Textual Evidence of Relations between Biomedical Concepts from Medline.” Bioinformatics (Oxford, England) 24, no. 11 (June 1, 2008): 1410–12. doi:10.1093/bioinformatics/btn117.
  42. Pęzik, P., A. Jimeno-Yepes, V. Lee, and D. Rebholz-Schuhmann. “Static Dictionary Features for Term Polysemy Identification.” In Building and Evaluating Resources for Biomedical Text Mining, LREC Workshop, 2008.
  43. Rebholz-Schuhmann, D., P. Pezik, V. Lee, J.J. Kim, R. del Gratta, Y. Sasaki, J. McNaught, et al. “BioLexicon: Towards a Reference Terminological Resource in the Biomedical Domain.” The 16th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB-2008), 2008.
  44. Jimeno, A., P. Pezik, and D. Rebholz-Schuhmann. “Information Retrieval and Information Extraction in Trec Genomics 2007.” The Sixteenth Text REtrieval Conference (TREC 2007) Proceedings. NIST Special Publication: SP, 2007, 274–274.
  45. Pęzik, Piotr. “Lexis, the Lexicon, Terms, Idioms and Co-Occurrence Statistics - a Case Study.” edited by Jacek Walinski, Krzysztof Kredens, and Stanislaw Gozdz-Roszkowski. Lang, 2007.
  46. Waliński, Jacek, and Piotr Pęzik. “Web Access Interface to the PELCRA Referential Corpus of Polish.” edited by Jacek Walinski, Krzysztof Kredens, and Stanislaw Gozdz-Roszkowski, 65–86. Lang, 2007.
  47. Hardie, A., E. Levin, and P. Pezik. “Analiza Morfologiczno-Składniowa Korpusów.” In Podstawy Językoznawsta Korpusowego, edited by B. Lewandowska-Tomaszczyk, 75–94. Wydawnictwo Uniwersytetu Łódzkiego, 2005.
  48. Wilson, Andrew, and P. Pęzik. “Systemy Anotacji Korpusów Jezykowych, Korpusów Równoleglych i Porównywalnych.” In Podstawy Jezykoznawstwa Korpusowego, edited by B. Lewandowska-Tomaszczyk, 61–74. Wydawnictwo Uniwersytetu Łódzkiego, 2005.
  49. Uzar, Rafał, Piotr Pęzik, and Eric Levin. “Developing Relational Databases for Corpus Linguistics.” In Practical Applications in Language and Computers, edited by Barbara Lewandowska-Tomaszczyk. Frankfurt am Main; New York: Peter Lang, 2004.
  50. Abiteboul, Serge, Peter Buneman, and Dan Suciu. Dane W Sieci Www: Od Relacji Do Modelu Semistrukturalnego i XML. Translated by Paweł Brągoszewski, Piotr Pęzik, and Sławomir Dzieniszewski, 2001.