Publications
Romein, C.A., Kırmızıaltın, S., Reshef, R. et al. From research proposal to project management. A guide from the Transkribus community on planning and executing workflows for researchers and GLAM-professionals. Int J Digit Humanities (2025). https://doi.org/10.1007/s42803-025-00107-7
Romein, C.A., Rabus, A., Leifert, G. et al. Assessing advanced handwritten text recognition engines for digitizing historical documents. Int J Digit Humanities 7, 115–134 (2025). https://doi.org/10.1007/s42803-025-00100-0
Romein, C. A. (2024). State of the Field: Digital Legal History. Journal for Digital Legal History, 3(1). https://doi.org/10.21825/dlh.91695
Kont, Jülide, Elving, Wim, Broersma, Marcel and Bozdağ, Çiğdem. "What makes audiences resilient to disinformation? Integrating micro, meso, and macro factors based on a systematic literature review" Communications, 2024. https://doi.org/10.1515/commun-2023-0078
Kont, J., Bozdağ, Ç., Elving, W., & Broersma, M. (2026). So Emotional? The Role of Emotions for Young Adults’ Resilience to Disinformation. Media and Communication, 14, Article 11398. https://doi.org/10.17645/mac.11398
van den Hoogen, J., Hudson, D. & Atzmueller, M. (2026). A comparison of graph construction techniques for applying graph signal processing to soil moisture networks. Discov Computing 29, 17. https://doi.org/10.1007/s10791-025-09862-1
Romein, C.A., Kırmızıaltın, S., Reshef, R. et al. From research proposal to project management. A guide from the Transkribus community on planning and executing workflows for researchers and GLAM-professionals. Int J Digit Humanities (2025). https://doi.org/10.1007/s42803-025-00107-7
Romein, C.A., Rabus, A., Leifert, G. et al. Assessing advanced handwritten text recognition engines for digitizing historical documents. Int J Digit Humanities 7, 115–134 (2025). https://doi.org/10.1007/s42803-025-00100-0
Eijkelboom, I., Schulp, A. S., Amkreutz, L., Verheul, D., Verschoof-van der Vaart, W., Van der Vaart-Verschoof, S., Hogeweg, L., Brunink, D., Mol, D., Peeters, H., & Wesselingh, F. (2025). Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects. PeerJ, 13(2), Article e18927. [DOI] [Portal]
Romein, C. A. (2024). State of the Field: Digital Legal History. Journal for Digital Legal History, 3(1). https://doi.org/10.21825/dlh.91695
Kont, Jülide, Elving, Wim, Broersma, Marcel and Bozdağ, Çiğdem. "What makes audiences resilient to disinformation? Integrating micro, meso, and macro factors based on a systematic literature review" Communications, 2024. https://doi.org/10.1515/commun-2023-0078
Krause, L., Heemskerk, E., de Boer, V., & Cheah, W. S. (2026, April 20). Towards Polyvocal Metadata Collection for Colonial Collections: a Pilot Study in Kuching, Malaysia. DH Benelux 2026, Maastricht, Netherlands. https://doi.org/10.5281/zenodo.19663429
Bakker, J., & Kamps, J. (2024, November). Cochrane-auto: An Aligned Dataset for the Simplification of Biomedical Abstracts. The Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), Miami, Florida, USA. https://doi.org/10.18653/v1/2024.tsar-1.5
Bakker, J., Yüksel, G., & Kamps, J. (2024, September). University of Amsterdam at the CLEF 2024 SimpleText Track. Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble, France. https://doi.org/10.5281/zenodo.14886987
Bakker, J., & Kamps, J. (2024, May). Beyond Sentence-level Text Simplification: Reproducibility Study of Context-Aware Document Simplification. The Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024 (DeTermIt), Torino, Italia. https://doi.org/10.5281/zenodo.14886721
Bram Bakker and Iris Hendrickx (2025, November 21). Evaluation of large language models on hierarchical entity matching for cultural heritage metadata. Anthology of Computers and the Humanities, 3:1180–1197. https://doi.org/10.63744/UKsLY7DKNPvA
Peeters, S., Romein, C. A., & Weber, A. (2025, November 21). Towards a NAvigator Tool for Dutch Verbaal-Archives: Leveraging Nineteenth-Century Archival Logic for Keyword Search. Anthology of Computers and the Humanities, 3, 1169–1179. https://doi.org/10.63744/byky4ladhcbb
Jan Bakker and Jaap Kamps. 2025. Section-Level Simplification of Biomedical Abstracts. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 13819–13833, Suzhou, China. Association for Computational Linguistics. https://aclanthology.org/2025.emnlp-main.697/
Taiki Papandreou, Jan Bakker, and Jaap Kamps. 2025. Medical Text Simplification From Jargon Detection to Jargon-Aware Prompting. Proceedings of the Fourth Workshop on Text Simplification, Accessibility and Readability (TSAR 2025), pages 36–46, Suzhou, China. Association for Computational Linguistics. https://aclanthology.org/2025.tsar-1.3/
Bram Bakker, Xiaomeng Wang (2025, October 25-26). Improving Text-to-image Retrieval of News Articles, Class-aware Fine-tuning of CLIP and the Use of Lead Text. MediaEval 2025 Workshop, Dublin, Ireland. https://2025.multimediaeval.com/paper16.pdf
Xiaomeng Wang, Bram Bakker (2025, October 25-26). Diffusion-Based Approaches for NewsImage Generation: A Comparative Study of SDXL Variants. MediaEval 2025 Workshop, Dublin, Ireland. https://2025.multimediaeval.com/paper17.pdf
Bruno N. Sotic and Jaap Kamps (2025, September 15). What Makes a User Click on a News Item? Understanding News Values of Visual Content in News Recommendation. Linking Theory and Practice of Digital Libraries: 29th International Conference on Theory and Practice of Digital Libraries, TPDL 2025, Tampere, Finland, September 23–26, 2025, Proceedings. Springer-Verlag, Berlin, Heidelberg, 321–339. https://doi.org/10.1007/978-3-032-05409-8_19
Kreefft-Libiu A., Helms F., Selçuk C., Bakker J. and Kamp J. (2025, September 9-12). University of Amsterdam at the CLEF 2025 JOKER Track (Vol. 4038, pages 2821-2828) CEUR-WS. https://ceur-ws.org/Vol-4038/paper_224.pdf
Bakker J., Vendeville B., Ermakova L. and Kamps J. (2025, September 9-12). Overview of the CLEF 2025 SimpleText Task 1: Simplify Scientific Text (Vol. 4038, pages 4167-4185). https://ceur-ws.org/Vol-4038/paper_344.pdf
Vendeville B.,Bakker J., Azarbonyad H., Ermakova L. and Kamps J. (2025, September 9-12). Overview of the CLEF 2025 SimpleText Task 2: : Identify and Avoid Hallucination (Vol. 4038, pages 4186-4204). https://ceur-ws.org/Vol-4038/paper_345.pdf
Papandreou T., Bakker J., Kamps J. (2025, September 9-12). University of Amsterdam at the CLEF 2025 SimpleText Track (Vol. 4038, pages 4356-4362) CEUR-WS. https://ceur-ws.org/Vol-4038/paper_359.pdf
Karlgren, J., Engels, M. I., Barrett, M., Gunti, R. R., Hoveyda, M., Nadalic Sotic, B., Kamps, J., Koistinen, M., & Zosa, E. (2025, September 9-12). Overview and Joint Report of the Robustness and Consistency Task of the ELOQUENT 2025 Lab for Evaluating Generative Language Model Quality: Notebook for the ELOQUENT Lab at CLEF 2025. In CLEF 2025 Working Notes (Vol. 4038, pages 1306-1319). CEUR-WS. https://ceur-ws.org/Vol-4038/paper_104.pdf
Sotić, Bruno N., & Kamps, J. (2025, September 9-12). Evaluating the Influence of Stylistic Prompt Variations on Semantic Interpretation. In CLEF 2025 Working Notes (Vol. 4038, pages 1443-1448). CEUR-WS. https://ceur-ws.org/Vol-4038/paper_115.pdf
Ermakova, L., Azarbonyad, H., Bakker, J., Vendeville, B., & Kamps, J. (2025, September). Overview of the CLEF 2025 SimpleText Track: Simplify Scientific Text (and Nothing More). Conference and Labs of the Evaluation Forum (CLEF 2025), Madrid, Spain. https://doi.org/10.1007/978-3-032-04354-2_23
Liana Ermakova, Hosein Azarbonyad, Jan Bakker, Benjamin Vendeville, and Jaap Kamps. 2025. Simplification de Textes Scientifiques (et Rien de Plus). Rapport sur l’Action CLEF 2025 SimpleText. In Actes de la 20e Conférence en Recherche d’Information et Applications (CORIA), pages 230–232, Marseille, France. ATALA \\& ARIA. https://aclanthology.org/2025.jeptalnrecital-coria.20/
Ermakova, L., Azarbonyad, H., Bakker, J., Vendeville, B., & Kamps, J. (2025, April). CLEF 2025 SimpleText Track: Simplify Scientific Text (and Nothing More). 47th European Conference on Information Retrieval (ECIR 2025), Lucca, Italy. https://doi.org/10.1007/978-3-031-88720-8_63
Bakker, J., Papandreou-Lazos, T., & Kamps, J. (2024, November). Biomedical Text Simplification Models Trained on Aligned Abstracts and Lay Summaries. The Thirty-Third Text REtrieval Conference (TREC 2024), Gaithersburg, MD, USA. https://trec.nist.gov/pubs/trec33/papers/UAmsterdam.plaba.pdf
Bakker, J., & Kamps, J. (2024, November). Cochrane-auto: An Aligned Dataset for the Simplification of Biomedical Abstracts. The Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), Miami, Florida, USA. https://aclanthology.org/2024.tsar-1.5/
Bakker, J., Yüksel, G., & Kamps, J. (2024, September). University of Amsterdam at the CLEF 2024 SimpleText Track. Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble, France. https://ceur-ws.org/Vol-3740/paper-310.pdf
Bakker, J., & Kamps, J. (2024, May). Beyond Sentence-level Text Simplification: Reproducibility Study of Context-Aware Document Simplification. The Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024 (DeTermIt), Torino, Italia. https://aclanthology.org/2024.determit-1.3/
Romein, C.A. (Juni 2025) ‘Kun je die 200.000 teksten niet gewoon zelf lezen?’ Hoe digital humanities historisch onderzoek veranderen. (Column) Skript Historisch Tijdschrift, 47.2, S.32-34.
Romein, C. A. (2025). Handleiding Citizen Scientists Collectie Overijssel i.s.m. HAICu/ UTwente Archiefdeel: Resoluties van de Staten van Overijssel. Zenodo. https://doi.org/10.5281/zenodo.15401991
11 June 2024: https://pro.europeana.eu/post/haicu-using-ai-to-access-connect-and-analyse-heritage-collection
van den Broek, R. (2025, februari) Hoe dragen online erfgoedcollecties bij aan het journalistieke werk? https://www.journalismlab.nl/onderzoek/hoe-dragen-online-erfgoedcollecties-bij-aan-het-journalistieke-werk/
A Posters
Bi, Q., Qi Bi, Yi, J., Salah, A., Veltkamp R. (2025, December 17). Modeling Visual Flow Leakage to Alleviate Hallucination in Large Vision-Language Models. HAICu-Day 2025, Hilversum. Zenodo. https://zenodo.org/records/18135988
Mahadeshwar, R.; Caselli, T.; van Cranenburgh, A. & Nissim, M. Evaluating the Impact of Source Diversity for RAG in Historical Research. HAICu-Day 2025, Hilversum. Zenodo. https://doi.org/10.5281/zenodo.17986713
Romein, C. A. (2025). Making Provincial Archives Accessible: ML/AI-Driven Transformation of the Overijssel Resolutions (1578-1795). HAICu-Day 2025, Hilversum. Zenodo. https://doi.org/10.5281/zenodo.17606759
Romein, C. A., van Schuijlenburg, K., Peeters, S., & Weber, A. (2025, December 4-5). From Annotation to Insight: Human-in-the-Loop Machine Learning for Historical Archives in HAICu WP2. Fantastic Futures 2025 (FF2025), London. Zenodo. https://doi.org/10.5281/zenodo.17606825
Bakker, J. (2025, November 5) Section-Level Simplification of Biomedical Abstracts. Presented at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025) in Suzhou, China. Also presented at the pre-EMNLP session of the Computational Linguistics Seminar at the University of Amsterdam (October 28) and the 2025 HAICu day in Hilversum (December 17) https://doi.org/10.48448/7yjf-5g39
van Schuijlenburg, K., Romein, C. A., Wolf, B., Weggeman, S., Peeters, S., Dhali, M. A., Dijkstra, K., Weber, A., Schomaker, L.. (2025). The HAICu Project (WP2). Digital Humanities conference 2025 (DH2025), Lisbon. Zenodo. https://doi.org/10.5281/zenodo.15829129
Mahadeshwar, R.; Caselli, T.; van Cranenburgh, A. & Nissim, M. Uncovering under-represented perspectives from cultural heritage data using a neuro-symbolic approach. Accepted for 45th TABU Dag 2025 (12-13 June) and CLIN 2025 (12 September) https://zenodo.org/records/15672710
Romein, C. A., Mooijweer, J., Wolf, B. J., Romein, J. C., Weggeman, S., & Weber, A. (2025). Making an Archive Accessible: AI and the Overijssel Case Study. DH Benelux 2025, Amsterdam, the Netherlands. Zenodo. https://doi.org/10.5281/zenodo.15502521
Zhao Y. and Hollink L. LLM for Contextual Contentiousness Classification in Historical Dutch Newspapers. Accepted for presentation at Digital Humanities Benelux, Amsterdam June 3-6, 2025.
Romein, C. A. , Mooijweer, J., Wolf, B., Romein, J. C., & Weber, A. (2025). Digital Pathways to Regional Heritage: AI Solutions for Archive Access in Overijssel. Digital Humanities in the Nordic and Baltic Countries (DHNB), Tartu. Zenodo. https://doi.org/10.5281/zenodo.14922615
Nababan, C., & de Boer, V. (2026). An Ontology for Traditional Knowledge Labels. DH Benelux 2026, Maastricht. Zenodo. https://doi.org/10.5281/zenodo.19237706
B Presentations
Mahadeshwar, R.; Caselli, T.; van Cranenburgh, A. & Nissim, M. Evaluating the Impact of Source Diversity for RAG in Historical Research. HAICu-Day 2025, Hilversum. Zenodo. https://doi.org/10.5281/zenodo.17986713
Romein, C. A. (2025, December 17). HackaLOD: HAICu - SCOPE. HAICu-Dag, Hilversum. Zenodo. https://doi.org/10.5281/zenodo.17980797
Romein, C. A. (2025, December 17). Reading between the Lines:Computational Analysis of Overijssel's Resolutions (1578-1795). Masterclass Jo Guldi, Hilversum. Zenodo. https://doi.org/10.5281/zenodo.17980678
Bakker, J. (2025, December 16) Section-Level Simplification of Biomedical Abstracts. Presented online and at the Masterclass Jo Guldi, Hilversum. https://doi.org/10.48448/7yjf-5g39
Romein, C. A. (2025, November 27-28). De Stemmen van Overijssel: digitale verkenning van provincial bestuur a.d.h.v. de Resoluties van de Staten van Overijssel (1578-1795). Nederlands-Belgische Rechtshistorische Dagen, Nijmegen. Zenodo. https://zenodo.org/records/17980633
Steven Claeyssens (2025, November 12): Digitised Newspapers at the KB, the National Library of the Netherlands: How AI Is Challenging the Status Quo. Online symposium - Analysing Digitised Newspapers Using AI: Projects, Problems, Perspectives
Sotić, Bruno N., & Kamps, Jaap. (2025, October 27). What Makes a User Click on a News Item? Understanding News Values of Visual Content in News Recommendation, DIR 2025, Nijmegen. https://informagus.nl/dir2025/schedule
van den Broek, R. (2025, October 23). Historical Sources: How Journalists Use the Past to Shape the Present at the Netherlands Media Studies Conference, organised by RMeS https://www.rmes.nl/netherlands-media-studies-conference-organised-by-rmes/
Bakker, J. (2025, October 17). Guest lecture on the Automatic Simplification of Scientific Texts. Presented at the Machine Learning and Language Models course at the University of Amsterdam.
Bakker, J. (2025, September 10). University of Amsterdam at the CLEF 2025 SimpleText Track. Presented at the CLEF 2025 SimpleText Track in Madrid, Spain. https://doi.org/10.5281/zenodo.17752126
Kamps, J. (2025, September 10). Overview of the CLEF 2025 SimpleText Task 1: Simplify Scientific Text. Presented at the CLEF 2025 SimpleText Track in Madrid, Spain. https://doi.org/10.5281/zenodo.17752369
Romein, C. A . (2025, July 4). (Keynote) Het toepassen van ML/AI bij het ontsluiten van cultureel erfgoed –over het HAICu onderzoeksproject. Universitair Platform Informatiedienstverlening en Recordsmanagement (UPIR), Enschede, the Netherlands. Zenodo. https://doi.org/10.5281/zenodo.15808791
Borrini, O. (2025, July 3). Exploring Relationship Extraction on Holocaust Archives using Reasoning Models with Reinforcement Learning. Presented at the Deep Culture keynote at the National Humane AI Network, Groningen
Zhao, Y. (2025, April 16). Probing Large Language Models for annotating contentious terms in Dutch Historical Archives. ICT.Open, Utrecht.
Hollink, L. (2025, March 21). Omtreden termen in Cultureel Erfgoed Data Congres historische kranten in het AI-tijdperk. National Library of the Netherlands, The Hague
Postma, E. (2025, March 21). Negeer de AI hype, omarm de mogelijkheden van AI! Congres historische kranten in het AI-tijdperk. National Library of the Netherlands, The Hague
Romein, C. A. (2025, February 24). Collectie Overijssel en Transkribus gebruik binnen HAICu. AI/ Betrouwbaarheid/Authenticiteit, Zwolle. Zenodo. https://doi.org/10.5281/zenodo.14923008
van den Broek, R. (2025, February 3-4). Journalism and Historical Context: Everyday Use in News Production at the RMeS Winter School & Graduate Symposium, organised by RMeS https://www.rmes.nl/rmes-winter-school-graduate-symposium-2025-26/
Postma, E. (2024, December 2). Deep Embeddings for HAICu https://doi.org/10.5281/zenodo.14929761
Romein, C. A. (2024, December 2). Making an Archive accessible: AI & the Overijssel Case Study (WP2). HAICu Day, Den Haag. Zenodo. https://doi.org/10.5281/zenodo.14922899
Bakker, J. (2024, November 15). Cochrane-auto: An Aligned Dataset for the Simplification of Biomedical Abstracts. Presented at the Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), Miami, Florida, USA. https://doi.org/10.5281/zenodo.17751587
Romein, C. A. (2024, November 14). HAICu: Overijsselse Archievenberaad. Overijssels Archievenberaad, Kampen. Zenodo. https://doi.org/10.5281/zenodo.14922711
Postma, E. (2024, November 5). Analyzing artworks with AI. ETernal HERitage (ETHER) POLIN Museum of the History of Polish Jews in Warsaw, Warsaw
Kamps, J. (2024, September 9). University of Amsterdam at the CLEF 2024 SimpleText Track. Presented at the CLEF 2024 SimpleText Track in Grenoble, France. https://doi.org/10.5281/zenodo.17751333
13 June 2024: https://www.clariah.nl/events/clariah-conference-2024
25 March 2024: https://haicu.science/updates/ai-and-heritage-applicability-and-future-of-ai-within-the-heritage-sector
Weber, A. and J. Mooijweer (2024, March 6). De digitale ontsluiting van de rekesten aan de Staten van Overijssel, Seriële stemmen van de gewone mensen. Rekesten als archiefbron 2024, Meertens Institute Amsterdam, https://research.utwente.nl/en/activities/de-digitale-ontsluiting-van-de-rekesten-aan-de-staten-van-overijs
Weber, A. (2023, November 16). Pitch about HAICu and WP2 at CWI Amsterdam, CWI Lectures on Digital Cultural Heritage 2023, https://www.cwi.nl/en/news/cwis-lectures-on-digital-cultural-heritage/
C Events
8 April 2026 - 4th HAICu PhDs and postdocs day at the Groninger Archieven (Groningen)
31 January 2026 - Ice Age and Citizen Science Symposium (Ijstijdsymposium) at Naturalis (Leiden)
17 December 2025 - Annual Conference Day at the Netherlands Institute for Sound and Vision (Hilversum)
16 December 2025 - Masterclass by Jo Guldi at the Netherlands Institute for Sound and Vision (Hilversum)
3 November 2025 - 3rd HAICu PhDs and PDs meeting at Naturalis (Leiden)
31 October-1 November 2025 - HackaLOD at the Dutch Openluchtmuseum (Arnhem) - HAICu SCOPE Team
29 September 2025 - Histories of Ordinary people Innovation Lab meeting at the National Archives (The Hague)
19 September 2025 - Deep journalism Innovation Lab meeting at the Netherlands Institute for Sound and Vision (Hilversum)
30 June 2025 - 2nd HAICu PhDs and postdocs day (online)
12 May 2025 - 1st HAICu DB live meeting in Utrecht
9 April 2025 - 1st HAICu PhDs and postdocs day at the Collectie Overijssel
2 December 2024 - HAICu Workshop: Transformative AI meets challenges from the humanities https://www.haicu.science/events/haicu-workshop
8-9 November 2024 https://netwerkdigitaalerfgoed.nl/nieuws/hackalod-2024-een-nacht-vol-innovatie-met-erfgoeddata/
02 February 2024 - HAICu Kick-off https://haicu.science/updates/haicu-project-officially-started
D Datasets
Romein, C. A. (Annemieke) ., & Romein, J. C. (2025). Handmatige Layout Analyse NL-ZlHCO_0003.1 (Collectie Overijssel - Resoluties van de Staten van Overijssel) [Data set] https://doi.org/10.5281/zenodo.15010938
Bakker, J. and Kamps, J. (2025) Cochrane-sections: A Dataset for the Section-level Simplification of Biomedical Abstracts. https://github.com/JanB100/cochrane-sections
Bakker, J. and Kamps, J. (2024) Cochrane-auto: An Aligned Dataset for the Simplification of Biomedical Abstracts. https://github.com/JanB100/cochrane-auto