Beyond Ctrl-F: how to semi-automate searches in numerous files using regular expressions, Poppler, GREP and Python
In this hands-on workshop Liliana Melgar (GKG) covers the basic level of automated searching. The training is addressed to humanities researchers or teachers who have little or no experience with the command line/terminal or with Python.
As a researcher you may need to query a relatively high number of digital files stored in your own computer for a quick exploration (for example, to know whether a term is mentioned, or whether the name of a person appears in your files). You may also need to query your files as part of a systematic study of concepts in your corpus (be it letters, journal articles, diaries, etc.). Querying all these files at once can be done in a flexible way by using a combination of tools available in your own computer. This workshop offers the possibility of doing hands-on work with regular expressions, Poppler, GREP, and Python with the guidance of an experienced instructor.
The workshop is a follow-up of the CDH webinar Beyond “Ctrl-F”: automating searches in large textual corpora of October 29, 2021. A number of participants of the webinar then indicated that they would like a follow up workshop. But you do not need to have attended the webinar to participate. It is a hands-on tutorial to which you can bring your own computer and your own files, receiving support in installing and using these tools.
Admission is free, but the number of participants is limited to 8 so that the instructor can give everyone enough attention, so please register as soon as possible (first come, first served). If you are unable to attend, please cancel your registration by sending an email to CDH@uu.nl, so another participant can take your place.
Please note: the Centre for Digital Humanities aims to promote digital literacy amongst staff-members and therefore compensates attendance to these courses in terms of DCU (22 hours = 1 DCU). The DCU’s will be automatically settled with your department at the end of the course.