Project Name: Folktales As Classifiable Texts
January 16, 2012
January 15, 2016
FACT (Folktales As Classifiable Texts) is a project funded by the NWO
CATCH program. In the FACT project, HMI cooperates with the DB group
and the Meertens Institute to study new possibilities for researchers
from humanities disciplines (folktale and narratology researchers,
documentalists, etc.) to explore folktales based on annotations and
links generated by data-driven methods.
To this end, FACT will develop software enabling the computer to
automatically enrich a corpus of Dutch folktales with metadata such as
names, genre, type, and a summary. In addition, FACT represents the
first effort to
systematically apply and evaluate various clustering techniques on a
very large (40.000+) and diverse collection of folktales.
The algorithms developed in the project will be integrated in a
user-friendly platform that supports annotation as well as exploratory
research into variability in oral and written transmission, using XML
database technology to model all folktale data (both annotations and
the text of the tale itself) in one unifying framework.
A large part of the
scientific research in FACT will deal with the pros and cons of human
classification and computerized clustering to investigate variation in
(oral) transmission. By using document clustering, we hope to discover
relationships between documents that cannot be readily identified by
human annotators. The main challenge will be to make the computer
decide which texts are related and which are not. This is not a
black-or-white issue: folktales may be related to each other on
different dimensions and to varying degrees. Will the computer be able
to recognize the cultural DNA of tales, and make a distinction between
different types (no kinship) and versions of the same type (kinship)?
The following HMI-member(s) is/are coordinator of this Project
Here you can find the publications