vai al contenuto della pagina vai al menu di navigazione


The portfolio includes all the works that are somehow related to the /DH.arc centre, including projects born and maintained at the /DH.arc centre, projects inherited from prior centres (e.g. Multimedia Centre CRR-MM), or projects born in a different context that are lead by /DH.arc members.

Projects at /DH.arc

ARTchives - The Inventory of Art Historians’ Archives

 Keywords: Arts, Creative industries,  Knowledge organization, Data collection, Knowledge Discovery

Description: The inventory of Art Historians’ Archives aims at gathering, describing and consuming information on archives produced by notable art historians spread around the world. The objective is to provide a flexible service for collecting data on art history related topics and serve accurate Linked Open Data. A web application leveraging so created data will provide scholars with new expressive means for discovering data.

Researchers: Marilena Daquino, Francesca Tomasi, Francesca Mambelli

Funded by: Federico Zeri Foundation

Status: About to start


Keywords: automatic analysis of scholarly documents, web-based interfaces, workflow definition and execution

Description: the project aims at analysing existing tools for automatic text analysis, so as to develop a prototypical Web-based application for mashing up these tools to create execution workflows by means of an intuitive Web interface.

Researchers: Ivan Heibi, Silvio Peroni

External collaborators: Paolo Ferri (University of Bologna, Italy), Luca Pareschi (University of Bologna, Italy)

Funded by: University of Bologna

Status: Ongoing

mAuth - mining Authoritativeness in Art History

Keywords: Arts and Photography, Authoritativeness, Information Retrieval

Description: mAuth is a tool for art historians, data collection managers, and curious, who want to collect information - historians' opinions, motivations, bibliographic references, and images - about the history of authorship attributions related to artworks of the Modern Art (15-16th centuries). It is based on a semantic crawler that harvests authorship attributions in the Web of Data and returns the list of contradictory statements sorted by their authoritativeness.

Researchers: Marilena Daquino, Francesca Tomasi

Status: Ongoing

OpenCitations Enhancement Project

Keywords: infrastructure, open citation data, interfaces for citation data

Description: The project aimed at making the datasets made available by OpenCitations more useful to the academic community both by significantly expanding the volume of citation data held within them, and by developing novel data visualizations and query services over the stored data.

Researchers: Marilena Daquino, Ivan Heibi, Silvio Peroni

External collaborators:David Shotton (University of Oxford, UK)

Funded by: Alfred P. Sloan Foundation

Status: Finished

Open Biomedical Citations in Context Corpus

Keywords: in-text reference pointers, open citation data, citation contexts

Description: The project aims at making the OpenCitations Corpus (OCC) more useful to the academic community by significantly expanding the kinds of citation data held within the Corpus, so as to provide data for each individual in-text reference and its semantic context, making it possible to distinguish references that are cited only once from those that are cited multiple times, to see which references are cited together (e.g. in the same sentence), to determine in which section of the article references are cited (e.g. Introduction, Methods), and, potentially, to retrieve the function of the citation.

Researchers: Ivan Heibi, Silvio Peroni

External collaborators: Vincent Larivière (Université de Montréal, Canada), David Shotton (University of Oxford, UK), Ludo Waltman (Leiden University, The Netherlands)

Funded by: Wellcome Trust

Status: About to start


Keywords: Digital Philology, Italian Literature, Authorial Philology, Alessandro Manzoni, I Promessi Sposi, Collodi, Le avventure di Pinocchio.

Description: PHILOEDITOR is a digital platform which allows to read texts that present multiple authorial drafts, representing the different versions and categories of variants marked by the editor. Texts currently available: I Promessi Sposi by Alessandro Manzoni (1827/1840) and Le avventure di Pinocchio by Carlo Collodi (1883/1890).

Researchers: Paola Italia, Francesca Tomasi, Fabio Vitali, Claudia Bonsi, Angelo Di Iorio, Teresa Gargano, Ersilia Russo.

Funded by: University of Bologna.

Status: Ongoing.

Semantic Digital Edition of Paolo Bufalini’s notebook

Keywords: Contemporary literature, Knowledge organization, Intertextuality, Intratextuality, Text encoding, Semantic Digital Edition

Description: The Semantic Digital Edition of Paolo Bufalini’s notebook aims at creating a scholarly edition enhanced by Semantic Web technologies. The digital edition, initially encoded in TEI/XML, focuses on inter and intra-textuality aspects. The RDF version of the edition aims at reconstruct the broad library owned by the scholar and enrich available information with data extracted from external sources.

Researchers: Francesca Giovannetti, Marilena Daquino, Francesca Tomasi, Silvio Peroni

Status: Ongoing

Semantic Digital Edition of Vespasiano da Bisticci’s Letters

Keywords: Renaissance, Creative industries, Semantic Digital Edition

Description: The Semantic Digital Edition of Vespasiano da Bisticci’s letters aims at highlighting the network of relations between intellectuals, copists, and customers that lead to the creation of a number of manuscripts during the Renaissance period. The edition focuses on such aspects and the semantic version of the edition aims at highlighting aspects related to the evolution of creative industries over time.

Researchers: Francesca Tomasi, Marilena Daquino

Status: Ongoing


Keywords: Arts and Photography, Ontology Development, Linked Open Data

Description: Zeri & LODE is a project to present cataloguing data belonging to art historical photo archives by using Semantic Web technologies. It produced a number of ontologies for representing the Arts and Photography domain (OAEntry Ontology, FEntry Ontology and HiCO Ontology) and produced Linked Open Data according to the developed models.

Researchers: Marilena Daquino, Francesca Mambelli, Francesca Tomasi, Silvio Peroni, Fabio Vitali

Status: Ongoing

SSE - Smart Structured Editor

Keywords: Templates, Authoring, Collaborative Editing, Technical documentation, Versioning

Description: SSE is a Web platform that allows users users to easily (i) produce structured content enriched with some metadata to easy the document tracking and exploration and (ii) share content fragments via templating. Originally developed for writing technical documentation, the system is flexible and extensible to other domains and content types.

Researchers: Fabio Vitali, Angelo Di Iorio, Alessandro Caponi

Funded by: Alstom s.p.a., Regione Emilia Romagna, University of Bologna

Status: Ongoing.


Keywords:  Document visualization, information interfaces and presentation, reading patterns

Description: DocuDipity is an interactive Web-based tool to support the exploration and analysis of heterogeneous document collections. It supports scholars by combining a sequential reading interface with alternative visualizations such as SunBurst and tree-based views. 

Researchers: Francesco Poggi, Angelo Di Iorio, Silvio Peroni, Fabio Vitali, Paolo Ciancarini

Funded by: University of Bologna

Status: Ongoing.

Hosted Projects


Keywords: Diary, War, WWI, Europeana, Digital Edition EVT, TEI XML, StoryMap

Description: The project aims to reactivate and renew old personal stories of ordinary people involved in the WWI. The Europeana Collection 1914-1918 preserves a great number of diaries from the trenches: this material represents the research core. Eight diaries in French and Italian and the letters written by Isaac Rosenberg to Laurence Binyon will be processed. They are available in two versions: StoryMaps and EVT digital edition.

Researcher: Saverio Vita

Funded by: Europeana Foundation, Research Grant 2018

Status: Ongoing

ArCo - The Italian Cultural Heritage Knowledge Graph

Keywords: Arts, Creative Industries, Knowledge Graphs, Semantic Web, Linked Open Data

Description: ArCo is a joint project by the Italian Ministry of Cultural Heritage's (MiBAC) agency ICCD (Istituto Centrale per il Catalogo e la Documentazione), and the Semantic Technology Lab (STLab) of ISTC-CNR, a partner of DHARC (some STLab researchers are hosted by the FICLIT Department). ArCo, based on ICCD norms for the description of cultural heritage, has designed an ontology network, and a knowledge graph of 800,000 Italian cultural entities, with relations to artists, places, institutions, techniques, etc.

ResearchersValentina Presutti (CNR project coordinator). Valentina CarrieroAndrea Nuzzolese (STLab), Aldo Gangemi, open to DHDK students for internships

Funded byICCD

Status: ongoing, first stable release available

Framester: a large-scale factual-linguistic knowledge graph based on frame semantics

Keywords: Semantic Interoperability, Lexicons, Frame Semantics, Knowledge Graphs, Semantic Web, Linked Open Data

DescriptionFramester is large factual-linguistic knowledge graph, based on a formalisation of Fillmore's frame semantics, and hosted on GitHub. It offers accurate alignments and formal representation for lexical resources such as FrameNet, WordNet, VerbNet, BabelNet, etc., as well as alignments to factual graphs such as DBpedia and YAGO, and foundational ontologies such as DOLCE. Framester creates a highly connected knowledge graph, enabling full-fledged OWL querying and reasoning, and fostering large-scale semantic interoperability. It has a dedicated SPARQL Endpoint, including a RESTful API. Originally designed by the RCLN group at LIPN, Paris Nord University, it is currently maintained jointly by CNR's STLab and DHARC.

ResearchersAldo Gangemi (coordinator), Mehwish AlamLuigi AsprinoValentina Presutti (CNR), open to DHDK students for internships.

Funded by: MARIO Project (EU H2020 programme), EFL LabEx (French Ministry of Research)

Status: ongoing, stable release available

FRED: state-of-the-art knowledge graph extraction from text

Keywords: Knowledge Extraction, Natural Language Processing, Frame Semantics, Knowledge Graphs, Linked Open Data

DescriptionFRED is a state-of-the-art tool for automatically extracting knowledge graphs from English (other languages via automated translation) text, and link them to existing knowledge. It can be considered a machine reader for the Semantic Web: it is able to parse natural language text, and transform it to linked data. It is implemented in Python, and available as REST service as well as a Python library suite [fredlib]. FRED background theories include: Combinatory Categorial Grammar, Discourse Representation Theory, Frame Semantics, and Ontology Design Patterns. FRED leverages Natural Language Processing components for performing Named Entity Resolution (using Stanbol and TagMe), Coreference Resolution (using CoreNLP), and Word Sense Disambiguation to DBpedia,, WordNet, VerbNet, and FrameNet. All FRED graphs include textual annotations and represent textual segmentation, expressed by means of EARMARK and NIF. Originally developed in the EU FP7 IKS project by CNR's STLab, it has been later improved also in the RCLN group at LIPN, Paris Nord University, and is currently maintained jointly by STLab and DHARC. FRED has been extended for Aspect-based Sentiment Analysis (Sentilo), type induction (Tìpalo), knowledge reconciliation (Mergilo), and synthetic relation extraction (Legalo).

ResearchersAldo Gangemi (coordinator), Mehwish AlamLuigi AsprinoAndrea NuzzoleseValentina Presutti (CNR), and Diego Reforgiato Recupero (University of Cagliari). Open to DHDK students for internships.

Funded by: FP7 IKS Project

Status: ongoing, stable release available as a web service