Cultural Heritage Enriched with AI-Generated Image Captions: the Case of Raphael’s “Allegory”

Author: Cédric Bhihe, , researcher at the Barcelona Supercomputing Center

Raffaello Sanzio, also known as “Raphael”, was an Italian Renaissance painter, frescoist, sculptor, architect, poet and archaeologist, born in 1483. Despite his early death at age 37, he was a very successful and productive artist and left a large number of artistic masterpieces. The frescoed Raphael Rooms, as many of his other paintings, are found in the Vatican Palace, and are widely considered as central art pieces in his career.

The Allegory, also called The Vision of a Knight or The Dream of Scipio, is one of his most famous paintings. It was finished between the years 1503 and 1504 and is currently located in the National Gallery in London. The painting shows a dreaming knight surrounded by two women who represent two concepts: Virtue and Pleasure. It is said that the most probable source for this allegory is a passage in the Punica, a poem written by the Latin poet Silius Italicus to narrate the Second Punic War.

Our common cultural heritage provides us with art collections thankfully replete with such images from the past. The main objective of the Saint George on a Bike project, led by the Barcelona Supercomputing Center (BSC), in partnership with the Europeana Foundation, is to enrich the descriptive annotations of images automatically, based on AI algorithms. Downstream goals include making the scene contents of such images “indexable” (referenced by search engines) and accessible to vision-impaired users, via richer descriptions. Whether paintings, etchings, prints, drawings or sketches, such as Raphael’s Allegory, all can benefit from annotation enrichments with the automatic description of depicted scenes.

Figure 1. Sketch for Allegory, Raffaello Sanzio (1503 – 1504)

In the sketch above (Figure 1), five important objects are detected: person_1, sword_1, person_2, knight_1 and helmet_1. The algorithmic tools deduce that the figure at bottom-center of the representation is a reclining knight and that it is the central topic of the scene. The two figures right and left are detected as standing persons, located very slightly behind the dreaming knight. Finally, the processing accounts for two crucial facts: namely, the left figure holds a sword vertically at shoulder level, and the knight is coiffed with a helmet, whose bounding box (i.e. the green rectangle labeled “helmet_1”) fully overlaps that of the knight itself.

In keeping with the Saint George on a Bike project’s goals, identifying individual objects in an image, e.g. of a sketch or a painting, is but one crucial step toward complex scene description and objects’ labels semantic disambiguation. Simply detecting a sword and a helmet in a scene (such as Raphael’s Allegory) is clearly not the same as also detecting that a knight in a reclining position is coiffed with the helmet, while the sword is held in a certain manner by another figure. An algorithmic (automated) approach to reveal how an image’s detected objects connive by their relative positions and interconnected visual cues to produce meaning (whether symbolic or literal) in a complex composition is the subject of intense research in Machine Learning and Artificial Intelligence.

Saint George on a Bike participates in that effort by building algorithms that will eventually generalize the ability of computing machines to describe not only the components of images, but also the way those components interact with one another within a scene representation. The specialized domains and time period of cultural heritage imagery affords computational AI experts the possibility to integrate expert knowledge from librarians, art historians and others. This will soon fuel further advances, based on probabilistic relational reasoning techniques in complex scene analysis.

Saint George on a Bike: Training AI to be aware of cultural heritage contexts

Automatic image captioning is a process that allows already trained models running on commodity computers to generate textual descriptions from an image. It is a burgeoning reality in a handful of other areas such as classifying image contents on social media. However, to date, no AI system has been built and trained to help in the description of cultural heritage images, while factoring in the time-period and scene composition rules for sacred iconography from the 14th to the 18th centuries.

As part of the Saint George on a Bike project, researchers at BSC build and train AI systems to help cultural heritage institutions describe and classify their art pieces automatically. In the end both casual users and cultural heritage professionals will benefit from a better access to collections and also a better experience navigating through collection catalogs. They will owe this to richer artwork annotations, leading to improved image scene indexation and search capabilities, obtained with the help of a specialized AI system.

To learn more about the Saint George on a Bike project, visit https://saintgeorgeonabike.eu/.

Cultural Heritage Enriched with AI-generated Image Captions

The Case of Raphael’s “Allegory”