Newspapers collect information about cultural, political and social events in a more detailed way than any other public record. Since their beginnings in the 17th century, they have recorded billions of events, stories and names, in almost every language, every country, every day. Newspapers have always been an important medium for the dissemination of public and political opinions, literary works, essays and art. This thematic wealth sets them at the center stage for anyone interested in European cultural heritage.
During the last few decades, tens of millions of newspaper pages from European libraries have been digitised and made available online as national libraries aim to intensify their digitisation efforts in the coming years due to a large demand for access to historical newspapers. Whilst the broad public shows general interest in this historical and cultural resource, it is also of crucial importance for many humanities scholars.
NewsEye, funded by the European Union’s Horizon 2020 research and innovation programme, is a research project advancing the state of the art and introducing new concepts, methods and tools for digital humanities by providing enhanced access to historical newspapers for a wide range of users. With the tools and methods created by Newseye, crucial user groups will be able to investigate views and perspectives on historical events and development and, as a consequence, the project aims to change the way European digital heritage data is (re)searched, accessed, used and analysed.
Impacts & Results
The main objective of the NewsEye project is to develop methods and tools for effective exploration and exploitation of the rich resource of newspapers by means of new technologies and 'big data' approaches, combining the 'close' and 'distant reading' methods of Digital Humanities.
This will improve the methods of studying European cultural heritage used by researchers and experts, as well as the general public.
NewsEye is therefore developing a seamlessly integrated armoury of tools and methods that will improve users’ capability to access, analyse and use the content in the digital Libraries of historical newspapers.
For this purpose, several tools have been implemented:
- Text Recognition & Article Separation - enriches digitised newspaper data with both article separation and classification information, as well as further textual information and full text transcripts at the article level.
- Semantic text enrichment - produces semantic annotations to ease access and facilitate advanced systematic analyses of newspaper collections.
- Dynamic text analysis - develops methods to automatically find topics, trends, viewpoints and exeptions in the corpus being studied, both within a specified context and in comparision between contrasting contexts.
- Personal Research Assistant - is the user's intelligent and transparent aid, using the enriched texts and dynamic text analysis tools to carry out a series of analysis steps and explain the results to the users. The Assistant aims to extract content from a dynamic query for an initial report, to be presented to the user in the natural language. The Assistant aims to then either continue an investigation autonomously, or the users may select viewpoints, articles and keywords which will interactively refine the targeted query.