The idea

MarcoPolo is a web for surfing the words used on Twitter by the political parties and their candidates in Spain from October 2015 to June 2016. This period includes two general election campaigns (December and June).

This project assumes a parallelism between the semantic frameworks (cognitive linguistics) and the co-occurrences (corpus linguistics). Frames allow us to explore how the meanings are constructed, crucially depending on how they are used: with what other words, in which contexts, etc.

Marcos is a visual presentation of the meaning of the words as used by each political party.


The co-occurrences have been obtained from more than 116,000 messages published by the five most voted political parties (and their candidates) in Spain during the nine months under study. The amount of tweets varies between parties and candidates because of the different publication rates they had (as shown in the following table).

For the results presented in the application, the texts by the candidates have been added to those by the parties, so that we can obtain better statistics given the brevity of the messages.

Podemos 26.207
IU 20.108
Ciudadanos 19.725
PSOE 18.347
PP 13.358
A. Garzón (IU) 4.538
P. Sánchez (PSOE) 4.510
M. Rajoy (PP) 4.296
A. Rivera (C’s) 4.296
P. Iglesias (Podemos) 1.625

What does MarcoPolo show?

In the upper left you will find a search engine to choose the word you want to look up. Below there is a corpus selector where you can choose which parties you are interested in (all are selected by default).

Once a word has been chosen, the following information will appear:

  • A detailed analysis carried out by the research team (only for the most frequent words).
  • Graphics in which the co-occurrences are classified according to the relationship with the word we have searched for. For a noun like “education”, for example, we will have a graphic with its most common modifiers (such as “public education”), the verbs with which it is used as a subject (“education X”), etc.
    • The outer gray circle represents the most common cases in the general corpus
      • The words appear clockwise arranged according to their frequency
      • The size of the sectors represents the frequency of each word
      • When passing the cursor on each sector, the word appears next to the data of its frequency
      • Each sector of the circle is a link to the corresponding word in order to make the navigation easier
    • The concentric circles show that information for each candidate/party. In this way, it is easy to observe which relationships are particular to each political party. Circles are arranged from the outside according to the number of samples they have with the chosen word
  • Tables show the frequencies that have been used to draw the graphs

How have we done it?

The tweets were downloaded automatically through the Twitter API at the moment of their publication. In this way, we have been able to keep them even in those cases in which they have been removed from Twitter.

The co-occurrences have been obtained using the Sketch Engine tool. The most relevant cases have been calculated with logDice (as explained in Adam Kilgarriff, Vít Baisa, Jan Bušta, Miloš Jakubícek, Vojtech Kovár, Jan Michelfeit, Pavel Rychlý, Vít Suchomel 2014. The Sketch Engine: ten years on. In Lexicography 1 (1): 7-36).

Results are arranged in the tables and in the graphs from the highest to the lowest frequency of use. The web have been designed with HTML5, CSS3, Javascript, and  jQuery. Besides, we have used two open source libraries: Bootstrap for the presentations and ChartJS for the graphics.

The research

The information shown in Marcos has already been used for research. The publications are listed on the project research page.

The data of MarcoPolo can be used freely for other research with the only condition of its use been recognized with a reference to:

Ruiz-Sánchez, A. & Alcántara-Plá, M. 2018. “Las campañas electorales en las redes sociales. El ejemplo de Twitter en España“, in El análisis del discurso político: géneros y metodologías, EUNSA.