Below we reproduce part of the large dataset on Russian state-controlled media coverage of protests that we constructed using our content analysis dictionary. This subset contains the results of content analysis as well as metadata of 2,519 news stories about protest in Russia between 2011 and 2013.
The dataset covers six Russian state-controlled media sources including their online outlets:
- НТВ (NTV)
- Россия 1 (Russia 1)
- Первый канал (Channel 1)
- Известия (Izvestia)
- Российская газета (Russian Gazette)
- Комсомольская правда (Komsomolskaya Pravda)
For more detail, please see the list of sources.
Definitions of variables in the dataset are as follows:
- tid
- Text identifier
- sid
- Source identifier
- gid
- Source group identifier
- vid
- Media outlet identifier
- mid
- Media type identifier
- date
- Date of publication
- length
- Size of text (total number of words)
- country
- The most spanly associated country (primary country)
- country_score
- The level of association to primary country
- country2
- The most spanly associated country (secondary country)
- country2_score
- The level of association to secondary country
- loc
- Unused source identifier (identical to sid)
- score
- Document score coded by dictionary
- n
- Number of entry words in text
- sd
- Standard deviation of document score
- se
- Standard error of document score
- n_noise
- Number of протестант* (protestant*) and протестир* (test*)
- n_signal
- Number of протест* (protest*)
- n_kw1
- Number of оранжев*
- n_kw2
- Number of маргинал*
- n_kw3
- Number of революц*
- n_kw4
- Number of отечеств*
- n_kw5
- Number of фашист*
- n_kw6
- Number of русск*
- n_kw7
- Number of нация*
- n_kw8
- Number of майдан*
- n_kw9
- Number of развал*
- n_1000
- Number of выбор*
- hash
- Unique string generated from title
- density
- Density of entry words in text
- time
- Days since January 1, 2011
- title
- Title of news story
- cluster
- Clusters identifier