We are delighted to announce that our innovative framing analysis tool is now publicly available in the LSS package. We performed analysis of the Russian media’s framing of street protests using a system developed in Python, but we ported the trained model to R to make it more accessible. We called it ‘dictionary’ earlier, but we decided to call it ‘model’ because it is a fitted Latent Semantic Scaling model. If you apply the model to your news stories, you can easily produce plots that are very similar to those in our papers. According to this model, a news article is framing street protests as “freedom of expression” when the score is high but protests as “social disorder” when score is low.
The code that produces the plot is the following.
devtools::install_github("koheiw/LSS") require(quanteda) require(LSS) # Pre-processing corp <- readRDS('Data/data_corpus_integrum.RDS') toks <- tokens(corp, remove_punct = TRUE) mt <- dfm(toks, remove = stopwords('ru')) %>% dfm_trim(min_termfreq = 5) # Framing analysis pred <- as.data.frame(predict(data_textmodel_lss_russianprotests, newdata = mt, density = TRUE)) pred$date <- docvars(mt, 'date') pred <- subset(pred, density > quantile(density, 0.25)) # Visualization plot(pred$date, pred$fit, pch = 16, col = rgb(0, 0, 0, 0.02), ylim = c(-1, 1), ylab = 'Protest framing', xlab = 'Date') lines(lowess(pred$date, pred$fit, f = 0.05), col = 1) abline(h = 0)
When you apply your data, please be aware that the model is fitted using a corpus of Russian news stories (TV and newspapers) from 2011-2014. This means that the model might not perform as intended outside of the period and with non-media texts. Depending on how you collect data, you should also consider applying geographical classifier that is also available as an R package. We are currently improving the Russian seed dictionary, but it will be available soon as part of the package.