## News Cohesiveness Index datasets

These datasets accompany the article: "News Cohesiveness: an Indicator of Systemic Risk in Financial Markets" (last version updated 12.02.2014.):

- document-entity matrices (csv format) zip

- R code that generates figures from the article and corresponding data zip

- Matlab code for analysis and corresponding data zip

## World Bank indicators datasets

We plan to consolidate a FOC-specific database containing different types of country level socio-economic indicators and produce datasets targeting specific problems. At the moment we have assembled several datasets by combining a subset of World Bank indicators and IMF banking crises data.

WB indicators+banking crises (WBiBC) datasets: Several datasets have been assembled by combining socio-economic indicators from the World Bank data repository and IMF country related crises episodes (banking, currency, sovereign debt crisis - based on the IMF Working paper by Laeven and Valencia „Systemic Banking Crises: A New Database“, 2008). These datasets are in tab delimited format (tsv) and can be easily read into any statistics or data mining tool; they are constructed as country-panel data, with indicator's aggregated values and trends within intervals of 3-5 years. These explanatory variables are coupled with indication of crisis/non-crisis for the particular interval, thus comprising a typical set of positive/negative cases that can be subjected to diverse data mining or statistical analyses.

Procedure for construction of datasets is illustrated on the diagram bellow:

Procedure for construction of datasets: For positive examples we take three years before each crisis and calculate 9 values for each of the 105 indicators. For negative examples we do the same, only difference being that we take one of the non-crisis years that is exactly 10 years away from the nearest crisis year.

## 105 World Bank indicators 2008

## 105 World Bank indicators 2012

- Banking+Curreny+Debt tsv
- Banking+Curreny tsv
- Banking tsv
- Banking+Debt tsv
- Curreny+Debt tsv
- Curreny tsv
- Debt tsv

## IMF Financial crisis episodes database 2012

- Banking xls
- Banking+Currency xls
- Banking+Currency+Debt xls
- Banking+Debt xls
- Curreny xls
- Currency+Debt xls
- Debt xls

## RCA data

This is a dataset on RCA (Relative Comparative Advantage) of countries that export particular product. It is based on C. Hidalgo dataset published here. The definition of RCA is: $$ RCA(c,i) = \frac{ \frac{x(c,i)}{\sum_i x(c,i)} }{ \frac{\sum_c x(c,i)}{\sum_{i,c} x(c,i)} } $$ where $x(c,i)$ is the value of the exports of country c in the i'th good. For the description of product codes used in this dataset please refer to the original sources here and here.

- RCA data txt

## Datasets for FOC school

These are the RapidMiner datasets that will be used in FOC school in Lucca. (last version updated 12.10.2012.)

- datasets zip