European Migration, Asylum, and Policy: Gaining Perspective on the Migrant Crisis

Migrants in Hungary, September 2015

How big was the migrant crisis really?

One thing we know for sure: bigger than expected.

Germany welcomed nearly double the asylum seekers it anticipated in 2015, totaling over one million. By early 2016, they bashfully admitted than they lost track of about 130,000 of them. While well-intentioned, the open arms of Germany’s Willkommenskultur towards asylum seekers quickly became fatigued.

And they weren’t the only ones.

Razor-wire fences were built along Greece and Bulgaria, blocking incoming migrants from Turkey. Migrants crossing from Croatia to Slovenia were pepper sprayed. Migrant smuggling boats capsized. The black market boomed. People died.

How did European governments respond?

  • Poland’s new government said no to the EU’s proposal of migrant quotas.
  • Hungary called for a global migrant quota.
  • Sweden said they’ve done their part, and other European countries need to step up.
  • Slovakia said, “Who caused problems in North Africa? Slovakia? No!”
  • Czechia said most refugees were economic migrants, and that quotas aren’t a solution.
  • The UK said we need to help those in war zones, “not the ones who are strong and rich enough to come to Europe.”

The guilt-tripping, shielding, and deflecting by world leaders suggests these were tough times.

Let’s see what the data has to say.

Data setup

I’m using Python in Jupyter notebooks.

You can see the full code rendered here and the raw data here.

import numpy as np
import pandas as pd
from functools import reduce
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib.patches as pat
from matplotlib import cm

Immigration data and the majority of the following data is from the European Commission’s Eurostat database.

# TOTAL IMMIGRATION DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_immi_esms.htm
df_imm_total = pd.read_excel('migr_imm2ctz.xlsx', sheet_name='total')
df_imm_total
df_imm_total.info()

Data cleaning

After setting the index, coding missing values, and deleting rows with no data, I can see there’s only a handful of values missing data to impute.

# Set index
df_imm_total.set_index('GEO/TIME', inplace=True)

# Replace colons with missing value NaNs and set datatype
df_imm_total = df_imm_total.apply(pd.to_numeric, errors='coerce')

# Drop rows with no data
df_imm_total.dropna(how='all', inplace=True)

# Explore remaining missing values
df_imm_total[df_imm_total.isnull().sum(axis=1) > 0]

They’re from Bulgaria and Belgium in the early years.

I’m going to impute those NaNs by backfilling, fix Germany’s name, and strip the column names of any extraneous characters.

# Backfill missing values
df_imm_total.bfill(axis=1, inplace=True)

# Rename Germany
df_imm_total.index = [s.replace('Germany (until 1990 former territory of the FRG)', 'Germany') for s in df_imm_total.index]

# Strip column names
df_imm_total.columns = [col.strip() for col in df_imm_total.columns]

df_imm_total.head()

Looking good.

Data cleaning function

Since I’ll be working with many other similarly formatted Eurostat datasets, I’ll create a cleaning function that does the above cleaning tasks.

# Eurostat dataset cleaning function
countries = df_imm_total.index.tolist()

def clean_eurostat_excel(file_name, sheet):
    data = pd.read_excel(file_name, sheet_name = sheet, index_col=0)

    # rename Germany
    data.index = [str(s).replace('Germany (until 1990 former territory of the FRG)', 'Germany') for s in data.index]

    # replace ':' with NaNs, set datatype, and impute missing data with backfill/frontfill
    data = data.apply(pd.to_numeric, errors='coerce').bfill(axis=1).ffill(axis=1)
    
    # drop rows with no data
    data.dropna(how='all', inplace=True)

    # strip column names of extraneous spaces
    data.columns = [col.strip() for col in data.columns]
    
    # reduce rows to country list
    data = data.loc[countries]

    return data;
# TOTAL EMIGRATION DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_immi_esms.htm
df_emi_total = clean_eurostat_excel('migr_emi1ctz.xlsx', 'total')

df_emi_total.head()

Testing the cleaning function on another dataset–emigration–was a success.

Data transformation to long-form function

Some graph types require the data to be pivoted to its long-form version. I’m going to create another function that will work across all of the Eurostat datasets.

# Pivot transformation to long-form for visual analysis
def df_to_longform(df, data_col_name):
    
    df = pd.melt(df.reset_index(), id_vars='index')

    df.columns=('country','year', data_col_name)

    df.sort_values(by=['country','year'], inplace=True)

    df['year'] = df['year'].astype(str)

    return df;
dfg_imm_total = df_to_longform(df_imm_total, 'immigrants')

dfg_imm_total.head()

Testing the function on our immigration dataset yields another success.

Ready to rock and roll.

Immigration and emigration in Europe

I’m going to combine the long-form dataframes of both the immigration and emigration datasets for our first two graphs.

# Combining long-form dataframes to graph
dfg = pd.merge(dfg_imm_total, dfg_emi_total, on=['country', 'year'])

dfg.head()
# Style
sns.set_style('whitegrid')

# Pairplot by country
pal = sns.cubehelix_palette(10, start=0.3, rot=-0.8)
g = sns.pairplot(dfg, vars=dfg.iloc[:,2:4],
            height=5, palette=pal, kind='reg', hue='country')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Immigration & Emigration by Country', size=16, weight='demi'); 

In this pairplot, we can see the immigration-emigration relationships for each of the thirty-two European countries in the decade from 2008 to 2017.

Here’s how Eurostat defines immigration and emigration:

Immigration: the action by which a person establishes his or her usual residence in the territory of a Member State for a period that is, or is expected to be, of at least 12 months, having previously been usually resident in another Member State or a third country.

Emigration: the action by which a person, having previously been usually resident in the territory of a Member State, ceases to have his or her usual residence in that Member State for a period that is, or is expected to be, of at least 12 months.

International Migration statistics, Eurostat

From the pairplot above, it’s clear migration flows vary widely between countries. This is likely due to both migration ‘supply factors’–such as each country’s immigration policy and geographic location–as well as migration ‘demand factors’, namely the migrant demand for each country.

# Pairplot by years
pal = sns.cubehelix_palette(10, start=0.3, rot=-0.8)
g = sns.pairplot(dfg, vars=dfg.iloc[:,2:4],
            height=5, palette=pal, kind='reg', hue='year')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Immigration & Emigration by Year', size=16, weight='demi'); 

When we graph the data by year rather than by country, we see all slopes trend positive.

We’d expect to see a slope of one if the immigration and emigration of each country cancelled each other out. Each year has a slope less than one, implying that for all years, there’s more people moving into Europe than emigrating out of Europe.

The key takeaway from this graph is in the slope changes over the decade. On the bottom left grid, you can see the starting point in 2009–the tan line. As the years go by, the slopes decline, which means more immigrants per emigrants than before.

Looking at the darkest lines, representing the last years of the decade, we can see the slope pivoting back towards where it was before, closer to the tan slopes. This implies after an influx of migration into Europe, the immigration-emigration rate moved back towards the pre-migrant-crisis rate.

Time to break out the country-level graphs.

Total immigration by country

How many immigrants did Germany–the EU’s de facto leader–receive? What about Sweden, who said they did their part during the migrant crisis? Or the UK, conveniently across the pond?

Let’s look at the number of immigrants each country received from 2008 to 2017 using Seaborn’s FacetGrid and Matplotlib’s Pyplot.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg_imm_total, col='country', col_wrap=4, 
                  height=2,aspect=2).set(xticks=np.arange(0,10,3))

# Create plot
g.map(plt.plot, 'year', 'immigrants',  color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants',  color='darkorange', alpha=0.5)
g.map(plt.axhline, y=0, lw=2, color='darkorange')

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")

# Formatting axes
g.set(yticks=[])
g.despine(bottom=True, left=True)

# Legend
color_key = {'Immigrants': 'darkorange'}
patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.5, 0.93]).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Total Immigration', size=16, weight='demi'); 

The simplicity of these subplots allows us to make relative comparisons of thirty-two countries over a decade–a lot of data in a small amount of space.

Germany sticks out for the volume of immigrants it took over the decade compared to other countries, in addition to the spike of immigrants in 2015.

The 2015 spike can be seen in Austria, Belgium, and a few other countries too. Italy has a noticeable downward trend while France and the UK are fairly flat over time. The other heavy hitter is Spain, which has an interesting U-shape.

Because we’re working with raw numbers, it’s harder to see trends in the data for smaller countries. Sidebar: We’ll get to per capita and per GDP data visualizations a bit farther down.

So who are these immigrants? How many of are asylum seekers?

First generation immigrants: reason for migration

# FIRST GENERATION IMMIGRANTS REASONS DATA 2014
# https://ec.europa.eu/eurostat/cache/metadata/en/lfso_14_esms.htm
df_imm_fgen_bythous = clean_eurostat_excel('lfso_14b1dr.xlsx', 'Reason first gen imm 2014')

df_imm_fgen_bythous[df_imm_fgen_bythous.isnull().sum(axis=1) > 0]

This data provides a spread of reasons for immigration, surveyed in 2014. A handful of our thirty-two countries did not participate, above.

Besides the usual cleaning and transformation to long-form, I’m also going to sort the data by ‘total’ to set the order of countries I want in my graph. Then, I’m going to delete the ‘total’ data for graphing since that will only make the individual reasons harder to see on the graph.

# Drop countries with no data
df_imm_fgen_bythous.dropna(how='all', inplace=True)

# Order dataframe by total first generation immigrants
df_imm_fgen_bythous.sort_values(by=['Total'], inplace=True, ascending=False)

fgen_order = df_imm_fgen_bythous.index

# Drop total - only graphing reasons
df_imm_fgen_bythous.drop('Total', axis=1, inplace=True)

# Dataframe to long-form for graphing
dfg_imm_fgen_bythous = df_to_longform(df_imm_fgen_bythous, 'x')

dfg_imm_fgen_bythous.columns = ['country', 'Reason', 'thousands']

Let’s see what a simple barplot of this data looks like.

# Set plots and style
sns.set(rc={'figure.figsize':(15,30)}, style='whitegrid')

# Plot
ax = sns.barplot(data=dfg_imm_fgen_bythous, x='thousands', y='country', hue='Reason',
                 palette=sns.color_palette('deep', 7), order=fgen_order, 
                 hue_order=['Family reasons', 'Work, no job found before migrating', 
                      'Work, job found before migrating', 
                       'International protection or asylum', 'Education reasons', 
                       'Other', 'No response']) #

# Legend
plt.legend(loc='center right')

# Title
ax.set_title('First Generation Immigrants: Reasons for Migration (2014)', fontsize='large', fontweight='demi', y=1.02)
plt.figtext(0.5,0.893, 'Countries in descending order of total immigrants in 2014', ha="center", va="top", fontsize=12, color='grey');

We can see ‘Family reasons’ (blue) is the number one reason for migration for most countries. Economic migrants who couldn’t find a job before migrating (orange) are proportionately high in Italy, Spain, and Greece.

The red represents asylum seekers and refugees, a large proportion for Sweden. We can see what the Swedes meant when they said they’ve taken on a lot of refugees relative to other types of immigrants, pressuring other countries to do the same.

I’m going to split the data into three and make three subplots so we can see the data more clearly for all countries.

# Create boolean criteria for subplots
df_imm_fgen_bythous['Fam500k+'] = df_imm_fgen_bythous['Family reasons'] > 500
df_imm_fgen_bythous['Fam100k+'] = df_imm_fgen_bythous['Family reasons'] > 100

df_imm_fgen_bythous.head()
# Creating sub-dataframes for zoomed-in subplots
df_imm_fgen_bythous_lar = df_imm_fgen_bythous.loc[df_imm_fgen_bythous['Fam500k+'] == True].drop(
    ['Fam500k+', 'Fam100k+'], axis=1)

df_imm_fgen_bythous_med = df_imm_fgen_bythous.loc[(df_imm_fgen_bythous['Fam500k+'] == False) & (
    df_imm_fgen_bythous['Fam100k+'] == True)].drop(
    ['Fam500k+', 'Fam100k+'], axis=1)

df_imm_fgen_bythous_sma = df_imm_fgen_bythous.loc[df_imm_fgen_bythous['Fam100k+'] == False].drop(
    ['Fam500k+', 'Fam100k+'], axis=1)
# Data to long-form for graphing
dfg_imm_fgen_bythous_lar = df_to_longform(df_imm_fgen_bythous_lar, 'x')
dfg_imm_fgen_bythous_med = df_to_longform(df_imm_fgen_bythous_med, 'x')
dfg_imm_fgen_bythous_sma = df_to_longform(df_imm_fgen_bythous_sma, 'x')

dfg_imm_fgen_bythous_lar.columns = ['country', 'Reason', 'thousands']
dfg_imm_fgen_bythous_med.columns = ['country', 'Reason', 'thousands']
dfg_imm_fgen_bythous_sma.columns = ['country', 'Reason', 'thousands']

dfg_imm_fgen_bythous_lar.head()
# Sort for graphing order
dfg_imm_fgen_bythous_lar.sort_values(by=['thousands'], ascending=False, inplace=True)

dfg_imm_fgen_bythous_med.sort_values(by=['thousands'], ascending=False, inplace=True)

dfg_imm_fgen_bythous_sma.sort_values(by=['thousands'], ascending=False, inplace=True)
# Set plots and style
fig, axs = plt.subplots(3, figsize=(10, 20), gridspec_kw={'height_ratios': [3, 2, 4]})

# Set style
sns.set_style('whitegrid')

# First subplot (> 500k family reasons)
p0 = sns.barplot(data=dfg_imm_fgen_bythous_lar, x='thousands', y='country', hue='Reason',
                 palette=sns.color_palette('deep', 7),
                 hue_order=['Family reasons', 'Work, no job found before migrating', 
                      'Work, job found before migrating', 
                       'International protection or asylum', 'Education reasons', 
                       'Other', 'No response'], ax=axs[0])
# Second subplot (200k > 500k family reasons)
p1 = sns.barplot(data=dfg_imm_fgen_bythous_med, x='thousands', y='country', hue='Reason',
                palette=sns.color_palette('deep', 7),
                 hue_order=['Family reasons', 'Work, no job found before migrating', 
                      'Work, job found before migrating', 
                       'International protection or asylum', 'Education reasons', 
                       'Other', 'No response'], ax=axs[1])

# Third subplot (< 200k family reasons)
p2 = sns.barplot(data=dfg_imm_fgen_bythous_sma, x='thousands', y='country', hue='Reason',
                 palette=sns.color_palette('deep', 7),
                 hue_order=['Family reasons', 'Work, no job found before migrating', 
                      'Work, job found before migrating', 
                       'International protection or asylum', 'Education reasons', 
                       'Other', 'No response'], ax=axs[2])


# Formatting
fig.tight_layout()
axs[0].set(xlabel='', ylabel='')
axs[1].set(xlabel='', ylabel='')
axs[2].set(xlabel='First generation immigrants', ylabel='')

# Legend
handles, labels = axs[2].get_legend_handles_labels()
p0.legend(loc='lower right')
p1.legend(handles[:0], labels[:0])
p2.legend(handles[:0], labels[:0])

# Title
plt.subplots_adjust(top=0.98)
fig.suptitle('First Generation Immigrants: Reasons for Migration (2014)', fontsize='large', fontweight='demi', y=1.01);

Much better.

Greece and Malta are the only countries without ‘Family reasons’ (blue) as their top answer, both with ‘Work, no job found before migrating’ (orange) as number one. We can see Greece is having a tough time with the proportion of ‘No response’s (pink) relative to other countries.

Despite Slovakia shirking responsibility for refugees verbally during the migrant crisis, they appear to have taken in an equal proportion of all types of immigrants, in the same vein as Bulgaria and Romania.

Like Czechia implied, it’s true there are a lot of economic migrants (orange and green). We can see their aversion to refugees in this graph with such a low proportion taken in (red). Italy, Greece, and Cyprus also took in a minuscule amount of refugees–all coastal countries, who struggled being on the front lines of migrant boats.

Yet, we can’t talk about immigration without talking about emigration too.

Total immigration and emigration by country

To get a better picture of migrant movement, let’s look at both immigration and emigration for each country. I’ll use the original immigration graph, overlaying emigration on top.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, 
                  height=2,aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants',  color='darkorange', alpha=0.5)

g.map(plt.plot, 'year', 'emigrants', color='darkgrey')
g.map(plt.fill_between, 'year', 'emigrants',  color='darkgrey', alpha=0.5)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Immigrants': 'darkorange',
            'Emigrants' : 'darkgrey'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.525, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Total Immigration & Emigration', size=16, weight='demi'); 

Now we have a better picture of the net number of migrants Germany absorbed. Many more people were coming in than were going out in the years surrounding the 2015 migrant crisis.

However, 2016 and 2017 shows Germany’s immigration rate going down and emigration rate going up. This reflects Germany’s steps to both decrease immigration and increase emigration post-migrant-crisis.

Italy’s overall immigration rate is negative while their overall emigration rate is positive over the decade–a trend that may continue–in spite of the migrant crisis. Recall their extraordinarily low proportion of immigrants that were asylum seekers from the previous chart.

We can see France’s net migration rate looks close to zero with fairly equal amounts of people entering and leaving the country. This contrasts with the UK, which has a fairly consistent, positive net migration rate.

Spain has a spike in emigration and a drop in immigration in 2013. This is likely related to their high unemployment rate around that time, which reached as high as 25% in 2012. Like Italy in the previous graph, Spain’s immigrants were almost entirely economic or familial migrants, lacking in the asylum seeker department.

Emigrants: foreign vs local

Who are these emigrants? What proportion are foreign, and what proportion are local?

# REPORTING COUNTRY EMIGRATION DATA 
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_immi_esms.htm
df_emi_reporting = clean_eurostat_excel('migr_emi1ctz.xlsx', 'reporting_country')

# Pivot transformation to long-form for visual analysis
dfg_emi_reporting = df_to_longform(df_emi_reporting, 'reporting_country_emi')

# Merge for graphing
dfg = pd.merge(dfg, dfg_emi_reporting, on=['country', 'year'])

dfg.isnull().sum()
# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, 
                  height=2,aspect=2).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'emigrants', color='cadetblue')
g.map(plt.fill_between, 'year', 'emigrants',  color='cadetblue', alpha=0.5)

g.map(plt.plot, 'year', 'reporting_country_emi', color='darkgrey')
g.map(plt.fill_between, 'year', 'reporting_country_emi',  color='darkgrey', alpha=0.5)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Total emigrants': 'cadetblue',
            'Reporting country emigrants' : 'darkgrey'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.58, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Foreign vs. Reporting Country Emigration', size=16, weight='demi'); 

The grey ‘total emigrants’ data is the same as in the previous graph. The overlapping blue represents ‘locals’–those who left their country of citizenship. Thus, grey with no blue overlap represent the proportion of emigrants that were foreign to the country that they left.

Many countries are hemorrhaging their own citizens relative to foreigners such as France, Italy, Poland, Romania, Hungary, and many of the smaller countries like Latvia and Croatia. This also implies that the foreign immigrants who move to these countries are staying put.

Spain has the highest proportion of foreign emigrants, and increasing local emigrants. From previous graphs, we learned that many of Spain’s immigrants were economic migrants–surely they didn’t hang around given Spain’s employment problems.

Many countries have relatively equal proportions of locals and foreigners leaving, such as Germany, the UK, Belgium, and Sweden.

Notably, Germany’s spike in emigration in the years after the migrant crisis appears to be the result of both Germans leaving and foreign emigration.

Time to level the playing field: immigration and emigration in terms of population and GDP.

Immigration & emigration per capita

Exploring the raw data is a great start, but not a great way to compare the economic and social impact of migration between different countries.

Here, I’ll recreate our immigration/emigration per country chart from before, but with migrants per capita data rather than raw, total data.

But first, some cleaning to do on population data. I used United Nations population data over Eurostat’s because it was more complete.

# WORLD POPULATION DATA (IN THOUSANDS)
# https://population.un.org/wpp/Download/Standard/Population/
df_pop = pd.read_excel('world_population.xlsx')
df_pop
# Select subset of dataframe
df_pop = df_pop.iloc[15:, 2:]
df_pop.head()
# Create new column header
df_pop.columns = df_pop.iloc[0]
df_pop = df_pop[1:]

df_pop.head()
# Cleaning up index
df_pop.set_index('Region, subregion, country or area *',inplace=True)
df_pop.drop(['Notes','Country code','Type','Parent code'],axis=1, inplace=True)
# Get rid of 14 in corner and index name
del df_pop.columns.name
del df_pop.index.name

df_pop.head()
# Reduce dataset to select countries and years
df_pop = df_pop.loc[countries].astype('int')

df_pop = df_pop.loc[:, '2008': '2017']

# Replace ':' with NaNs and set datatype
df_pop = df_pop.replace(':', np.nan).bfill(axis=1).ffill(axis=1)

df_pop.head()
# Push index into column for graphing
dfi_pop = df_pop.reset_index()

# Drop columns for graphing
dfi_pop = dfi_pop.drop(['2009','2010','2012','2013', '2015', '2016'],axis=1)

dfi_pop.head()
# Set up PairGrid
g = sns.PairGrid(data=dfi_pop.sort_values('2008', ascending=False),
                x_vars=dfi_pop.columns[1:], y_vars=['index'],
                height=10, aspect=0.3)

# Create stripplot
g.map(sns.stripplot, size=10, orient='h', palette=sns.cubehelix_palette(32, start=0.5, rot=-0.8, reverse=True),
     linewidth=1, edgecolor='w')

# Set x-axis limits on all columns
g.set(xlim=(-10000, 100000), xlabel="Population in thousands", ylabel="")

# Column titles
titles = ['2008', '2011', '2014', '2017']

for ax, title in zip(g.axes.flat, titles):

    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Population by Country', size=16, weight='demi'); 

sns.despine(left=True, bottom=True)

We can see the five most populous countries are Germany, France, the UK, Italy, and Spain. Following the next most populous countries–Poland, Romania, and the Netherlands–the populations of the remaining countries are substantially smaller.

Cracking onward: let’s see migrants per capita.

# Pivot data to long-form for visual analysis
dfg_pop = df_to_longform(df_pop, 'pop_in_thous')

dfg = pd.merge(dfg, dfg_pop, on=['country', 'year'])

dfg.info()
# Create immigrants/emigrants per capita data
dfg['immigrants_per_capita'] = dfg['immigrants']/dfg['pop_in_thous']
dfg['emigrants_per_capita'] = dfg['emigrants']/dfg['pop_in_thous']

dfg.head()
# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, 
                  height=2, aspect=2).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants_per_capita', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants_per_capita',  color='darkorange', alpha=0.5)

g.map(plt.plot, 'year', 'emigrants_per_capita', color='darkgrey')
g.map(plt.fill_between, 'year', 'emigrants_per_capita',  color='darkgrey', alpha=0.5)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Immigrants per capita': 'darkorange',
            'Emigrants per capita' : 'darkgrey'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.6, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Immigration & Emigration Per Capita', size=16, weight='demi'); 

Viewing migration through a per capita lens, Germany doesn’t stick out so much anymore relative to other countries.

Less populous countries like Cyprus, Iceland, Luxembourg, and Malta have much higher rates of migration per capita than top five most populous countries.

While the most populous countries get the most press coverage for their raw numbers, this data shows the larger impact migration has on less populous countries.

Immigration and emigration per GDP

A less conventional measure is migration per GDP. This loosely communicates migrants as a function of a country’s wealth, and of the government’s potential wealth.

While many studies suggest economic migration expands the economy, asylum seekers and refugees are–at least initially–an economic cost.

First, I’ll make GDP and GDP per capita graphs.

# GDP IN MILLION EUROS DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/nama10_esms.htm
df_gdp = clean_eurostat_excel('nama_10_gdp.xlsx','GDP_million_euro')

df_gdp2 = df_gdp.drop(['2010', '2011', '2012', '2014', '2015', '2016', '2018'],axis=1)

df_gdp2 = df_gdp2.reset_index()

df_gdp2 = df_gdp2.rename(columns = {'index':'country'})

df_gdp2.head()
# Style
sns.set(style='whitegrid')

# Set up PairGrid
g = sns.PairGrid(data=df_gdp2.sort_values('2009', ascending=False),
                x_vars=df_gdp2.columns[1:], y_vars=['country'],
                height=10, aspect=0.3)

# Create stripplot
g.map(sns.stripplot, size=10, orient='h', palette=sns.cubehelix_palette(32, start=0.5, rot=-0.8, reverse=True),
     linewidth=1, edgecolor='w')

# Set x-axis limits on all columns
g.set(xlim=(-150000, 3500000), xlabel="GDP", ylabel="")

# Column titles
titles =['2009', '2013', '2017']

for ax, title in zip(g.axes.flat, titles):

    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('GDP', size=16, weight='demi');

The top five wealthiest countries in terms of GDP are the same top five most populous countries. Let’s see GDP per capita.

# Pivot data to long-form for visual analysis
dfg_gdp = df_to_longform(df_gdp, 'GDP')

# merge population and GDP data
dfg = pd.merge(dfg, dfg_gdp, on=['country', 'year'])

dfg.head()
# Create GDP per capita variable
dfg['GDP_pc'] = dfg['GDP']/dfg['pop_in_thous']

dfg.info()
# Pivot data into wide-form for graphing
df_gdp_pc = dfg.pivot(index='country', columns='year')['GDP_pc'].reset_index()

df_gdp_pc = df_gdp_pc.drop(['2010', '2011', '2012', '2014', '2015', '2016'],axis=1)

df_gdp_pc.head()
# Style
sns.set(style='whitegrid')

# Set up PairGrid
g = sns.PairGrid(data=df_gdp_pc.sort_values('2009', ascending=False),
                x_vars=df_gdp_pc.columns[1:], y_vars=['country'],
                height=10, aspect=0.3)

# Create stripplot
g.map(sns.stripplot, size=10, orient='h', palette=sns.cubehelix_palette(32, start=0.5, rot=-0.8, reverse=True),
     linewidth=1, edgecolor='w')

# Set x-axis limits on all columns
g.set(xlim=(-10, 170), xlabel="GDP Per Capita", ylabel="")

# Column titles
titles =['2009', '2013', '2017']

for ax, title in zip(g.axes.flat, titles):

    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('GDP Per Capita', size=16, weight='demi'); 

The highest-GDP and most populous countries are in the middle-ranks when it comes to GDP per capita. Leichtenstein takes the cake by a good margin, followed by the usual suspects: Luxembourg, Norway, Switzerland, and Denmark.

Over time, we can see countries like Switzerland and Ireland have moved up in the GDP per capita ranks–and especially Iceland, coming from far behind.

# Create variables immigrants/emigrants per GDP
dfg['immigrants_per_GDP'] = dfg['immigrants']/dfg['GDP']
dfg['emigrants_per_GDP'] = dfg['emigrants']/dfg['GDP']

dfg.info()
# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants_per_GDP', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants_per_GDP',  color='darkorange', alpha=0.4)

g.map(plt.plot, 'year', 'emigrants_per_GDP', color='darkgrey')
g.map(plt.fill_between, 'year', 'emigrants_per_GDP',  color='darkgrey', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Immigrants per GDP': 'darkorange',
            'Emigrants per GDP' : 'darkgrey'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.61, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Immigration & Emigration per GDP', size=16, weight='demi'); 

It’s our usual immigration/emigration graph, but this time immigrants and emigrants per GDP.

The top five most populous and highest-GDP countries–Germany, France, the UK, Italy, and Spain–all have relatively few immigrants and emigrants per GDP. As expected, some of the smallest countries such as Cyprus, Latvia, Lithuania, Malta, and Romania have the most immigrants and emigrants per GDP.

This implies the economy of smaller nations is more radically affected by migrant flows than the larger nations.

Choropleth maps of the 2015 migrant crisis

I’m using Geopandas and Natural Earth GeoJSON data to create a few choropleth maps of the 2015 migrant crisis.

Similarly to the graphs above, I’ll create one showing total immigrants, another showing immigrants per capita, and the last showing immigrants per GDP.

import geopandas

# GEOJSON POLYGON COUNTRY DATA
# https://datahub.io/core/geo-countries
world_map = geopandas.read_file('countries.geojson')

world_map.head()
# Exploring map axes
world_map.plot(figsize=(30,30))
plt.axis([-13,40,30,75]);
# Check if country lists of data and maps match
diff = list(set(countries) - set(world_map['ADMIN']))

diff
# Find outlier
world_map[world_map['ADMIN'].str.startswith('Cz')]

My immigration data has “Czechia” while the map data lists it as “Czech Republic,” so I’ll rename the map data to match.

# Rename outlier to match data with map
world_map.iat[60,0] = 'Czechia'

world_map[world_map['ADMIN'].str.startswith('Cz')]

Now that I know all the country names from the Eurostat data and the map data match, time to create and merge the datasets for graphing.

# Map data for total immigrants
europe_map_imm = pd.merge(world_map, df_imm_total, left_on=['ADMIN'], right_index=True)

europe_map_imm
# Map data for immigrants per capita
df_imm_pc = dfg.pivot(index='country', columns='year')['immigrants_per_capita']

europe_map_imm_pc = pd.merge(world_map, df_imm_pc, left_on=['ADMIN'], right_index=True)

europe_map_imm_pc.head(1)
# Map data for immigrants per GDP
df_imm_gdp = dfg.pivot(index='country', columns='year')['immigrants_per_GDP']

europe_map_imm_pc = pd.merge(world_map, df_imm_gdp, left_on=['ADMIN'], right_index=True)

europe_map_imm_pc.head(1)
from matplotlib import cm
from matplotlib.colors import ListedColormap, LinearSegmentedColormap
from mpl_toolkits.axes_grid1 import make_axes_locatable

# Set figure and axis
fig, ax = plt.subplots(1, figsize=(15,15))
plt.axis([-13,40,30,75])

# Colormap formatting
oranges_middle = cm.get_cmap('Oranges', 512)
Oranges2 = ListedColormap(oranges_middle(np.linspace(0.25, 0.75, 256)))

# Aligning map layers
ax.set_aspect('equal')
ax.axis('off')

# Legend formatting
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.1)

# Plots
world_map.plot(ax=ax, color='silver', edgecolor='white')

europe_map_imm.plot(ax=ax, column='2015', cmap=Oranges2, legend=True, cax=cax)

# Title
ax.set_title('European Immigration 2015', fontdict={'fontsize':'28', 'fontweight':'3'});
# Set figure and axis
fig, ax = plt.subplots(1, figsize=(15,15))
plt.axis([-13,40,30,75])

# Colormap formatting
oranges_middle = cm.get_cmap('Oranges', 512)
Oranges2 = ListedColormap(oranges_middle(np.linspace(0.25, 0.75, 256)))

# Aligning map layers
ax.set_aspect('equal')
ax.axis('off')

# Legend formatting
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.1)

# Plots
world_map.plot(ax=ax, color='silver', edgecolor='white')

europe_map_imm_pc.plot(ax=ax, column='2015', cmap=Oranges2, legend=True, cax=cax)

# Title
ax.set_title('Immigrants per Capita 2015', fontdict={'fontsize':'28', 'fontweight':'3'});
# Set figure and axis
fig, ax = plt.subplots(1, figsize=(15,15))
plt.axis([-13,40,30,75])

# Colormap formatting
oranges_middle = cm.get_cmap('Oranges', 512)
Oranges2 = ListedColormap(oranges_middle(np.linspace(0.25, 0.75, 256)))

# Aligning map layers
ax.set_aspect('equal')
ax.axis('off')

# Legend formatting
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.1)

# Plots
world_map.plot(ax=ax, color='silver', edgecolor='white')
europe_map_imm_pc.plot(ax=ax, column='2015', cmap=Oranges2, legend=True, cax=cax)

# Title
ax.set_title('Immigrants per GDP 2015', fontdict={'fontsize':'28', 'fontweight':'3'});

The three maps emphasize different ways to view the the 2015 migrant crisis.

While Germany’s deep orange stands out in the raw immigration map, we can see the effect population and geography have on the second map. Many countries located farther to the east and south experienced a higher influx of immigrants per capita in 2015, serving as the gateways into Europe. The final map, showing immigrants per GDP in 2015, has a similar coloring as the per capita map.

With the wealthier and more populous European nations farther into the European peninsula, it’s likely eastern European countries experienced both greater social and economic impacts than their larger, wealthier counterparts.

Asylum seekers

Here’s how Eurostat defines asylum seekers and refugees:

Asylum seeker: First-time asylum applications are country-specific and imply no time limit. Therefore, an asylum seeker can apply for first time in a given country and afterward again as first-time applicant in any other country. If an asylum seeker lodge again an application in the same country after any period of time, (s)he is not considered again a first-time applicant.

Refugee:  person granted refugee status (as defined in Art.2(e) of Directive 2011/95/EC within the meaning of Art.1 of the Geneva Convention relating to the Status of Refugees of 28 July 1951, as amended by the New York Protocol of 31 January 1967) or person granted subsidiary protection (as defined in Art.2(g) of Directive 2011/95/EC and person covered by a decision granting authorisation to stay for humanitarian reasons under national law concerning international protection.

International Migration statistics, Eurostat
# ASYLUM APPLICATIONS DATA 
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_asyapp_esms.htm
df_asylum = clean_eurostat_excel('migr_asylum_apps.xlsx', 'Data')

dfg_asylum = df_to_longform(df_asylum, 'asylum_apps')

# Explore remaining missing values
dfg_asylum[dfg_asylum.isnull().sum(axis=1) > 0]

It’s always nice to have no nulls.

df_asylum.head()
# ASYLUM DECISIONS, OUTGOING
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_dub_esms.htm
df_asyl_accepted_outgoing = clean_eurostat_excel('migr_dubdo.xlsx', 'Accepted - outgoing')

dfg_asyl_accepted_outgoing = df_to_longform(df_asyl_accepted_outgoing, 'asylum_accepted_outgoing')

# ASYLUM DECISIONS, INCOMING
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_dub_esms.htm
df_asyl_accepted_incoming = clean_eurostat_excel('migr_dubdi.xlsx', 'Accepted - incoming')

dfg_asyl_accepted_incoming = df_to_longform(df_asyl_accepted_incoming, 'asylum_accepted_incoming')

# Explore remaining missing values
print(dfg_asyl_accepted_incoming[dfg_asyl_accepted_incoming.isnull().sum(axis=1) > 0])
print(dfg_asyl_accepted_outgoing[dfg_asyl_accepted_outgoing.isnull().sum(axis=1) > 0])
# Merge multiple into graphing dataframe
dfs = [dfg, dfg_asylum, dfg_asyl_accepted_outgoing, dfg_asyl_accepted_incoming]

dfg = reduce(lambda left, right: pd.merge(left, right, on=['country', 'year'],
                                        how='inner'), dfs)     

dfg.info()
dfg.head(1)
# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants',  color='darkorange', alpha=0.4)

g.map(plt.plot, 'year', 'asylum_apps', color='darkcyan')
g.map(plt.fill_between, 'year', 'asylum_apps',  color='darkcyan', alpha=0.5)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Immigrants': 'darkorange',
            'Asylum applications' : 'darkcyan'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.56, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Immigrants & Asylum Seekers', size=16, weight='demi');

Here, we can see the difference between the number of immigrants and asylum seekers. Keep in mind these are only the asylum applications, not the decisions on those applications.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=2.2,
                  aspect=1.6).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'asylum_apps', color='darkcyan')
g.map(plt.fill_between, 'year', 'asylum_apps',  color='darkcyan', alpha=0.5)

g.map(plt.plot, 'year', 'asylum_accepted_incoming', color='indianred')
g.map(plt.fill_between, 'year', 'asylum_accepted_incoming',  color='indianred', alpha=0.5)


# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Asylum applications' : 'darkcyan',
            'Accepted incoming applications': 'indianred'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.61, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Asylum Applications: Total vs. Accepted', size=16, weight='demi');

Carrying on the blue asylum applications from the previous graph into the graph above, we can now see how many of those applications were accepted.

There’s data for both ” accepted outgoing Dublin requests” and “accepted incoming Dublin requests.”

An accepted “outgoing” decision on an asylum application means one European nation processed the application, accepted the asylum seeker, and passed them on to another country, as part of the Dublin Regulation:

The Dublin Regulation aims to “determine rapidly the Member State responsible [for an asylum claim]”[1] and provides for the transfer of an asylum seeker to that Member State.

Dublin Regulation, Wikipedia

An accepted “incoming” decision on an asylum application is the other end of the stick–the receiving end.

In short, “outgoing” implies that the asylum seeker is no longer that country’s responsibility, even though they processed the application, while “incoming” implies that the asylum seeker is their responsibility.

I chose “incoming” for the above graph (red) to show which countries were ultimately responsible for taking accepted asylum seekers in. As you can see, relatively few asylum applications were accepted.

Let’s look at this same data summed up across the decade to see how many asylum seekers each country took in.

# Group by country and year for graphing
dfg_grouped_country = dfg.groupby(['country']).sum().reset_index()

dfg_grouped_year = dfg.groupby(['year']).sum().reset_index()
# Split by-country dataframe for clearer graphing
dfg_grouped_country['250k+'] =  dfg_grouped_country['asylum_apps'] > 250000
dfg_grouped_country['50k+'] =  dfg_grouped_country['asylum_apps'] > 50000


dfg_grouped_country.sort_values('asylum_apps', ascending=False, inplace=True)
# Create sub-dataframes
dfg_grouped_country_lar = dfg_grouped_country.loc[dfg_grouped_country['250k+'] == True]
dfg_grouped_country_med = dfg_grouped_country.loc[(dfg_grouped_country['50k+'] == True) & 
                                                    (dfg_grouped_country['250k+'] == False)]
dfg_grouped_country_sma = dfg_grouped_country.loc[dfg_grouped_country['50k+'] == False]

dfg_grouped_country_sma.head()
# Initialize figure
fig, ax = plt.subplots(figsize=(8, 13))

# Plot asylum apps
sns.barplot(data = dfg_grouped_country.sort_values('asylum_apps', ascending=False), x='asylum_apps', y='country',
            label='Asylum applications', color='darkcyan', ci=None)

# Plot asylum apps accepted, incoming
sns.barplot(data = dfg_grouped_country.sort_values('asylum_apps', ascending=False), x='asylum_accepted_incoming', y='country',
            label='Accepted incoming applications', color="indianred", ci=None)

# Legend
ax.legend(bbox_to_anchor = [0.5, 0.95])

# Axis label
ax.set(ylabel='',
       xlabel='Incoming applications)')
sns.despine(left=True, bottom=True)

# Title
fig.subplots_adjust(top=0.93)
fig.suptitle('Asylum Applications: Total vs. Accepted Incoming (2009 - 2017)', size=14, weight='demi');

Despite Italy’s relatively low migration numbers, they took in the most incoming asylum applicants over the decade.

Let’s zoom in on the smaller countries.

# Set figure and axes
fig, axs = plt.subplots(3, figsize=(10, 15), gridspec_kw={'height_ratios': [1, 1.43, 2.14]})

# Set style
sns.set_style('whitegrid')

# First subplot (>250k asylum apps)
sns.barplot(data = dfg_grouped_country_lar, y='country', x='asylum_apps',
            label='Total asylum applications', color="darkcyan", ci=None, ax=axs[0])

sns.barplot(data = dfg_grouped_country_lar, y='country', x='asylum_accepted_incoming',
            label='Accepted incoming applications', color="indianred", ci=None, ax=axs[0])

# Second subplot(<250k asylum apps)
sns.barplot(data = dfg_grouped_country_med, y='country', x='asylum_apps',
            label='Total asylum applications', color="darkcyan", ci=None, ax=axs[1])

sns.barplot(data = dfg_grouped_country_med, y='country', x='asylum_accepted_incoming',
            label='Accepted incoming applications', color="indianred", ci=None, ax=axs[1])

# Third subplot(<50k asylum apps)
sns.barplot(data = dfg_grouped_country_sma, y='country', x='asylum_apps',
            label='Total asylum applications', color="darkcyan", ci=None, ax=axs[2])

sns.barplot(data = dfg_grouped_country_sma, y='country', x='asylum_accepted_incoming',
            label='Accepted incoming applications', color="indianred", ci=None, ax=axs[2])


# Formatting
fig.tight_layout()
axs[0].set(xlabel='', ylabel='')
axs[1].set(xlabel='', ylabel='')
axs[2].set(xlabel='', ylabel='')

# Legend
axs[2].legend(bbox_to_anchor = [0.97, 0.13])

# Title
plt.subplots_adjust(top=0.95)
fig.suptitle('Asylum Applications: Total vs. Accepted Incoming (2009 - 2017)', size=16, weight='demi');

It appears that while larger countries processed more applications than smaller countries, approved asylum seekers were dispersed.

Let’s look at this same data, except aggregate it across countries to see how many accepted incoming applications there were per year.

# Initialize figure
fig, ax = plt.subplots(figsize=(15, 10))

# Plot asylum apps
sns.set_color_codes("pastel")
sns.barplot(data = dfg_grouped_year, x='year', y='asylum_apps',
            label="Total asylum applications", color="darkcyan", ci=None)

# Plot asylum apps accepted, incoming
sns.set_color_codes('muted')
sns.barplot(data = dfg_grouped_year, x='year', y='asylum_accepted_incoming',
            label="Accepted incoming applications", color="indianred", ci=None)

# Legend
ax.legend(bbox_to_anchor = [0.25, 0.95]) #loc='upper left'

# Axis labels
ax.set(ylabel='',xlabel='Applications')
sns.despine(left=True, bottom=True)

# Title
plt.subplots_adjust(top=0.95)
fig.suptitle('Accepted (Incoming) & Total Asylum Applications', size=20, weight='demi');

Above shows the number of asylum applications that were processed and accepted (incoming) for our thirty-two countries.

It’s notable that asylum applications have taken a fairly sharp drop since 2015, and that the number of accepted incoming asylum applications has not taken the same sharp drop. Still, it’s clear that the number of applications leading up to 2015 increased at a faster rate than they were being accepted.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=2.2,
                  aspect=1.6).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'asylum_accepted_outgoing', color='mediumpurple')
g.map(plt.fill_between, 'year', 'asylum_accepted_outgoing',  color='mediumpurple', alpha=0.5)

g.map(plt.plot, 'year', 'asylum_accepted_incoming', color='cadetblue')
g.map(plt.fill_between, 'year', 'asylum_accepted_incoming',  color='cadetblue', alpha=0.5)


# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Accepted outgoing applications' : 'mediumpurple',
            'Accepted incoming applications': 'cadetblue'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.645, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Accepted Applications: Outgoing vs Incoming', size=16, weight='demi');

Here’s our first look at the “outgoing” asylum applications–the ones that one country processed, approved, then gave to another country via the Dublin Regulation.

Unsurprisingly, Germany–the number one asylum application processor–transferred many asylum seekers to other European countries. Some countries, like France and Switzerland, have been transferring more approved applications out than taking them in, while other countries, such as Poland and Italy, have been taking more in than transferring out.

Let’s take a look at accepted incoming asylum applications per capita.

# Create new variables for graphing
dfg['asylum_accepted_incoming_per_capita'] = dfg['asylum_accepted_incoming']/dfg['pop_in_thous']*1000
dfg['asylum_accepted_incoming_per_GDP'] = dfg['asylum_accepted_incoming']/dfg['GDP']*1000000

dfg.head()
# Pivot data into wide-form for graphing
df_asylum_inc_pc = dfg.pivot(index='country', columns='year')['asylum_accepted_incoming_per_capita'].reset_index()
df_asylum_inc_pgdp = dfg.pivot(index='country', columns='year')['asylum_accepted_incoming_per_GDP'].reset_index()

df_asylum_inc_pc.head()
df_asylum_inc_pc = df_asylum_inc_pc.drop(['2009', '2010', '2011', '2012'],axis=1)
df_asylum_inc_pgdp = df_asylum_inc_pgdp.drop(['2009', '2010', '2011', '2012'],axis=1)

df_asylum_inc_pc.max()
# Style
sns.set(style='whitegrid')

# Set up PairGrid
g = sns.PairGrid(data=df_asylum_inc_pc.sort_values('2015', ascending=False),
                x_vars=df_asylum_inc_pc.columns[1:], y_vars=['country'],
                height=10, aspect=0.3)

# Create stripplot
g.map(sns.stripplot, size=10, orient='h', palette=sns.cubehelix_palette(32, start=0.5, rot=-0.8, reverse=True),
     linewidth=1, edgecolor='w')

# Set x-axis limits on all columns
g.set(xlim=(-100, 1900), xlabel="Accepted per capita", ylabel="")

# Column titles
titles =['2013', '2014', '2015', '2016', '2017']

for ax, title in zip(g.axes.flat, titles):

    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Accepted Incoming Asylum Applications per Capita', size=16, weight='demi')
plt.figtext(0.5,0.95, 'Countries in descending order of accepted applications per capita in 2015', ha="center", va="top", fontsize=14, color='grey');

Since the migrant crisis was in 2015, I sorted the data in descending order of that year. We can look to the two years before and after to see how many asylum seekers each country took in per their population.

Malta, a tiny country and top receiver of incoming asylum applicants per capita the year of the crisis, actually slowed the number of incoming asylum seekers relative to their population in 2015 compared to the two previous years. Bulgaria and Croatia took in many more asylum seekers per capita in 2016 than in the previous years.

Italy ranks third for incoming asylum seekers relative to their population, the country who accepted the most incoming asylum applicants over the ten years. Notably, out of the top five most populous and highest-GDP countries, Italy is the only country to break the top ten countries on this graph, following by Germany ranking fourteenth.

Understandably, larger countries’ rates appear to change little over the five years compared to smaller countries’ rates since each accepted asylum seeker has a larger impact per capita in less populous countries.

Taking in asylum seekers is an economic burden in the short-run for countries. Let’s look at accepted incoming asylum seekers per GDP.

# Style
sns.set(style='whitegrid')

# Set up PairGrid
g = sns.PairGrid(data=df_asylum_inc_pgdp.sort_values('2015', ascending=False),
                x_vars=df_asylum_inc_pgdp.columns[1:], y_vars=['country'],
                height=10, aspect=0.3)

# Create stripplot
g.map(sns.stripplot, size=10, orient='h', palette=sns.cubehelix_palette(32, start=0.5, rot=-0.8, reverse=True),
     linewidth=1, edgecolor='w')

# Set x-axis limits on all columns
g.set(xlim=(-3000, 98000), xlabel="Accepted per GDP", ylabel="")

# Column titles
titles =['2013', '2014', '2015', '2016', '2017']

for ax, title in zip(g.axes.flat, titles):

    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Accepted Incoming Asylum Applications per GDP', size=16, weight='demi')
plt.figtext(0.5,0.95, 'Countries in descending order of accepted applications per GDP in 2015', ha="center", va="top", fontsize=14, color='grey');

Like in the previous graph, you can see the data is in descending order based on 2015.

Again, Italy is the only country out of the top five most populous and highest-GDP countries that ranks in the top ten. The UK is notably third-to-last, making them one of the largest and wealthiest countries that accepted the fewest asylum seekers relative to GDP.

Like with accepted incoming asylum seekers per capita, we can see larger changes across the years for smaller nations.

Immigration policy, enforcement, and effectiveness

Here, we’ll examine Eurostat data on third country nationals (TCN) found to be illegally present, leave orders issued to them, and how effective those leave orders were. We’ll also look at data that shows how many people were refused at border crossings.

Third country nationals: Any person who is not a citizen of the Union within the meaning of Article 17 (1) of the Treaty, including stateless persons (see Art. 2.1 (i) of the Council Regulation (EC) no 862/2007).

International Migration statistics, Eurostat
# THIRD COUNTRY NATIONALS (TCN) ILLEGALLY PRESENT
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_eil_esms.htm
df_tcn_illegal_pres = clean_eurostat_excel('migr_illegally_present.xlsx', 'Data')

dfg_tcn_illegal_pres = df_to_longform(df_tcn_illegal_pres, 'illegally_pres')

dfg_tcn_illegal_pres.head(1)
# TCN ILLEGALLY PRESENT GIVEN LEAVE ORDERS DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_eil_esms.htm
df_tcn_leave_order = clean_eurostat_excel('migr_leave_order.xlsx', 'migr_eiord')

dfg_tcn_leave_order = df_to_longform(df_tcn_leave_order, 'leave_order')

dfg_tcn_leave_order.head(1)
# TCN REFUSED AT BORDER CROSSINGS DATA 
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_eil_esms.htm
df_tcn_refused = clean_eurostat_excel('migr_refused_entry.xlsx', 'migr_eirfs')

dfg_tcn_refused = df_to_longform(df_tcn_refused, 'refused_border')

dfg_tcn_refused.head(1)
# TCN ILLEGALLY PRESENT THAT LEFT COUNTRY AFTER LEAVE ORDER DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_eil_esms.htm
df_tcn_returned = clean_eurostat_excel('migr_eirtn.xlsx', 'total_returned')

dfg_tcn_returned = df_to_longform(df_tcn_returned, 'illegal_returned')

dfg_tcn_returned.head(1)
# TCN ILLEGALLY PRESENT THAT LEFT COUNTRY AND EU AFTER LEAVE ORDER DATA
# https://ec.europa.eu/eurostat/cache/metadata/en/migr_eil_esms.htm
df_tcn_returned_third = clean_eurostat_excel('migr_eirtn.xlsx', 'returned_third_country')

dfg_tcn_returned_third = df_to_longform(df_tcn_returned_third, 'illegal_returned_thirdcountry')

dfg_tcn_returned_third.head(1)
# Merge all into graphing dataframe
dfs = [dfg, dfg_tcn_illegal_pres, dfg_tcn_leave_order, dfg_tcn_refused,
       dfg_tcn_returned, dfg_tcn_returned_third]

dfg = reduce(lambda left, right: pd.merge(left, right, on=['country', 'year'],
                                        how='inner'), dfs)     

dfg.info()
dfg.head(1)
# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants',  color='darkorange', alpha=0.4)

g.map(plt.plot, 'year', 'refused_border', color='olivedrab')
g.map(plt.fill_between, 'year', 'refused_border',  color='olivedrab', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Total immigrants': 'darkorange',
            'Third country nationals refused at border' : 'olivedrab'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.55, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Total Immigrants vs. Third Country Nationals Refused at Border', size=16, weight='demi');

Here’s our same orange ‘total immigration’ data overlaid with the volume of third country nationals refused entry at border crossings.

Third country nationals refused entry at the external border:
Third country nationals formally refused permission to enter the territory of a Member State (see Art. 2.1 (q) and 5.1(a) of the Council Regulation (EC) no 862/2007). The external border is defined as in the Schengen Borders Code (Council Regulation (EC) No 562/2006, more details on Article 2.2). For countries which are not in the Schengen area, the external border is the same as the international border. The grounds for refusal refer to the Annex V part B of the Schengen Border Code, which is an administrative document in use in most of the Member States.


Each person is counted only once within the reference period, irrespective of the number of refusals issued to the same person.

International Migration statistics, Eurostat

Besides a slight uptick in France in the last couple years of the decade and Spain’s consistently high border refusals, these numbers look pretty low across the board.

I am curious as to why Spain’s border refusals are so much higher. I suspect that it has something to do with Spain’s geography–namely its coastline and southern border near Africa. According to data from Frontex, Spain’s shores received twice as many immigrants as Italy and a similar number as Greece in 2018.

Now, let’s compare total immigrants to illegally present third country nationals.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'immigrants', color='darkorange')
g.map(plt.fill_between, 'year', 'immigrants',  color='darkorange', alpha=0.4)

g.map(plt.plot, 'year', 'illegally_pres', color='lightseagreen')
g.map(plt.fill_between, 'year', 'illegally_pres',  color='lightseagreen', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Total immigrants': 'darkorange',
            'Illegally present third country nationals' : 'lightseagreen'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.55, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Total Immigrants vs. Illegally Present Third Country Nationals', size=16, weight='demi');

Third country nationals found to be illegally present:
Third country nationals who are detected by Member States’ authorities and have been determined to be illegally present under national laws relating to immigration (see Art. 2.1 (r) and 5.1(b) of the Council Regulation (EC) no 862/2007). This category relates to persons who have been found to have entered illegally (for example by avoiding immigration controls or by employing a fraudulent document) and those who may have entered legitimately but have subsequently remained on an illegal basis (for example by overstaying their permission to remain or by taking unauthorised employment).

Only persons who are apprehended or otherwise come to the attention of national immigration authorities are recorded in these statistics. These are not intended to be a measure of the total number of persons who are present in the country on an unauthorised basis.

Each person is counted only once within the reference period.

International Migration statistics, Eurostat

As noted above, the statistics of unauthorized people in the country only include the unauthorized people that the country knows about. It’s not meant to be an educated guess at the actual total number of unauthorized people, which is likely larger.

Greece and Hungary stick out on this chart for their 2015 spikes. Clearly they received many illegally present TCNs as people fled to Europe during the migrant crisis. Since their total immigration numbers are relatively low and stable over the same time period, this suggests many people traveled through Greece and Hungry rather than immigrating there.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'asylum_apps', color='goldenrod')
g.map(plt.fill_between, 'year', 'asylum_apps',  color='goldenrod', alpha=0.4)

g.map(plt.plot, 'year', 'illegally_pres', color='lightseagreen')
g.map(plt.fill_between, 'year', 'illegally_pres',  color='lightseagreen', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Asylum applications': 'goldenrod',
            'Illegally present third country nationals' : 'lightseagreen'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.57, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Asylum Applications vs. Illegally Present Third Country Nationals', size=16, weight='demi');

Carrying on the same turquoise data from the previous chart, we can visually compare illegally present TCNs to asylum applications.

Again, this confirms the idea that many people were traveling through Greece and Hungry as they had more illegally present TCNs than asylum applications.

Germany, Sweden, and Italy stand out for their higher asylum applications relative to illegally present TCNs in the last few years of the decade.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'leave_order', color='slateblue')
g.map(plt.fill_between, 'year', 'leave_order',  color='slateblue', alpha=0.5)

g.map(plt.plot, 'year', 'illegally_pres', color='lightseagreen')
g.map(plt.fill_between, 'year', 'illegally_pres',  color='lightseagreen', alpha=0.5)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Illegally present third country nationals (TCNs)' : 'lightseagreen',
            'Illegally present TCNs given leave order': 'slateblue'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.56, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Illegally Present Third Country Nationals & Those Given Leave Orders', size=16, weight='demi');

When an illegal TCN is found by a country, oftentimes they will receive a leave order asking them to leave the country.

Third country nationals ordered to leave:
Third country nationals found to be illegally present who are subject to an administrative or judicial decision or act stating that their stay is illegal and imposing an obligation to leave the territory of the Member State (see Art. 7.1 (a) of the Regulation).  


These statistics do not include persons who are transferred from one Member State to another under the mechanism established by the Dublin Regulation (Council Regulation (EC) No 343/2003 and (EC) No 1560/2003, for these cases see related Dublin Statistics).


Each person is counted only once within the reference period, irrespective of the number of notices issued to the same person.

International Migration statistics, Eurostat

For countries like Germany, Greece, Hungary, and Austria, there were more illegally present TCNs that the governments knew about than there were leave orders being handed out to them. Most other countries appear to have given leave orders to most of their illegal TCN population.

But how effective were the leave orders? Did people leave?

# Initialize FacetGrid object
g = sns.FacetGrid(dfg2, col='country', col_wrap=4, height=1.8,
                  aspect=2).set(xticks=np.arange(0,10,3))


# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'leave_order', color='slateblue')
g.map(plt.fill_between, 'year', 'leave_order',  color='slateblue', alpha=0.6)

g.map(plt.plot, 'year', 'illegal_returned', color='crimson')
g.map(plt.fill_between, 'year', 'illegal_returned',  color='crimson', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}")
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Illegally present third country nationals given leave order': 'slateblue',
            'Leave order recipients that left' : 'crimson'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.525, 0.93], ncol=2).get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Illegally Present Third Country Nationals: Leave Orders and Response', size=16, weight='demi');

Eurostat uses “returned” TCNs to mean TCNs who have left the country:

Third country nationals returned following an order to leave:
Third country nationals who have in fact left the territory of the Member State, following an administrative or judicial decision or act stating that their stay is illegal and imposing an obligation to leave the territory (see Art. 7.1 (b) of the Council Regulation (EC) no 862/2007). On a voluntary basis Member States provide Eurostat with a subcategory which relates to third country nationals returned to a third country only.

Persons who left the territory within the year may have been subject to an obligation to leave in a previous year. As such, the number of persons who actually left the territory may be greater than those who were subject to an obligation to leave in the same year.

The EIL statistics based on Council Regulation (EC) no 862/2007 include forced returns and assisted voluntary returns. Unassisted voluntary returns are included where these are reliably recorded. Data do not include persons who are transferred from one Member State to another under the mechanism established by the Dublin Regulation (Council Regulation (EC) No 604/2013 and Council Regulation (EC) No 1560/2003 amended by Council Regulation (EC) 118/2014, for these cases see related Dublin Statistics).

Each person is counted only once within the reference period.

International Migration statistics, Eurostat

Unfortunately, the Switzerland data isn’t available for this one.

Germany has continued to ramp up the leave orders they give out. What’s interesting is that it looks like Germany had a near-100% ‘leave rate’–the rate of those who left per number of leave orders doled out–until the last year in the decade, 2017.

While it’s new for a substantial number of illegal TCNs to disobey leave orders in Germany, it’s fairly normal elsewhere, like France, Italy, Belgium, and Norway.

Still, we have all kinds of relationships here:

  • Countries with fairly consistent ‘leave rates,’ where the two variables seem to track together (France, Norway, Sweden)
  • Countries with inconsistent ‘leave rates,’ where the two variables sometimes converge and other times diverge (Germany, Greece, Spain)
  • Countries with high ‘leave rates,’ where most people who get a leave order leave do leave the country (Sweden, Poland, and many smaller countries)
  • Countries with low ‘leave rates,’ where a substantial amount of TCNs who get a leave order do not leave the country (Belgium, France, Greece, Italy, Spain, and more recently Germany and the UK)
  • Rising rates of leave orders given out (Germany), flat rates of leave orders given out (France, Austria, Norway), and falling rates of leave orders given out (Spain, Greece, the Netherlands)

Factors like geography, government enforcement, reputation of consequences, and even delivery of the leave order could affect these varying patterns over time.

dfg3 = dfg2.groupby(['country']).sum().reset_index()
# Relationship between leave orders and compliance?
# relationship is heteroskedastic
g = sns.jointplot(data=dfg3, x='leave_order', y='illegal_returned',
                                   kind='reg', color='crimson');

If we sum the total leave orders and known illegally present TCNs for each country across time and calculate a basic linear regression, we can see the regression is heteroskedastic. Heteroskedasticity is a problem for ordinary least squares regression assumptions, making it so we wouldn’t can’t trust results relying on this relationship.

# Initialize FacetGrid object
g = sns.FacetGrid(dfg2, col='country', col_wrap=4, height=2.8,
                  aspect=2).set(xticks=np.arange(0,10,3))

# Create immigration plot - set for immigrants
g.map(plt.plot, 'year', 'illegal_returned', color='crimson')
g.map(plt.fill_between, 'year', 'illegal_returned',  color='crimson', alpha=0.6)

g.map(plt.plot, 'year', 'illegal_returned_thirdcountry', color='darkcyan')
g.map(plt.fill_between, 'year', 'illegal_returned_thirdcountry',  color='darkcyan', alpha=0.6)

# Facet titles
for ax in g.axes:
    g.set_titles("{col_name}", size=18)
    g.set_axis_labels(y_var= '')

# Formatting axes
g.set(yticks=[])
g.despine(left=True)

# Legend
color_key = {'Left country that gave leave order': 'crimson',
            'Left Europe altogether' : 'darkcyan'}

patches =  [pat.Patch(color=v, label=k) for k,v in color_key.items()]

g.fig.legend(handles=patches, bbox_to_anchor =
             [0.58, 0.93], ncol=2, fontsize='xx-large').get_frame().set_edgecolor('1.0')

# Title
plt.subplots_adjust(top=0.85)
g.fig.suptitle('Illegally Present Third Country Nationals That Left After Given Leave Order', size=26, weight='demi');

Finally, our last graph: TCNs that left the European country they were in versus the subset of TCNs that left Europe entirely.

On a voluntary basis, Eurostat further collects information on those persons who are recorded as having returned to a third country (as opposed to being returned to another EU Member State).

International Migration statistics, Eurostat

The red is the total of those who left following a leave order from the previous graph while the overlapping dark blue represents those who left Europe altogether. That implies any area where the blue is not overlapping with the red represents illegally present TCNs who emigrated to other EU Member States following a leave order.

For the most part, there appears to be very few and narrow red slivers. This implies that out of those who receive order that do leave, most emigrate out of the entire EU Member State area. There’s a much smaller proportion of EU Member State ‘hoppers’. For those who are hopping from one EU Member State to the next, leaving as an illegally present TCN asked to leave by the government, we can see them coming largely from France and the UK, with fewer coming from Germany, Sweden, and Spain.

With the changing tides of immigration and emigration in Europe, we’ve seen a wide picture: total numbers, per capita, and even per GDP. We’ve seen how large the migrant crisis of 2015 is in terms of migrants, illegally present third-country nationals, and asylum seekers. Government responses such as border refusals and leave orders–as well as how well those are listened to–showed us how each country is its own microcosm.