Análisis de Datos Electorales 🏛️️

Entrega Grupal Software Estadístico II 💻

👥 Adolfo Alarcón, Gonzalo Blanca, Pablo Galarón, Mario Gómez, Itziar Carro, Hugo Alonso, Raúl Palomo

Librerías Necesarias y carga de los datos

rm(list = ls())

library(tidyverse)
library(lubridate)
library(forcats)
library(glue)
library(mapSpain)
library(waffle)
library(ggplot2)
library(viridis)
library(plotly)

# NO TOQUES NADA
election_data <- read_csv(file = "./data/datos_elecciones_brutos.csv")
cod_mun <- read_csv(file = "./data/cod_mun.csv")
surveys <- read_csv(file = "./data/historical_surveys.csv")
abbrev <- read_csv(file = "./data/siglas.csv")

Filtrar años de 2008 a 2019

Debes trabajar únicamente en el período de tiempo que incluye las elecciones desde 2008 hasta las últimas elecciones de 2019.

election_data <- 
  election_data |> 
  filter(between(anno, 2008, 2019))

election_data |> 
  distinct(anno)

# A tibble: 5 × 1
   anno
  <dbl>
1  2008
2  2011
3  2015
4  2016
5  2019

Queremos datos en TIDYDATA

Los datos deben ser convertidos a tidydata donde sea apropiado.

Cada variable en una única columna ❌
Cada individuo en una fila diferente ✅
Cada celda con un único valor ✅
Cada dataset en un tibble ✅
Si queremos cruzar múltiples tablas debemos tener una columna común ❌

Queremos datos en TIDYDATA

election_data

# A tibble: 48,737 × 471
   tipo_eleccion  anno mes   vuelta codigo_ccaa codigo_provincia
   <chr>         <dbl> <chr>  <dbl> <chr>       <chr>           
 1 02             2008 03         1 14          01              
 2 02             2008 03         1 14          01              
 3 02             2008 03         1 14          01              
 4 02             2008 03         1 14          01              
 5 02             2008 03         1 14          01              
 6 02             2008 03         1 14          01              
 7 02             2008 03         1 14          01              
 8 02             2008 03         1 14          01              
 9 02             2008 03         1 14          01              
10 02             2008 03         1 14          01              
# ℹ 48,727 more rows
# ℹ 465 more variables: codigo_municipio <chr>,
#   codigo_distrito_electoral <dbl>, numero_mesas <dbl>, censo <dbl>,
#   participacion_1 <dbl>, participacion_2 <dbl>, votos_blancos <dbl>,
#   votos_nulos <dbl>, votos_candidaturas <dbl>, `BERDEAK-LOS VERDES` <dbl>,
#   ARALAR <dbl>, `PARTIDO OBRERO SOCIALISTA INTERNACIONALISTA` <dbl>,
#   `ALTERNATIVA MOTOR Y DEPORTES` <dbl>, `PARTIDO HUMANISTA` <dbl>, …

Queremos datos en TIDYDATA

Conjunto de datos ELECCIONES

Con pivot_longer -> cada variable estará en una única columna
Eliminamos las variables que tomen un único valor porque INFORMACIÓN = VARIANZA
Creamos Código de municipio con glue para luego poder cruzar tablas

election_data_tidy <-
  election_data |> 
  pivot_longer(cols =  "BERDEAK-LOS VERDES":"COALICIÓN POR MELILLA", 
               names_to = "Partido", 
               values_to = "votos",
               values_drop_na = TRUE) |> 
  select(-c(tipo_eleccion, vuelta, codigo_distrito_electoral)) |> 
  mutate(fecha = ymd(glue("{anno}{mes}01")), 
         cod_mun= glue("{codigo_ccaa}-{codigo_provincia}-{codigo_municipio}"))

election_data_tidy <- 
  election_data_tidy |> left_join(cod_mun, by = "cod_mun")

Queremos datos en TIDYDATA

surveys

# A tibble: 3,753 × 59
   type_survey date_elec  id_pollster pollster media    field_date_from
   <chr>       <date>     <chr>       <chr>    <chr>    <date>         
 1 national    1982-10-28 pollster-1  PSOE     <NA>     1982-10-28     
 2 national    1982-10-28 pollster-2  IDEAL    <NA>     1982-10-28     
 3 national    1982-10-28 pollster-3  SOFEMASA EL PAÍS  1982-10-16     
 4 national    1982-10-28 pollster-4  GRUPO 16 GRUPO 16 1982-10-09     
 5 national    1982-10-28 pollster-3  SOFEMASA EL PAÍS  1982-10-01     
 6 national    1982-10-28 pollster-4  GRUPO 16 GRUPO 16 1982-09-25     
 7 national    1982-10-28 pollster-3  SOFEMASA EL PAÍS  1982-09-24     
 8 national    1982-10-28 pollster-5  AP       <NA>     1982-09-27     
 9 national    1982-10-28 pollster-6  GALLUP   <NA>     1982-09-06     
10 national    1982-10-28 pollster-4  GRUPO 16 GRUPO 16 1982-09-05     
# ℹ 3,743 more rows
# ℹ 53 more variables: field_date_to <date>, exit_poll <lgl>, size <dbl>,
#   turnout <dbl>, UCD <dbl>, PSOE <dbl>, PCE <dbl>, AP <dbl>, CIU <dbl>,
#   PA <dbl>, `EAJ-PNV` <dbl>, HB <dbl>, ERC <dbl>, EE <dbl>, CDS <dbl>,
#   FN <dbl>, PAD <dbl>, PRD <dbl>, MUC <dbl>, IU <dbl>, CG <dbl>, PAR <dbl>,
#   AIC <dbl>, UV <dbl>, EA <dbl>, PP <dbl>, LV <dbl>, ARM <dbl>, PDP <dbl>,
#   CC <dbl>, PAP <dbl>, BNG <dbl>, ICV <dbl>, EH <dbl>, UPYD <dbl>, …

Queremos datos en TIDYDATA

Conjunto de datos ENCUESTAS

Con pivot_longer -> cada variable estará en una única columna
Eliminamos las variables que tomen un único valor porque INFORMACIÓN = VARIANZA

surveys_tidy <- 
  surveys |> 
  pivot_longer(cols= -("type_survey":"turnout"),
               names_to= "Partido",
               values_to="Estimación",
               values_drop_na = TRUE) |> 
  select(-type_survey)

Reestructuración de los Partidos Políticos

Warning

No confundir el PSA con el PSOE, no tienen nada que ver, tener en cuenta para la recodificación solo los partidos que tengán en la tabla abbrev las siglas de PSOE, hay que unir con la tabla abrevv y así con todos.
Cuidado con los signos como . , ; ’. Tenemos que quitarlos porque sino nos dificultan la recodificación.

election_data_tidy <- 
  election_data_tidy |> 
  left_join(abbrev, by = c("Partido" = "denominacion"))

election_data_tidy <- 
  election_data_tidy |> 
  mutate(siglas = str_replace_all(siglas, pattern = "[\\.:\\+\\']", replacement = ""))

election_data_tidy <- 
  election_data_tidy |>
  mutate("siglas" =
           case_when(str_detect(siglas, "PSOE") ~ "PSOE",
                     str_detect(siglas, "PP") ~ "PP",
                     str_detect(siglas, "CS|C´S") ~ "CS",
                     str_detect(siglas, "PNV") ~ "PNV",
                     str_detect(siglas, "BNG") ~ "BNG",
                     str_detect(siglas, "PODEMOS|EB-B|IU") ~ "PODEMOS",
                     str_detect(siglas, "ERC|ESQUERRA") ~ "ERC",
                     str_detect(siglas, "BILDU|EA|ARALAR") ~ "BILDU",
                     str_detect(siglas, "M PAÍS") ~ "MP",
                     str_detect(siglas, "VOX") ~ "VOX",
                     str_detect(siglas, "CIU") ~ "CIU",
                     TRUE ~ "OTROS"))

surveys_tidy <- 
  surveys_tidy |>
  mutate("siglas" =
           case_when(str_detect(Partido, "PSOE") ~ "PSOE",
                     str_detect(Partido, "PP") ~ "PP",
                     str_detect(Partido, "CS|C´S") ~ "CS",
                     str_detect(Partido, "PNV") ~ "PNV",
                     str_detect(Partido, "BNG") ~ "BNG",
                     str_detect(Partido, "PODEMOS|EB-B|IU") ~ "PODEMOS",
                     str_detect(Partido, "ERC|ESQUERRA") ~ "ERC",
                     str_detect(Partido, "BILDU|EA|ARALAR") ~ "BILDU",
                     str_detect(Partido, "M PAÍS") ~ "MP",
                     str_detect(Partido, "VOX") ~ "VOX",
                     str_detect(Partido, "CIU") ~ "CIU",
                     TRUE ~ "OTROS"))

Filtrar en el dataset Encuestas

Debes descartar las encuestas que:

se refieran a elecciones anteriores a 2008
sean a pie de urna
tamaño muestral desconocido o inferior a 500.
tenga 1 día o menos de trabajo de campo.

surveys_tidy <- 
  surveys_tidy |> 
  drop_na(size) |> 
  filter(exit_poll == FALSE & size >= 500 & year(date_elec) >= 2008 & as.Date(field_date_to) - as.Date(field_date_from) > 1)

Ítems Obligatorios

Pregunta 1

¿Qué partido fue el ganador en los municipios con más de 100.000 habitantes (censo) en cada una de las elecciones?

ganadores_municipios <- 
  election_data_tidy |> 
  filter(censo > 100000) |> 
  slice_max(votos, n=1, by = c(fecha, cod_mun), with_ties =  FALSE) |>
  select(c(fecha, cod_mun, municipio, siglas, votos))

ganadores_municipios

# A tibble: 282 × 5
   fecha      cod_mun   municipio                   siglas  votos
   <date>     <glue>    <chr>                       <chr>   <dbl>
 1 2008-03-01 14-01-059 Vitoria-Gasteiz             PSOE    56349
 2 2008-03-01 07-02-003 Albacete                    PP      49909
 3 2008-03-01 17-03-014 Alicante/Alacant            PP      91117
 4 2008-03-01 17-03-065 Elche/Elx                   PP      67584
 5 2008-03-01 01-04-013 Almería                     PP      49463
 6 2008-03-01 10-06-015 Badajoz                     PP      43205
 7 2008-03-01 04-07-040 Palma                       PP      78133
 8 2008-03-01 09-08-015 Badalona                    PSOE    56931
 9 2008-03-01 09-08-019 Barcelona                   PSOE   356264
10 2008-03-01 09-08-101 Hospitalet de Llobregat, L' PSOE    70566
# ℹ 272 more rows

resumen <- 
  ganadores_municipios |> 
  summarise("municipios_ganados" = n(), .by = c(fecha, siglas))

resumen

# A tibble: 29 × 3
   fecha      siglas  municipios_ganados
   <date>     <chr>                <int>
 1 2008-03-01 PSOE                    20
 2 2008-03-01 PP                      21
 3 2008-03-01 OTROS                    6
 4 2011-11-01 PP                      38
 5 2011-11-01 PSOE                     5
 6 2011-11-01 OTROS                    2
 7 2011-11-01 PODEMOS                  1
 8 2011-11-01 PNV                      1
 9 2015-12-01 PODEMOS                  5
10 2015-12-01 PP                      32
# ℹ 19 more rows

municipios_ganados <-
  ggplot(resumen) +
  geom_col(aes(x = fct_reorder(siglas, -municipios_ganados), y = municipios_ganados, fill = siglas)) +
  scale_fill_manual(values = c("#ffbf41", "#808080","#2E8B57", "#5f457c", "#1e4b8f", "#e30613", "#008000")) +
  facet_wrap(~fecha, scales = "free_y") +
  labs(
    title = "Nº municipios (>100k habitantes) ganados por partido por fecha",
    x = "Partido",
    y = "Número de Municipios Ganados",
    fill = "Partido") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) + theme(plot.title = element_text(family = "serif",
    size = 19, face = "bold"), legend.text = element_text(size = 10),
    legend.title = element_text(size = 13,
        family = "serif"), panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite")) + theme(legend.text = element_text(family = "serif"))

Gráfico 📊

ggplotly(municipios_ganados)

Pregunta 2

¿Qué partido fue el segundo cuando el primero fue el PSOE? ¿Y cuando el primero fue el PP?

analizar_segundos_lugares <- function(data, partido_ganador) {
  data |>
    group_by(fecha, cod_mun) |>
    slice_max(votos, n = 2) |>
    filter(siglas[1] == partido_ganador) |>
    filter(siglas != partido_ganador) |>
    ungroup() |>
    group_by(siglas) |>
    count() |> 
    ungroup()
}

Ejercicio12_2a <- analizar_segundos_lugares(election_data_tidy, "PSOE") 

Ejercicio12_2a

# A tibble: 7 × 2
  siglas      n
  <chr>   <int>
1 BILDU       2
2 ERC         2
3 OTROS      46
4 PNV         2
5 PODEMOS    56
6 PP        320
7 VOX        12

Ejercicio12_2b <- analizar_segundos_lugares(election_data_tidy, "PP")

Ejercicio12_2b

# A tibble: 4 × 2
  siglas      n
  <chr>   <int>
1 OTROS      65
2 PODEMOS    53
3 PSOE     1874
4 VOX        27

pal1 <- c('#808080', '#2E8B57', '#5f457c', '#1e4b8f', '#008000')

preg_2 <- 
ggplot(Ejercicio12_2a) +
  geom_waffle(aes(fill = siglas, values = n), make_proportional = TRUE, alpha = 0.7, n_rows = 5) +
  scale_fill_manual(values = pal1) +
  coord_equal() +
  labs(title = "Proporción de partidos en segundo lugar cuando ganó el PSOE") +
  theme_minimal() + 
  theme(legend.position = "bottom", legend.direction = "horizontal") + theme(plot.title = element_text(family = "serif",
    size = 19, face = "bold"), legend.text = element_text(family = "serif"),
    legend.title = element_text(size = 13,
        family = "serif"), panel.background = element_rect(fill = NA))

pal2 <- c('#808080','#5f457c', '#e30613', '#008000')

preg_2_a <- 
ggplot(Ejercicio12_2b) +
  geom_waffle(aes(fill = siglas, values = n), make_proportional = TRUE, alpha = 0.7, n_rows = 5) +
  scale_fill_manual(values = pal2) +
  coord_equal() +
  labs(title = "Proporción de partidos en segundo lugar cuando ganó el PP") +
  theme_minimal() +
  theme_minimal() + theme(legend.position = "bottom", legend.direction = "horizontal") + theme(plot.title = element_text(family = "serif",
    size = 19, face = "bold"), legend.text = element_text(family = "serif"),
    legend.title = element_text(size = 14,
        family = "serif"), panel.background = element_rect(fill = NA))

Waffle 📊

preg_2

Waffle 📊

preg_2_a

Pregunta 3

¿A quién beneficia la baja participación?

election_data_tidy <-
  election_data_tidy |> mutate("participacion_total"= participacion_1 + participacion_2)

correlacion <- 
  election_data_tidy |> 
  group_by(siglas) |> 
  summarise(correlacion = cor(participacion_total, votos))

correlacion

# A tibble: 10 × 2
   siglas  correlacion
   <chr>         <dbl>
 1 BILDU         0.348
 2 BNG           0.903
 3 ERC           0.892
 4 MP            0.989
 5 OTROS         0.333
 6 PNV           0.970
 7 PODEMOS       0.924
 8 PP            0.945
 9 PSOE          0.957
10 VOX           0.738

partido_colores <- c(
  'PSOE' = '#e30613',
  'PP' = '#1e4b8f',
  'VOX' = '#008000',
  'PODEMOS' = '#5f457c',
  'ERC' = '#ffcc00',
  'PNV' = '#2E8B57',
  'BNG' = '#4682B4',
  'OTROS' = '#808080',
  'CS' = '#EB6109',
  'MP' = '#6ad9c4',
  'BILDU' = '#83C441')

preg_3 <- 
  ggplot(correlacion) +
  geom_bar(aes(x = fct_reorder(siglas, correlacion, .desc = TRUE), 
               y = correlacion, fill = siglas), 
           stat = "identity") +
  scale_fill_manual(values = partido_colores, na.value = "grey") +
  theme_minimal() +
  labs(
    title = "Correlación entre Participación Total y Votos por Partido",
    x = "Partido",
    y = "Correlación")  + theme(plot.title = element_text(family = "serif",
    size = 18.5, face = "bold"), legend.text = element_text(size = 9.2,
    family = "serif"), legend.title = element_text(size = 14,
    family = "serif"), panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite"))

Gráfico de Barras 📊

ggplotly(preg_3)

Cuanta más correlación haya, mejor le viene que haya muchos votos al aumentar la participación, también aumentan los votos para ese partido. Los partidos con correlaciones bajas son aquellos que podrían beneficiarse más de una baja participación.

Pregunta 4

¿Cómo analizar la relación entre censo y voto? ¿Es cierto que determinados partidos ganan en las zonas rurales?

Seleccionar las columnas necesarias

datos <- 
  election_data_tidy |> 
  select(anno, mes, cod_mun, siglas, votos, censo, participacion_1, participacion_2)

datos

# A tibble: 688,226 × 8
    anno mes   cod_mun   siglas  votos censo participacion_1 participacion_2
   <dbl> <chr> <glue>    <chr>   <dbl> <dbl>           <dbl>           <dbl>
 1  2008 03    14-01-001 OTROS       9  1838             677            1008
 2  2008 03    14-01-001 OTROS      27  1838             677            1008
 3  2008 03    14-01-001 OTROS       1  1838             677            1008
 4  2008 03    14-01-001 OTROS       1  1838             677            1008
 5  2008 03    14-01-001 OTROS       1  1838             677            1008
 6  2008 03    14-01-001 OTROS       2  1838             677            1008
 7  2008 03    14-01-001 PP        238  1838             677            1008
 8  2008 03    14-01-001 PP        238  1838             677            1008
 9  2008 03    14-01-001 PODEMOS    61  1838             677            1008
10  2008 03    14-01-001 BILDU      85  1838             677            1008
# ℹ 688,216 more rows

Agrupar por partido político y sumar los votos totales

votos_totales_por_partido <- 
  datos |> 
  group_by(siglas) |> 
  summarise(
    votos_totales = sum(votos, na.rm = TRUE),  #Sumar votos totales por partido
    censo_total = sum(censo, na.rm = TRUE)    #Sumar el censo asociado
  ) |> 
  ungroup()

votos_totales_por_partido

# A tibble: 10 × 3
   siglas  votos_totales censo_total
   <chr>           <dbl>       <dbl>
 1 BILDU          823271    12288326
 2 BNG           1257080    20454711
 3 ERC           7730216    77118044
 4 MP              52195     4785424
 5 OTROS        53468777  3402770741
 6 PNV           3973630    20655616
 7 PODEMOS      37846149   389281512
 8 PP           81963125   343372463
 9 PSOE         76583616   357600555
10 VOX           6421604   114750880

votos_totales_por_partido <- 
  votos_totales_por_partido |> 
  mutate(relacion_votos_censo = votos_totales / censo_total)  #Proporción votos/censo

votos_totales_por_partido

# A tibble: 10 × 4
   siglas  votos_totales censo_total relacion_votos_censo
   <chr>           <dbl>       <dbl>                <dbl>
 1 BILDU          823271    12288326               0.0670
 2 BNG           1257080    20454711               0.0615
 3 ERC           7730216    77118044               0.100 
 4 MP              52195     4785424               0.0109
 5 OTROS        53468777  3402770741               0.0157
 6 PNV           3973630    20655616               0.192 
 7 PODEMOS      37846149   389281512               0.0972
 8 PP           81963125   343372463               0.239 
 9 PSOE         76583616   357600555               0.214 
10 VOX           6421604   114750880               0.0560

Ahora vamos a clasificar los municipios en urbano o rural

datos <- datos |> mutate("tipo" = case_when(
  censo < 10000 ~ "rural",
  TRUE ~"urbano"
))

datos

# A tibble: 688,226 × 9
    anno mes   cod_mun  siglas votos censo participacion_1 participacion_2 tipo 
   <dbl> <chr> <glue>   <chr>  <dbl> <dbl>           <dbl>           <dbl> <chr>
 1  2008 03    14-01-0… OTROS      9  1838             677            1008 rural
 2  2008 03    14-01-0… OTROS     27  1838             677            1008 rural
 3  2008 03    14-01-0… OTROS      1  1838             677            1008 rural
 4  2008 03    14-01-0… OTROS      1  1838             677            1008 rural
 5  2008 03    14-01-0… OTROS      1  1838             677            1008 rural
 6  2008 03    14-01-0… OTROS      2  1838             677            1008 rural
 7  2008 03    14-01-0… PP       238  1838             677            1008 rural
 8  2008 03    14-01-0… PP       238  1838             677            1008 rural
 9  2008 03    14-01-0… PODEM…    61  1838             677            1008 rural
10  2008 03    14-01-0… BILDU     85  1838             677            1008 rural
# ℹ 688,216 more rows

Calcular votos totales y proporción por tipo de zona

votos_por_zona <- datos |> 
  group_by(siglas, tipo) |> 
  summarise(
    votos_totales = sum(votos, na.rm = TRUE),
    censo_total = sum(censo, na.rm = TRUE),
    relacion_votos_censo = votos_totales / censo_total
  ) |> 
  ungroup()

votos_por_zona

# A tibble: 20 × 5
   siglas  tipo   votos_totales censo_total relacion_votos_censo
   <chr>   <chr>          <dbl>       <dbl>                <dbl>
 1 BILDU   rural         218196     3200106              0.0682 
 2 BILDU   urbano        605075     9088220              0.0666 
 3 BNG     rural         457910     7178571              0.0638 
 4 BNG     urbano        799170    13276140              0.0602 
 5 ERC     rural        2456583    17925959              0.137  
 6 ERC     urbano       5273633    59192085              0.0891 
 7 MP      rural           6661      769169              0.00866
 8 MP      urbano         45534     4016255              0.0113 
 9 OTROS   rural       12050991   660305161              0.0183 
10 OTROS   urbano      41417786  2742465580              0.0151 
11 PNV     rural        1155454     4912864              0.235  
12 PNV     urbano       2818176    15742752              0.179  
13 PODEMOS rural        8561596   103309655              0.0829 
14 PODEMOS urbano      29284553   285971857              0.102  
15 PP      rural       22784871    91396349              0.249  
16 PP      urbano      59178254   251976114              0.235  
17 PSOE    rural       22094209    91936613              0.240  
18 PSOE    urbano      54489407   265663942              0.205  
19 VOX     rural        1524008    27075262              0.0563 
20 VOX     urbano       4897596    87675618              0.0559

Pregunta 5

¿Cómo calibrar el error de las encuestas (recordemos que las encuestas son de intención de voto a nivel nacional)?

#agregamos el total de votos para cada elección.
total_votos <- 
  election_data_tidy |> 
  summarise("votos_elección" = sum(votos), .by = fecha)

#agregamos el total de votos que recibió cada partido en las distintas elecciones.
votos_partidos <-
  election_data_tidy |> 
  summarise("votos_totales_partido" = sum(votos), .by = c(fecha, siglas))

#calculamos el porcentaje de voto que tuvo cada partido para cada elección.
porcentaje_votos_partido <-
  votos_partidos |>  left_join(total_votos, by = "fecha") 
porcentaje_votos_partido <- 
  porcentaje_votos_partido |>  
  mutate("porcentaje_voto" = votos_totales_partido / votos_elección * 100,
         "elec_year" = year(fecha))

porcentaje_votos_partido

# A tibble: 49 × 6
   fecha      siglas  votos_totales_partido votos_elección porcentaje_voto
   <date>     <chr>                   <dbl>          <dbl>           <dbl>
 1 2008-03-01 OTROS                 5199699       45756461          11.4  
 2 2008-03-01 PP                   17479126       45756461          38.2  
 3 2008-03-01 PODEMOS               1190531       45756461           2.60 
 4 2008-03-01 BILDU                   52987       45756461           0.116
 5 2008-03-01 PSOE                 20223962       45756461          44.2  
 6 2008-03-01 PNV                    606528       45756461           1.33 
 7 2008-03-01 ERC                    584760       45756461           1.28 
 8 2008-03-01 BNG                    418868       45756461           0.915
 9 2011-11-01 PP                   19894569       41853567          47.5  
10 2011-11-01 PNV                    647182       41853567           1.55 
# ℹ 39 more rows
# ℹ 1 more variable: elec_year <dbl>

Calculamos el error absoluto entre el porcentaje de voto que recibió cada partido y las estimaciones.

error <- 
  surveys_tidy |> mutate("elec_year" = year(date_elec)) |>
  inner_join(porcentaje_votos_partido, by = c("elec_year" = "elec_year",
                                            "Partido" = "siglas"))  |> 
  mutate("error_abs" = abs(Estimación - porcentaje_voto)) |> 
  relocate(error_abs, .before = field_date_from)

error

# A tibble: 7,165 × 18
   date_elec  id_pollster pollster media error_abs field_date_from field_date_to
   <date>     <chr>       <chr>    <chr>     <dbl> <date>          <date>       
 1 2008-03-09 pollster-49 GESOP    EL P…    1.20   2008-03-05      2008-03-07   
 2 2008-03-09 pollster-49 GESOP    EL P…    3.72   2008-03-05      2008-03-07   
 3 2008-03-09 pollster-49 GESOP    EL P…    0.800  2008-03-05      2008-03-07   
 4 2008-03-09 pollster-49 GESOP    EL P…    1.60   2008-03-04      2008-03-06   
 5 2008-03-09 pollster-49 GESOP    EL P…    3.72   2008-03-04      2008-03-06   
 6 2008-03-09 pollster-49 GESOP    EL P…    0.400  2008-03-04      2008-03-06   
 7 2008-03-09 pollster-49 GESOP    EL P…    1.80   2008-03-03      2008-03-05   
 8 2008-03-09 pollster-49 GESOP    EL P…    3.72   2008-03-03      2008-03-05   
 9 2008-03-09 pollster-49 GESOP    EL P…    0.800  2008-03-03      2008-03-05   
10 2008-03-09 pollster-48 DEMOMÉT… TELE…    0.0991 2008-02-29      2008-03-02   
# ℹ 7,155 more rows
# ℹ 11 more variables: exit_poll <lgl>, size <dbl>, turnout <dbl>,
#   Partido <chr>, Estimación <dbl>, siglas <chr>, elec_year <dbl>,
#   fecha <date>, votos_totales_partido <dbl>, votos_elección <dbl>,
#   porcentaje_voto <dbl>

Calculamos el error medio de cada encuesta para cada elección.

errores_encuestas <- 
  error |> 
  summarise("empresa" = unique(pollster),
            "error" = mean(error_abs), .by = c(date_elec, id_pollster))

errores_encuestas

# A tibble: 116 × 4
   date_elec  id_pollster empresa                 error
   <date>     <chr>       <chr>                   <dbl>
 1 2008-03-09 pollster-49 GESOP                   1.28 
 2 2008-03-09 pollster-48 DEMOMÉTRICA             0.503
 3 2008-03-09 pollster-18 SIGMA DOS               1.66 
 4 2008-03-09 pollster-51 OBRADOIRO DE SOCIOLOXÍA 0.818
 5 2008-03-09 pollster-45 NOXA                    1.35 
 6 2008-03-09 pollster-25 DYM                     1.61 
 7 2008-03-09 pollster-38 SONDAXE                 0.776
 8 2008-03-09 pollster-53 METROSCOPIA             1.59 
 9 2008-03-09 pollster-55 APPEND                  1.11 
10 2008-03-09 pollster-41 ASEP                    1.87 
# ℹ 106 more rows

Pregunta 6

¿Qué casas encuestadoras acertaron más y cuáles se desviaron más de los resultados?

Vemos cuales fueron las 3 empresas que acertarón más para cada elección

empresas_mas_aciertos <-
  errores_encuestas |>
  slice_min(error, n = 3, by = date_elec)

Vemos cuales fueron las 3 empresas que se desviarón más de los resultados para cada elección

empresas_menos_aciertos <-
  errores_encuestas |> 
  slice_max(error, n = 3, by = date_elec)

Gráfico 📊 de las empresas con más aciertos

grafico_menor_error <- 
  ggplot(empresas_mas_aciertos) +
  geom_col(aes(x = fct_reorder(empresa, -error), y = error, fill = empresa)) +
  facet_wrap(~date_elec, scales = "free_x") +
  labs(title = "Empresas con mayor acierto por elección",
       subtitle = "(Menor error cometido)", 
       x = "Empresa", y = "Error cometido") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 5)) + theme(plot.title = element_text(family = "serif",
    size = 19.5, face = "bold"), legend.text = element_text(family = "serif"),
    legend.title = element_text(family = "serif"),
    panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite"))

Gráfico 📊 de las empresas que cometen más errores

grafico_mayor_error <-
  ggplot(empresas_menos_aciertos) +
  geom_col(aes(x = fct_reorder(empresa, -error), y = error, fill = empresa)) +
  facet_wrap(~date_elec, scales = "free_x") +
  labs(title = "Empresas con mayor error por elección",
       x = "Empresa", y = "Error cometido") +
  theme_minimal()+
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 5)) + theme(plot.title = element_text(family = "serif",
    size = 20, face = "bold"), legend.text = element_text(size = 9,
    family = "serif"), legend.title = element_text(size = 15,
    family = "serif"), panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite"))

ggplotly(grafico_menor_error)

ggplotly(grafico_mayor_error)

Calculamos el error medio de las encuestas

error_medio_encuestas <-
  errores_encuestas |> 
  summarise("empresa" = unique(empresa),
            "error_medio" = mean(error), .by = id_pollster)

error_medio_encuestas

# A tibble: 45 × 3
   id_pollster empresa                 error_medio
   <chr>       <chr>                         <dbl>
 1 pollster-49 GESOP                         3.17 
 2 pollster-48 DEMOMÉTRICA                   0.503
 3 pollster-18 SIGMA DOS                     2.44 
 4 pollster-51 OBRADOIRO DE SOCIOLOXÍA       2.20 
 5 pollster-45 NOXA                          3.01 
 6 pollster-25 DYM                           2.98 
 7 pollster-38 SONDAXE                       2.32 
 8 pollster-53 METROSCOPIA                   2.54 
 9 pollster-55 APPEND                        1.11 
10 pollster-41 ASEP                          6.26 
# ℹ 35 more rows

Gráfico 📊 para errores totales

errores_totales <- 
  ggplot(error_medio_encuestas) +
  geom_col(aes(x = fct_reorder(empresa, -error_medio), y = error_medio, fill = "steelblue")) +
  labs(title = "Error medio de cada empresa",
       x = "Empresa", y = "Error medio") +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 5)) + theme(plot.title = element_text(family = "serif",
    size = 20, face = "bold"), panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite"))

ggplotly(errores_totales)

Mapa Municipios

Mapa con los partidos ganadores en cada municipio en las elecciones del año 2019.

Creamos nuestro RESUMEN

datos_mapa <- 
  election_data_tidy |> 
  filter(anno == 2019) |> 
  mutate(fecha = ymd(glue("{anno}{mes}01")), 
         cod_mun= glue("{codigo_provincia}{codigo_municipio}")) |> 
  filter(siglas != "OTROS") |> 
  summarise("votos_por_mun" = sum(votos), .by = c(siglas, cod_mun)) |> 
  slice_max(votos_por_mun, n = 1, by = cod_mun) 

datos_mapa

# A tibble: 8,184 × 3
   siglas cod_mun votos_por_mun
   <chr>  <glue>          <dbl>
 1 PNV    01001            1550
 2 PNV    01002            8284
 3 PNV    01003            1082
 4 PNV    01004            1342
 5 PP     01006              78
 6 PNV    01008             672
 7 PNV    01009             964
 8 PNV    01010            2938
 9 PNV    01011             236
10 PNV    01013             566
# ℹ 8,174 more rows

Creamos la geometría del mapa con un Join

mapa_muni <- 
  esp_get_munic() |> 
  left_join(datos_mapa, by = c("LAU_CODE" = "cod_mun"))

mapa_muni

Simple feature collection with 8184 features and 9 fields
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -13.21926 ymin: 34.70178 xmax: 4.32409 ymax: 43.7889
Geodetic CRS:  ETRS89
First 10 features:
   codauto ine.ccaa.name cpro ine.prov.name cmun                 name LAU_CODE
1       01     Andalucía   04       Almería  001                 Abla    04001
2       01     Andalucía   04       Almería  002             Abrucena    04002
3       01     Andalucía   04       Almería  003                 Adra    04003
4       01     Andalucía   04       Almería  004            Albánchez    04004
5       01     Andalucía   04       Almería  005            Alboloduy    04005
6       01     Andalucía   04       Almería  006                Albox    04006
7       01     Andalucía   04       Almería  007              Alcolea    04007
8       01     Andalucía   04       Almería  008             Alcóntar    04008
9       01     Andalucía   04       Almería  009 Alcudia de Monteagud    04009
10      01     Andalucía   04       Almería  010              Alhabia    04010
   siglas votos_por_mun                       geometry
1    PSOE          1272 POLYGON ((-2.77744 37.23836...
2    PSOE          1458 POLYGON ((-2.88984 37.09213...
3      PP         12470 POLYGON ((-2.93161 36.75079...
4    PSOE           434 POLYGON ((-2.13138 37.29959...
5    PSOE           664 POLYGON ((-2.70077 37.09674...
6    PSOE          5630 POLYGON ((-2.15335 37.54576...
7    PSOE           846 POLYGON ((-3.05663 36.88506...
8      PP           546 POLYGON ((-2.65344 37.33238...
9      PP           200 POLYGON ((-2.27371 37.2416,...
10   PSOE           902 POLYGON ((-2.5425 36.97485,...

Por último, graficamos 📊

mapa_municipio <- 
ggplot(mapa_muni) + 
  geom_sf(aes(fill = siglas), color = NA) + 
  scale_fill_manual(
    values = c(
      'PSOE' = '#e30613',  
      'PP' = '#1e4b8f',    
      'VOX' = '#008000',   
      'PODEMOS' = '#5f457c',  
      'ERC' = '#ffcc00',  
      'PNV' = '#2E8B57',   
      'BNG' = '#4682B4',   
      'OTROS' = '#808080'  
    )
  ) + 
  labs(
    title = "Partido Ganador por Municipio en las Elecciones de 2019",
    fill = "Partido"
  ) +
  theme_minimal() + theme(axis.title = element_text(family = "serif"),
    axis.text = element_text(family = "mono"),
    axis.text.x = element_text(family = "NimbusRom"),
    axis.text.y = element_text(family = "Palatino"),
    plot.title = element_text(family = "serif",
        size = 20, face = "bold"), legend.text = element_text(family = "serif"),
    legend.title = element_text(family = "serif"),
    panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite")) + theme(legend.title = element_text(size = 16))

mapa_municipio

Mapa Provincias

Mapa ganador por provincia en las elecciones de 2019

RESUMEN

datos_mapa_prov <- 
  election_data_tidy |> 
  filter(anno == 2019) |> 
  group_by(codigo_provincia, siglas) |> 
  summarise(votos_totales = sum(votos, na.rm = TRUE)) |> 
  filter(siglas != "OTROS") |> 
  slice_max(votos_totales, n = 1)

datos_mapa_prov

# A tibble: 52 × 3
# Groups:   codigo_provincia [52]
   codigo_provincia siglas votos_totales
   <chr>            <chr>          <dbl>
 1 01               PNV           160664
 2 02               PSOE          288714
 3 03               PSOE          995704
 4 04               PSOE          373792
 5 05               PP            128578
 6 06               PSOE          597712
 7 07               PP            381292
 8 08               ERC          2657292
 9 09               PSOE          255362
10 10               PSOE          354756
# ℹ 42 more rows

GEOMETRÍA

mapa_provi <- 
  esp_get_prov() |> 
  left_join(datos_mapa_prov, by = c("cpro" = "codigo_provincia"))

mapa_provi

Simple feature collection with 52 features and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -13.21924 ymin: 34.70209 xmax: 4.320511 ymax: 43.78793
Geodetic CRS:  ETRS89
First 10 features:
   codauto cpro iso2.prov.code nuts.prov.code ine.prov.name iso2.prov.name.es
1       01   04          ES-AL          ES611       Almería           Almería
2       01   11          ES-CA          ES612         Cádiz             Cádiz
3       01   14          ES-CO          ES613       Córdoba           Córdoba
4       01   18          ES-GR          ES614       Granada           Granada
5       01   21           ES-H          ES615        Huelva            Huelva
6       01   23           ES-J          ES616          Jaén              Jaén
7       01   29          ES-MA          ES617        Málaga            Málaga
8       01   41          ES-SE          ES618       Sevilla           Sevilla
9       02   22          ES-HU          ES241        Huesca            Huesca
10      02   44          ES-TE          ES242        Teruel            Teruel
   iso2.prov.name.ca iso2.prov.name.ga iso2.prov.name.eu cldr.prov.name.en
1               <NA>              <NA>              <NA>           Almería
2               <NA>              <NA>              <NA>             Cádiz
3               <NA>              <NA>              <NA>           Córdoba
4               <NA>              <NA>              <NA>           Granada
5               <NA>              <NA>              <NA>            Huelva
6               <NA>              <NA>              <NA>              Jaén
7               <NA>              <NA>              <NA>            Málaga
8               <NA>              <NA>              <NA>           Seville
9               <NA>              <NA>              <NA>            Huesca
10              <NA>              <NA>              <NA>            Teruel
      cldr.prov.name.es    cldr.prov.name.ca           cldr.prov.name.ga
1  Provincia de Almería  Província d’Almeria        Provincia de Almería
2    Provincia de Cádiz   Província de Cadis          Provincia de Cádiz
3  Provincia de Córdoba Província de Còrdova        Provincia de Córdoba
4  Provincia de Granada Província de Granada        Provincia de Granada
5   Provincia de Huelva  Província de Huelva         Provincia de Huelva
6     Provincia de Jaén    Província de Jaén           Provincia de Xaén
7   Provincia de Málaga  Província de Màlaga         Provincia de Málaga
8  Provincia de Sevilla Província de Sevilla        Provincia de Sevilla
9   Provincia de Huesca     Província d’Osca Provincia de Huesca - Uesca
10  Provincia de Teruel   Província de Terol         Provincia de Teruel
      cldr.prov.name.eu prov.shortname.en prov.shortname.es prov.shortname.ca
1  Almeríako probintzia           Almería           Almería           Almeria
2    Cádizko probintzia             Cádiz             Cádiz             Cadis
3  Kordobako probintzia           Córdoba           Córdoba           Còrdova
4  Granadako probintzia           Granada           Granada           Granada
5   Huelvako probintzia            Huelva            Huelva            Huelva
6     Jaéngo probintzia              Jaén              Jaén              Jaén
7   Málagako probintzia            Málaga            Málaga            Màlaga
8  Sevillako probintzia           Seville           Sevilla           Sevilla
9   Huescako probintzia            Huesca            Huesca              Osca
10  Teruelgo probintzia            Teruel            Teruel             Terol
   prov.shortname.ga prov.shortname.eu nuts2.code nuts2.name nuts1.code
1            Almería           Almería       ES61  Andalucía        ES6
2              Cádiz             Cádiz       ES61  Andalucía        ES6
3            Córdoba           Kordoba       ES61  Andalucía        ES6
4            Granada           Granada       ES61  Andalucía        ES6
5             Huelva            Huelva       ES61  Andalucía        ES6
6               Xaén              Jaén       ES61  Andalucía        ES6
7             Málaga            Málaga       ES61  Andalucía        ES6
8            Sevilla           Sevilla       ES61  Andalucía        ES6
9             Huesca            Huesca       ES24     Aragón        ES2
10            Teruel            Teruel       ES24     Aragón        ES2
   nuts1.name  siglas votos_totales                       geometry
1         SUR    PSOE        373792 MULTIPOLYGON (((-1.640825 3...
2         SUR PODEMOS        813108 MULTIPOLYGON (((-5.144902 3...
3         SUR    PSOE        617560 MULTIPOLYGON (((-4.275964 3...
4         SUR    PSOE        671664 MULTIPOLYGON (((-2.357043 3...
5         SUR    PSOE        378208 MULTIPOLYGON (((-6.195137 3...
6         SUR    PSOE        588300 MULTIPOLYGON (((-2.536215 3...
7         SUR    PSOE        953146 MULTIPOLYGON (((-4.33173 37...
8         SUR    PSOE       1590158 MULTIPOLYGON (((-5.590053 3...
9     NORESTE    PSOE        157664 MULTIPOLYGON (((-0.74093 42...
10    NORESTE    PSOE         88554 MULTIPOLYGON (((0.197914 41...

GRAFICAMOS 📊

partido_colores <- c(
  'PSOE' = '#e30613',
  'PP' = '#1e4b8f',
  'VOX' = '#008000',
  'PODEMOS' = '#5f457c',
  'ERC' = '#ffcc00',
  'PNV' = '#2E8B57',
  'BNG' = '#4682B4',
  'OTROS' = '#808080')

mapa_provincias <- 
ggplot(mapa_provi) +
  geom_sf(aes(fill = siglas), color = "white") + 
  scale_fill_manual(values = partido_colores) +
  labs(title = "Partido Ganador por Provincia en España (2019)",
       fill = "Partido") +
  theme_minimal() + theme(plot.title = element_text(family = "serif",
    size = 22, face = "bold"), legend.text = element_text(family = "serif"),
    legend.title = element_text(size = 16,
        family = "serif"), panel.background = element_rect(fill = "aliceblue"),
    plot.background = element_rect(fill = "antiquewhite"))

mapa_provincias

PSOE VS PP (medios de comunicación)

Qué medios de comunicación estimaron que las elecciones del 2019 las ganaba el PP vs qué medios de comunicación estimarón que ganaba el PSOE.

Filtrar los datos para el año 2019 y los partidos PP y PSOE

medios <- surveys_tidy |> 
  filter(year(date_elec) == 2019 & Partido %in% c("PP", "PSOE")) |> 
  select(c(date_elec, media, Partido, Estimación))

medios

# A tibble: 920 × 4
   date_elec  media       Partido Estimación
   <date>     <chr>       <chr>        <dbl>
 1 2019-04-28 EL ESPAÑOL  PSOE          27.9
 2 2019-04-28 EL ESPAÑOL  PP            17.9
 3 2019-04-28 COPE        PSOE          26.8
 4 2019-04-28 COPE        PP            18.3
 5 2019-04-28 RTVE–FORTA  PSOE          28.1
 6 2019-04-28 RTVE–FORTA  PP            17.8
 7 2019-04-28 ELECTOMANÍA PSOE          25.3
 8 2019-04-28 ELECTOMANÍA PP            20.2
 9 2019-04-28 ELECTOMANÍA PSOE          25.6
10 2019-04-28 ELECTOMANÍA PP            20.1
# ℹ 910 more rows

Dividir los datos entre PP y PSOE

pp_data <- medios |> filter(Partido == "PP")
pp_data

# A tibble: 460 × 4
   date_elec  media           Partido Estimación
   <date>     <chr>           <chr>        <dbl>
 1 2019-04-28 EL ESPAÑOL      PP            17.9
 2 2019-04-28 COPE            PP            18.3
 3 2019-04-28 RTVE–FORTA      PP            17.8
 4 2019-04-28 ELECTOMANÍA     PP            20.2
 5 2019-04-28 ELECTOMANÍA     PP            20.1
 6 2019-04-28 EL ESPAÑOL      PP            18.2
 7 2019-04-28 ELECTOMANÍA     PP            19.9
 8 2019-04-28 ELDIARIO.ES     PP            23.5
 9 2019-04-28 EL CONFIDENCIAL PP            19.8
10 2019-04-28 EL ESPAÑOL      PP            18.4
# ℹ 450 more rows

psoe_data <- medios |> filter(Partido == "PSOE")
psoe_data

# A tibble: 460 × 4
   date_elec  media           Partido Estimación
   <date>     <chr>           <chr>        <dbl>
 1 2019-04-28 EL ESPAÑOL      PSOE          27.9
 2 2019-04-28 COPE            PSOE          26.8
 3 2019-04-28 RTVE–FORTA      PSOE          28.1
 4 2019-04-28 ELECTOMANÍA     PSOE          25.3
 5 2019-04-28 ELECTOMANÍA     PSOE          25.6
 6 2019-04-28 EL ESPAÑOL      PSOE          29.5
 7 2019-04-28 ELECTOMANÍA     PSOE          27.4
 8 2019-04-28 ELDIARIO.ES     PSOE          28.9
 9 2019-04-28 EL CONFIDENCIAL PSOE          29.8
10 2019-04-28 EL ESPAÑOL      PSOE          30  
# ℹ 450 more rows

Unir los datos por ‘media’ y ‘date_elec’ para poder comparar las estimaciones

comparacion <- pp_data |> 
  inner_join(psoe_data, by = c("media", "date_elec"), 
             suffix = c("_PP", "_PSOE"))

comparacion

# A tibble: 14,884 × 6
   date_elec  media      Partido_PP Estimación_PP Partido_PSOE Estimación_PSOE
   <date>     <chr>      <chr>              <dbl> <chr>                  <dbl>
 1 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    27.9
 2 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    29.5
 3 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30  
 4 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.5
 5 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    31  
 6 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.8
 7 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.5
 8 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.2
 9 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    29.9
10 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.1
# ℹ 14,874 more rows

Filtrar los medios donde PP tiene mayor estimación que PSOE
Filtrar los medios donde PSOE tiene mayor estimación que PP

resultado_pp <-
  comparacion |> 
  filter(Estimación_PP > Estimación_PSOE)

resultado_pp

# A tibble: 48 × 6
   date_elec  media       Partido_PP Estimación_PP Partido_PSOE Estimación_PSOE
   <date>     <chr>       <chr>              <dbl> <chr>                  <dbl>
 1 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
 2 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
 3 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
 4 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
 5 2019-04-28 ELDIARIO.ES PP                  23.9 PSOE                    23.7
 6 2019-04-28 ELDIARIO.ES PP                  23.9 PSOE                    23.7
 7 2019-04-28 ELDIARIO.ES PP                  23.9 PSOE                    23.7
 8 2019-04-28 ELDIARIO.ES PP                  23.9 PSOE                    23.7
 9 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
10 2019-04-28 ELDIARIO.ES PP                  23.8 PSOE                    23.7
# ℹ 38 more rows

resultado_psoe <- 
  comparacion |> 
  filter(Estimación_PP < Estimación_PSOE)

resultado_psoe

# A tibble: 14,820 × 6
   date_elec  media      Partido_PP Estimación_PP Partido_PSOE Estimación_PSOE
   <date>     <chr>      <chr>              <dbl> <chr>                  <dbl>
 1 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    27.9
 2 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    29.5
 3 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30  
 4 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.5
 5 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    31  
 6 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.8
 7 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.5
 8 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.2
 9 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    29.9
10 2019-04-28 EL ESPAÑOL PP                  17.9 PSOE                    30.1
# ℹ 14,810 more rows

Seleccionar los medios únicos que favorecen a PP
Seleccionar los medios únicos que favorecen a PSOE

media_con_mas_pp <- 
  resultado_pp |> 
  select(media) |> distinct()

media_con_mas_psoe <- 
  resultado_psoe |> 
  select(media) |> distinct()

media_con_mas_pp

# A tibble: 1 × 1
  media      
  <chr>      
1 ELDIARIO.ES

media_con_mas_psoe

# A tibble: 22 × 1
   media          
   <chr>          
 1 EL ESPAÑOL     
 2 COPE           
 3 RTVE–FORTA     
 4 ELECTOMANÍA    
 5 ELDIARIO.ES    
 6 EL CONFIDENCIAL
 7 LA RAZÓN       
 8 ABC            
 9 ESDIARIO       
10 EL PAÍS        
# ℹ 12 more rows

Análisis de Datos Electorales 🏛️️

Librerías Necesarias y carga de los datos

Filtrar años de 2008 a 2019

Queremos datos en TIDYDATA

Queremos datos en TIDYDATA

Queremos datos en TIDYDATA

Queremos datos en TIDYDATA

Queremos datos en TIDYDATA

Reestructuración de los Partidos Políticos

Filtrar en el dataset Encuestas

Ítems Obligatorios

Pregunta 1

Pregunta 2

Pregunta 3

Pregunta 4

Pregunta 5

Pregunta 6

Mapa Municipios

Mapa Provincias

PSOE VS PP (medios de comunicación)

GRACIAS 😊😊😊

👥 Adolfo Alarcón, Gonzalo Blanca, Pablo Galarón, Mario Gómez, Itziar Carro, Hugo Alonso, Raúl Palomo