Phoenician colonization
R-bloggers 2025-09-17
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I was reading Phoenician colonization from its origin to the 7th century BC (Manzano-Agugliaro et al. 2025) and thought it was an interesting dataset, but alas: it is split in four tables, behind a javascript redirect (wtf Taylor & Francis?) and with DMS coordinates (including typos and special characters)… So not easily reusable.
Let’s go build an accessible dataset.
Config
library(readr)library(purrr)library(dplyr)library(stringr)library(ggplot2)library(forcats)library(janitor)library(sf)library(rnaturalearth)library(glue)library(parzer)library(leaflet)sf_use_s2(FALSE)knitr::knit_hooks$set(crop = knitr::hook_pdfcrop)
Data
We need to manually download the CSVs (parts 1, 2, 3 and 4) because there is an antiscraping mechanism… Then a little cleaning and coordinates parsing with the very nice {parzer} package let us build a spatial object with {sf}.
sources = list( c_10_bce = "data_raw/T0001-10.1080_17445647.2025.2528876.csv", c_09_bce = "data_raw/T0002-10.1080_17445647.2025.2528876.csv", c_08_bce = "data_raw/T0003-10.1080_17445647.2025.2528876.csv", c_07_bce = "data_raw/T0004-10.1080_17445647.2025.2528876.csv")phoenician <- sources |> imap(\(f, c) { read_csv(f) |> mutate(century_start_bce = parse_number(c))}) |> list_rbind() |> clean_names() |> mutate(lon = parse_lon(str_replace(longitude_e, "−", "-")), lat = parse_lat(str_replace(latitude_n, ",", "."))) |> st_as_sf(coords = c("lon", "lat"), crs = "EPSG:4326")
Maps
The resulting layer, mapped on a Natural Earth background, seems good.
world <- ne_countries() |> st_intersection(phoenician |> st_bbox() |> st_as_sfc() |> st_buffer(4, joinStyle = "MITRE", mitreLimit = 10))phoenician |> ggplot() + geom_sf(data = world) + geom_sf(aes(color = fct_rev(as_factor(century_start_bce)))) + theme_void() + labs(title = "Phoenician colonies", subtitle = "10th c. BCE - 7th c. BCE", color = "from\n(century BCE)", caption = glue("data doi:10.1080/17445647.2025.2528876 https://r.iresmi.net/ {Sys.Date()}")) + theme_minimal() + theme(plot.caption = element_text(size = 6), plot.background = element_rect(fill = "white"))

You want more interactivity? Using {leaflet}…
phoenician |> leaflet() |> addTiles(attribution = r"( <a href="https://r.iresmi.net/">r.iresmi.net</a>. data: Manzano-Agugliaro et al. 2025. doi:10.1080/17445647.2025.2528876; map: <a href="https://www.openstreetmap.org/copyright/">OpenStreetMap</a>)") |> addCircleMarkers(popup = ~ glue("<b>{settlement}</b><br /><br /> from {century_start_bce}th c. BCE \\ {if_else(!is.na(centuries_of_subsequent_permanence), paste0('<br />to ', centuries_of_subsequent_permanence), '')}"), clusterOptions = markerClusterOptions())
Export
We can build a clean Geopackage (and a CSV just in case):
phoenician |> st_write( "data/phoenician_settlements.gpkg", layer = "phoenician_settlements", layer_options = c( "IDENTIFIER=Phoenician colonization from its origin to the 7th century BC", glue("DESCRIPTION=Data from: Manzano-Agugliaro, F., Marín-Buzón, C., Carpintero-Lozano, S., & López-Castro, J. L. (2025). \\ Phoenician colonization from its origin to the 7th century BC. Journal of Maps, 21(1). \\ https://doi.org/10.1080/17445647.2025.2528876 Available on https://doi.org/10.5281/zenodo.17141060 Extracted on {Sys.Date()} – https://r.iresmi.net/posts/2025/phoenician")), delete_layer = TRUE, quiet = TRUE)phoenician |> select(-c(latitude_n, longitude_e)) |> bind_cols(st_coordinates(phoenician)) |> rename(lon_wgs84 = X, lat_wgs84 = Y) |> st_drop_geometry() |> write_csv("data/phoenician_settlements.csv")
And lastly we store them in a public repository; they are now available on Zenodo and therefore even have a doi:10.5281/zenodo.17141060
References
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.