--- title: "A quick start guide to rKolada" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{A quick start guide to rKolada} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette provides a quick start guide to get up and running with rKolada as fast as possible. For a more comprehensive introduction to the rKolada package, see [Introduction to rKolada](introduction-to-rkolada.html). ```{r setup} library("rKolada") ``` In this guide we walk you through five steps to download, inspect and search through Kolada metadata. We then use our search results to download data from Kolada and plot it. Kolada stores data in three dimensions: **KPI** (what is measured), **municipality/region** (where), and **year** (when). To download data you specify at least two of these three. See `vignette("introduction-to-rkolada")` for a deeper explanation. ### 1. Get metadata Kolada contains five different types of metadata entities: 1. `kpi`: Key Performance Indicators — the measures themselves (e.g. "Share with post-secondary education") 1. `municipality`: Municipalities and regions, identified by numeric codes (e.g. Stockholm = `"0180"`) 1. `ou`: Operating Units — sub-units within municipalities (e.g. a specific school or elderly home) 1. `kpi_groups`: Thematic groupings of KPIs 1. `municipality_groups`: Thematic groupings of municipalities To obtain data using `rKolada` it is usually a good idea to start by exploring metadata. `rKolada` comes with convenience functions for each of the five above mentioned entities. These functions are all names `get_[entity]()` and can be called as follows. The `cache` parameter allows you to temporarily store results on disk to avoid repeated calls to the API in case you need to re-run your code: ```{r, echo = FALSE} kpis <- rKolada:::kpi_df munic <- rKolada:::munic ``` ```{r, eval = FALSE} kpis <- get_kpi(cache = FALSE) munic <- get_municipality(cache = FALSE) ``` If you have already familiarised yourself with the Kolada API (e.g. by reading the [official docs on GitHub](https://github.com/Hypergene/kolada)) you can access the full metadata API using `get_metadata()`. ### 2. Search metadata Metadata tables are stored as regular `tibble`s so you can start inspecting them by simply viewing them in RStudio. For example, the KPI metadata we downloaded looks like this: ```{r} dplyr::glimpse(kpis) ``` But `rKolada` also comes with a set of convenience functions to simplify the task of exploring KPI metadata. `kpi_search()` filters down a list of KPIs using a search term, and `kpi_minimize()` can be used to clean the KPI metadata table from columns that don't contain any information that distinguish KPIs from each other: ```{r} # Get a list KPIs matching a search for "bruttoregionprodukt" (Gross regional product) kpi_res <- kpis |> kpi_search("bruttoregionprodukt") |> # "K" = data available at municipality level (not just region) kpi_search("K", column = "municipality_type") |> kpi_minimize(remove_monotonous_data = TRUE) dplyr::glimpse(kpi_res) ``` Let's say we are interested in retrieving data for three Swedish municipalities. We thus want to create a table containing metadata about these municipalities: ```{r} munic_res <- munic |> # type "K" = municipality (kommun), vs "L" for region (län) municipality_search("K", column = "type") |> # Only keep Stockholm, Gothenburg and Malmö municipality_search(c("Stockholm", "Göteborg", "Malmö")) dplyr::glimpse(munic_res) ``` ### 3. Describe KPIs In addition to the information provided about every KPI in the `title` and `description` columns of a KPI table, `kpi_bind_keywords()` can be used to create a rough summary of every KPI creating a number of _keyword_ columns. The function `kpi_describe()` can be used to print a huamn readable table containing a summary of a table of KPIs. For instance, by setting the `knitr` chunk option `results='asis'`, the following code renders a Markdown section that is automatically inluded as a part of the HTML that renders this web page: ```{r, echo = TRUE, results='asis'} kpi_res |> kpi_bind_keywords(n = 4) |> kpi_describe(max_n = 1, format = "md", heading_level = 4, sub_heading_level = 5) ``` ### 4. Get data Once we have settled on what KPIs we are interested in the next step is to download actual data from Kolada. Use `get_values()` to do this. To download data from the Kolada API you need to provide at least two of the following parameters: 1. `kpi`: One or a vector of several KPI IDs 1. `municipality`: One or a vector of several municipality IDs or municipality group IDs 1. `period`: The years for which data should be downloaded. The ID tags for KPIs and municipalities can be extracted using the convenience functions `kpi_extract_ids()` and `municipality_extract_ids()`: ```{r, echo = FALSE} kld_data <- rKolada:::kld_data ``` ```{r, eval = FALSE} kld_data <- get_values( kpi = kpi_extract_ids(kpi_res), municipality = municipality_extract_ids(munic_res), period = 1990:2019, simplify = TRUE ) ``` `simplify = TRUE` does two things: replaces numeric IDs with readable names (e.g. `"0180"` becomes `"Stockholm"`) and drops columns that are only used internally. ### 5. Inspect and visualise results Finally, time to inspect our data: ```{r} # Visualise results library("ggplot2") ggplot(kld_data, aes(x = year, y = value)) + # One line per municipality, coloured by name geom_line(aes(color = municipality)) + # Separate panel per KPI (stacked vertically) facet_grid(kpi ~ .) + # Thousand separators for readability scale_y_continuous(labels = scales::comma) + labs( title = "Gross Regional Product", subtitle = "Yearly development in Sweden's three\nmost populous municipalities", x = "Year", y = "", caption = values_legend(kld_data, kpis) # Auto-generated KPI legend ) ``` Note the use of the helper function `values_legend()` to produce a legend containing the names and keys of all KPIs included in the graph. ## Next steps - **Deeper walkthrough** — `vignette("introduction-to-rkolada")` covers the full data model, metadata groups, and more complex workflows. - **ggplot2 reference** — for more on visualisation. ## Related packages If you work with data from PX-Web APIs (Statistics Sweden, Statistics Norway, Statistics Finland, etc.), see [pixieweb](https://lchansson.github.io/pixieweb/) — a sibling package that follows the same design principles as rKolada.