A Minimal Introduction to GIS (in R)
2025-02-28
1 Overview
This is a minimal introduction to GIS and handling spatial data in R compiled for the Biological Sciences BSc(Honours) class at the University of Cape Town.
1.1 General
The goal is to give you a very brief introduction to Geographic Information Systems (GIS) in general and some familiarity with handling spatial data in R. GIS is a field of research that many people dedicate their entire lives to, yet we only have a week, so this really is a minimalist introduction. I’ll focus on giving you a broad overview and some idea of how to teach yourself (using R).
The core outcomes I hope you’ll come away with:
- Some familiarity with GIS and what it can help you achieve
- Some familiarity with GIS jargon and technical terms
- Highlight some of the common problems and pitfalls when using GIS
- Some familiarity with handling spatial data in R
- Some hints and resources to help you teach yourself R
- Some idea of how to help yourself or find help when you inevitably come unstuck…
These course notes borrow or paraphrase extensively from Adam Wilson’s GEO 511 Spatial Data Science course, Manny Gimond’s Intro to GIS & Spatial Analysis and the 2020 series of GIS Lecture Lunches by Thomas Slingsby and Nicholas Lindenberg from UCT Library’s GIS Support Unit.
Other very valuable resources include:
- Lovelace et al’s online book Geocomputation with R
- Ryan Garnett’s cheatsheet for library(sf)
- Pebesma and Bivand’s Spatial Data Science
All code, images, etc can be found here. I have only used images etc that were made available online under a non-restrictive license (Creative Commons, etc) and have attributed my sources. Content without attribution is my own and shared under the license below. If there is any content you find concerning with regard to licensing, or that you find offensive, please contact me. Any feedback, positive or negative, is welcome!
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
1.2 Using this resource
I’ve included quite a bit of demonstration R code in Chapters 7 and 8. To run the code you will need:
- to have working R and RStudio installations - this tutorial may help if needed. To have it integrated with GitHub, you can follow this tutorial.
- to install the following R packages (you can copy and run the code):
- required:
install.packages(c("tidyverse", "sf", "lwgeom", "terra", "stars", "exactextractr"))
- optional, but handy for some visualizations:
install.packages(c("cowplot", "hrbrthemes", "knitr", "leaflet", "htmltools", "rosm", "ggspatial", "rnaturalearth", "mapview", "tmap"))
- required:
- to download the datasets discussed in section 7.1
1.3 Module and Project details
Lectures
Lectures will be held in the mornings between 10:00AM to 12:00PM Monday to Thursday in BIOLT1.
Friday will be lightning talk presentation day - also 10:00AM to 12:00PM in BIOLT1.
I will be available 1-2:30PM on Wednesday and 2-3PM on Thursday afternoon as a “help desk” to assist you refine your projects and troubleshoot issues.
Projects
Afternoons are self-study time where you will incrementally develop your own individual GIS project in R and RMarkdown or Quarto. The project will count 70% of your mark for the module and will be due on Thursday the 6th March.
You will need to do your analyses in an RMarkdown (.Rmd) file, which when you knit
it will create an HTML document. Make sure all your code chunks are visible. Please submit your project as a link to a public Git Repository that contains the .Rmd and .html files. In the README to your Git repo, provide a link to the stitched HTML file via https://htmlview.glitch.me/ or https://raw.githack.com/ (see the end of the README in this repo as a demonstration). This will allow you to see the rendered HTML document in a browser without having to download it.
Note that the repo does not have to be reproducible (although you’re welcome to), but the code should be visible, you should state what you set out to do and what datasets you used (citing their sources and/or providing links to where you got them from).
Do not upload large or sensitive data to the repository - GitHub won’t handle large data. If you want/need to have the data in your local version of the Git repo on your laptop, then put it in a folder named data
or similar and add a line to your .gitignore
file with the name of the folder, e.g. data/*
- see example here. You can just download the example file and add it to your Git repo if you didn’t create a .gitignore when you created the repo.
The focus of the assignment is essentially setting up a GIS workflow and will be assessed on whether you’ve absorbed the content of the lectures. The topic and datasets used will be up to you, but again, this is a great opportunity to get a kick start on your Honours projects. If your Honours project doesn’t require GIS (which is unlikely) then either help a buddy or let your curiosity roam wild! This is a teaching exercise, so it doesn’t even have to be based on biological data, but it would help you to explore some of the data sources suggested below.
The project objectives, broken down as daily goals to help you pace yourselves this week:
- Day 1:
- Define a question that requires GIS, the kinds of data you’ll need to address the question, and describe in words what you think you would need to do with the data to get there.
- Find and describe the datasets you’ll need for your analysis (type, source, how created, etc).
- Day 2:
- Describe the GIS workflow you think you’ll need to perform your analysis (in words and/or a figure). Reconsider and refine your datasets.
- Day 3:
- Translate your workflow into the R functions you think you’ll need to use and begin coding and running the GIS workflow in R.
- Day 4:
- Keep working on the code.
- Develop your 1 slide lightning talk presentation.
- Day 5:
- Present your lightning talk
Lightning talks:
You’ll do lightning presentations (30% of your mark for the module) on your GIS projects (1 slide, 2 minutes presentation, 1 minute questions). Don’t worry, you won’t lose marks if your projects are not yet complete! We just want to know what you’re doing your project on, what datasets you’re using, and what you plan to do with them. Please read the instructions in the Google slide deck and add your slide.
Examples from previous years:
Some sources of local data to help you get started. Feel free to look for others! If you find good ones, let me know and I’ll add them.
- SANBI’s “Biodiversity GIS” - https://bgis.sanbi.org/
- SANBI’s Botanical Database of South Africa (BODATSA) - http://newposa.sanbi.org/
- SAEON’s Data Catalogue - http://catalogue.saeon.ac.za/
- City of Cape Town’s Open Data Portal - https://odp.capetown.gov.za/
- iNaturalist - https://www.inaturalist.org/ (accessible from R - see section 7.7)
- GBIF - https://www.gbif.org/ (accessible from R - see rgbif)
- Google Datasets Search - https://datasetsearch.research.google.com/
- The STAC Index - mostly for access to Cloud-optimized geotiffs (COGs)
Make sure to check the data use policies and make sure you have permission to use the datasets!!!