A Minimal Introduction to GIS (in R)
This is a minimal introduction to GIS and handling spatial data in R compiled for the Biological Sciences BSc(Honours) class at the University of Cape Town.
The goal is to give you a very brief introduction to Geographic Information Systems (GIS) in general and some familiarity with handling spatial data in R. GIS is a field of research that many people dedicate their entire lives to, yet we only have a week, so this really is a minimalist introduction. I’ll focus on giving you a broad overview and some idea of how to teach yourself (using R).
The core outcomes I hope you’ll come away with:
- Some familiarity with GIS and what it can help you achieve
- Some familiarity with GIS jargon and technical terms
- Highlight some of the common problems and pitfalls when using GIS
- Some familiarity with handling spatial data in R
- Some hints and resources to help you teach yourself R
- Some idea of how to help yourself or find help when you inevitably come unstuck…
These course notes borrow or paraphrase extensively from Adam Wilson’s GEO 511 Spatial Data Science course, Manny Gimond’s Intro to GIS & Spatial Analysis and the 2020 series of GIS Lecture Lunches by Thomas Slingsby and Nicholas Lindenberg from UCT Library’s GIS Support Unit.
Other very valuable resources include:
All code, images, etc can be found here. I have only used images etc that were made available online under a non-restrictive license (Creative Commons, etc) and have attributed my sources. Content without attribution is my own and shared under the license below. If there is any content you find concerning with regard to licensing, or that you find offensive, please contact me. Any feedback, positive or negative, is welcome!
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
- to have working R and RStudio installations - this tutorial may help if needed
- to install the following R packages (you can copy and run the code):
install.packages(c("tidyverse", "sp", "rgdal", "raster", "sf", "lwgeom", "terra", "stars", "exactextractr"))
- optional, but handy for some visualizations:
install.packages(c("cowplot", "hrbrthemes", "knitr", "leaflet", "htmltools", "rosm", "ggspatial", "rnaturalearth", "mapview", "tmap"))
- to download the datasets discussed in section 7.1
Lectures will be held in the mornings between 10:00 to 12:00 Monday to Thursday in BIOLT1. Thursday afternoon, 2-4PM, will be presentation day (in person, ESKOM willing…).
Afternoons are self-study time where you will incrementally develop your own individual GIS project in R and RMarkdown or Quarto. The project will count 70% of your mark for the module and will be due on Thursday the 16th March. You will need to submit a .Rmd or .Qmd file and stitched HTML notebook. You’re welcome to do this in a Git Repository (nudge nudge), now that we’ve completed the Reproducible Research module.
The focus of the assignment is essentially setting up a GIS workflow and will be assessed on whether you’ve absorbed the content of the lectures. The topic and datasets used will be up to you.
Pro tip: Use this as an opportunity to get a kick start on your Honours projects. If your Honours project doesn’t require GIS (which is unlikely) then either help a buddy or let your curiosity roam wild! This is a teaching exercise, so it doesn’t even have to be based on biological data, but it would help you to explore some of the data sources suggested below.
The project objectives, broken down as daily goals to help you pace yourselves:
- Day 1:
- Define a question that requires GIS, the kinds of data you’ll need to address the question, and describe in words what you think you would need to do with the data to get there.
- Find and describe the datasets you’ll need for your analysis (type, source, how created, etc).
- Day 2:
- Describe the GIS workflow you think you’ll need to perform your analysis (in words and/or a figure). Reconsider and refine your datasets.
- Day 3:
- Translate your workflow into the R functions you think you’ll need to use and begin coding and running the GIS workflow in R.
- Day 4:
- Until the 16th March: Iterate over the previous steps until done!
I will be available 2-3PM on Wednesday afternoon and 10-11AM on Friday morning as a “help desk” to assist you refine your projects and help troubleshoot issues.
Lightning talks: you’ll do online lightning presentations (30% of your mark for the module) on your GIS projects (1 slide, 2 minutes presentation, 1 minute questions) on Thursday afternoon (2PM, 9th March). Don’t worry, you won’t lose marks if your projects are not yet complete! We just want to know what you’re doing your project on, what datasets you’re using, and what you plan to do with them. Please read the instructions in the Google slide deck and add your slide.
Some sources of local data to help you get started. Feel free to look for others! If you find good ones, let me know and I’ll add them.
- SANBI’s “Biodiversity GIS” - https://bgis.sanbi.org/
- SANBI’s Botanical Database of South Africa (BODATSA) - http://newposa.sanbi.org/
- SAEON’s Data Catalogue - http://catalogue.saeon.ac.za/
- City of Cape Town’s Open Data Portal - https://odp.capetown.gov.za/
- iNaturalist - https://www.inaturalist.org/ (accessible from R - see section 7.7)
- Google Datasets Search - https://datasetsearch.research.google.com/
Make sure to check the data use policies and make sure you have permission use the datasets!!!