Chapter 1 Introduction
You are here because creating reproducible and accessible R code is important to your job with Birds Canada and ultimately to bird conservation.
The goal of this guidebook is to give you (analytically savvy staff/ contractors/ partners) the tools needed to develop reproducible R code and step-by-step instructions on how to archive your old code or share your existing code and relevant documents in GitHub. Later chapters of this book will discuss working collaboratively using Git version control and GitHub. But do not fret, we will start with the basics first.
While some of the instructions in this book are general guidelines/ best practices, others are mandatory for staff and contractors to ensure analytical scripts are properly stored and accessible for future use.
1.1 Why is this Important?
Someone unfamiliar with your project should be able to look at your R code, data files, and outputs and understand in detail what you did and why. This someone will very likely be your future self! We very often discover flaws in our initial preparation of the data, get access to new data, or decide that the parameterization of a particular model was not correct. This means that the analysis you did last week/ month/ year will probably need to be redone. If you have organized and documented your work clearly, then repeating the analysis will be much, much easier.
Staff turnover is also a reality in all organizations. If you leave Birds Canada for greener pastures, your colleagues/ replacements will also need to know where to find your scripts, files, and outputs. This will better ensure knowledge is not lost and time is not wasted recreating your workflows.
Last, it is increasingly required by funders, partners, and publishers that data and analytical script be made available, such that the FAIR principal of data management and stewardship are followed: Findability, Accessibility, Interoperability, and Reusability. This is also a priority for Birds Canada as we strive to make our work more open and transparent.
1.2 Prerequisites
This book assumes that you have a basic understanding of R and RStudio. Regardless of whether you are new to R or not, we highly recommend that you become familiar with R for Data Science by Garrett Grolemund and Hadley Wickham. Their book covers how to import, visualize, and summarize data in R using the tidyverse collection of R packages. It also provides an invaluable framework for organizing your workflow to create clean, reproducible code.
1.3 Acknowledgements
Dr. Steffi LaZerte provided valuable comments and edits to improve the clarity and usability of this book.
This R book will be updated as needed. If you have any comments or feedback, please submit them to dethier@birdscanada.org.