Before class
- Remind students to setup a GitHub account and email the instructor their username.
- Setup class organization at Github.
- Add students’ username to organization with “create repo” permissions and respond with link to organization in email.
For class
- Download Gaeta_etal_CLC_data.csv.
- Arrange to have a teaching partner attend class and
push
the following code for the ‘Collaborating’ demo.
ggplot(fish_data_cat, aes(x = scalelength, fill = length_cat)) +
geom_histogram()
- Open the following links in a browser and zoom in to make the images fill the screen.
Introduction
Motivation
- Who has a directory on their computer with a bunch of filenames
- Get rid of messy folders and track changes to things like data files and code in a more manageable way.
Benefits of version control
- Track changes (but better)
- Tracks every change ever made in groups called commits
- Every commit stores the full state of all of your files at that time
- Never lose anything
- Revert or restore to any commit
- Easily unbreak your code/data/manuscript
- No more file name changes
- Tracks every change ever made in groups called commits
- Collaboration
- Work on things simultaneously
- See what changes others have made
- Everyone has the most recent version of everything
- Work on things simultaneously
Version control using Git & RStudio
Create a Git repo
- Navigate to Github in a web browser and login.
- Click the
+
at the upper right corner of the page and chooseNew repository
. - Choose the class organization (e.g.,
fish497-2018
) as theOwner
of the repo. - Fill in a
Repository name
that follows the formFirstnameLastname
. - Select
Initialize this repository with a README
. - Click
Create Repository
.
Connect to the Git repo in RStudio
- From new GitHub repository, click green
Clone or download
button -> Click theCopy to clipboard
button. - In RStudio, File -> New Project -> Version Control -> Git
- Paste copied URL in
Repository URL:
. - Leave
Project directory name:
blank; automatically given repo name. - Choose where to
Create project as subdirectory of:
. - Click
Create Project
. - Check to make sure you have a
Git
tab in the upper right window.
First commits
Commit data
- Download the data file Gaeta_etal_CLC_data.csv to your project directory.
- Git -> Select
Gaeta_etal_CLC_data.csv
. - Commit with message.
Add fish size and growth rate data
- History:
- One commit
- Changes too large to see
Commit R script
- Read in data to new R script.
fish_data = read.csv("Gaeta_etal_CLC_data.csv")
- Save as
fish-analysis.R
. - Git -> Select
fish-analysis.R
.- Changes in staged files will be included in next commit.
- Can also see changes by selecting
Diff
- Commit with message.
Start script comparing fish length and scale size
- History:
- Two commits
- See what changes were made to
fish-analysis.R
Building a history
fish-analysis.R
doesn’t currently show on theGit
tab- No saved changes since last commit
- Add some more code to
fish-analysis.R
- Create new categorical size column
library(dplyr)
fish_data_cat = fish_data %>%
mutate(length_cat = ifelse(length > 200, "big", "small"))
- Save
fish-analysis.R
. - Now we see the file on the
Git
tab.M
indicates that it’s been modified.
- To commit these changes, we need to stage the file.
- Check the box next to
fish-analysis.R
.
- Check the box next to
- Commit with message.
Add categorical fish length column
- History:
- Three commits
- Each
fish-analysis.R
commit shows the additions we made in that commit.
- Modify this code in
fish-analysis.R
- Change category cut-off size
fish_data_cat = fish_data %>%
mutate(length_cat = ifelse(length > 300, "big", "small"))
- Save file -> stage -> commit
Change size cutoff for new column
- Green sections for added lines, red for deleted
- Git works line by line.
- The previous version of the line is shown as deleted.
- The new version of the line is shown as added.
Do Exercise 2 - First Solo Commit, Exercise 3 - Second Solo Commit, and Exercise 4 - Commit Multiple Files.
Instructor also do exercises
Git as a time machine
Experiment with impunity
fish_data_cat = fish_data %>%
mutate(length_cat = ifelse(length > 300, "large", "small"))
Save
and show changes are staged-
More
->Revert
->Yes
- Get previous state of a file
History
-> select commit ->View file @ ...
Delete with impunity
- Close the upper left window with the
fish-analysis.R
. - Choose the
File
tab in the lower right window. - Select
fish-analysis.R
->Delete
->Yes
- Stage deleted file ->
More
->Revert
->Yes
GitHub Remotes
- So far we’ve worked with a local
Git
repository. - One of the big benefits of version control is easy collaboration.
- To do this, we synchronize our local changes with a remote repository called
origin
. - Our remote repository is on GitHub.
- By far the most popular hosted version control site
- Public and private hosted repositories
- Private free for students and academics
- https://education.github.com/
Push to a remote
-
Push
sends your recent commits to theorigin
remote. -
Before a
Push
your commits show in your local history but not on the remote.
Show local commit history and lack of history in remote.
- To
Push
to your remote, select thePush
button at the top of theGit
tab. - Now your changes and commit history are also stored on the remote.
Show local commits now on
origin
.
Have students slack a link to their repo to once they have finished Pushing Changes
The instructor should then commit the following code to their repo with the commit message:
Plot histogram of scale length by fish categorical size
ggplot(fish_data_cat, aes(x = scalelength, fill = length_cat)) +
geom_histogram()
Collaborating
- Big advantage to remotes is easy collaboration
- Avoids emailing files and shared folders where you are never sure if you actually have the most recent version
- Makes it easy to see what collaborators have done
- Automatically combines non-overlapping changes
- While I’ve been talking, a collaborator has added a plot of scale size and fish length to the code.
Show
origin
with collaborator commit.
Add collaborator local repo to diagram and
pull
arrow fromorigin
to locals.
Pull
the changes from the remote repo with thePull
button on the Git tab
Show updates to history following
Pull
and run code
Do Tasks 3-6 in Exercise 6 - Pulling and Pushing.
- Collaborating on Github can get more complex with “forks” and “branches.
Optional: Redraw diagram with local,
origin
, andupstream
. Arrows fromorigin
to/fromupstream
are pull requests and merges.
Show an example of a working repository with branches and forks. Navigate to pull requests.