Forks and Pull Requests in GitHub - ProfHacker - The Chronicle of Higher Education

ProfHacker 2013-03-26

Today we’ll continue our series of postings on GitHub. In the first posting I introduced GitHub, pointed to some of the previous postings here at ProfHacker that have talked about it, and went through the steps of setting up a basic repository. Last week, we looked at the most common workflow for working with GitHub as a version control platform for text, and showed how you could directly edit text files through the GitHub website, instead of in your offline copy of a repository.

From what we have seen so far, GitHub is a place to sync a repository of texts, publicly share these text files for free, and provides a complete solution for version control of text-based projects. I mentioned in a previous posting that private repositories require a paid account but the education liaison at GitHub, John Britton, recently pointed out to me that it is possible for students and educators to get a special educational account that bypasses some of the restrictions of the free account.

These features alone already make GitHub interesting, but would not suffice for me to boldly predict that its innovations may have a significant impact on collaborative writing in the future. As I mentioned in the first posting of this series, GitHub allows a form of collaboration without collaborating. If Facebook and Twitter are social networks based upon mutual or asymmetrical relationships between users (“those you went to school with” and “those you wish you went to school with” as the two networks are occasionally described), then GitHub is a social network which allows the creation of relationships between texts through a process of replication. While users on GitHub can “follow” each other, as you would on Twitter, you can “star” projects that you interested in or “watch” their progress over time. Any public repository can also be very easily “cloned,” which downloads the project to your computer, complete with the hidden “.git” directory that contains its full history.

Forking

The real “social” move on GitHub is not simple cloning. It begins with the “fork.” Unlike a simple clone, a “fork” is a special form of replication on GitHub that retains a connection to its originating repository. If there is a project you like, such as a syllabus, an article, some code, the German Bundesgesetze, or official datasets from the city of Chicago and you want to adopt, modify, and then share any changes you make to that repository, all you have to do is press the “fork” button on the top right of the repository’s main page. It will then be copied to your own list of repositories using a different icon to indicate that it is a fork. A link back to the original is shown whenever you navigate to your fork. Unlike giving a “star” to a project, which is similar to a Facebook “like” button, when you fork a GitHub project, you are making a somewhat different statement. You are not just saying, “I could use what you wrote,” since a clone is sufficient for that purpose. You are saying, “I could use your text and want to improve it.”

As with a downloaded clone, a forked repository lives independently of the project it was replicated from. If I fork, for example, some Zotero Workshop materials by Zotero wizard Sebastian K., that fork would live on in my GitHub account, along with a history of all its edits, even if Sebastian decides to take his repositories down. It is thus important to remember that when you put something on GitHub or any open git repository, you are not just releasing a text into the wild, but its entire genealogy. This may be unnerving to many, but a powerful embrace of transparency to others. Of course, the genealogy exists only at the level of the “commit,” and if you continually rework a new paragraph before “committing” a complete version to a draft in the repository, the history will only show the final addition of the paragraph, not the struggles of its composition. If your first commit of a new text document is a complete first draft, no record of the many revisions you made up to that point will be saved, but future drafts committed to the repository will reflect aggregate changes.

The Pull Request

Once you have your fork, you can use the GitHub client on your own computer to download it, modify it, and commit back to your account. You can happily edit it directly online as described in my last posting. There is no need for you to ever interact with the original project, except to abide by any attribution or other license requirements of the original text. For example, let us say Lincoln’s GitHub syllabus for a course on “Nineteenth-Century American Religion: A Digital History Seminar” caught my eye. I could fork his syllabus and drastically change his course policies, keep half of the readings he selected that I liked, and then proceed to build my own course on a related topic but perhaps a different scope. In the future, when others go looking for some good starter syllabi to build up their own course, they then have the option of starting off by forking Lincoln’s original, or forking my fork, depending on what they thought fit their needs better.

As this took place, Lincoln would note in his GitHub account that his “fork” count for the syllabus repository went up, and he could visit my repository to see what kinds of interesting things I have been doing with it, but we wouldn’t necessarily ever come into direct communication…

…which would be a shame, since the real power of GitHub happens when forks talk back. The way they talk back is the “pull request.” The pull request is the tool that allows a complete outsider to contribute to your work. More surgical than a suggestion sent to you in the form of a comment on a blog entry or an email, a pull request arrives in your GitHub account as a very detailed request to “merge” a proposed change from someone into your original text. The “commit message” is usually where the proposed change is described and justified by the user who made the request and if you like the change, it takes just a simple click to merge the change into your original, not unlike accepting a proposed edit in Word’s track changes. To make your own pull request, fork a project, make a change, commit the change, and when you return to the main page of your fork press the “pull request” button to propose your change back to the original author. An even quicker way is to “edit” a document directly on a user’s GitHub project page. When you save the changes, the project will be forked if you have not already done so, and you will immediately be presented with an option to make a pull request.

In the programming world, the word “fork” was once a sad word, because it described that often tragic moment when an open source software project split into factions. Each faction would take their code and march in different directions, splitting the collaborative labor of a community of coders. Some code would get incorporated back into each fork but it was not usually a smooth process. With GitHub, the idea of a “fork” has a very different feel. It is a kind of currency of legitimacy. Stars are great, but when you see your project is getting forked, it is a sign that your text is alive and potentially evolving out there, in the crowd. These forks may never call home with proposed pull requests to improve your original, but they still represent a pool of potential contributors to your own project or, if you are losing interest in your creation, potential successors who will carry on the flame. If they do go in a different direction and become more popular than your original, you can still take pride in having started the ball rolling, then either continue to develop and improve your own version or decide to contribute with your own pull requests. Whatever you decide, your contributions – every line of text you composed or changed is there, preserved (unless more advanced git commands are used to change the record of the past) in the history of the repository.

In my next posting I’ll talk about some resources related to git and GitHub, then follow with a posting on the limits of GitHub for writing and scholarship in the humanities.

Have you ever forked and issued pull requests to a GitHub project? Also, as mentioned in the last posting, please feel free to mention things you would like to see covered in future postings of this series.

Image “Collabocats” in the Octodex collection by GitHub under terms here.