19 June 2013

Github supports the publishing of web pages and blog posts using the Jekyll rendering engine by simply including a GitHubUserName.github.io repository in your project. Github also supports organizations that can support git repositories, groups of users, and unified management of the two. Each repository in an organization can have its own web pages at a URL like http://OrganizationName.github.io/ProjectName. I'll describe how I did this for CAEN's Github projects.

1 Github Pages

Creating web pages for projects (or repositories) within a Github project is documented in many places on the web—you can Google for it—but this is how I created web pages processed by Jekyll for CAEN's Github Organization.

The general steps to create web pages for CAEN's repos, like http://caen.github.io/hadoop, are:

  1. Create a special branch of the repository called gh-pages that will hold all of the web content and none of the actual project content
  2. Add and configure the Jekyll files to this new repository
  3. Add some content, either plain old web pages or blog posting or both

2 Making a gh-pages branch

The first step is to create an orphan branch in your Git repository and remove all of your content from it. The steps to do this are:

  1. First create the orphan branch called gh-pages
git checkout --orphan gh-pages

This will create the branch and switch to it. You can type git status to make sure you are in the newly created branch.

  1. Remove everything from the gh-pages branch in preparation for adding your web content.
git rm -rf .
  1. Commit all of those changes to that branch with git commit. You can confirm that your project content still exists by switching back to the master branch (git checkout master) and typing ls. After you've satisfied yourself that your git rm's didn't delete your work, switch back to the gh-pages branch (git checkout gh-pages)

3 Adding Jekyll to you branch

3.1 Get the Jekyll files

Now that you have an empty directory, you can add the default Jekyll files to it. The following example:

  1. Clones the Jekyll Bootstrap code into the gh-pages branch
  2. Moves all of the Jekyll files from the Jekyll directory to the top level directory of the gh-pages branch of your project repository
  3. Removes the (now empty) jekyll-bootstrap directory
  4. Adds all of the Jekyll files to this branch of your Git repository
$ git clone https://github.com/plusjade/jekyll-bootstrap.git
Cloning into 'jekyll-bootstrap'...
remote: Counting objects: 1813, done.
remote: Compressing objects: 100% (940/940), done.
remote: Total 1813 (delta 855), reused 1674 (delta 760)
Receiving objects: 100% (1813/1813), 524.41 KiB | 0 bytes/s, done.
Resolving deltas: 100% (855/855), done.
$ mv jekyll-bootstrap/* .
$ \rm -rf jekyll-bootstrap/
$ git add *
$ git commit -m "Adding Jekyll files to gh-pages branch"

3.2 Configuring Jekyll

The configuration for Jekyll pages in _config.yml that are project pages within an organization is different from the user configuration for Jekyll. We'll use CAEN's hadoop Github project as our example

The first set of edits to _config.yml are advised for all Jekyll configurations and set the title, author, email fields for use by the themes.

# Themes are encouraged to use these universal variables
# so be sure to set them if your theme uses them.
title : CAEN Hadoop
tagline: Big Data, little data by little data
author :
  name : CAEN
  email : hadoop-support@umich.edu

The next edit is to set the production_url variable by following the instructions in the _config.yml file:

# Finally if you are pushing to a GitHub project page, include the project name at the end.
#
production_url : http://caen.github.io/hadoop

Continuing to follow the instructions in the _config.yml file, the BASE_PATH is set:

#   A GitHub Project site exists in the `gh-pages` branch of one of your repositories.
#  REQUIRED! Set BASE_PATH to: http://username.github.io/project-name
BASE_PATH : http://caen.github.io/hadoop

I like to turn off comments, analytics, and sharing at the start, only turning them back on when their supporting infrastucture is prepared:

comments :
  provider : false
analytics :
  provider : false
sharing :
  provider : false

After all of the edits are made, you should commit the changes to _config.yml with

git commit _config.yml -m "local edits to _config.yml"

4 Creating and publishing the project pages

Once you have the configuration set, you should remove the sample files that come with Jekyll:

git mv index.md index.md-orig
git rm core-samples/2011-12-29-jekyll-introduction.md
git commit -m "removed sample Jekyll files"

4.1 HTML Pages

Then you can simply add Jekyll+HTML files to that directory; those files are of the format:

  ---
  layout: page
  title: CAEN Hadoop
  tagline: <br>Big Data, little data by little data
  ---
<dl>
  <dt><a href=hadoop-user.html>Hadoop User Documentation</a></dt>
    <dd>This is the documentation for people to use Hadoop and its
    friends like Hive, Pig, Sqoop, etc.</dd>
</dl>

The HTML following the Jekyll header (between the dashed lines) is all of the HTML that would be found between the <body> tags.

4.2 Blog Posts

Creating blog posts is described as part of Blogging with Emacs org-mode and Github Pages and the same process applies here, although you'll want to take a look at the original index.md file that you off-named above to see the Jekyll code that automatically includes blog posts from the _posts directory and include that code in your index.html file.

4.3 Pushing pages to Github

Once you have your HTML pages and blog posts created, you should add them to the repository with git add *.html, commit them with git commit, and push them to github with git push. The first time you push, you have to add the new branch with the command git push origin gh-pages, but after that you can push with a simple git push.

After giving Github a few minutes to process the Jekyll into pages, you can visit your pages at http://YourOrganization.github.io/YourRepo, in the case of the CAEN Hadoop pages, this is http://caen.github.io/hadoop.

5 Working with Projects that already have Github pages

If your organization (in my case, CAEN) has projects that already have Github pages set up, you simple clone the project then track the gh-pages branch, which will allow you to see it, push the gh-pages branch back to generate web content, etc.

First, identify a project with a gh-pages branch by either asking the owner or looking on github.com at the branches for a gh-pages branch. Once you find one, these steps will get you the web content and Jekyll configuration:

1 git clone https://github.com/YourOrg/YourProj.git clone the base project
2 cd YourProj go into the project directory
3 git checkout –track origin/gh-pages track the gh-pages branch

Now you should see the HTML files in the top directory of the gh-pages branch and the posts in the _posts directory. To switch back to the project, check out the master branch with git checkout master.