September 2020

Creating a blog in Node.js using Git to store our blog posts

See the full code - Programmer skill required: Basic git knowledge

Welcome to my very first blog post! Today I'd like to share step by step how I created this very blog from scratch, and how you can too using Node.js and Git.

Requirements

I've wanted a blog for a while, and I had basic considerations for this blogging software:

  1. It could take either Markdown or HTML

  2. I can create/edit blogs on any machine (that I trust) without SSH

  3. Publishing posts is a 1 step action with no server downtime (basic Continuous Deployment)

  4. It can be customizable (I'm a programmer!)

I thought, why not just store my blog posts in a public git repo? I would still need a server to pull and serve my latest blog posts from this remote git server.

Great! Lets get to it.

Node.js

I like starting a web service as basic as possible. First, lets start with a node server that can respond to requests and turn markdown into HTML. Name this file app.js

const markdown = require('markdown-it')();
const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
  const html = markdown.render(contents);
  res.send("<html><body>" + html + "</body></html>");
});

app.listen(port, () => console.log(`blog running on http://localhost:${port}`))

The first two lines tell us we're using a library called markdown-it and express, so don't forget to run these commands in your terminal to install these libraries and have them saved to our package.json file:

> npm install --save markdown-it
> npm install --save express

If you don't yet have Node.js and npm installed, the above commands will fail. You can download Node and NPM here at the official www.nodejs.org website.

More modern Node backend frameworks exist like koa or sails, but for this example we'll stick to the battle-tested express framework.

Now, lets run our server. In the terminal, type:

> node app.js

Navigating to localhost:3000 in the browser should reveal our marked-down hello world page.

markdown hello world page

Awesome! If we were content with simplicity we could maintain our blog by expanding this line: const html = markdown.render(contents);

It would work okay but we're missing basic things like styling, our code wouldn't be modular, and publishing new blog posts will require a full server restart to take effect. If we deploy our server on a cloud computing service, it would take quite a bit of effort to change the file on our server and restart the server manually.

Git

Isomorphic-git is a powerful git library in Node.js which supports async/await functions, which we'll explore shortly. Lets go ahead and install this into our project.

> npm install --save isomorphic-git

And in our app.js file, add this starting on line 5 next to our other dependencies:

const git      = require('isomorphic-git');
const http     = require('isomorphic-git/http/node');
const fs       = require('fs');

Next, these helper functions can be also added to our app.js:

async function clone() {
  const remote = 'https://github.com/siriusastrebe/blog.git';
  const dir = './blog';
  console.log(`Git - Clone ${remote} to ${dir}`);

  await git.clone({
    fs,
    http,
    dir: dir,
    url: remote
  });
}

async function pull() {
  const dir = './blog';
  console.log(`Git - Pull ${dir}`);

  await git.pull({
    fs,
    http,
    dir: dir,
    ref: 'master',
    author: {name: 'siriusastrebe', email: ''},
  });
}

These two functions will clone and pull our remote repository. Eventually you will want to change the remote repository const remote = 'https://github.com/siriusastrebe/blog.git'; to your own repo. But for now we can keep it as a pre-populated blog repo. You can take a look at its contents just by copy/pasting https://github.com/siriusastrebe/blog.git into your browser window.

Note that the function calls say async function. This tells the Node.js compiler that these functions will be waiting on long tasks, like network requests. You can see what lines specifically it will be waiting on, they're marked with an await. Node.js will let other pending code execute while it waits on these long tasks. It also means that the function returns as a Promise. If that sounds confusing, don't worry, they're easy to work with as async functions.

So far these functions are never called. When our server starts, we want to make sure our blog git repository is cloned into our directory. Having them up-to-date is nice as well, so lets make it so on each request we pull the latest contents of our git repository.

Remove these lines from our app.js:

- app.get('/', (req, res) => {
-   const html = markdown.render(contents);
-   res.send("<html><body>" + html + "</body></html>");
- });
- 
- app.listen(port, () => console.log(`Gitblog running on http://localhost:${port}`))                                                                                                  

Replace it with these lines:

clone().then(pull).then(runServer);

function runServer() {
  app.get('/', async (req, res) => {
    await pull()

    const dir = await fs.promises.readdir('./blog/posts/');

    const promises = dir.map(async (d) => {
      return fs.promises.readFile('./blog/posts/' + d, 'utf8');
    });

    const contents = await Promise.all(promises);

    res.send(markdown.render(contents.join('\n\n')));
  });

  app.listen(port, () => console.log(`blog running on http://localhost:${port}`));
}

What's going on here? await poll() makes sure our git repository is up-to-date with the remote master branch. The next line calls readdir which will give us an array of filenames inside the directory. The .map() function is a little trickier, but it just returns an array of promises, each one promising to readFile().

How do we get all of these promises to be fulfilled? With the next line const contents = await Promise.all(promises); Promises will delay their execution until they are triggered by an await or a .then(), or in our case a Promise.all(). Promise.all() will wait until all promises resolve successfully.

Alright lets test it out. Restart our server again using the command:

> node app.js

markdown barely working git blog

What do we see? This post! We're making great progress but we're not quite there yet.

Final Touches

Scrolling down reveals more posts, but they're sorted from oldest first to newest. We want it the other way around. We could also use some styling. Lets give it some by adding an HTMLFormat() function with this CSS, inspired by better...website.com.

Here's the full code in all its glory:

const markdown = require('markdown-it');
const express  = require('express');
const app      = express();
const port     = 3000;
const git      = require('isomorphic-git');
const http     = require('isomorphic-git/http/node');
const fs       = require('fs');

let lastPull;

// Fresh repository pull
clone().then(pull).then(runServer);

function runServer() {
  app.get('/', async (req, res) => {
    const page = Number(req.params.page) || 0;

    await poll()
    const dir = await fs.promises.readdir('./blog/posts/');

    const sorted = dir.sort((a, b) => a < b ? 1 : -1);

    const promises = sorted.map(async (d) => {
      return fs.promises.readFile('./blog/posts/' + d, 'utf8');
    });

    const contents = await Promise.all(promises);

    res.send(HTMLFormat(contents, dir));
  });

  app.listen(port, () => console.log(`blog running on http://localhost:${port}`));
}

// ----------------------------------------------------------------
// Helper functions
// ----------------------------------------------------------------
async function clone() {
  const remote = 'https://github.com/siriusastrebe/blog.git';
  const dir = './blog';
  console.log(`Git - Clone ${remote} to ${dir}`);

  await git.clone({
    fs,
    http,
    dir: dir,
    url: remote,
    singleBranch: true
  });
}

async function pull() {
  const dir = './blog';
  lastPull = new Date();
  console.log(`Git - Pull ${dir}`);

  await git.pull({
    fs,
    http,
    dir: dir,
    ref: 'master',
    author: {name: 'siriusastrebe', email: ''},
    singleBranch: true
  });
}

async function poll() {
  if (new Date() - lastPull > 60000) {
    return await pull();
  }
}

function HTMLFormat(posts, filenames) {
  let html = "<!DOCTYPE html><html><body>";

  posts.forEach((post, i) => {
    html += formatFile(post, filenames[i]);
  });

  html += "</body>";
  html += "<style type=\"text/css\">body{margin:40px auto;max-width:800px;font-size:18px;color:#333;padding:0 10px;}h1,h2,h3{line-height:1.2}pre{background:lightyellow;overflow:auto;padding:0 20px}</style>";
  html += "</html>";
  return html;
}

function formatFile(contents, filename) {
  if (filename.substring(filename.length - 3) === '.md') {
    return markdown.render(contents);
  } else if (filename.substring(filename.length - 5) === '.html') {
    return contents;
  }
}

I've also added in a poll() function, which limits the number of pull requests to the remote git server to once every minute. This is crucial in the event of a Denial of Service attack, otherwise every request to our webserver would result in a request to github. The DoS'ed would become the DoS'er.

Excellent work. We now have a fully operational blog just like that. Adding a new post is as simple as committing a new markdown file to our Git repo. We could still use things like page headers, search, pagination, and code cleanliness considerations like moving our CSS and HTML templating to separate files. I'll be exploring those topics in upcoming posts.

With less than 100 lines, we've created ourselves a beautiful minimalist blog, ready for production.

Other considerations

  • Markdown and HTML are supported, but what about other formats? LateX is a great format that we could utilize. What about PDFs?

  • One option I considered was using github pages for a blog. This blog would be limited to static pages. One person built theirs using github pages and Jekyll. Jekyll adds another step that needs to be automated for Continuous Deployment. What if instead you skip the jekyll step and use client-side javascript on a github pages to make AJAX/fetch queries to your own github pages? You could maintain directory listings, implement previous/next buttons. There's a lot of great options down this rabbit hole but without a true backend server, we can't implement interactive elements like comments, voting, or comprehensive search.

  • Isomorphic-git actually supports using Git on the browser! In their notes they state,

    "Unfortunately, due to the same-origin policy by default isomorphic-git can only clone from the same origin as the webpage it is running on. This is terribly inconvenient, as it means for all practical purposes cloning and pushing repos must be done through a proxy."

    Despite the limitation the idea of accessing the contents of a git repo directly from the browser is quite novel, and could be used as an alternative to traditional REST APIs for sharing/communicating data between server and client. Similar to pouchDB or firebase.