Monday, May 21, 2018

Documentation for today's programmer

When creating documentation, whether for a project, lab, or technical, you would have run into the challenge of needing to move that documentation to different formats: PDF is a popular one, but perhaps also HTML, particularly if your company uses a wiki for such things. At nsquared, we found that this movement of documents can get frustrating, not only because they do not always come across cleanly, but also because if they are different, you then have to maintain a bunch of different documents. Time to solve this, using tools which are freely available: Markdown, Pandoc, and PowerShell.

The solution is reasonably simple. You can still write your document up in your favourite word processor, however, keep the formatting to a minimum (avoid anything more complex than bold, italics, and hyperlinks; also, you can add images, but do not do it in your word processor). Once you have your file ready, save it out as a .docx, so that we can get underway in earnest. The first part covers converting your document to the Markdown format.

Steps to convert from docx to Markdown:
1. Download and install Visual Studio Code.

  • - We will use this to edit your document later, but essentially this will be your go to program very soon.
2. Download and install the Pandoc installer (download the latest Windows ‘x86_64.msi’ file).
  • - Pandoc is a freely available program online, which will handle the conversion of your documents. It supports a host of outputs, including: docx, HTML, Markdown, PDF, latex, and txt, just to name a few.
3. If you are running Windows 10, you will already have PowerShell available to you. This solution is written for Windows, though is transferable to Apple Mac, via the use of Terminal. Once you are ready, launch PowerShell (found by typing 'power' into the search of the Windows menu).
4. You now need to navigate PowerShell to the location of your document. Generally, it will start in your user folder (C:\Users\YourAccount). You can use the 'cd' command, plus the path of your document to get there quickly:

  • - Type: cd 'C:\Users\YourAccount\Documents'.
  • - Make sure to replace the path section with the location of your document (the above uses 'Documents' as that location). 
5. With the above completed, you will notice that the path that PowerShell is using is what you just typed - this means that it is now using this location as the point from which to execute commands.
6. Now it is time to utilise Pandoc. In PowerShell, type: pandoc 'YourDoc.docx' -f docx -t markdown -s -o 'YourNewDoc.md'

  • - Make sure to substitute 'YourDoc' with the name of your current document, and update 'YourNewDoc' to have a name which you want, for your converted file.
  • - If you want to know about the commands available for Pandoc, make sure to visit their documentation page.
7. With that all in place, press the Enter key on your keyboard to run the command.
8. Your document will be converted to Markdown (though retaining the original document, though you will not need it by the end, so do with it what you like).
9. You have successfully converted your document from .docx to Markdown, the next part is to update your document using Markdown.


Steps to update your Markdown document for easy conversion:
1. Open Visual Studio Code. Once open, you will be presented with a (mostly empty) window. This might look familiar in part, if you have used any other Visual Studio program; Visual Studio Code is the light weight version, and it is remarkably powerful, and allows you to easily write in many programming languages.





2. Click File > Open File, and then browse your files for your new Markdown file (we are opening it in VS Code).
3. With your file now open in Markdown, you will notice that it is looking very plain. This is the power of Markdown, it utilises only the most basic of formatting, however, this allows it to easily be converted into other formats.
4. Here is an excellent cheat sheet of how Markdown works. Have it at the ready, for the next few steps.
5. Now you will need to open the preview page (which shows you what your document will look like with Markdown applied). With your document open, navigate to the top right, and click the split window icon, with the magnifying glass in front of it:





6. You will be presented with a panel to the right of your Markdown document, which is showing you what the output will be. You will notice, all the Markdown tags (#/*/---/```) are gone, and just plain text appears, with light formatting.





7. Now, using the cheat sheet as a guide, update your Markdown document, so that it presents how you would like it.
8. With your Markdown complete, close Visual Studio Code, because, you are ready for further conversion!

Steps to convert your Markdown to HTML:
1. Open PowerShell once more, and navigate to your Markdown document:

  • - Remember to use the ‘cd’ command, and the path to your document – cd 'C:\Users\YourAccount\Documents'
2. Once PowerShell is in the same location as your Markdown document, use the following command to convert from Markdown to HTML:
  • - Pandoc 'YourNewDoc.md' -f markdown -t html -s -o 'YourNewWebpage.html'
3. Your document is now in HTML! If you had images and set them up correctly (according to the cheat sheet), they will have come across cleanly, creating a ‘media’ folder along the way, to use with your new webpage.

As you will realise, from now on, you simply need to maintain your Markdown document, and then you can convert it; as mentioned previously, this works for PDF and docx too, so you can always produce those formats if you need them.

This is the start of your Markdown journey, though, expect to continue and go further. Through using Pandoc, and PowerShell, you could put together a PowerShell script to automatically convert your latest Markdown document to HTML, so that you can keep working on your documents, without worrying about the export process. This is an excellent workflow and may help you increase efficiency!

Elliot Moule

No comments: