The Music of the Algorithms

Thoughts on Computing, the Universe, and Everything

Embrace good typography on the Web

Published on January 9, 2016. Typography, LaTeX, Web

The modern web sup­ports beau­tiful graph­ics, re­spon­sive lay­outs, in­ter­ac­tiv­ity, even 3D graph­ics, but ty­pog­raphy usu­ally re­mains in the back­ground. With this site I wanted to ex­per­i­ment with the tools made avail­able by the latest HTML5 and CSS3 tech­nolo­gies to ob­tain print­-level ty­pog­raphy of my blog posts.

Ty­pog­raphy on the web is often over­looked. Web sites usu­ally seem to not care about the read­ability of their con­tents, and I think this has to do with the per­ceived dif­fi­culty in ob­taining some­thing dif­ferent from the de­fault, with re­sults con­sis­tent across browsers. I am not an ex­pert of web tech­nology at all. How­ever, when I first thought about the re­design of my per­sonal blog, which re­sulted in the site you are cur­rently read­ing, I chose to keep the layout sim­ple, fo­cusing in­stead on trying to ex­ploit avail­able web tech­nolo­gies to ob­tain the best pos­sible ty­po­graph­ical re­sults. This post sur­veys what I wanted to do and what I’ve learnt trying to do it.

Fonts

In the dig­ital ty­pog­raphy world, when one asks for qual­ity, the an­swer is \(\LaTeX\). I love it. One of its major draw­backs, how­ever, is that it is by de­sign in­trin­si­cally tied to the paper medium. Some at­tempts to adapt \(\TeX\) and \(\LaTeX\) to other media such as web pages ex­ist, but they are clumsy at best. What a web site can do, how­ever, is to em­bedd the same ty­po­graph­ical el­e­ments and guide­lines that make \(\LaTeX\) output look so good.

The first el­e­ment in this re­gard is the font used to render the text. Thanks to this web­site, I’ve man­aged to use Com­puter Modern, the same font used by de­fault in \(\LaTeX\) doc­u­ments. I use it in both serif and san­s-serif vari­ants, and it looks in­cred­ibly good. This has been pos­sible thanks to the CSS3 stan­dard that now sup­ports custom web fonts. The sup­ported font for­mat, of course, varies be­tween browsers, but for­tu­nately the font comes nicely pack­aged in all the dif­ferent for­mats needed to sup­port the major browsers. As a rule of thumb, all the web site el­e­ments are ren­dered in san­s-serif, while the con­tent’s text para­graphs are ser­ifed.

In ad­di­tion to the main font, in a Com­puter Sci­ence blog it is im­por­tant to also cor­rectly and nicely typeset source code text. I’ve choosen Hack, an open source mono­space font de­signed to render source code on the screen. You can see how it looks later in this post. It is, by the way, also the font that I reg­u­larly use in my text ed­i­tors.

Para­graphs layout

An­other im­por­tant as­pect of a good looking doc­u­ment is the para­graph jus­ti­fi­ca­tion. Web sites tend to not jus­tify text, leaving every­thing left­-aligned, which looks very bad to me. Jus­ti­fi­ca­tion is, how­ever, not easy at all to get right. Al­though the CSS3 stan­dard na­tively sup­ports text jus­ti­fi­ca­tion (it suf­fices to specify a text-align: justify prop­er­ty), it is com­pletely use­less without proper sup­port for hy­phen­ation. Jus­ti­fied but non hy­phen­ated para­graphs look very weird, be­cause the browser has to put too much space be­tween words.

Un­for­tu­nately, while the sup­port for au­to­matic hy­phen­ation of text the­o­ret­i­cally ex­ists (with a text-hyphenation: auto prop­er­ty), cur­rent browsers do not sup­port it. What they do sup­port is the text-hyphenation: manual prop­erty, which is en­abled by de­fault. The manual hy­phen­ation con­sists in putting a ­ char­acter at the right point within words, in order to tell the browser where the word can be bro­ken. This mech­a­nism works well in any browser, but it means that all the text of the page has to be pre-hy­phen­ated and filled with those char­ac­ters. This is, I think, the reason why no­body does it, but thanks to Hakyll, the pre­pro­cessing soft­ware tool that I used to create this web­site, this is not so hard after all (more on this in a later post).

The dif­fer­ence from how the same para­graph would be ren­dered by \(\LaTeX\) is still vis­i­ble, since the al­go­rithms used by the browser to layout the para­graphs in real time do not pro­duce op­timal line breaks, but the re­sult is def­i­nitely worth the ef­fort.

An­other minor de­tails are worth men­tioning re­garding the para­graphs. As you can see, this post is typeset as usual in books and pa­pers, with no space be­tween para­graphs. To aid the eye rec­og­nizing para­graphs breaks, a little in­den­ta­tion is put at the be­gin­ning of each para­graph in­stead. The first para­graph, on the other hand, does not need to be vi­su­ally sep­a­rated from the ti­tle, so it is not in­dented. This is achieved quite easily in CSS by spec­i­fying the text-indent prop­erty with an ad­ja­cency se­lec­tor:

1
2
3
p + p {
  text-indent: 15px;
}

Spacing be­tween lines is left to the de­fault values set by Boot­strap. A re­ally nice touch, sug­gested by The El­e­ments of Ty­po­graphic Style Ap­plied to the Web, is that of syn­chro­nizing the rithm of the text para­graph with that of the menu items in the side bar. What does this mean? If you look at the be­gin­ning of the post you can no­tice that the spacing be­tween the el­e­ments of the page are so that the para­graph lines on the left happen to be ver­ti­cally aligned to the menu items on the right. This de­tail is im­por­tant to re­duce vi­sual clut­ter, and it is en­sured by care­fully spec­i­fying every ver­tical dis­tance as a func­tion of the line height com­puted by Boot­strap from the font size.

Math­e­matics type­set­ting

Math­e­mat­ical type­set­ting is what made LaTeX so heavily used in sci­en­tific en­vi­ron­ments. What can be done in a web page to get sim­ilar re­sults? We can sim­u­late LaTeX from scratch, of course! That is what the au­thors of MathJax must have been thought when they de­signed this awe­some li­brary. All the math­e­mat­ical text and equa­tions in this site are typeset by Math­Jax. Every­thing I had to do to en­able it was to load the script from the CDN:

1
2
3
4
<script
  type="text/javascript"
  src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML.js">
</script>

Fol­lowing this tag, MathJax au­to­mat­i­cally in­ter­cepts any oc­cur­rence of a pair of dollar signs in the HTML source code and trans­lates the con­tents into a typeset equa­tion using a ful­l-fledged im­ple­men­ta­tion of the TeX math layout al­go­rithm.

Things to im­prove

Con­sid­ering the fact that I haven’t touched a line of HTML/CSS in al­most a decade, I’m pretty sat­is­fied of the re­sults of my ex­per­i­ment. How­ever, there are still some de­tails to fix. The first item in the TODO list is to fix the hor­i­zontal scrolling of source code list­ings, which cur­rently does not work well as you can see above.

Then, the next big thing is a CSS sheet for printing posts. It should be rather easy, but it gets time to craft the de­tails. An­other problem is that I am not fully sat­is­fied by the spacing be­tween para­graphs and sec­tion ti­tles, and that of code blocks and im­ages. Also, even if the au­to­matic hy­phen­ation mech­a­nism works very well to­gether with the text jus­ti­fi­ca­tion, the au­to­matic hy­phen­ation of words is not per­fect. The al­go­rithm, im­ple­mented by a Haskell li­brary used by Hakyll, does not take into ac­count some small de­tails, such as that words should not be hy­phen­ated on the first syl­la­ble, as it hap­pens for the word be-tween some lines above. I’m afraid this can only be fixed by looking at the li­brary’s source code, but I will never have enough spare time to do it. For every­thing else, any sug­ges­tion is wel­come!