HTML escaping: the great equalizer. Even Kings must escape their HTML! (Lest they be exposed to XSS attack vectors.) If you have some web development experience under your belt, you are probably pretty familiar with this. Everyone else, pay attention in the next section.
What is HTML escaping?
In HTML, there are a handful of super special characters, namely: <
, >
, &
, and "
. When using these characters, it's important to let HTML know if you want to use them normally, or in their special capacity.
Say I have an image tag.
<img src="photo.jpg" alt="Indonesian Phrase of the Day: "Selamat pagi"" />
Do you see the problem? The first double quote inside the quotes could be interpreted as the closing quote for the alt tag, and not a quotation marker. The correct version of this is:
<img src="photo.jpg" alt="Today's phrase: "Selamat pagi"" />
Or what about the following somewhat contrived sentence:
If 3<x and x>y and y<4 then x and y must not be integers.
Are we opening a tag, x, with two HTML5 style attributes, and and x? No, but it might look like that to a computer.
The good news is that modern browsers are actually quite intelligent. In most cases, the browser understands what you meant, even when you don't properly escape your HTML. Of course, if we are using content submitted from untrusted sources, the issue becomes more serious. Say we collect user submitted comments and put them in a paragraph tag, like so.
<p>{{ comment }}</p>
This is fine, unless the user submits a comment like this:
<script>something terrible...</script>
Woops. Moral of the story, anytime you insert content into a template, whether it's from a trusted or untrusted source, better safe than sorry: escape your HTML. For how to do that, continue below.
How to escape HTML in Liquid
You can escape HTML using the escape
filter. So the following:
{{ 'html & escaping "go" <together>' | escape }}
Becomes:
html & escaping "go" <together>
Escaping our site
Let's revisit our site and use HTML escaping where necessary. Where is it necessary? Anywhere where content is coming from a back-office editable field that isn't already in an HTML format.
Open the app/views/pages/index.liquid
page. First, let's replace the keywords and description meta tags with the code below.
<meta name="keywords" content="{{ site.meta_keywords | escape }} {{ page.meta_keywords | escape }}">
{% if page.meta_description %}
<meta name="description" content="{{ page.meta_description | escape }}">
{% else %}
<meta name="description" content="{{ site.meta_description | escape }}">
{% endif %}
Next, replace the title tag with the code below.
<title>
{{ page.title | escape }} - {{ site.name | escape }} |
{% if page.seo_title %}
{{ page.seo_title | escape }}
{% else %}
{{ site.seo_title | escape }}
{% endif %}
</title>
Two more places that need fixing. The site title needs to be escaped.
<h1>{{ site.name | escape }}</h1>
As does the page heading.
<h2>{{ page.title | escape }}</h2>
Some of the editable content on the about page needs to be escaped. We don't need to escape the about_body
field, because it uses the rich, HTML editor. That means its content will already be in HTML format. However, the photo's caption uses raw text, so that must be escaped.
Let's escape the alt tag's content in snippets/img_box.liquid
so we don't have to worry about it again on other pages.
<div class="img-box">
<a href="{{ photo | resize: '800x>' }}">
<img src="{{ photo | resize: size }}" alt="{{ caption | escape }}" />
</a>
<p>{{ caption | escape }}</p>
</div>
The final portion of the site that needs escaping is the site title in the footer snippet.
<p>
© {{ site.name | escape }} {{ now | localized_date: '%Y' }}
</p>
Finishing up
This mini-chapter went by quickly. Let's save our changes to git.
$ git add app/views/pages/index.liquid
$ git add app/views/snippets/footer.liquid app/views/snippets/img_box.liquid
$ git commit -m "Added proper HTML escaping."
Layering on HTML escaping onto the beginning tutorials would have been a little tedious, so I put off covering it until now. Now that it's out of the way, we can forge ahead to more exciting waters: content models.
Next: adding content types