Browser Mechanics In My Own Words, Part 2: The Browser is An Interpreter

Posted July 15, 2018 in Research, Tutorials+Tips

So far in our story of the browser, the browser has received bytes of data from a network HTTP request and decoded them into Unicode code points, an encoding required by web standards. Our ultimate goal, remember, is to “turn bytes into trees” so that the browser has the right instructions for rendering. If you’d like to step back and read it, Part 1 is here.

What’s next? What does a browser do with a bunch of U+74 U+68 U+69 U+73? Well…it parses those units in order to make meaning from them. That’s a big topic, and as I was researching, I had a very cool aha! moment that really helped solidify my mental model for HTML and CSS as programming languages. So, before we get into the details of parsing I want to take a small detour and set the stage.

As we discussed in Part 1, the bytes a browser receives are decoded to HTML, CSS, and JS. The thing is, those languages are not something a computer or browser can understand. This is a fundamental problem in computer science: the need to transform the code we humans understand into instructions a computer can execute.

A metaphor

Think of a time when you read some “legalese” or other dense, jargon-heavy text that was difficult or impossible to understand. Maybe you thought, “Ugh, I wish this was written in plain language!”. This is the same way a computer feels (except computers don’t have feelings…right?) when it sees the instructions we write in programming languages. A computer can’t do anything with the code we write, there must be an additional step that reads that code for the computer.

In computer science, the notion of reading refers to translating that jargon-heavy text into plain language. Instead of plain language, however, a computer wants machine code a.k.a. binary 1s and 0s or hexidecimal bytes, depending on the machine. Yep, those bytes are the same format we saw coming to the browser through HTTP.

So, all code we write – no matter the language – must eventually become numerical machine code that hardware can understand. The further the language is from machine code, the higher level the language is – on the flip-side, languages that are closer to machine code are lower level languages. Program instructions written in HTML and CSS are very high-level, while instructions in Rust or C++ – languages used in browser development – are lower-level languages and “closer to the metal“, so to speak.

Interpreters vs. Compilers

There are are two main terms used to describe the process for translating instructions written in a programming language to instructions in machine code: compiler and interpreter.

If you are in the web development world, like me, you may have heard the term “compiler” in the context of Sass and other pre-processors; the Sass compiler transforms the Sass we write into CSS the browser can read. I can’t say I have a specific web development association with the term interpreter, however.

Let’s take a look at a few definitions for both, starting with compiler:

A compiler takes entire program and converts it into object code which is typically stored in a file.

Geeks for Geeks

A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language).

Wikipedia

Sure, that makes sense and fits right in with my understanding of a Sass compiler. Now, some definitions for interpreter:

An interpreter directly executes instructions written in a programming or scripting language without previously converting them to an object code or machine code.

Geeks for Geeks

In computer science, an interpreter is a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program.

Wikipedia

Wait…what? Wouldn’t that be impossible? Isn’t translation to machine language fundamental to how computers work? Does an interpreter skip that step?! No, it certainly doesn’t skip that step, but it’s nuanced.

A compiler has a tangible, static outcome: it takes an input of instructions written in a programming language and outputs those instructions as machine code into a different file. A compiler does not execute your code, an interpreter does.

So, an interpreter both reads and executes a programmer’s instructions, and – based on my research – the steps in that process vary quite a bit from language to language. Historically, the difference was that a compiler read and compiled an entire program, while an interpreter went through a program line-by-line, reading and executing each in turn.

“Interpreter” in today’s array of programming languages, however, seems to be a bit of a catch-all term. It’s not a traditional compiler? It’s an interpreter…and note that many interpreters contain compilers!

When used in the context of describing programming languages, one could say that Sass is a compiled programming language in the same way that C++ is a compiled programming language. Python is an interpreted programming language. JavaScript is compiled and interpreted.

What about HTML and CSS? Hmm…I don’t want to answer that question quite yet.

I like this sentiment from Matt Esch on this StackOverflow thread: Remember, using the terms compiled and interpreted to describe a programming language is describing the implementation of the programming language, not characteristics of language itself.

Back to metaphors

In our jargon-y text metaphor at the beginning of this post, a compiler would be a friend who wrote out a translation of the jargon and handed you the plain language version to read. An interpreter would be a friend who read and translated the text line by line for you.

You could also watch this awesome, vintage video about aliens as compilers and interpreters!

What does this have to do with a browser?

Oh gosh, everything! The idea for a browser didn’t just “poof” out of nowhere. Sir Tim Berners-Lee, a computer scientist and inventor of the World Wide Web, was very aware of compilers and interpreters. The term compiler was coined all the way back in the 1950s by Grace Hopper, and interpreter shortly thereafter by Steve Russell.

The concept of an interpreter has everything to do with the browser…because that’s what it is! At a high level, Sir Tim Berners-Lee built an interpreter for HTML. The browser does a lot of other things, too, such as providing an interface for browsing.

Via HTTP requests, our computers receive program instructions – from anyone, anywhere (wow!) – that the browser reads and executes on our own, personal computers. HTML, CSS, and JavaScript are the programming languages interpreted by the browser. The web is a domain for sharing programs written in these languages. So cool!

Interestingly, browsers today also contain compilers called Just-in-time compilers (JITs) that afford the flexibility of interpreters along with the speed of compilation. JIT compilers compile and cache the source code in parts, rather than compiling all of it at once. I have a feeling this will come up more in a different post, so for now I will leave you with Lin Clark’s excellent, illustrated explanation of JIT compilers and a list of resource for TurboFan, a compiler included in Google’s JavaScript engine, V8.

(Thanks to Sarah for cluing me in on JITs ^^!)

A question and an answer

When I wrote this post, I thought: wouldn’t websites be a lot faster if HTML, CSS, and JS were compiled on servers, then bytes of machine code – bytecode –were sent over HTTP, immediately executable on our personal computers? Why do we send bytes of source code that must be interpreted? Why isn’t the browser more like an HTML, CSS, and JS compiler? 

My first thought was that this is makes the web is open. If we sent machine code to browsers, there would be no “View Source”. Another reason of more dire consequence is security. What if there was a bunch of malicious nonsense in those bytes and the receiving computer just ran it, no questions asked? Further, different machines have different processors so sending executable bytecode would mean that all processors would have to read the same bytes, and that’s just not how it is.

Those reasons, plus the existence of JIT compilers and the fact that parsing isn’t all that cumbersome means we won’t be seeing HTML/CSS compilers anytime soon.

(Thanks to Emilio for clearing that ^^ up for me!)

With that, I’m going to close up Part 2! As of now, the plan is for Part 3 to be a deep dive into the steps in interpreting HTML, CSS, and JS (and many other programming languages!): parsing, tokenization, and tree creation.

Sources

Comments

What do you think? Do you have any questions, thoughts, or related links to share? Did I make a mistake in my post?

Submit a Comment