KA Engineering

KA Engineering

We're the engineers behind Khan Academy. We're building a free, world-class education for anyone, anywhere.

Subscribe

Latest posts

Let's Reduce! A Gentle Introduction to Javascript's Reduce Method

Josh Comeau on July 10

Creating Query Components with Apollo

Brian Genisio on June 12

Migrating to a Mobile Monorepo for React Native

Jared Forsyth on May 29

Memcached-Backed Content Infrastructure

Ben Kraft on May 15

Profiling App Engine Memcached

Ben Kraft on May 1

App Engine Flex Language Shootout

Amos Latteier on April 17

What's New in OSS at Khan Academy

Brian Genisio on April 3

Automating App Store Screenshots

Bryan Clark on March 27

It's Okay to Break Things: Reflections on Khan Academy's Healthy Hackathon

Kimerie Green on March 6

Interning at Khan Academy: from student to intern

Shadaj Laddad on Dec 12, 2016

Prototyping with Framer

Nick Breen on Oct 3, 2016

Evolving our content infrastructure

William Chargin on Sep 19, 2016

Building a Really, Really Small Android App

Charlie Marsh on Aug 22, 2016

A Case for Time Tracking: Data Driven Time-Management

Oliver Northwood on Aug 8, 2016

Time Management at Khan Academy

Several Authors on Jul 25, 2016

Hackathons Can Be Healthy

Tom Yedwab on Jul 11, 2016

Ensuring transaction-safety in Google App Engine

Craig Silverstein on Jun 27, 2016

The User Write Lock: an Alternative to Transactions for Google App Engine

Craig Silverstein on Jun 20, 2016

Khan Academy's Engineering Principles

Ben Kamens on Jun 6, 2016

Minimizing the length of regular expressions, in practice

Craig Silverstein on May 23, 2016

Introducing SwiftTweaks

Bryan Clark on May 9, 2016

The Autonomous Dumbledore

Evy Kassirer on Apr 25, 2016

Engineering career development at Khan Academy

Ben Eater on Apr 11, 2016

Inline CSS at Khan Academy: Aphrodite

Jamie Wong on Mar 29, 2016

Starting Android at Khan Academy

Ben Komalo on Feb 29, 2016

Automating Highly Similar Translations

Kevin Barabash on Feb 15, 2016

The weekly snippet-server: open-sourced

Craig Silverstein on Feb 1, 2016

Stories from our latest intern class

2015 Interns on Dec 21, 2015

Kanbanning the LearnStorm Dev Process

Kevin Dangoor on Dec 7, 2015

Forgo JS packaging? Not so fast

Craig Silverstein on Nov 23, 2015

Switching to Slack

Benjamin Pollack on Nov 9, 2015

Receiving feedback as an intern at Khan Academy

David Wang on Oct 26, 2015

Schrödinger's deploys no more: how we update translations

Chelsea Voss on Oct 12, 2015

i18nize-templates: Internationalization After the Fact

Craig Silverstein on Sep 28, 2015

Making thumbnails fast

William Chargin on Sep 14, 2015

Copy-pasting more than just text

Sam Lau on Aug 31, 2015

No cheating allowed!!

Phillip Lemons on Aug 17, 2015

Fun with slope fields, css and react

Marcos Ojeda on Aug 5, 2015

Khan Academy: a new employee's primer

Riley Shaw on Jul 20, 2015

How wooden puzzles can destroy dev teams

John Sullivan on Jul 6, 2015

Babel in Khan Academy's i18n Toolchain

Kevin Barabash on Jun 22, 2015

tota11y - an accessibility visualization toolkit

Jordan Scales on Jun 8, 2015

Meta

Babel in Khan Academy's i18n Toolchain

by Kevin Barabash on Jun 22, 2015

We've been using ES6 (along with JSX) for sometime at Khan Academy. Right now, we're using jstransform to compile our ES6 and JSX code to ES5, but we'd like to switch to babel. Some of the reasons for doing this include:

  • better support for ES6 + ES7
  • allows us to use eslint, making it easier for open source contributors to lint their code and run the tests in projects such as perseus.

i18n Workflow

Our i18n workflow on the frontend uses a custom plugin for jstransform which converts certain JSXElements into special function calls.

input:

<$_ first="Hayao" last="Miyazaki">
    Hello, %(first)s %(last)s
</$_>
<$i18nDoNotTranslate>var x = 5;</$i18nDoNotTranslate>

desired output:

$_({ first: "Hayao", last: "Miyazaki" },
    "Hello, %(first)s %(last)s!"
);
$i18nDoNotTranslate("var x = 5;");

While babel has support for JSX, it transforms all JSXElements into calls to React.createElement(). This would result in the following incorrect output:

actual output:

React.createElement(
    $_,
    { first: "Motoko", last: "Kusanagi" },
    "Hello, %(first)s %(last)s!"
);
React.createElement($i18nDoNotTranslate, null,
    "var x = 5");

Plugin

Before we can switch to babel, we need to customize babel's output when it encounters <$_> or <$i18nDoNotTranslate> tag. We can use babel's plugin architecture.

It's relatively straight forward. Each plugin is a node module which exports a single function which returns a babel.Transformer instance. babel.Transformer takes two arguments: the name of the transformer as a string and an object containing callbacks.

module.exports = function (babel) {
    var t = babel.types;
    return new babel.Transformer("i18n-plugin", {
        JSXElement: function (node, parent, scope, file) {
            // inspect node, parent, scope, etc.
            // construct a tree and return its root

            // example:
            // construct a new "CallExpression"
            // assumes callee and args exist
            var call = t.callExpression(callee, args);

            // copy the location from the source node
            // so that line numbers can be maintained
            call.loc = node.loc;
            return call;
        }
    }
};

After the JavaScript source is parsed, babel will run the callback on each node it finds in the AST of the specified type. An AST (Abstract Syntax Tree) is a tree structure where each node represents a part of the syntatic structure of a piece of code such as statements, expressions, identifiers, literals, etc. The keys for the object should be one of the node types listed in the babel source. This list of nodes extends Mozilla's original Parser API.

Some notes about the example:

  • babel.types provides functions for creating new nodes
  • babel also supports calling on exit, or calling on both enter and exit if needed
  • full source code for the plugin as available in Khan/i18n-babel-plugin.

Matching Output

When developing this plugin it was important that we match the output we were getting from jstransform so that babel could be a drop-in replacement without having to modify other parts of our build chain. In particular we needed to ensure that we were maintaining both line numbers in compiled code as well as whitespace within translation strings.

Line Numbers

Maintaining line numbers is important because not all of our build chain is source map aware. In particular kake, our custom build system, does not know how to deal with source maps. Babel's "retainLines" options takes care of this for us.

We did however find one issue with "retainLines". If a method call had 3 or more arguments then Babel would ignore "retainLines" and pretty print it so that each argument was on a separate line. Babel's maintainer sebmck was quite responsive and provided an update within a couple of hours.

Whitespace

As for whitespace within localized strings, any changes in the whitespace means that the string is essentially a different string which means that that string would need to be re-translated into different languages for all our localized sites.

In order to make sure that our Babel plugin produces calls to $_() with the same strings as jstransform we need to compare all of the JavaScript strings. One of our build steps generates a .pot file (used by Gettext http://en.wikipedia.org/wiki/Gettext) containing all of the strings on the site that need to be localized. We generated .pot files using both the jstransform and babel workflows and compared them using a python script.

The script uses polib to parse the .pot files generated by the two workflows and iterate through the entries. It looks at the occurrences property to pick out the items that came from javascript and creates a dict from msgid->entry.

example.pot:

#: modules/user/views_handler_filter_user_name.inc:29
msgid "Enter a comma separated list of user names."
msgstr ""
#: modules/user/views_handler_filter_user_name.inc:112
msgid "Unable to find user: @users"
msgid_plural "Unable to find users: @users"
msgstr[0] ""
msgstr[1] ""

We then compared the two dicts and looked for differences in occurrences or strings. There were a few discrepancies in line numbers which had to be investigated manually. It turned out that the jstransform line numbers were off by a line from the source line numbers. While this was not an issue, there were quite a few strings that weren't the same. Close inspection of these revealed that the differences were differences in whitespace.

Various patterns of carriage returns and spaces were producing the differences in whitespace. Creating test cases (and fixes) for a few of these situations and then re-running our string comparison script allowed us to quickly narrow the large number of mismatched strings into a relatively few test cases. Below are two fixtures used by the harness which compiles input.jsx using our babel plugin and compares the output against expected.js.

test/fixtures/i18n-line-feed/input.jsx:

 1 var a = <$_>hello,
 2         world!
 3         </$_>;
 4 var b = <$_>
 5 
 6         hello,
 7         world!</$_>;
 8 var c = <$_>
 9         {"hello, "}
10         world!
11         </$_>;
12 var d = <$_>
13 hello, world!</$_>;

test/fixtures/i18n-line-feed/expected.js:

 1 var a = $_(null, "hello, world!");
 2 
 3 
 4 var b = $_(null, "hello, world!");
 5 
 6 
 7 
 8 var c = $_(null,
 9 "hello, ", "world!");
10 
11 
12 var d = $_(null, "hello, world!");

Issues

We also wanted to make sure that all of JavaScript was being compiled correctly before rolling out these changes to all of our developers. We had already refactored our build scripts to compile our ES6 and JSX files so that we could extract localizable strings.

let

We started with manual testing. The homepage wasn't loading. Uh-oh. Investigation revealed that the compiled code contained the let keyword which most browsers don't support. What's weird about this is that we didn't use let in any of source code. Where was it coming from?

In the new build script we specify a whitelist of transformers for babel to use. This list is conservative. We wanted to match the functionality of jstransform and then adopt other features on an "as needed" basis. Here's the initial list of transformers we were using:

  • es6.arrowFunctions
  • es6.classes
  • es6.destructuring
  • es6.parameters.rest
  • es6.templateLiterals
  • es6.spread
  • es7.objectRestSpread

After doing some hunting I found out that some of the es6 transfomers actually desugar ES6 to other ES6. In this case the es6.classes transformer was producing code with let.

source.js:

class MyAwesomeClass { ... }

compiled.js:

let MyAwesomeClass = function() { ... }

The fix was pretty simple, add es6.blockScoping to the whitelist.

functionName transformer shadows globals

The next issue we ran into was with a seemingly innocuous method. Here's the full mixin to give some context:

set-interval-mixin.js:

var SetIntervalMixin = {
    componentWillMount: function() {
        this.intervals = [];
    },
    setInterval: function setInterval(fn, ms) {
        this.intervals.push(setInterval(fn, ms));
    },
    componentWillUnmount: function() {
        this.intervals.forEach(clearInterval);
    }
};

It adds a setInterval method to other classes and makes sure that the intervals are cleaned up with the component unmounts.

The issue is that setInterval was being transformed to this:

setInterval: function setInterval(fn, ms) {
    setInterval(fn, ms);
}

By default babel turns anonymous function expressions into named function expressions. In most cases this wouldn't be an issue, but in this case the named function shadows the global setInterval. When the setInterval method is called on the object it ends up calling itself. The second time it's called, this is bound to window and it blows up.

This issue was fixed after I erroneously reported it as a React bug and Ben Alpert correctly reported it as a babel bug and Sebastian McKenzie, maintainer of babel, fixed it.

Summary

We're looking forward to use babel so that we can leverage the power of ES6's new features. Babel's plugin architecture is easy and helped maintain our i18n workflow without a lot of work. The minor issues that did crop up were quickly resolved.

Thanks

We'd like to thank babel's maintainer Sebastian McKenzie for the quick turnaround when it came to dealing with issues in babel. Also, Ben Alpert was helpful in pointing out edge cases we hadn't thought about.