Khan Engineering

Khan Engineering

We're the engineers behind Khan Academy. We're building a free, world-class education for anyone, anywhere.


Latest posts

Making Websites Work with Windows High Contrast Mode

Diedra Rater on March 21

Kotlin for Python developers

Aasmund Eldhuset on Nov 29, 2018

Using static analysis in Python, JavaScript and more to make your system safer

Kevin Dangoor on Jul 26, 2018

Kotlin on the server at Khan Academy

Colin Fuller on Jun 28, 2018

The Original Serverless Architecture is Still Here

Kevin Dangoor on May 31, 2018

What do software architects at Khan Academy do?

Kevin Dangoor on May 14, 2018

New data pipeline management platform at Khan Academy

Ragini Gupta on Apr 30, 2018

Untangling our Python Code

Carter J. Bastian on Apr 16, 2018

Slicker: A Tool for Moving Things in Python

Ben Kraft on Apr 2, 2018

The Great Python Refactor of 2017 And Also 2018

Craig Silverstein on Mar 19, 2018

Working Remotely

Scott Grant on Oct 2, 2017

Tips for giving your first code reviews

Hannah Blumberg on Sep 18, 2017

Let's Reduce! A Gentle Introduction to Javascript's Reduce Method

Josh Comeau on Jul 10, 2017

Creating Query Components with Apollo

Brian Genisio on Jun 12, 2017

Migrating to a Mobile Monorepo for React Native

Jared Forsyth on May 29, 2017

Memcached-Backed Content Infrastructure

Ben Kraft on May 15, 2017

Profiling App Engine Memcached

Ben Kraft on May 1, 2017

App Engine Flex Language Shootout

Amos Latteier on Apr 17, 2017

What's New in OSS at Khan Academy

Brian Genisio on Apr 3, 2017

Automating App Store Screenshots

Bryan Clark on Mar 27, 2017

It's Okay to Break Things: Reflections on Khan Academy's Healthy Hackathon

Kimerie Green on Mar 6, 2017

Interning at Khan Academy: from student to intern

Shadaj Laddad on Dec 12, 2016

Prototyping with Framer

Nick Breen on Oct 3, 2016

Evolving our content infrastructure

William Chargin on Sep 19, 2016

Building a Really, Really Small Android App

Charlie Marsh on Aug 22, 2016

A Case for Time Tracking: Data Driven Time-Management

Oliver Northwood on Aug 8, 2016

Time Management at Khan Academy

Several Authors on Jul 25, 2016

Hackathons Can Be Healthy

Tom Yedwab on Jul 11, 2016

Ensuring transaction-safety in Google App Engine

Craig Silverstein on Jun 27, 2016

The User Write Lock: an Alternative to Transactions for Google App Engine

Craig Silverstein on Jun 20, 2016

Khan Academy's Engineering Principles

Ben Kamens on Jun 6, 2016

Minimizing the length of regular expressions, in practice

Craig Silverstein on May 23, 2016

Introducing SwiftTweaks

Bryan Clark on May 9, 2016

The Autonomous Dumbledore

Evy Kassirer on Apr 25, 2016

Engineering career development at Khan Academy

Ben Eater on Apr 11, 2016

Inline CSS at Khan Academy: Aphrodite

Jamie Wong on Mar 29, 2016

Starting Android at Khan Academy

Ben Komalo on Feb 29, 2016

Automating Highly Similar Translations

Kevin Barabash on Feb 15, 2016

The weekly snippet-server: open-sourced

Craig Silverstein on Feb 1, 2016

Stories from our latest intern class

2015 Interns on Dec 21, 2015

Kanbanning the LearnStorm Dev Process

Kevin Dangoor on Dec 7, 2015

Forgo JS packaging? Not so fast

Craig Silverstein on Nov 23, 2015

Switching to Slack

Benjamin Pollack on Nov 9, 2015

Receiving feedback as an intern at Khan Academy

David Wang on Oct 26, 2015

Schrödinger's deploys no more: how we update translations

Chelsea Voss on Oct 12, 2015

i18nize-templates: Internationalization After the Fact

Craig Silverstein on Sep 28, 2015

Making thumbnails fast

William Chargin on Sep 14, 2015

Copy-pasting more than just text

Sam Lau on Aug 31, 2015

No cheating allowed!!

Phillip Lemons on Aug 17, 2015

Fun with slope fields, css and react

Marcos Ojeda on Aug 5, 2015

Khan Academy: a new employee's primer

Riley Shaw on Jul 20, 2015

How wooden puzzles can destroy dev teams

John Sullivan on Jul 6, 2015

Babel in Khan Academy's i18n Toolchain

Kevin Barabash on Jun 22, 2015

tota11y - an accessibility visualization toolkit

Jordan Scales on Jun 8, 2015


Schrödinger's deploys no more: how we update translations

by Chelsea Voss on Oct 12, 2015

If you’re trying to bring the best learning experience to people around the world, it’s important to, well, think about the world.

Khan Academy is translated into Spanish and Turkish and Polish and more – and this includes not only text, but also the articles, exercises, and videos. Thanks to the efforts of translators, learners around the world can use Khan Academy to learn in their language.

"You can learn anything" in several languages

Internationalization is important. Internationalization is also an engineering challenge: it requires infrastructure to mark which strings in a codebase need to exist in multiple different languages, to store and look up the translated versions of those strings, and to show the user a different website accordingly. Additionally, since our translations are crowdsourced, we need infrastructure to allow translators to translate strings, to show translators where their effort is most needed, and to show these translations once they’re ready. There are many moving parts.

When I arrived at Khan Academy at the beginning of this summer, some of these moving parts in our internationalization infrastructure were responsible for most of the time our deploys took to finish. One of the things I accomplished this summer during my internship here was to banish this slowness from our deploy times.

The problem

Whenever we download the latest translation data from Crowdin, which hosts our crowdsourced translations, we rebuild the translation files – files which the Khan Academy webapp can read in order to show translated webpages. The next time an engineer deploys a new version of the webapp, these new translation files are then deployed as well.

Uploading files to Google App Engine, which hosts the Khan Academy website, is usually the slowest part of our deploys; the translation files are big, so translation files in particular are a major contributor to this. So, whenever the latest translations are downloaded and rebuilt, the next deploy would be quite a bit slower while it uploaded the changed files.

Furthermore, since it’s not always the case that translation files have been rebuilt recently, as an engineer it’s hard to tell whether the deploy you’re about to make will be hit with Translations Upload Duty or not. Sometimes deploys would take around 30 minutes, sometimes they would take around 75 minutes or more:

Graph of deploy times, before translation server

The previous state of affairs – 30-minute deploys punctuated by 75-minute deploys. There are a couple of lulls in this graph where deploys are consistently near 30 minutes: these reflect times when our download from Crowdin was not working.

The fix

We decided to rearrange the infrastructure around this so that instead of uploading translation files to Google App Engine (GAE) along with the rest of the webapp, we would upload the translation files to Google Cloud Storage (GCS) in a separate process and then modify the webapp to read the files from there.

Implementing this required making a few different changes, and the changes had to be coordinated in such a way as to keep internationalized sites up and running throughout the entire process:

  1. Upload the translation files to GCS whenever they're updated.
  2. Change the webapp to read translations from GCS instead of from GAE.
  3. Stop uploading the now-unnecessary translation files to GAE.

These steps by themselves are enough to implement the change to make deploys faster, but we also want to make sure that this project won't break anything – so instead, the steps look something like:

  1. Upload the translation files to GCS whenever they're updated. Measure everything. (How much will this cost? How fast will the upload be?)
  2. Change the webapp to read translations from GCS instead of from GAE. Measure everything. (How much slower is this than reading from disk? Will requests be slower?)
  3. Stop uploading the now-unnecessary translation files to GAE. Measure everything. (Do translated sites still work?)

One thing I experimented with while working on this project was keeping a lab notebook of sorts in a Google Doc; this was where I went to record everything I learned, from little commands that might be useful later, to dependencies I had to install, to all of the measurements I ended up making. This was a good decision. This habit and these notes did in fact turn out to be useful, frequently.

Mythbusters' Adam Savage: "Remember kids, the only difference between screwing around and science is writing it down."

The consequences

Deploys are faster!

I deployed the last piece of this project on August 20, 2015. Deploy times have been more consistent since: the graph of deploy times is free of the spikes that previously indicated translation uploads.

Graph of deploy times, after translation server

Before and after; the change happened on 8/20.

Translations can be updated independently!

Now we don't require an engineer to deploy new code in order to change the translations that appear on Khan Academy – translations are updated by a separate job. Also, this opens up new possibilities – we can now do exciting things like updating our languages independently of each other, and make it so that the time between when a translator makes a translation and when that translation shows up on the main site becomes even shorter. Our internationalization efforts will be able to push forward even faster!