Khan Engineering

Khan Engineering

We're the engineers behind Khan Academy. We're building a free, world-class education for anyone, anywhere.

Subscribe

Latest posts

Using static analysis in Python, JavaScript and more to make your system safer

Kevin Dangoor on July 26

Kotlin on the server at Khan Academy

Colin Fuller on June 28

The Original Serverless Architecture is Still Here

Kevin Dangoor on May 31

What do software architects at Khan Academy do?

Kevin Dangoor on May 14

New data pipeline management platform at Khan Academy

Ragini Gupta on April 30

Untangling our Python Code

Carter J. Bastian on April 16

Slicker: A Tool for Moving Things in Python

Ben Kraft on April 2

The Great Python Refactor of 2017 And Also 2018

Craig Silverstein on March 19

Working Remotely

Scott Grant on Oct 2, 2017

Tips for giving your first code reviews

Hannah Blumberg on Sep 18, 2017

Let's Reduce! A Gentle Introduction to Javascript's Reduce Method

Josh Comeau on Jul 10, 2017

Creating Query Components with Apollo

Brian Genisio on Jun 12, 2017

Migrating to a Mobile Monorepo for React Native

Jared Forsyth on May 29, 2017

Memcached-Backed Content Infrastructure

Ben Kraft on May 15, 2017

Profiling App Engine Memcached

Ben Kraft on May 1, 2017

App Engine Flex Language Shootout

Amos Latteier on Apr 17, 2017

What's New in OSS at Khan Academy

Brian Genisio on Apr 3, 2017

Automating App Store Screenshots

Bryan Clark on Mar 27, 2017

It's Okay to Break Things: Reflections on Khan Academy's Healthy Hackathon

Kimerie Green on Mar 6, 2017

Interning at Khan Academy: from student to intern

Shadaj Laddad on Dec 12, 2016

Prototyping with Framer

Nick Breen on Oct 3, 2016

Evolving our content infrastructure

William Chargin on Sep 19, 2016

Building a Really, Really Small Android App

Charlie Marsh on Aug 22, 2016

A Case for Time Tracking: Data Driven Time-Management

Oliver Northwood on Aug 8, 2016

Time Management at Khan Academy

Several Authors on Jul 25, 2016

Hackathons Can Be Healthy

Tom Yedwab on Jul 11, 2016

Ensuring transaction-safety in Google App Engine

Craig Silverstein on Jun 27, 2016

The User Write Lock: an Alternative to Transactions for Google App Engine

Craig Silverstein on Jun 20, 2016

Khan Academy's Engineering Principles

Ben Kamens on Jun 6, 2016

Minimizing the length of regular expressions, in practice

Craig Silverstein on May 23, 2016

Introducing SwiftTweaks

Bryan Clark on May 9, 2016

The Autonomous Dumbledore

Evy Kassirer on Apr 25, 2016

Engineering career development at Khan Academy

Ben Eater on Apr 11, 2016

Inline CSS at Khan Academy: Aphrodite

Jamie Wong on Mar 29, 2016

Starting Android at Khan Academy

Ben Komalo on Feb 29, 2016

Automating Highly Similar Translations

Kevin Barabash on Feb 15, 2016

The weekly snippet-server: open-sourced

Craig Silverstein on Feb 1, 2016

Stories from our latest intern class

2015 Interns on Dec 21, 2015

Kanbanning the LearnStorm Dev Process

Kevin Dangoor on Dec 7, 2015

Forgo JS packaging? Not so fast

Craig Silverstein on Nov 23, 2015

Switching to Slack

Benjamin Pollack on Nov 9, 2015

Receiving feedback as an intern at Khan Academy

David Wang on Oct 26, 2015

Schrödinger's deploys no more: how we update translations

Chelsea Voss on Oct 12, 2015

i18nize-templates: Internationalization After the Fact

Craig Silverstein on Sep 28, 2015

Making thumbnails fast

William Chargin on Sep 14, 2015

Copy-pasting more than just text

Sam Lau on Aug 31, 2015

No cheating allowed!!

Phillip Lemons on Aug 17, 2015

Fun with slope fields, css and react

Marcos Ojeda on Aug 5, 2015

Khan Academy: a new employee's primer

Riley Shaw on Jul 20, 2015

How wooden puzzles can destroy dev teams

John Sullivan on Jul 6, 2015

Babel in Khan Academy's i18n Toolchain

Kevin Barabash on Jun 22, 2015

tota11y - an accessibility visualization toolkit

Jordan Scales on Jun 8, 2015

Meta

Using static analysis in Python, JavaScript and more to make your system safer

by Kevin Dangoor on July 26

"Linting" source code to look for errors is nothing new (the original “lint” tool turned turned forty this year!), but most places I worked prior to Khan Academy didn’t use linters as extensively as we do here. So, I thought I’d share a bit about a few of our custom linters in hopes that others may invest a little time to prevent more bugs.

For more than just formatting

JavaScript programmers will be familiar with tools like ESLint and Python programmers may be familiar with pylint. Teams spend time arguing about what their code style is and then configure the linters to enforce that consistent style. For many folks, I’d imagine that their primary experience with linters is something along the lines of “oh, that’s the tool that complains when I put my brace in the wrong place.”

I’m a big fan of Prettier, which has completely eliminated both formatting errors and discussion of code formatting for our JavaScript code. Even with code formatting being a “solved problem”, we rely more on linters than ever. They serve the important purpose of maintaining code quality by preventing known bad patterns from sneaking in.

Staying in sync

In our web application, we’ve got code written in JavaScript, Python, and Kotlin. One bit of complexity that naturally comes up when from having multiple languages is that sometimes you’ll need to keep files in sync. Imagine that we’ve duplicated a small bit of logic between two of the languages, or that there’s an interface shared between the two that must be changed in tandem. Sure, we do our best to minimize those case, but they can be hard to avoid entirely.

Our code_syncing_lint.py linter is set up to handle this problem. It defines this comment format:

# sync-start:<tag> <filename>
# sync-end:<tag>

tag is a name you give to the block of code. The linter ensures that if you make a change in one file in a given commit, there must also be a change in the block with the same tag name in filename as well.

Frontend best practices

Khan Academy has been around for several years now and JavaScript development has changed a lot over those years. Where we used to use Handlebars and LESS files for defining our client-side views, we now use React with Aphrodite. At this point, if someone creates a new Handlebars or LESS file in our repository, they’re working against the direction we’re pushing for our frontend.

So, we have a linter that makes sure that no new files of those types are created. That linter comes complete with a whitelist of the files we haven’t yet managed to get rid of. As much as possible, when we introduce a new lint rule, we fix up all of the lint discovered by the rule. Rewriting a Handlebars template as React components is a non-trivial change, so this linter has a whitelist of those pre-existing files that are allowed to break the rule.

You can lint images, too

One of our engineers, Colin Fuller, noticed that some of the images on our team page looked off. The page is a grid of photos cropped to the same sizes, but some of the photos looked stretched. During our February Healthy Hackathon, Colin wrote a linter that double checks that every team photo is the same size. The linter is only about 50 lines of Python, comments included.

Avoiding tests that accidentally overwrite others

Have you ever copy/pasted an existing unit test to create a new test case for a different variation? Sometimes, all you want to do is call the function under test with different arguments, so copy/paste is the easy solution.

Imagine you have a piece of a test class that looks like this, after a copy/paste:

def test_foo1(self):
    self.assertEqual(1, foo("one"))

def test_foo1(self):
    self.assertEqual(2, foo("two"))

It’s pretty easy to forget to change the test method name and not notice that the original test will no longer be called. We have a linter that watches for this.

Avoiding problems with third-party libraries

Third-party libraries don’t always work the way we’d want them to. Many times, the right answer is to put up a pull request for the library. But what happens if the behavior of the library is by design?

We’ve got a case of that in our codebase. For example, we use persistgraphql to extract GraphQL queries from our client-side code so that we can allow only specific queries to run. The problem we ran into is that persistgraphql reformats the queries in a way that works fine for their main use case, but could make the queries not match up with what our server-side code expects. Our solution is a linter that guarantees that the query in the JavaScript code will exactly match the query expected by the server.

You can import this but you can’t import that

As part of our Great Python Refactoring, we instituted new rules about which code was allowed to import which other code, to help avoid future similar tangles. We could impose restrictions via Python import hooks, but we wouldn’t find out about those problems until runtime. Our components_lint.py linter ensures at commit time that we aren’t breaking the rules.

Of course, the Great Python Refactoring didn’t take care of every bothersome import, but the linter has a simple whitelist that we’ll whittle down as we continue to clean things up.

Tip: Keep linters fast with regular expressions

Yes, yes, we all know you can’t parse HTML with a regular expression. When you need to be correct 100% of the time, you need to use a proper parser. But many times a regular expression will suffice and be much faster. While regular expression syntax can seem quite obscure (or become a maintenance nightmare), regexes can actually be easier to understand than code designed to traverse a parse tree.

As an added bonus, linters like our code_syncing_lint mentioned earlier can work on Python, JavaScript, and Kotlin files without needing three separate parsers and a whole lot more code.

But beware! It’s easy for a regex to not properly handle legitimate, real world code files, so just keep that tradeoff in mind. Linters are like any other code, though, so you can write unit tests to verify the expected cases as we have for ours.

Lint all the things!

Once you have the basic hooks in place to run linters automatically, you’ll doubtless find many ways in which you can prevent common sorts of bugs from creeping in to your system. ESLint has rules to help you with frequent JavaScript mistakes, but I’d bet there are other potential pitfalls that are unique to your environment.

I hope this tour of some of our linters gives you ideas for some of your own.

Thanks to Ben Kraft, Craig Silverstein, and Amos Latteier for their suggestions of good example linters from our code, and Scott Grant for editing advice.