"Linting" source code to look for errors is nothing new (the original “lint” tool turned turned forty this year!), but most places I worked prior to Khan Academy didn’t use linters as extensively as we do here. So, I thought I’d share a bit about a few of our custom linters in hopes that others may invest a little time to prevent more bugs.
For more than just formatting
Staying in sync
code_syncing_lint.py linter is set up to handle this problem. It defines this comment format:
# sync-start:<tag> <filename> # sync-end:<tag>
tag is a name you give to the block of code. The linter ensures that if you make a change in one file in a given commit, there must also be a change in the block with the same
tag name in
filename as well.
Frontend best practices
So, we have a linter that makes sure that no new files of those types are created. That linter comes complete with a whitelist of the files we haven’t yet managed to get rid of. As much as possible, when we introduce a new lint rule, we fix up all of the lint discovered by the rule. Rewriting a Handlebars template as React components is a non-trivial change, so this linter has a whitelist of those pre-existing files that are allowed to break the rule.
You can lint images, too
One of our engineers, Colin Fuller, noticed that some of the images on our team page looked off. The page is a grid of photos cropped to the same sizes, but some of the photos looked stretched. During our February Healthy Hackathon, Colin wrote a linter that double checks that every team photo is the same size. The linter is only about 50 lines of Python, comments included.
Avoiding tests that accidentally overwrite others
Have you ever copy/pasted an existing unit test to create a new test case for a different variation? Sometimes, all you want to do is call the function under test with different arguments, so copy/paste is the easy solution.
Imagine you have a piece of a test class that looks like this, after a copy/paste:
def test_foo1(self): self.assertEqual(1, foo("one")) def test_foo1(self): self.assertEqual(2, foo("two"))
It’s pretty easy to forget to change the test method name and not notice that the original test will no longer be called. We have a linter that watches for this.
Avoiding problems with third-party libraries
Third-party libraries don’t always work the way we’d want them to. Many times, the right answer is to put up a pull request for the library. But what happens if the behavior of the library is by design?
You can import this but you can’t import that
As part of our Great Python Refactoring, we instituted new rules about which code was allowed to import which other code, to help avoid future similar tangles. We could impose restrictions via Python import hooks, but we wouldn’t find out about those problems until runtime. Our
components_lint.py linter ensures at commit time that we aren’t breaking the rules.
Of course, the Great Python Refactoring didn’t take care of every bothersome import, but the linter has a simple whitelist that we’ll whittle down as we continue to clean things up.
Tip: Keep linters fast with regular expressions
Yes, yes, we all know you can’t parse HTML with a regular expression. When you need to be correct 100% of the time, you need to use a proper parser. But many times a regular expression will suffice and be much faster. While regular expression syntax can seem quite obscure (or become a maintenance nightmare), regexes can actually be easier to understand than code designed to traverse a parse tree.
As an added bonus, linters like our
But beware! It’s easy for a regex to not properly handle legitimate, real world code files, so just keep that tradeoff in mind. Linters are like any other code, though, so you can write unit tests to verify the expected cases as we have for ours.
Lint all the things!
I hope this tour of some of our linters gives you ideas for some of your own.
Thanks to Ben Kraft, Craig Silverstein, and Amos Latteier for their suggestions of good example linters from our code, and Scott Grant for editing advice.