The Changelog

Open Source moves fast. Keep up.

Side-by-side highlighted command line diffs #

Jeff Kaufman’s icdiff takes advantage of your terminal’s ability to display colors to show you the differences between similar files without getting in the way.


It’s not meant to replace the built-in diff command, but complement it.

Prophet: a Python microframework for financial markets #

Prophet strives to let the programmer focus on modeling financial strategies, portfolio management, and analyzing backtests. It achieves this by having few functions to learn to hit the ground running, yet being flexible enough to accomodate sophistication.

Looks great for anybody dipping their toe in to financial market software. Nice double entendre, too.

CloudTunes: your web-based music player for the cloud #

Great idea and execution from Jakub Roztočil:

CloudTunes provides a unified interface for music stored in the cloud (YouTube, Dropbox, etc.) and integrates with, Facebook, and Musicbrainz for metadata, discovery, and social experience. It is similar to services like Spotify, except instead of local tracks and the fixed Spotify catalog, CloudTunes uses your files stored in Dropbox and music videos on YouTube.


Collect your thoughts and notes without leaving the command line #

jrnl is a great little text-based journaling tool with a command line interface. Why plain text files? I love this tidbit from the readme:

you can put them into a Dropbox folder for instant syncing and you can be assured that your journal will still be readable in 2050, when all your fancy iPad journal applications will long be forgotten.

At first blush, the interface looks really well thought out. I don’t journal much, but jrnl just might get me started.

docopt gets CLI argument parsing right

Brilliant ideas can be painfully obvious in retrospect. They’ll leave you thinking, “Why didn’t we I think of that before?!” Docopt is that for parsing CLI arguments.

Clean your HTML with Bleach #

When developing for the web a time will come when you’ll need to sanitize HTML. If you need to do this in Python then you should check out Bleach.

Bleach is an HTML sanitizing library that escapes or strips markup and attributes based on a white list. Bleach can also linkify text safely, applying filters that Django’s urlize filter cannot, and optionally setting rel attributes, even on links already in the text.

Even if all you want to do is apply rel='nofollow' to the links in user generated content, Bleach has you covered. So, check it out the next time you need to clean some HTML.

ngxtop: real-time metrics for nginx server #

ngxtop is shaping up to be one of those tools that I didn’t even know I needed, but now I won’t know how I ever lived without it.

ngxtop parses your nginx access log and outputs useful, top-like, metrics of your nginx server.

Need we say more? Check the readme for some nice examples of what this Python script is capable of.

Easily Build Mac OS X Status Bar Apps With Python #

From time to time, the thought has a occurred to me that it would be cool if I could build simple native apps with Python. So, I was excited when I found rumps.

Ridiculously Uncomplicated Mac os x Python Statusbar apps

You can’t make full blown apps, but if you’ve ever had a status bar app idea you can use rumps to build it.

Bunch lets you use a Python dict like it’s an Object #

Sometimes, in Python, I wish I could access dicts as if they are objects. Bunch makes it easy to do that.

A Bunch is a Python dictionary that provides attribute-style access (a la JavaScript objects).

Bunch acts like an object and a dict.

>>> b = Bunch()
>>> b.hello = 'world'
>>> b.hello
>>> b['hello'] += "!"
>>> b.hello

And it even plays nice with serialization.

>>> b = Bunch(foo=Bunch(lol=True), hello=42, ponies='are pretty!')
>>> import json
>>> json.dumps(b)
'{"ponies": "are pretty!", "foo": {"lol": true}, "hello": 42}'

This approach isn’t for everything, but if you want a dict that acts like an object checkout Bunch.

Dominate HTML in Python #

Have you ever wished that you had a sweet little API to generate HTML in Python? Dominate is probably what you are looking for.

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API.

Now, I’m a self admitted HTML purist, but look at how the dominate API works.

from dominate.tags import ul, li
list = ul()
for item in range(4):
    list += li('Item #', item)

If done correctly HTML generators can blend in with your code nicely.

Checkout Dominate the next time you’re looking for a nice native HTML generator API for python.

Can you use Python 3? #

Good question. It’s a long road to Python 3, but it’s a little easier to navigate now with the release of caniusepython3.

This script takes in a set of dependencies and then figures out which of them are holding you up from porting to Python 3.

It’s a simple script which makes it just a little easier to use Python 3.

The output of the script will tell you how many (implicit) dependencies you need to transition to Python 3 in order to allow you to make the same transition. It will also list what projects have no explicit dependency blocking their transition so you can ask them consider starting a port to Python 3.

Want to run SQL on a CSV file? #

Now you can with q, a Python lib.

q allows performing SQL-like statements on tabular text data.

It seems this idea isn’t restricted to Python either. TextQL is a project written in Go that promises to do roughly the same thing.

You always need another Python task queue #

I kid, diversity is the key to a healthy ecosystem. Huey is a simple offline Python task queue that has relatively few dependencies.

a lightweight alternative: written in python, no deps outside the standard lib except Redis (or you can roll your own backend), and support for Django.

Sometimes a little goes a long way. Checkout Huey if you need a lightweight Python task queue. If you need more features I would recommend RQ, or Celery.

Generate 4 language bindings for your API in one Go #

You just built an API, and want to make sure everyone can use it. Building libraries in every language isn’t only going to be hard, its going to take a lot of time. Time you don’t have. This is where Alpaca can help.

You define your API according to the format, alpaca builds the API libraries along with their documentation. All you have to do is publishing them to their respective package managers.

Right now it can generate API clients in PHP, Python, Ruby, and JavaScript. You can see examples of the generated client libraries here. I can’t speak to the quality of all the generated language bindings, but I took a cursory look at the Python lib and it looks good. Looks like Alpaca could save us all a lot of time.

Show a progress bar for long running loops with tqdm #

I can’t tell you how many times I’ve kicked off a long running process only to kill it and add in a progress indicator. I probably should have come up with something standard awhile ago, but now I don’t have to. tqdm has created one kind of solution.

Instantly make your loops show a progress meter – just wrap any iterator with tqdm(iterator), and you’re done!

Can’t say much more about it, but if you have had this problem in the past you might want to check out tqdm.

Speed up AWS S3 by 2000x with this transparent proxy #

Amazon S3 works pretty well, is cheap, and is not too slow. It is employed as a blob store by so many companies that it’s practically the de facto solution. So, if you could speed up S3 I am sure it would have a pretty big impact. That is exactly what MimicDB is trying to do.

By maintaining a transactional record of every API call to S3, MimicDB provides a local, isometric key-value store of data on S3. MimicDB stores everything except the contents of objects locally. Tasks like listing, searching and calculating storage usage on massive amounts of data are now fast and free.

The readme says that on average tasks like those are 2000x faster using MimicDB. It also reduced the number of API calls to S3 thus reducing the price. If you use S3 heavily, MimicDB looks like it could be an interesting addition to your stack.

Build newsfeeds with Feedly (not the RSS reader) #

Feedly, no not this feedly, is a python lib that provides a high level abstraction for building news feeds.

Feedly is a Python library, which allows you to build newsfeed and notification systems using Cassandra and/or Redis.

If you are building a social stream at some point in time SELECT * FROM updates WHERE user_id IN (people user follows) ORDER BY id DESC stops working. At that point you need to build something a little more advanced. Feedly gives you those tools.

Thumbor is a self hosted thumbnail-as-a-service #

Thumbor is pretty impressive. Not only does it take something like thumbnailing, which is always painful, and makes it easy. It has cool image operation out of the box.

It also features a VERY smart detection of important points in the image for better cropping and resizing, using state-of-the-art face and feature detection algorithms

It even sports an east to use API with urls like:


If you’re like me and think thumbnailing is a pain, checkout Thumbor.

Newspaper delivers Instapaper style article extraction #

Newspaper lets anyone do article extraction like Instapaper and Pocket.

Newspaper is a Python 2 library for extracting & curating articles from the web.
It wants to change the way people handle article extraction with a new, more precise layer of abstraction.

Besides “read later” services, there’s a growing number of APIs that provide article extraction as a service like diffbot and Those services are great, but it’s nice that newspaper is open source and hackable.

For instance, when I first checked out newspaper it only had plain text article extraction. Sometimes, though, I want the original markup of the article with some sanitization. It helps to have the paragraphs, links, and headers accurately represent the article. So, I forked the project, made some changes, and the maintainer codelucas was reactive and worked with me to get my changes merged in.

If you want a place to start working on article extraction Newspaper looks like a good bet.