Streaming JSON with Flask

I have a SQLAlchemy query (able to behave as an iterator) which could return a large result set. First version of the code was very simple. Release objects have a to_dict() function which returns a dictionary, so I append to a list and jsonify the result:

# releases = <SQLAlchemy query object>

output = []
for r in releases:
    output.append(r.to_dict())

return jsonify(releases=output), 200

(context on github)

This result set could potentially grow to a point that fitting it memory would be impractical – with only a thousand releases there is already a significant lag before we start getting results.

Unfortunately, Flask’s jsonify() function doesn’t support streaming, so we have to do it manually as described in the Flask documentation. I thus came up with a simple generator like so:

# query = <something>

def generate():
    yield '{"releases": ['
    for release in query:
        yield json.dumps(release.to_dict()) + ', '
    yield ']}'

return Response(generate(), content_type='application/json')

The problem is, that trying to json.loads() the output of this, will result in “ValueError: No JSON object could be decoded”, because the last element in the list will have a comma. No .join() for us!

Thus we need to detect the last iteration, and omit the comma.

How does one do this? I found a handy answer on stackoverflow, which describes using what is called a “lagging generator”. On each yield we return the previous iteration, which allows us to look ahead.

So I modified the generator, and came up with the following:

def generate():
    releases = query.__iter__()
    prev_release = next(releases)  # get first result

    yield '{"releases": ['

    # Iterate over the releases
    for release in releases:
        yield json.dumps(prev_release.to_dict()) + ', '
        prev_release = release

    # Now yield the last iteration without comma but with the closing brackets
    yield json.dumps(prev_release.to_dict()) + ']}'

Now we can detect the last iteration and omit the comma, substituting for the closing brackets instead.

There’s just one problem. When the length of the query result is zero (a reasonable situation), the first next(releases) call will raise StopIteration before we’ve outputted any JSON. Code that expects a valid JSON document will thus fail.

The solution is therefore to catch the first StopIteration, yield a valid “empty” JSON result set, and re-raise the StopIteration. The final solution is thus:

def generate():
    """
    A lagging generator to stream JSON so we don't have to hold everything in memory

    This is a little tricky, as we need to omit the last comma to make valid JSON,
    thus we use a lagging generator, similar to http://stackoverflow.com/questions/1630320/
    """
    releases = query.__iter__()
    try:
        prev_release = next(releases)  # get first result
    except StopIteration:
        # StopIteration here means the length was zero, so yield a valid releases doc and stop
        yield '{"releases": []}'
        raise StopIteration

    # We have some releases. First, yield the opening json
    yield '{"releases": ['

    # Iterate over the releases
    for release in releases:
        yield json.dumps(prev_release.to_dict()) + ', '
        prev_release = release

    # Now yield the last iteration without comma but with the closing brackets
    yield json.dumps(prev_release.to_dict()) + ']}'

return Response(generate(), content_type='application/json')

(github link)

Archival Storage Part 1: The Problems

All of us have data which has value beyond our own lives. My parents’ generation have little record of their childhoods, other than the occasional photo album, but what little records there are, are cherished. My own childhood was well preserved, thanks to the efforts of my mother. Each of my brothers and I has a stack of photo albums, with dates and milestones meticulously documented.

Today, we are generating a massive amount of data. While the majority of it will not be of interest to future generations, I believe preserving a small, selective record of it, akin to the photo albums my mother created, would be immensely valuable to my relatives and descendants – think of your great grandparents jewellery, a photo album of your childhood that your parents created, immigration papers of your predecessors.

Modern technology allows us to document our lives in vivid detail, however the problem is that the data is transient by nature. For example, this blog is run on a Linode server – if I die, the bill doesn’t get paid and Linode deletes it. If Linode goes away, I have to be there to move it to a new server. If Flickr goes away, my online photos are lost. If Facebook goes away, all that history is lost. Laptops and computers are replaced regularly, and the backups created by previous computers may not be readable by future ones, unless we carry over all the data each time.

In part one of this series (this article) I document the problems of common backup solutions for archival storage, with reference to my own set-up. In part two, I’ll detail my “internet research” into optical BD-R media and how it solves these problems, and in part 3 I’ll deal with checksums and managing data for archival (links will be added when done).

Part 1 is fairly technical, so if you just want safe long-term storage, install and configure Crashplan, and skip to part 2.

Continue reading

Kiwis, London, and expectations

Recently there’s been a conversation in the expat community about Kiwis making the move to London. Alex Hazlehurst’s article, which set out to dispel the myth that finding a job in London is easy for kiwis, attracted a fair bit of commentary (she also has a nicely designed blog here). Some of it was nice, some not so nice, and one reply was well written but somewhat condescending.

This conversation is not about people coming for an extended holiday. It is not about coming to London on the two-year visa, with nothing but travel plans and maybe a bit of bar or temp work here and there. It’s about young Kiwis moving to London to start or continue their careers, as I and many others have done. Continue reading

Ubuntu Home Server 14.04

I had grand intentions.

This home server article was to be a detailed masterpiece, a complete documentation of my home server setup.

It hasn’t turned out that way, and many pieces are missing. Turns out, that writing a detailed article on setting up a server is much harder than just doing it! So what you see here is what I finally managed to publish, 5 months after actually building it. I hope you find it useful, and I don’t rule out the possibility that I may update parts of it in future. Continue reading

Canon EOS – From 40D to 70D

It was time to upgrade. The 40D has been a trooper, but it hasn’t seen much use recently. Whether it’s the inconvenience of its compact flash memory cards, or just sheer size and weight, I have seldom felt the need to travel with it.

The 40D was released 8 years ago in 2007, which is a very long time in technology. I bought mine at the start of 2009, just after the release of the 50D, which carried a 30% price premium over the older model. I’ve never regretted my decision to go with the 40D, and I’ve had over 30,000 shutter actuations out of it, most of those with my favourite lens – a Canon EF-S 10-22mm f/3.5-4.5 USM. Now days, you can get a much cheaper 10-18mm lens which is slower (f/4.5-5.6), but has a STM motor and image stabilisation, making it much better for video. If you don’t already have an ultra-wide angle lens, seriously, get this one.

In the past year though, I’ve probably shot less than 100 frames with the 40D, favouring a much smaller Panasonic GX1 with a Samyang 7.5mm fish-eye and the excellent 20mm f/1.7 pancake. These lenses are as sharp as they are useful, but when using the little GX1 I don’t really feel like I’m “doing photography”. The experience is that of using a point & shoot, but the pictures I can take with it are almost as good as the 40D (better in low light with the f/1.7 pancake actually). Lugging the SLR doesn’t really make sense.

Looking through some of my old photos recently made me realise how much I miss the Canon 10-22mm lens. The fish-eye is fun, and even has a wider field of view, but images from rectilinear lenses afford more creative flexibility in my opinion; the fish-eye look is distinctive, and not one you really want to characterise all your images. Continue reading

Good news for Z3 Compact owners

There has been some rather good news for Z3 Compact (Z3C) owners the past couple of weeks. Firstly, Cyanogenmod started releasing nightly CM12 builds for the Z3C. But more importantly, a root exploit was released.

The thing that galled me most about the Z3 was that unlocking the bootloader permanently erased DRM keys which are required for some functionality. Usually this functionality is superfluous (I never intend to purchase any protected content from the Sony store), but in the case of the Z3C, erasing the DRM keys makes the camera worse in low light.

Unfortunately, unlocking the bootloader is required to install firmware from sources other than Sony, which means I can only do what Sony officially sanctions, unless I want to sacrifice camera performance.

I don’t believe I should have to make that choice.

This exploit restores the balance, but upgrading to Cyanogenmod while retaining the DRM keys is a fairly lengthy process:

  • Downgrade to an older, exploitable firmware version (before October 2014) with Flashtool
  • Run the giefroot root exploit
  • Backup the TA (trim area) partition with Backup TA (this saves the DRM keys so you can truly revert to stock)
  • Flash the rom of your choice, safe in the knowledge that it will always be possible to revert to factory condition!

Note that the DRM keys (probably Sony’s camera app as well) can’t be used with Cyanogenmod, so the camera will still be theoretically inferior to the stock Sony firmware. But this does allow me to revert to factory condition, or stick with firmware derived from Sony’s if I am not happy with the trade-off. Previously, this wouldn’t have been possible!

The Z3 is a great piece of hardware (read my brief review here), but Sony’s software and hostile DRM have been sore points. Now, I can finally have the phone I wanted (not to mention paid for).

Ubuntu 14.04 – No USB keyboard after upgrading kernel

After upgrading my Ubuntu 14.04 LTS install from linux kernel 3.13 to 3.16, USB input devices, particularly my keyboard, stopped working.

On rebooting to an older kernel, the keyboard worked again. The reason for this, is that the base kernel package doesn’t include the usbhid module, which is require for USB input devices.

The solution, is to install the linux-image-extra package for your kernel. In my case it was:

sudo apt-get install linux-image-extra-3.16.0-28-generic

You can either do this via ssh, or boot to an older working kernel first.

Afterwards, you should be able to do modprobe usbhid, or simply reboot, and your usb input devices should function correctly.

Sony’s “cancellation” of The Interview’s cinematic release is a shrewd move

Yesterday, Sony Pictures Entertainment (SPE) announced that it has cancelled the theatrical release of Seth Rogen’s “The Interview”, in the wake of terrorist threats.

In light of the decision by the majority of our exhibitors not to show the film The Interview, we have decided not to move forward with the planned December 25 theatrical release. We respect and understand our partners’ decision and, of course, completely share their paramount interest in the safety of employees and theater-goers.

Apparently, one of the demands of the GoP hackers that breached SPE, was that Sony should not release this film. I’m not making this shit up.

According to the USA’s own Department of Homeland Security, the threat is, unsurprisingly, not credible. Sony therefore, has no reason to cancel the theatrical release. Other than… publicity.

The trailer looks bloody awful, and if I was North Korea I wouldn’t take offense at all. Really, it should damage Seth Rogen’s reputation more than Kim Jong-un.

Cancelling it then, is exactly the right thing to do. Except that it really hasn’t been “cancelled”, as the release will eventually be “re-evaluated” once there is “no longer any threat to innocent lives”. Of course, no one would ever want to see a “highly controversial” film which “incited terrorist threats” and “offended an entire nation”.

Talk about making Lemonade.