But it made me wonder if you could send an actual QR code over sound. We can generate the image using QR code libraries, then scan through and fourier transform it into a waveform. At the receiving end we fourier transform back and plot a spectrogram. The clever bit is we can apply the existing QR-scanning algorithms on that image - which does all the alignment and error checking for us. Hopefully.
Here's the demo. It runs in your browser by using the Javascript QR Code reader by Lazar Laszlo and QR code generator by davidshimjs. (You will need a browser that supports getUserMedia.)
Type a message in the sender and hit 'Play' to hear it as a fourier-transformed QR code.
Click 'Listen' on the receiver, and you should see a spectrogram of the audio it hears. When the you have a clear QR code in view, hit 'Decode' to try and interpret it.
I might play around with the decoding some more to make it more reliable, but of course I have no delusions about this being a viable protocol. I just think it's reaaallly cool.
In this quick demo, the phone was about 4 metres from the desktop speakers. My phone runs the receive program very slowly, so I had to transmit extra slowly (increase the height) for it to appear square.
You can cheat / test this by directly sending the audio data to the receiver, either by connecting a cable between line in / line out, or through software (on windows, selecting 'stereo mix' as the microphone source). This gives almost perfect transmission.
With actual audio transmission, I found it was best to apply a small amount of blurring. This is done by a square convolution, as in, run through the image and set each pixel to the average of its surrounding pixels. I also made it auto-level the image, this is done by calculating a histogram, then integrating to find the central 80% of the pixels, then stretching the image data so these span the full range of intensities.
Blurring / leveling seems to work very well. One last thing I added was a test to see if there are any obvious notch bands. This looks for columns of pixels that, after leveling, have no black pixels, and darkening these columns until they do. This only applies to bands through the middle, if there are notch bands at the sides it won't know what's noise and what's data - so it's important to change the width/offset to ensure the edges of the code are clear.
By the way, if you want to use other software to plot the spectrogram, make sure you set the scaling to linear, not logarithmic.
The spectrum originally used the built-in fourier transform of the browser. This, I expect, runs much faster because it's compiled code. However it's intended for visualizations, and I soon realized that the window function was way too smeared out and every peak ended up with a long tail. The coloured spectrum at the top displays this effect. Implementing our own FFT with rectangular windowing means a much crisper output. I also switched from using the analyser nodes to the scriptProcessor node which hopefully won't miss entire buffers out at random.
But even the smeared out image at the top, with some levels adjustment in photoshop, can be made readable:
Missing buffers mean the image is stretched/squeezed vertically in random places, and this is not the type of transform the QR reader is expecting. Possibly we could replace the algorithms with something more appropriate.
As for transmitting this data down a phone line, I suspect this isn't possible (although I haven't tried it). Phone lines are extremely squeezed in terms of bandwidth (vocoders etc) and anything that doesn't sound like a human voice gets destroyed. Damn those pesky phone companies trying to save bandwidth!
I may try and find what the best data rate I can get is using a direct audio cable between the devices, I expect I might be able to get into the kilobits.
Incidentally I would recommend reading the wiki page on dial-up modems. They are actually incredible bits of engineering. Dial-up speeds might not sound like much, but remember that phone lines only have about 3kHz of bandwidth, so pushing it to 56kbps is insane.
While trying to glue the FFT to the QR code maker, I went through a few stages. First of all I generated white noise, and used the generated image to blank various components. This actually works perfectly, except for the continuously randomized phase components causing discontinuities with each buffer fill. Makes it quite clicky.
Much better to randomize the components initially, and propagate them maintaining constant magnitudes. We then copy/attenuate these components into the FFT. Twas the thought anyway, and it works perfectly. There was, however, a face-palming moment in this development. Each of those components is, by definition, an integer multiple period of the buffer size. That is to say, after propagating for a full buffer they will all be back where they started. In other words: we don't need to propagate them at all. Moments like this make me feel like a buffoon, and I almost want to delete the old reverse spectrogram experiments page out of embarrassment.
Oh well. We can make up for it now by providing a working, generic, reverse spectrogram image program:
Hooray! It works so well, I might post some more on this later.