The Autotune Slide Whistle is barely more than a joke, but it works! It started out as a concept, which I finally prototyped just recently.
Conversely, the Robot MIDI Slide Whistle is a momentous invention almost guaranteed to solve many of the world's problems.
Right from the start I knew it had to respond fast.
We could have used a belt drive and a stepper motor. But it'd need to be beefy to move with the kind of speed and accuracy I wanted. Part of the problem is that the low notes are far apart, but the high notes are close together, so any linear arrangement needs to be both fast enough for the low notes and accurate enough for the high ones.
The scissor arrangement uses leverage to flatten out the problem. Here's one of my early prototypes:
Those hobby servos are dreadful. Not only do they have poor accuracy and poor repeatability, they have huge variations between units. So sending the 50% signal to both of them does not mean they'll both point in the same direction. In the shot above, I'd pulled out their control circuitry and attempted to improve upon it. The breadboard has an H-bridge driver and there's an STM32 board controlling it, running a PID loop using the internal potentiometers of the servos.
I tried an awful lot of things to get this to work, but even my best efforts were fruitless. Eventually I conceded, it was time to try some serious servos.
The Dynamixel AX-12 is one hell of a servo. It's roughly ten times the price of a regular one, but for that you get a proper control loop that can be tweaked, excellent accuracy and repeatability out of the box, and a whole load of other features like temperature and current load readout. They're controlled completely differently to other servos, a single wire carries addressed packets of data between all servos and the host.
The plywood parts, as always, were built in my semi-CAD approach: just draw a couple of lines in the laser software, try it out, edit and cut them again if needed. This is the advantage of owning your own laser cutter. The pivots are dressmaker's pins through laser-drilled holes.
It gives a very good pivot for very little effort. The hole is produced just thin enough that there is absolutely no play at all.
As for holding the whistle, little clips were cut that press-fit into the support part. A loop of tape is still needed, to stop the whistle rotating.
The need for flowrate control was obvious from the start. With a constant flow rate, only about half of the notes will sound, either the top notes are just a breathy noise, or the bottom notes jump to a higher harmonic. With an electric fan, we can just modulate the power to it and get a range of flowrates. There's a fair bit of inertia to the fan, so it won't instantly change flowrate, but it's fast enough to keep up with the kind of speeds the plunger is moving.
It's not fast enough to articulate the notes, however. It can adjust flowrates marginally with a good response, but slowing to a complete stop takes the best part of a second. Hence the need for the additional valve on the fipple.
Adjusting the speed is theoretically simple, but I can remember encountering all kinds of problems. A big ol' transistor was used to pulse the 12V supply to it.
The heatsink might have been overkill. Even for the highest notes we only want a fraction of the full 50W output the motor can provide. It's usual when driving an inductive load to suppress the back-EMF with a diode. I found that the amount of noise thrown off this fan was far more than a simple diode could cope with. A lot of work went into isolating the noise from the servos. The cheap 9g servo operating the valve was particularly sensitive – it is an analog servo, after all. Changing motor speed sometimes caused the servo to jump position or get confused.
I added an additional diode and a small capacitor inline on the power cable to the fan motor, as close as I could get it to the source of the noise. This helped a lot more than just adding things to the circuit board.
The main circuit was built with the intention of mounting it onto the back of the slide whistle, to make a compact instrument. The circuit is quite simple, just a microcontroller (ATmega328p), optoisolator for the MIDI input, a linear regulator for the uC and a small secondary buck regulator for the valve servo. The air pump and the dynamixel servos run from 12V, provided by the big step-down module on the right.
Seriously, you wouldn't believe the number of graphs I plotted while making this thing.
The goal was to get the thing to play in tune, by calibrating it in advance, so it didn't need a feedback loop. Essentially, we have a python script tell the whistle to try every possible position, and detect what frequency the sound is. Then we invert that to generate a lookup table.
My python script uses a simple method of pitch-detection, but it works very well. Even the built-in microphone on the laptop running the script was good enough for calibrating the whistle. It first filters the data by using a gaussian blur, then counts the number of zero-crossings. Numpy makes this very easy to implement.
But, there are a bunch of problems that make geneating the lookup table harder.
The first problem is that for different notes, different flowrates are needed. In the middle region, blowing harder or softer will change not just the volume, but also the pitch. At the top end, blowing too gently will not make a sound, and at the low end, blowing too hard will overblow to the harmonic.
Doing several sweeps at constant flowrates looks a little like this:
The hex value is a duty cycle sent to the H-bridge for the fan motor. The higher flowrates cause harmonics at the low end, and the lower flowrates produce gibberish at the top. But this is enough to tell us what the flowrates should be at different positions. There was a bit of handwaving involved, but I came up with a formula to choose a good flowrate for each position.
Once we have a clean sound for any position, we can do some full sweeps. The next problem to solve is the backlash, also called slop or hysteresis. We tell the script to do a very slow sweep from bottom to top, and then another slow sweep from top to bottom. The two curves should show us the extremes of the backlash.
In theory, taking an average of these two curves will give the 'true' position we need to send to the servos, to get a given pitch. That is, assuming the backlash is small enough.
The maximum pitch error of about 2% corresponds to roughly one third of a semitone. One semitone in equal temperament is about 6% difference in pitch.
There are ways of cutting this down, aside from trying to tackle the source of the backlash. We could factor in the previous position when we move to a new note, and digitally compensate for the backlash. This is a lot harder than it sounds, because the backlash is also a function of the speed the servos move. Quite often for large jumps they will overshoot and rebound, so the backlash error is opposite to what you'd expect.
Another option is to have a realtime feedback loop, of course, but I really didn't think that was needed. To be honest, a one-third semitone maximum pitch error on a slide whistle is golden.
Happy with the calibration, the next step is to invert the averaged data to get us a mapping between MIDI note number and servo angle.
I was very pleased with this result. For a first-order correction, that's pretty much a straight line. The size and angle limits of the scissor mechanism were chosen on a whim, a gut feeling maybe, but until I plotted this graph I had no idea if it would work. Satisfying!
A MIDI note number is an integer, a whole number of semitones. In the graph above, we've labelled it MIDI note number but it's a continuous scale, just the logarithm of the original curve and an offset to line it up with the MIDI scale. Really, we need to snap to the nearest whole number, and only use the intermediate numbers when pitch-bend is involved.
The source code should not be too hard to follow if you're familiar with AVR assembly. We only care about MIDI note-on, note-off, and pitch-bend. Naturally the pitch-bend resolution is awful, but it's funny to support it.
The ATmega chip only has one UART, but in assembly it's easy to just bitbang another UART.
The watchdog timer is used to detect the end of a song. It starts ticking after the last note-off, and if no new notes are played for more than a second, the fan-motor is turned off.
As usual, the source code for everything has been dumped on github.
Staccato notes are fine. The valve is closed, then a note-on arrives and the valve opens. The problems occur when a new note is played directly after another. There's no time for the valve to close and open again, so it just stays open, and the notes are slurred together. This really cuts into the performance and for the first few songs, I manually edited the midi files to shorten any notes that directly preceded others of the same pitch.
But we live in an age of computers!
You probably already think I'm a bit weird for choosing to program in assembly, but things are about to get weirder. Let me tell you about CAL scripts.
Cakewalk Pro Audio 9 is a very early bit of music software. A product of the 90s, it existed before the term DAW had been coined, and well before VSTs were introduced. The "audio" in the name suggests it can do more than just MIDI, but for audio handling I wouldn't recommend it. The point is, in the 90s this was the peak of MIDI software. I somehow got hold of Cakewalk Pro Audio 9 when I was a kid, and it was already out of date, but it set me on my MIDI journey.
I was always amazed by how complete it was. For MIDI, it did everything you could possibly want. And it kept working long after the computers it was designed for had rotted to bits. I think partly this is because it only made basic Windows API calls, instead of depending on complex graphics libraries, so the software still works today, even with the butchered title bars Windows 10 tries to give it.
Cakewalk continued to produce software, their main product evolved into Sonar, but the newer software eventually dropped support for one of the most interesting bits, the CAL scripts.
Nowadays if you want to add scripting support into a software product pretty much everyone chooses Python. If you're a bit more classy, you might still go with Lua. But in 1994, these languages were virtually unknown (Lua launched in 1993, Python was developed in the 1980s but didn't take off until much much later). Which is why CAL scripts are based on LISP.
Don't get me wrong. I don't recommend anyone wanting to start MIDI programming should head anywhere near a CAL script, but this type of task is exactly what they were designed for. Being able to highlight a track in the sequencer and process it with the script is lovely, and it shocks me that the developers of modern DAWs don't think people would want that.
Our program logic is simple: find any note that ends immediately before another note of the same pitch, and shorten it by a certain amount. In other words, if the current note starts where the last one ends, shorten the last one.
Despite the enthusiasm with which I pronounce "LISP", CAL scripts are actually awful. There's no documentation or reference material, and there are serious limitations with what you can do. When iterating over note events, it isn't possible to maintain a reference to a previous event. So to "shorten the previous note" is incredibly tricky. The only way I was able to do it was by copying all of the properties the current event into global variables, then forcibly deleting every event as it comes in, and re-adding it on the next iteration. Ridiculous.
(do (int shorten 40) (getInt shorten "Shorten amount" 1 200) (dword lastNoteKey 128) (dword lastNoteVel 0) (dword lastNoteDur 0) (dword lastEventTime 0) (dword lastEventChan 16) (dword final 0) (forEachEvent (if (== Event.Kind NOTE) (= final Event.Time)) ) (forEachEvent (do (if (== Event.Kind NOTE) (do (if (&& (== lastNoteKey Note.Key) (>= (+ lastEventTime lastNoteDur) (- Event.Time shorten))) (= lastNoteDur (- (- Event.Time shorten) lastEventTime)) ) (if (!= lastEventChan 16) ; first loop - last event not set (insert lastEventTime lastEventChan NOTE lastNoteKey lastNoteVel lastNoteDur) ) (= lastNoteKey Note.Key) (= lastNoteVel Note.Vel) (= lastNoteDur Note.Dur) (= lastEventTime Event.Time) (= lastEventChan Event.Chan) (if (!= Event.Time final) (delete)) ) ) ) ) )
Still, seeing it in multitracked glory does make me want to finish it after all. One day I shall build my army of robot slide whistles.
Stay tuned for more absurdities, only on mitxela.com!