Once upon a time, in the days when computers were mysterious and new, there was no simple way of making electronic musical instruments communicate with each other. Every manufacturer invented their own methods for co-ordinating the various bits and pieces that they sold. These would usually involve fragile bundles of wires passing analogue control voltages from one device to another. On reaching their intended devices, these voltages were amplified, manually scaled and offset in order to render them useful.
The pre-MIDI EMS VCS3 Cricklewood keyboard and Putney synthesiser in various stages of interconnectedness. In those days, it was considered acceptable to name one’s products after unprepossessing but well-to-do corners of Greater London. (Apologies for the suboptimal photography.)
The brutal-looking connector on the 1960s VCS3 is called a Jones connector. It supplies two power voltages and a ground to the keyboard. Two scaled control signals and an envelope trigger are generated and returned on separate terminals. Putney’s backplane has an array of jack sockets that allow control voltages to enter and leave.
In response to this unhelpful situation, the MIDI Manufacturer’s Association [MMA], a consortium of mostly American and Japanese companies, agreed on a universal specification for a digital interface. This specification was driven entirely by two needs: to encourage interoperability between musical devices, and to keep cost to a minimum. The MMA settled on an asynchronous serial interface, because this reduced the complexity and cost of interconnection. It was specified to run at 31.25kHz, a number chosen because it is easily reachable by dividing 1MHz by a power of two. At the time, this choice rendered it incompatible with RS-232 (which can usually provide nothing between 19.2kHz and 38.4kHz), preventing existing computers from transmitting or receiving MIDI without extra hardware. MIDI may have ended up on computers only as an afterthought.
Data was communicated in one direction only over 5-pin DIN connectors, which were ubiquitous in the home audio market, and were therefore about the cheapest multi-pin connectors available. (They were so cheap, in fact, that the MIDI specification wantonly squandered two of the connector’s pins by leaving them unconnected: a move that would not be countenanced today.)
The data that travels on the MIDI interface was elegantly designed to embrace the feature set of contemporary microprocessors. Only 8-bit data was employed and, to save memory, no standard message could exceed three bytes in length. One bit of every byte was reserved to frame the data, giving rise to the 7-bit data limitation that causes annoyance today.
By design, MIDI embraced note gating information, key velocity, and standard controller messages for the pitch bend wheel, sustain pedal, and key aftertouch. A loosely-defined framework of controller messages was also provided so that other data could be conveyed besides this. The provision was made for almost every command to be assigned one of 16 separate channels, intended to allow sixteen different sounds to be controlled independently over the same physical cable.
The first MIDI devices emerged in 1983. Some unintentionally very amusing episodes of Micro Live were created that demonstrated the technology. The rest is history. Synthpop was essentially democratised by this inexpensive serial protocol. Dizzy with the possibilities of MIDI, musicians ganged their synthesisers together and began controlling them from the same keyboard to layer more and more voices, creating fat digital sounds that were very distinctive and dated very quickly. Artists that did not have the resources to employ professional studios with all their pre-MIDI equipment connected their Casio keyboards to home computers, running new software that enabled them to build up note sequences, and then quantise, manipulate, and replay them in a manner that would have been unthinkably expensive by any other means.
Here we are, nearly thirty years later. The processing power and capacity of a computer is around two million times as great as anything available for similar money in 1983. As a consequence, keyboard controllers, synthesisers, sequencers, and signal processing tools have advanced considerably. And yet, amidst all this change, MIDI still reigns supreme. As a basic protocol, it is just about fit for purpose. With our ten digits, two feet, and a culturally-motivated lack of interest in breath controllers, most of us are still trying to do the same things that we’ve always done in terms of musicianship. Although devices now produce much richer MIDI data at a faster rate, this is not a problem because MIDI is conveyed over faster physical interfaces (such as USB) so we can still capture it.
Aside from the musical data, MIDI has another weapon that has ensured its survival: it allows manufacturer-specific data transmissions. These System Exclusive messages opened a portal that allows modern devices to play with computers in ways that MIDI’s creators could not have imagined. To System Exclusive messages, we owe patch storage and editing software, remote software upgrades, and next-generation protocol extensions like Automap.
And yet … and yet, the specification shows its age to anybody who wants to do more or to delve deeper. MIDI is inherently a single-direction protocol, and its 7-bit data limitation results in an obsession with the number 128 that is now painfully restrictive: 128 velocity gradations; 128 programs in a bank; 128 positions a controller can take. Certain aspects of MIDI were poorly defined at the beginning, and remain unresolved three decades later.
Q. Middle C is conveyed by MIDI note number 60. Should we display this note to the user as C3 or C4?
A. Just choose one at random and provide the other as a configuration option.
Q. How much data might I expect a System Exclusive message to convey?
A. Oh dear, you went and bought a finite amount of memory. Good luck designing that MIDI merge box / USB interface.
Q. I’ve got a MIDI Thru splitter that is supposed to be powered from the MIDI OUT port of my keyboard. Why doesn’t it work?
A. Your keyboard manufacturer and your Thru box manufacturer have both bent the specification. If they’ve bent it in opposite directions, then your box won’t work as advertised.
Q. If the user doesn’t understand MIDI channels, and is attempting to transmit MIDI data on one, and receive MIDI data on another, what will happen?
A. The device at one or other end of the cable will end up back at the shop.
Q. I’m designing a new keyboard. Should my device support Active Sensing?
A. I don’t know. Should it?
Apart from all that, a lack of per-note control data annoys the creators of more expressive instruments. The standard’s rigid genuflection to Western 12-tone chromaticism is an irksome limitation to some (particularly those who use terms such as ‘Western 12-tone chromaticism’). The note model cannot properly handle single-note pitch effects such as glissandi. For devices that must accept or transmit a wide variety of control data, including us, the NRPN system constitutes a fairly unpleasant prospect, loaded with parsing irregularities and a padding-to-payload ratio of 2:1.
In retrospect, dealing with MIDI could have been made somewhat easier. The size of a single MIDI instruction depends on the contents of the first byte in a way that is neither obvious nor easy to derive, and the first byte may not necessarily be repeated in subsequent messages, which leads to a fairly onerous parsing process.
The authors of the USB MIDI specification went to the trouble of re-framing all the data into four-byte packages to simplify parsing. Unfortunately, they left a back door open to transmit an individual data byte where this was deemed essential. When is this essential? When you are deliberately trying to send malformed data that’s useless to the device at the other end. Or, to put it another way, never. The inevitable happened: one company now misframes even valid instructions, using this message capriciously to split up standard data into streams of single bytes. The USB MIDI parser thus becomes more, not less, complex, because it has to be able to support both the new four-byte frames and the old-fashioned variable length ones.
In honesty, it’s only slightly inconvenient. The MIDI parser that we embed into our hardware designs is about 640 bytes long. These are 640 very carefully arranged bytes that took several days and a lot of testing to prove, and all they do is allow a device to accept a music protocol invented in the early 1980s, but it might have been a lot worse. Indeed, it is worse once you start trying to respond to the data. Never mind: if even the pettiest problem stings us, we fix it properly. And if any fool could do MIDI properly, we’d all have to find alternative careers.
There have been attempts, and occasionally there still are, to supplant MIDI with an all-new data format, but these seem doomed to obscurity and ultimately to failure. About twenty years ago, there was ZIPI; today, it’s nothing more than a Wikipedia page. mLAN attempted to place MIDI into an inexpensive studio network. In spite of very wide industry support, it had few adopters. With hindsight, the futurologists were wrong and the world took a different turn. Latterly, there’s the HD-MIDI specification, and Open Sound Control [OSC], soon to be re-christened Open Media Control. We’ve looked into these. I cannot remember if we are prevented from discussing our draft of the HD-MIDI spec, but we probably are. My one-sentence review therefore contains nothing that isn’t already in the public domain.
HD-MIDI promises to be improved and more versatile, and does so by adding complexity in ways that not everybody will find useful. OSC suffers from a superset of this problem: it’s anarchy, and deliberately so. The owners of the specification have been so eager to avoid imposing constraints upon it that it has become increasingly difficult for hardware to cope with it. The most orthodox interpretation of the specification has the data payload transmitted via UDP somewhere in the middle of a TCP/IP stack. (You think that MIDI’s 7-bit limitation creates too many processing overheads and data bottlenecks? Wait until you try TCP/IP as a host-to-device protocol!)
Networking protocols are fine for computer software, phone apps, and for boutique home-brew products, but they are somewhat impractical for a mass-market music device. Most musicians are not IT specialists. Those whose savoir faire extends only as far as the concept of MIDI channels cannot be expected to prevail in a world of firewalls, MAC addresses, subnet masks, and socket pairing. Ethernet being the mess that it is, there are at least two simpler ways of interfacing with computers by using old serial modem protocols, but most new operating systems have all but given up supporting these and the burden of configuration is, again, upon the user.
More severely, there is an interoperability problem. OSC lacks a defined namespace for even the most common musical exchanges, to the extent that one cannot use it to send Middle C from a sequencer to a synthesiser in a standardised manner. There are many parties interested in commercialising OSC, and a few have succeeded in small ways, but it wouldn’t be possible to stabilise the specification and reach a wide audience without garnering a consortium of renegade manufacturers for a smash-and-grab raid. The ostensible cost of entry to the OSC club is currently far higher than MIDI, too. Producing a zero-configuration self-powered Ethernet device, as opposed to a bus-powered USB MIDI device of equivalent functionality, would price us out of the existing market, exclude us from the existing MIDI ecosystem, and require a great deal more support software, and to what advantage? For OSC to gain universal acceptance, it will need to be hybridised, its rich control data combined with more regular musical events, embedded together in a stream of – you’ve guessed it. If we’re going to go through all that palaver, and more or less re-invent OSC as a workable protocol in our own club, why would we start with its strictures at all? This brings us back to the MMA, and the original reason for its existence. HD-MIDI, at least, has industry consensus. If it is sufficiently more effective than MIDI 1.0, it may yet form part of a complete next-generation protocol.
For all its shortcomings, we musicians and manufacturers cannot abandon MIDI. We have had thirty years to invent a better protocol and we have singularly failed. Some of us have already lost sight of what makes MIDI great, and we must strive to remind ourselves how we can make it better. Meanwhile, the very simplicity, flexibility, and ubiquity of MIDI 1.0 make it certain to be an important protocol for some time to come. With this in mind, I confidently predict that, in 2023, MIDI will still be indispensible, unimpeachable, and utterly, utterly everywhere.