kII 2.1 at Cybersonica SOUNDWAVES
I am very pleased and happy to announce, that an adapted work of kII (and the Kontaktstation!) will be shown at the Cybersonica SOUNDWAVES Exhibition in the Kinetica Museum in London from May 17th to June 30th. kII (2.1) will be changed to fit the chances of the exhibition space: the idea is to take the work one step further. There will be one device to be operated from the inside and a second device to be operated from outside the museum’s shop window.
Other participating artists are: Pierre Bastien, Max Eastley, Julie Freeman, Andy Huntington, Rob Mullender, Martin Riches and Peter Vogel. See Kinetica-Museum for more informations.
To make the speech output more attractive to the audience and provide a joyful interaction experience, it has been decided that there will be some other sonic output (beats, pads…) too. As this is how I am using the machine anyway (as a musical instrument capable of singing in a very uncommon way beyond the typical trashy robotic pre-notatated stuff), I am looking forward to realise the work. This article will serve as a documentation page / diary, where the newest developments will be published.
This is the current state of the existing work kII Kempelen 2.0:
Note that I wrote the application of this machine in C – based upon MBHP/MIOS, a non-commercial open-source microchip-based hard- and software-project written by Thorsten Klose. You’ll find all sources, plans and everything else you need to rebuild the machine at http://www.midibox.org.
Thinking about two questions:
1. will there be music all the time or does the sensor matrix also control the sound? I think yes, it does
2. as there’s too less time to set up one device that’s controllable from the inside and the outside, it may be better to setup two independent devices with different sensors. The biggest problems may be unwanted reflections on the window glass. It’s hard to determine on the forehand what happens if I cannot experiment on-site.
The last two days, I thought about two things I had not considered:
If there are two devices and one computer, I need three Line-Ins and two MIDI-Inputs which I do not have. Everything seems to get very complicated and I don’t intend to spend so much money on additional equipment. Also, I don’t like the idea of a hundred extra cables and devices that all need power and are possible weak points.
I’m also a bit worried about the additional Speech-IC I ordered for the second machine. I have absolutely no idea when it will arrive from oversea.
I’m really thinking about porting the kII speech-output-technology to the Mac (using the Mac-Speech-Engine, with which I have worked before). The sensors would be connected to one or two ACSensorizers (with Enabled MIDI-Tru) which already provides better sensoric controls. Therefore I could try out different sensor-position for the outside-window-version.
This would mean: less electronics, less wires and probably most important: less problems and therefor faster realisation.
Time is running short, I need to make a decision…
I think, I’m going to finish the machine for the inside first (based on the SpeakJet). If I have enough time left, I’ll start with the Speech Output on the Mac for the second (outside) device; I wanted to port the kII to the Mac Speech System anyway, so why not just begin? The most important electonic components are not yet delivered.
So what’s up with the Mac OS X speech? Why would I want to use that? Well, I discovered that there’s a scripting language for the SpeechServicesManager right after I began working on kII. It’s concept is pretty simple, but unleashes powerful control over the system’s speech output.
If you’re surfing on a mac right now, please select the following text and choose from the “Services” Menu (It’s in the Application Menu, so when you’re browsing with Safari it’s: Safari > Services > Speech > Start Reading Text:
[[inpt TEXT]]
hello[[cmnt PITCH 0..127]]
[[pbas 10]]
hello [[slnc 200]]
[[pbas 100]]
hello
[[pbas 43]][[cmnt TUNE]]
[[inpt TUNE]]
~
2EH {D 2000; P 400.0:0 400.0:40 200.0:60 200:100}[[cmnt EMPHASIS +/-]]
[[inpt PHON]]
[[emph +]] EH
[[slnc 100]]
[[emph -]] EH[[cmnt ACCENT]]
[[inpt PHON]]
@1AEplIHk2EYSIXn. [[slnc 100]]
@2AEplIHk1EYSIXn.[[inpt TEXT]]
Now isn’t that cool?
That’s why I love my mac! :)
Today I’ve set up an Ableton LIVE set with three basic beats and two very strange Operator patches. Then I reordered and enhanced the Midi/CC-Output capabilites of the kII-application to support a better range of output values and added a Note-On Message with the harmonized phoneme output. Then I connected these to the LIVE-set to filter the beats and control their volumes. The Notes trigger the Operator Instruments which are enhanced with LIVE-internal Note-Off lenghts and some nice Arpeggiators. The result is surprisingly good for just one evening!
Available output controls are now:
– Note-On{+chAssignment} (harmonized heigt, lastPhoneme}
– Jaw (opening state), CC#40
– Tongue (finger roll), CC#41
– Bend (thumb roll), CC#43
– Speed (movement speed), CC#44
There are additional verbose outputs available, but I think these won’t be needed… and so I can save a bit MIDI-bandwidth…
The best with this approach is, that I have now all necessary messages to control a Mac-OS based speech output for the second device. I could set up a second core with a kII application – even without an IIC-SpeakJet module (!). I have made no final decision about this yet, but if the progress continues with that speed, I am looking forward to a very cool dual-voice music machine…
Today I got the first picture of the exhibition place. Based on that, I scribbled the required cabling. Wild, huh?
Last night I had an idea for the second device (that should be operated from outside the shop window): I thought about attaching a cutted foil to the glass to add some information on the usage (icons), but then it hit me: Electric conductive foil! I think it’s possible to glue the very lightweight sensors somehow to the glass, so the whole sensorMatrix would stick directly to the shop window. The microcontrollers and heavy PCBs would be below the ceiling. Glueing the sensors onto the window would also have the advantage that they are really tight to the glass, which is a vital point of the construction-concept. If there’s a gap larger than 1 or 2 cm between the glass and the IR-diodes, it’s likely that the IR-beam gets reflected by the glass surface and produces wrong readings: the Sharp sensors have a critical reversed reading below the minimum range, therefore this could end up in a real “auto-kinetic” machine, which might be interesting, but not really what I want :)
There are now different options how to realise this, but it seems like a great idea…
I also found an 3M shop near me that has a great variety of conductive foils. Another option would be to stick de-isolated cables directly to a computer cutted foil. Might be a bit of work, but possibly a lot cheaper than constructing and building a second casing…
I always thought, there would be a bit more inside these nifty SHARP distance sensors, so I opened one of them… well, curiosity is now satisfied, astonishingly there’s just a small PCB in there, but my testings showed that the lenses are pretty important: without no useable readings are possible, but they’re fixed inside the case…
Working on the construction of the case. I used Google SketchUp for the 1:1 construction to make sure I don’t forget nothing. It’s pretty bad if you sit in front of an expensive case and recognize, that the switch does not fit in because an electronic component is bigger than expected :-
I also improved the sensor orientation. Many users had problems positioning their hands correctly within the sensor matrix, so I tried to eliminate some “silent areas” in the array by crossing the beams and tilting the sensors slightly to show towards the user. I’m pretty proud of the easy solution with the distance holding sticks :)
I really hoped that Ableton LIVE 6 now can handle sleep/wake issues. While LIVE 4 was a pain in the ar$e because it lost all midi connections after waking up from sleep or if an application crashed that used virtual MIDI connections to remote-control LIVE. Last year I had to program a launcher application with virtual ports to make sure all programs were started and terminated in the right order. I even had to deal with sending virtual keystrokes to suppress these annoying “do-you-really-want-to-quit-your-changed-file” dialogue… all these workarounds just to enable a seamless and easy “power-on / running / power-off” system! If LIVE would be an open-source application, this would probably require just 10 lines of additional code to re-initialize MIDI-devices 8-/
That’s what I really hate about closed-source products…
However, the good thing with LIVE 6 is that you can now disconnect a MIDI device and it finally restores all MIDI port settings when you plug it back in; but unfortunately it does not handle well sleep/wake: the device simply no longer works after sleeping unless you restart the application – or even worser: it crashes on wakeup.
But at least it’s a bit of a progress:
Now the launcher app just has to get notified of a sleep and quit live before that – as well as relaunching it after waking up. The launcher also sends a START system realtime event to immediately start the liveset. Chances are good I’ll release my launcher application with sourcecode in a few days. Maybe this is of some use for other people who are trying to use this elsewise very nice application for long-term exhibitions. I mean, you can’t tell any exhibition space running person how to setup a complex midi prefs dialogue, can you?
I finished the launcher app. No more MIDI, just sending virtual keystrokes. Until now everything works okay (autostart, sleep, wakeup, shutdown). I’ll have to test though…
It makes no fun testing sleep/wake/shutdown.
99% of my time I’m waiting 8-
I invested some time in a general document-based project setup with midi-input (output possible, but not connected). This may be a fine template for future applications.
Because the second SpeakJet hasn’t yet arrived and I have to wait for some other informations, I’ll continue on a Mac Speech Control Interface (kIII or kX) based on the jaw/tongue principle of kII.
If I can finish it in the next days this’ll surely be a nice audio output. I also thought about an AudioUnit, but that’s a bit over my head atm. A stand-alone app will do for the first.
Geez, I just got notified that my second submission, the Kontaktstation 3.0 got also selected to be shown at Cybersonica :-D
I just handed that in for a backup and I never thought of being selected twice. I feel very honoured. Lucky me. I will open up another post about that…
kIII is talking!
I switched from the NSSpeechManager to the Carbon Speech Manager that has far better controls over the selected voices.
Features:
– multi-document application (one document per voice); the number of documents (= voices) is only limited by RAM
– midi-input port & channel selection to enable per-document voice control & -triggering
– full voice parameter control (pitch, rate/speed, pitch-modulation, volume)
– phrase editor
– supports all input modes (text, phon, tune)
A first beta-version will be released soon.
In the meanwhile I finished the new ACSensorizer 0.4.3. (tba) for the Kontaktstation. I’m currently working on the phoneme implementation of kIII. The case of kII will be transparent, it looks like it’ll be ready tomorrow:
The case is finished, now I’m trying to stuff it with the electronics. I have to redo some solderings. As it’s likely that the additional equipment (Power Supply, Mixer and Computer for additional Live-Sounds) will be on the floor than under the ceiling, I plan to add some plugs, so that I can either operate the device normal or upside down, depending on where the cables go. It’s tricky to get the right position for the sensors in the case :-
Another nice side-effect of the transparent case is, that I can now add an LCD inside that shows the current harmony settings. But I must not forget to keep the orientation of the LCD changeable, too! ;)
There are no good news about the kIII with Mac OS Speech Output. I implemented a SpeakJet-to-MacOS-Speech phoneme table and connected some wider range distance sensors to the software, but the phonemes are not triggering very smoothly. It sounds more like a cat than a human/robot :( The problem is that I’m currently triggering one phoneme after another and this isn’t working very well: either there are gaps (too fast) or the speech output has gaps and is still enounciating even if the input has stopped long ago. In order to produce gapless phonemes, I’d have to fill a buffer and then process this buffer for enounciation and secondly, I have to imitate the speakJet’s MSA/SCP input functionality. In MSA-Mode, events get added to the buffer and are processed first-in-first-out while SCP commands are executed at once (e.g. clear buffer). Though working with a buffer implies that it’s not realtime anymore. Timing issues are always tricky to solve…
Anyway: I’ll concentrate on finishing the case for kII and solder a second SpeakJet board for the outside device, just in case. If there’s time left, I can try to implement some other input / buffer-based phoneme reproduction method for kIII. It seems a pure SpeakJet emulation on the Mac is not so straightforward as I thought. :- Maybe I should focus on the strengths of the Mac Speech reproduction and it’s processing powers compared to a tiny microcontroller. I’m also not very confident with the hands open/close concept if the sensors are in front of the user rather than above/below. Let’s see if I get some ideas the next days.
The case is now completely finished. I added some plugs to operate the device upside down or the other way round. It is completely deconstructable. Still, I have to polish out some scratches and glue the PCBs. But in general it’s ready.
The nice thing however is, that sensors are pretty well adjusted. It is now possible to operate the device easily with one hand, which was a bit difficult with the 2.0 model, ’cause the sensors pointed straight up/down. The 2.1 sensors point slightly towards the user, so the thumb position is measured more precise.
Hooray!
After having soldered a completely new SJ-Interface Board, I had everything I needed for a second machine (the device that’s to be operated from the outside). But before saying a final good-bye to the Mac OS Speech based kIII, I wanted to implement a buffering method and see if this could work. I added a phoneme buffer with adjustable size – and it works! Like a charm!
Now I can hear that the kII generated voice is actually a lot cooler than one can hear from the SpeakJet. I just have to fine-tweak some parameters, but I guess this will be a very cool second option and a worthy version upgrade to kIII :)
I also have some ideas how to improve the sensor matrix for a non-vertical input. This will be just a quick adaption for the glass front device; however, there seems to be lots of space for improvements with an updated version of kII or a smart ACSensorizer setup, but with respect of just five days remaining, the kII application will do just fine…
Besides implementing phrase-support and doing some fine-tunings on the voice-output, I worked mainly on a casing for the outside device the last days. While waiting for the color to dry, I recorded a sample of the kIII speech-output. Nice language :) Ugalayia Katchaja Puh.
< ?php $xsfp_p_w = 350; $xsfp_p_h = 110; ?> |