### 4.1 The musical interface

In electronic music domain, low frequency oscillators are periodic functions addressed to the modulation of sound synthesis or effect parameters. In ordinary hardware and software music interfaces, they can be selected from a set of predefined common waveforms (e.g., saw tooth, triangle) that represent the trend of the function within its period *T*. Once triggered, the chosen shape is looped to create cyclic automations on the music parameter, according to the way the image of the periodic function is mapped onto the range of values of the music parameter. Typically, this is done linearly, mapping the minimum and the maximum in the image, respectively, to the minimum and the maximum parameter values.

Some devices include graphic and parametric editors to allow the user to create custom periodic functions. The waveform can be drawn within its period starting from a constant flat line, and then adding breakpoints to arbitrarily change the steepness of the curve. In other editors the period domain is discretized into small intervals, where a constant value for the function can be defined. At high discretization rates, this technique permits a good approximation of any waveform. Both breakpoint-based and interval-based techniques provide a graphical feedback of the resulting functions that are addressed only to the musician, since they are displayed on the devices she/he is operating on. As opposed, the audience can only perceive the sound that results from the choice of the low frequency oscillators. This lack of information does not play a crucial role in sound synthesis, while it is particularly strong when oscillators are used to modulate an effect parameter. In sound synthesis, indeed, the complex processing oscillators take part in could make difficult understanding the function shape and progression, hiding its contribution onto the output. On the contrary, during effect modulation the sound-function mapping is often straightforward, making the oscillator visual feedback--and its progression over time--a strong appeal for the audience's sensorial and emotive involvement. Furthermore, this decoupling of audio and visual feedback produces a gap between the sonic output and the gestures the artist is performing to create or affect sounds, for the turning of knobs and the pressure of buttons could hardly be considered a clear metaphor for the drawing of periodic functions. This lack of a comprehensible connection can be easily perceived during both synthesis and effect modulation.

Exploiting the dynamic features of our robotic arm, we designed a novel haptic interface to create and refine cyclic waveforms. This system permits the physical drawing of the periodic functions that compose oscillators, by directly grasping and moving the robotic arm around a predefined center, arbitrarily varying the radius to affect the chosen music parameter (Figure 1). This approach guarantees a continuous coupling between the visual and the audio output for both the musician and the audience, and a direct metaphor that clarifies the artist's gestures.

As previously introduced, in common devices the periodic waveform is shown on a 2D Cartesian coordinate system, where *f*_{
t
}(*x*) ∈ [0,1] and *x* ∈ [0*,T*). The interface we designed works, instead, on a 2D Polar coordinate system, where *f*_{
t
} (*ϑ*) ∈ [0*,R*_{max}] and *ϑ* ∈ [0,2*π*) (Figure 2). Compared to the use of Cartesian coordinates, this solution highlights the periodicity of the functions, being represented by the continuous movement in space of the robot's hand, where the hand can be grasped during each cycle to arbitrarily change its motion.

The interface is composed by two elements, a generic controller/input device (e.g., a computer keyboard, a MIDI controller) and the robotic arm. Initially, the robot is in gravity compensation mode, and a given central point in the robot workspace acts as a virtual attractor. A set of forces only allows the user to move the arm along a predefined direction, where *ϑ* = 0, in order to select a suitable radius value. Once reached the desired value, the user can trigger the robot movement by pushing the controller start button. The robot responds by starting to move around the center in a circular trajectory (initially with constant radius).

From now on, any local modification of the radius is learnt on-line by the robot, which gradually becomes stiffer during the progressive refinement of the user's trajectory. When the user is satisfied with the resulting trajectory and/or with the audio feedback generated by the related modulation, she/he can release the arm, which will continue moving by repeating the learnt loop.

A haptic interaction occurs between the robot and the user whenever the latter decides to apply a modification to the executed trajectory. By touching the robot, the user experiences a force feedback whose intensity depends on the amplitude of the introduced perturbation (i.e., trajectory modification), through the stiffness and damping parameters of the controller. Such force reflects the effort the user has to produce in order to apply the desired perturbation. The introduced haptic feedback guides the user and his/her gestures during the musical task, connecting the performer's physical effort directly to the intensity and the velocity of the music output modifications. We believe this may increase the player's consciousness over the interface and its fine usage, and consequently pave the way to novel artistic expression.

### 4.2 Audio/visual setup

We placed the robot in front of a Powerwall (a 4 × 2*m*^{2} large high-resolution display wall) to provide the user with a visual feedback. While the robot is moving, a stereoscopic trail is projected onto the screen to visually represent (with a 3D depth effect) the trajectory of the robot end effector. This superimposition of real and virtual elements in Hybrid Reality music environment has been proposed in [31], to enhance gestural music composition with interactive visuals. The system records in real time the trail and displays it as a virtual trajectory in the background when the user decides to start modulating another parameter. When the user pushes the button to create a new modulation, the robot stops cycling and moves again towards the center, under the influence of the virtual attractor. While the trail from the previous loop continues to cycle as a virtual trajectory (still affecting the related sound parameter), the robot's current trail color changes. The user can now set the starting radius for the next parameter modulation, creating a new trajectory that dynamically overlaps and intersects with the previous ones. This procedure can be repeated over time, to layer multiple modulations of different parameters and to visually superimpose the related trajectories, each created using the robot (Figure 3). Each trajectory is associated to a virtual memory slot, where the trail is saved, and to a previously selected set of device parameters, which are modulated according to the radius length. Thus, the user can choose which parameters to modulate, selecting on the controller the proper slot. Virtual trajectories saved into virtual memory slots can be stopped or recalled through the controller.

The precise alignment of the stereoscopic trails with the position of the robot's hand was made possible thanks to the bidirectional connection between the system dedicated to the control and the central workstation, which manages all the hardware and software devices that compose our setup. The main application running on the central workstation is VRMedia [32] XVR, a flexible free software primarily meant for virtual environment design; quick to program and extendible with custom modules, XVR uses a UDP connection to receive from the robot the current 3D position of its hand, and works as interface to convert and forward the control signals coming from the external controller.

One of the custom modules we developed for XVR allows receiving and transmitting OSC and MIDI signals from external hardware and software devices. The radius *r* of both robot trajectory and virtual trajectories is translated into a numeric value according to functions {g}_{z}\left(r\right)={p}_{\text{min}}^{w}+{m}_{z}\left(r\right)\left({p}_{\text{max}}^{w}-{p}_{\text{min}}^{w}\right) for OSC, and functions {g}_{z}\left(r\right)=\u230a{p}_{\text{min}}^{w}+{m}_{z}\left(r\right)\left({p}_{\text{max}}^{w}-{p}_{\text{min}}^{w}\right)\u230b for MIDI, with *r* ∈ [0, *R*_{max}], *m*_{
z
}(*r*) ∈ [0,1]. Inner functions *m*_{
z
}(*r*) apply an arbitrary mapping between domain and image, *z* is the number of the current trajectory, and {p}_{\text{max}}^{w} and {p}_{\text{min}}^{w} are, respectively, the maximum and the minimum value for the *w*-th parameter. Each trajectory is associated to up to three parameters, *w*_{max} = 3, which are constantly updated and sent to predefined connected devices. By exploiting standard digital music communication protocols, the robotic interface can be easily integrated with more common electronic setups, making it possible to control the different hardware and software devices; an example of such a composite setup has been shown during the performance described in Section 5.

### 4.3 Robot setup

The robot employed in this study is a *Barrett WAM* with 7 revolute DOFs back-drivable arm, controlled by inverse dynamics solved with recursive Newton Euler algorithm [33]. A gravity compensation force is added to the center of mass of each link. Tracking of a desired path in Cartesian space is insured by a force command F=m\ddot{\text{x}}, where *m* is a virtual mass and \ddot{\text{x}} is a desired acceleration command.

Tracking is performed through a weighted sum of virtual mass-spring-damper subsystems, which is equivalent to a proportional-derivative controller with moving target {\widehat{\mu}}^{\chi}:

\ddot{\text{x}}={K}^{\mathcal{P}}\left({\widehat{\mu}}^{\chi}-\text{x}\right)-{K}^{\mathcal{V}}\dot{\text{x}},\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\text{with}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{\widehat{\mu}}^{\chi}=\sum _{i=1}^{K}{h}_{i}{\mu}_{i}^{\chi}.

(1)

The virtual attractors {{\mu}_{i}}^{\chi} are initially distributed along a circle, following a trajectory determined by a fixed center **x**^{c}, an orientation (direction cosine matrix) **R**^{c}and a series of *K* points parameterized in planar polar representation {\left\{{r}_{i},{\theta}_{i}\right\}}_{i=1}^{K}.

{\mu}_{i}^{\chi},{K}^{\mathcal{P}}, and {K}^{\mathcal{V}} are defined as

{\mu}_{i}^{\chi}={x}^{c}+{R}^{c}\left[\begin{array}{c}{r}_{i}\text{cos}\left({\theta}_{i}\right)\hfill \\ {r}_{i}\text{sin}\left({\theta}_{i}\right)\hfill \\ 0\hfill \end{array}\right];{K}^{\mathcal{P}}={R}^{c}\left[\begin{array}{ccc}{k}^{\mathcal{P}}\hfill & 0\hfill & 0\hfill \\ 0\hfill & {k}^{\mathcal{P}}\hfill & 0\hfill \\ 0\hfill & 0\hfill & {k}_{{}^{\perp}}^{\mathcal{P}}\hfill \end{array}\right],{K}^{\mathcal{V}}={R}^{c}\left[\begin{array}{ccc}{k}^{\mathcal{V}}\hfill & 0\hfill & 0\hfill \\ 0\hfill & {k}^{\mathcal{V}}\hfill & 0\hfill \\ 0\hfill & 0\hfill & {k}_{{}^{\perp}}^{\mathcal{V}}\hfill \end{array}\right],

(2)

where {\kappa}^{\mathcal{P}} and {\kappa}^{\mathcal{V}} are adaptive stiffness and damping gains in the plane of the circle. {\kappa}_{\perp}^{\mathcal{P}} and {\kappa}_{\perp}^{\mathcal{V}} are constant gains in a direction perpendicular to the circle.

The variable scalar gains {\kappa}^{\mathcal{P}} and {\kappa}^{\mathcal{V}} are defined as

{\kappa}^{\mathcal{P}}=\left\{\begin{array}{cc}{\kappa}_{\text{min}}^{\mathcal{P}}\hfill & \text{if}\phantom{\rule{0.3em}{0ex}}t=0\hfill \\ {\kappa}_{\text{min}}^{\mathcal{P}}+\left({\kappa}_{\text{max}}^{\mathcal{P}}-{\kappa}_{\text{min}}^{\mathcal{P}}\right)\frac{t}{{t}_{\text{max}}}\hfill & \text{if}\phantom{\rule{2.77695pt}{0ex}}t\le {t}_{\text{max},},\hfill & {k}^{\mathcal{V}}=2\sqrt{{\kappa}^{\mathcal{P}}}.\hfill \\ {\kappa}_{\text{max}}^{\mathcal{P}}\hfill & \text{otherwise}\text{.}\hfill \end{array}\right.

(3)

The weights *h*_{
i
}in (1) are used to switch between the different subsystems by following a periodic sequence. To ensure smooth and parameterizable transitions, we use a weighting mechanism based on a variant of *variable duration Hidden Markov model* representation [34]. The weights are defined at each iteration *n* as {h}_{i,n}=\frac{{\alpha}_{i,n}}{{\sum}_{k=1}^{K}{\alpha}_{k,n}}, with initialization given by *α*_{i,1}= *π*_{
i
}, and recursion given by {\alpha}_{i,n}={\sum}_{j=1}^{K}{\sum}_{d=1}^{{d}_{\text{max}}}{\alpha}_{j,n-d}{a}_{j,i}{p}_{i}\left(d\right). *π*_{
i
}is the initial probability of being in state *i*. *a*_{
i,j
}is the transitional probability from state *i* to state *j*. *p*_{
i
}(*d*) is a parametric state duration probability density function defined by a Gaussian distribution {p}_{i}\left(d\right)=\mathcal{N}\left(d\Delta t;\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{\mu}_{i}^{\mathcal{D}},{\sum}_{i}^{\mathcal{D}}\right). In particular, the state duration is discretized in intervals indicated with the index *d*. The mechanism shares similarities with the forward variable of a *Hidden Semi-Markov model*[35] in which only state duration information would be used (i.e., spatial information is discarded).

Parameters m=1\left[kg\right],\phantom{\rule{2.77695pt}{0ex}}{\kappa}_{\perp}^{\mathcal{P}}=169\left[N/m\right],\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{\kappa}_{\perp}^{\mathcal{V}}=26\left[Ns/m\right],\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{\kappa}_{\text{min}}^{\mathcal{P}}=100\left[N/m\right],\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{\kappa}_{\text{max}}^{\mathcal{P}}=300\left[N/m\right],\phantom{\rule{2.77695pt}{0ex}}{t}_{\text{max}}=60\left[s\right],\phantom{\rule{2.77695pt}{0ex}}{\mu}_{i}^{\mathcal{D}}=0.06\left[s\right],\phantom{\rule{2.77695pt}{0ex}}{\sum}_{i}^{\mathcal{D}}=0.02\left[{s}^{2}\right],\phantom{\rule{2.77695pt}{0ex}}{d}_{\text{max}}=5,\phantom{\rule{2.77695pt}{0ex}}K=100, and Δ*t* = 0.02[*s*] have been determined empirically based on the robot capabilities and feedback of the performer.