Handy

Handy is a wearable interface interpreting tangible interaction to intangible experiences, using EMG signal and machine learning. Unlike current popular hand-tracking interfaces, including those based on LiDAR and infrared camera, that we need to consciously pose our body for, on-body sensing based on EMG signals allows more intuitive and relaxed interactivity.


Prior Art

Handy

The idea was adapted from my former conceptual project, also named Handy, made with two classmates during the sophomore year. The project is a soft robotic device helping stroke patients rehabilitate hands’ muscle and nervous systems, with closed-loop interactions based on regular motion simulations and movement assistant. The project was awarded the only James Dyson Award National Winner (China, 2019). While only listening to binary signals produced by each muscle, the current research push further and build classifier models to interpret more complex interactions.

Learn more from James Dyson Award website

Motivation

When we move, a lot more is happening than what can be explicitly observed: signals coming out of the brain, and converted into electrical inputs muscle can understand and execute. The signal that we can record on our skin is Electromyography (EMG). According to former research, EMG data can be collected and analyzed for the identification of certain movements. For example, AlterEgo (A. Kapur et al.) can translate indiscernible silent speech to words with machine learning of collected data.

But the concept and technology haven't been applied to the identification and interpretation of body movements yet, while current body tracking and hand tracking mostly based on "off-body" sensors - LiDAR sensor, infrared camera, or image based machine learning modules like PoseNet. VR headsets, e.g. Oculus Quest, now also use similar technology to enable hand tracking as an alternative controller. These methods, especially those solely based on cameras, are highly accessible, easy to implement, and have an acceptable learning curve, while also have certain limitations:

Limitations Oculus Developer Documentation

  • Occlusion, of body and hands, or fingers and palm (for more detailed gestures), makes it hard to truthfully detect the correct gestures.
  • Cameras and other off-body sensors all have a limited detecting area and any movements outside of the volume will not be recorded.
  • Users tend to intentionally and consciously pose for those sensors, which is harmful to an experience.

To address the above limitations and propose an alternative interface allowing more natural interaction, I used EMG sensors and machine learning models to differentiate hand and finger gestures. Data were collected by OpenBCI Cyton Board and sticky electrodes.

Development

To test the feasibility of the proposal, I setup the equipment and collected data with only one channel (detecting one muscle) on:

Early board testing

We can clearly tell when did the subject use the muscle and when did the subject idle. Similar data would be given to machine learning model for discrimination.

Electrodes

One Cyton board can read and send up to 4 channels of EMG data, which means 4 different muscle groups. After learning about hand and arm muscle groups, I decided to use two channels for early stage development and simple gestures, and all 4 channels for final implementation and more complex and detailed movement detection. The electrodes will be attached to the upper limb and hand as follows:

2 Channel Implementation Two-channel Placement Sketch from Wikipedia 4 Channel Implementation Four-channels Placement Sketch from Wikipedia

After several trials of data collecting and testing, I finally adjust the electrode placement as the following, for the designated interactions of current stage features - scroll up/down and zoom in/out, channel one and two will mainly be used to scroll, and three and four will mainly be used to zoom:

Electrode Placement

Signal Processing

The models were trained with ml5.neuralNetwork. The sampling rate of the OpenBCI Cyton board is around 200Hz (which is much higher than the default 60Hz p5.js draw()) so I'm getting a lot of data each second for every channel.

I firstly tried to send the data directly to ml5.neuralNetwork, however, the training performed very poorly no matter how I adjusted training options like learning rate or epochs. Then I plotted out the signals I received and found the reason: even though the signal plot is relatively easy for us to identify the "active" parts from the "idle" ones, the data points are really close to each other, jumping back and forth, making it hard for machine learning model to distinguish (see the figure below the loss curves.)

Clean Signal Curve Preprocessed sign received from Cyton

Signal Curve

Data points in the same place of adjacent periodic signal chunks

Then I realized the "active" part of the signal has "destroyed periods" - the signal no longer repeat as it normally does, with a larger amplitude. Thus, instead of sending the absolute value of the data into the model, I sent the following three values:

  • the delta (absolute value) of the point and its predecessor
  • the delta (absolute value) of the point and the point 12 points ahead of it
  • the standard deviation of the the point with 11 points ahead of it (f = 12)

And the training results and predictions works drastically better than before:

Training Performance

Due to the limited time, I didn't have enough time to apply appropriate DSP (digital signal processing) filters to the signal streamed from the board. But this could help the machine learning model perform much better on the signal recognizing and gesture classification. Also, the OpenBCI GUI (normally for quick plotting and connection checking, based on Processing!) applies the filters automatically and provides options of re-streaming out the filtered data. But I haven't figured out the proper way to establish connection with it.

Interface

For the first stage of development, I tried to accomplish the following three interactions with the interface (all requiring categorization as the task for the models.)

Interactions

The third interaction that intends to create a "number board" inside the palm hasn't been achieved, it's unstable even when recognizing "1" and "9" - which are the furthest two numbers on a board - alone. I'll keep working on it and see if the result will improve if after applying the filter or with other methods. I also modularized the system so that each model only needs to handle very simple and specific tasks, and it's easy to extend the applications by training new models and adding them into the pipeline.

Model Structure

The sensors are supposed to be enclosed into a glove or wearable devices like smart watch. More possibilities of interaction and form are to be explore based on current results.

References

  1. Electromyography - Wikipedia
  2. Muscles of the Upper Limb - Wikipedia
  3. Arm - Wiki, Upper Limb - Wiki
  4. AlterEgo - MIT Media Lab, Paper
  5. OpenBCI
  6. Techniques of EMG signal analysis: detection, processing, classification and applications
  7. EMG Signal Classification for Human Computer Interaction A Review (Table 1: Summary of major methods used for EMG classification in the field of HCI)
  8. A new means of HCI: EMG-MOUSE
  9. Design for Hands - Oculus Developer Documentation