David Watkins-Valls, Chaiwen Chou, Caroline Weinberg, Jacob Varley, Kenneth Lyons, Sanjay Joshi, Lynne Weber, Joel Stein, and Peter Allen

Columbia Robotics Lab


Abstract

This work describes a new human-in-the-loop (HitL) assistive grasping system for individuals with varying levels of physical capabilities. We investigated the feasibility of using four potential input devices with our assistive grasping system interface, using able-bodied individuals to define a set of quantitative metrics that could be used to assess an assistive grasping system. We then took these measurements and created a generalized benchmark for evaluating the effectiveness of any arbitrary input device into a HitL grasping system. The four input devices were a mouse, a speech recognition device, an assistive switch, and a novel sEMG device developed by our group that was connected either to the forearm or behind the ear of the subject. These preliminary results provide insight into how different interface devices perform for generalized assistive grasping tasks and also highlight the potential of sEMG based control for severely disabled individuals.


Video


Results

For the user study, a trial was considered successful if the user grasped an object and carried it over to the other location on the table. A failure was when the arm could not recognize the object after three attempts or if the arm failed to pick and place the object. The subject used the mouse first in the experiment consistently and therefore the success rates were much lower than the other input devices, which we attribute to an unfamiliarity with the interface. We randomly assigned placement of the sEMG device to either behind the subject’s ear or on their forearm. Users who had placed the sEMG device behind their ear were slightly more successful than those who had placed it on their forearm. An object form the YCB object database\cite{YCBDataset} was assigned to each user to grasp in the final stage of the trial for each input device. Users had no difficult picking up various sized cubes but showed reduced performance with the YCB object. On average users were successful at picking up an object 94.13% of the time.

The timing results are shown in Table \ref{tab:timingdata}. The major take away from this is that most users had very little difficulty using any of the four inputs devices. The times listed under YCB object trials took much longer than those for the blocks because the time taken to calculate the grasps was so much longer than it was for the blocks. This inflated the time spent waiting to choose a grasp for the given object. Occasionally users would have difficult with the interface such as recalibration of the sEMG device or a peripheral crashing. In cases such as these the timer was reset and the subject was asked to redo the experiment.

In addition to this added grasp calculation time, users often took much more time with the YCB object because of trouble calculating any valid grasps due to point cloud error. In these circumstances we had the user rerun object recognition on the scene to attempt to find a better view of the object. Moving forward we will likely rerun vision automatically in the event that no valid grasps are found and improve our methodology for finding grasps. All four devices had similar times for all of the trials and we take this information to show that using an sEMG device is as fast or in some cases faster than using the other three input devices, and therefore validates it as a useful input device into a HitL system.

One important thing to note is that the training time for the mouse and sEMG device were the longest. The mouse time included an introduction into the system, which inflated the amount of time it took to train the user in this case. The mouse was also the first device the user had a chance to use. The sEMG device took additional time to train the user since it had to be calibrated. Training for the sEMG device also involved finding a valid electrode location, showing users the interface, and walking them through how the device would control the interface. Training users on the Amazon Echo and the Ultimate Switch took substantially less time as they were easy to use and very similar in use to the mouse. However once the user became familiar with each device the timings were all comparable.

Overall, the users understood how they were supposed to navigate the system and were aware of what the system was doing at any given point in time. From our user studies we were able to both verify that our improvements to the new system had the intended effect, and several new areas of improvement became apparent.

One lesson learned from the experiments was that displaying more of the current scene would be useful to the user, for example showing an overlaid image of the scene on top of the user interface was a suggestion given by several of the participants. Another suggestion was that showing the perspective of the objects relative to the user would have been helpful. Many users were confused by the orientation of the grasp relative to the object and found the display to be relatively counter intuitive.

ActivityMouse 15 trialsAlexa 15 trialsSwitch 15 trialssEMG (forearm) 7 trialssEMG (behind ear) 8 trialsAverage
Block 1100%100%100%100%100%100%
Block 2100%100%100%100%100%100%
Block 3100%100%100%100%100%100%
YCB Object66.67%80%80%71.43%87.50%76.53%
Average92%95%95%93%97%94.13%
Success rate of grasping objects during each trial. Each trial occurred 15 times and shows the number of successful grasps as well as the percentage successful.
ActivityMouse (s) Alexa (s)Switch (s)sEMG (forearm) (s)
explain interface169.5880.7678.97314.15
user block 120.51820.518.0
robot block 163.740.8748.0553.01
user block 233.432338.527.3
robot block 260.3953.5249.54850.07
user block 321.5172813
robot block 361.1860.3974.33568.27
user YCB bottle2374.1477.09124.24
robot YCB object73.3940.9167.7165.7
Average times for both the time taken for the user to select and object and grasp, followed by the time taken by the robot to grasp the object. The robot was consistent in taking roughly 50-70 seconds to execute a grasp.

Downloads

Source Code

https://github.com/crlab/crui_mico_ws

Protocol

ModifiedBoxandBlocksProtocol.pdf


Citation

Arxiv link here