Introduction: In our previous work, needle driving gestures during a vesicourethral anastomosis of robotic radical prostatectomy were not only classified based on instrument hand, needle grasp, and wrist rotation, but also associated with tissue tears and outcomes (Fig. 1a). Herein, we train and validate deep-learning based computer vision (CV) to automate the identification of suturing gestures for needle driving attempts during the VUA.
Methods: Two independent raters manually identified “ground truth” gestures. Inter-observer variability was measured by Cohen’s kappa coefficient. An AI-agent, comprised of CV, sequential, and deep learning models, was built to distinguish between surgical videos of Gesture 1 (G1) vs Gesture 2 (G2), and G1 vs Gesture 7 (G7) during the VUA. (Fig. 1b). Step 1: Videos were sorted into training, validation, and testing sets (G1 vs 2: 122 training, 31 validation, 31 test; G1 vs 7: 161 training, 41 validations, 40 test). Steps 2-3: Each video was uniformly sampled at 10 frames per second and applied to Inception V3, a 42-layer deep learning network, to extract high-level features from each frame. Step 4: Process using sequential models (Long Short Term Memory [LSTM], Gated Recurrent Units [GRU], Recurrent Neural Networks [RNNs]). Step 5: Application to neural network to process the output as an identified gesture. Chi-square test compared the classification performance to a random classifier (50% chance of each gesture).
Results: Manual reviewers had an agreement rate of 81.20% for “ground truth” gestures (Cohen’s Kappa coefficient was 0.721 (p<0.001)). G1 vs G2, which differ in needle grasp (over vs. under), was best distinguished by LSTM (accuracy=90.32%, p=0.002) compared to GRU and RNN (86.67% and 90.00%, respectively; p<0.04). In order to distinguish G1 vs G7, differing in use of left vs. right instrument, LSTM and RNN were both best performing, achieving a 92.50% accuracy (p<0.001), followed by GRU (82.50%, p=0.04).
Conclusions: Our results demonstrate CV’s ability to recognize features that distinguish suturing gestures. Future work includes automatic detection of each classified gesture and automated risk assessment feedback, based on gesture and tissue location (urethra/bladder neck clock position), and likelihood of tissue trauma according to our database of “gestures-to-tissue tear” library. Source of