Introduction: Kinematic performance metrics during robotic surgery have been linked to clinical outcomes. Thus far, the kinematic data we have analyzed, in the form of automated performance metrics (APMs), have been summarized data over specific steps of a surgical procedure and focused on economy of movement (i.e., efficiency). Herein, we evaluate for the first time raw (unprocessed) kinematic data during robotic simulation, utilizing artificial intelligence (AI) methods to predict Mimic’s composite scoring and select psychomotor errors.
Methods: Our analysis of raw kinematic data centered around a single needle driving task (“Basic Suture Sponge”) on the Mimic Technologies FlexVR platform. Eleven participants (surgeons/non-surgeons) completed the simulation exercise 5-11 times each. For each exercise, spatial x,y,z coordinates of the camera and instruments were collected at 30 Hz. This logged data was then used to infer “micro-displacements” along each degree of freedom and overall spatial-temporal instrument trajectory during the exercise. Psychomotor errors and composite “M Score” were reported by the simulator. We utilized several sequential deep learning algorithms, including Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN), and Long Short Term Memory (LSTM) networks to map the raw data and to predict the composite score and general surgical skills: needle targeting and instrument collisions.
Results: 60 simulation sessions were divided into 37, 12, and 11 - training, validation, and testing cohorts, respectively. The mean ±SD composite “M score” was 925.75 ±324.85. Predicting composite simulation score: the LSTM network was able to best predict the composite “M score” with a mean absolute error 273.09 (29%). Predicting Mistargeting of Needle: GRU and RNN were able to learn best from the dataset (accuracy=76%; p=0.38), achieving 6% more accuracy than the base ratio of classes in the dataset (30% perfect/near perfect needle targeting [0-1 misses] vs. 70% imperfect targeting [>1 misses]). Predicting Instrument Collision: RNN outperformed every other model in this exercise with 64.7% accuracy (11.1% more than the base classifier, p=0.18). LSTMs and GRUs were up next with 60% and 58.8% accuracy.
Conclusions: Our preliminary results suggest that the “micro-displacements data” captured during a simulation exercise can at minimum moderately capture aspects of suturing psychomotor skill. With further optimization, AI methods may be able to automate psychomotor skills assessment. Source of