This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

Abstract

Objectives

We assessed the predictive validity of ProMIS hybrid laparoscopic simulator in a urology residency program.

Methods

Between June 2008 and December 2011, we trained 14 urology residents on ProMIS, measuring 5 basic laparoscopic tasks (peg transfer, pattern cutting, EndoLoop placement, extracorporeal suturing, and intracorporeal suturing). Then, we compared their last performance on ProMIS to their first performance on a porcine laparoscopic nephrectomy model. Two independent urologic surgeons with laparoscopic experience rated the resident performance on the porcine models, and kappa test with standardized weight function was used to assess for inter-observer bias. Non-parametric spearman correlation test was used to compare each rater’s cumulative score with the cumulative score obtained on the porcine models in order to test the predictive validity of the ProMIS simulator.

Results

The kappa results showed acceptable agreement between the two observers amongst all domains of the rating scale of performance except for confidence of movement and efficiency. In addition, poor predictive validity of the ProMIS simulator was demonstrated.

Conclusions

We could not demonstrate the predictive validity for the ProMIS hybrid simulator in our urology residency program.

Key words

ProMIS, MISTELS, Laparoscopic simulator, MIS training

Introduction

Modern surgical practice has witnessed a major shift towards minimally invasive surgery (MIS) for obvious reasons related to reduce morbidity and less hospital stay. This has been accompanied by difficulties in surgical training pertaining to how properly train and evaluate surgical residents, as it was clear that open surgical skills do not correlate with laparoscopic surgical skills
[1]. For ethical, clinical, and logistical reasons, the
operating room (OR) is not the ideal place to start learning those basic
laparoscopic skills [23]. Therefore, there has been an increasing demand for training simulators that can train the residents, assess their performance on those skills and predict their performance in the OR.

Many laparoscopic simulators have been introduced over the last decade. There are two types of surgical simulators: the virtual reality (VR) simulators and the physical (box) trainers
[4]. The VR simulator comes with the disadvantage of lacking haptic feedback
[5], and we have already demonstrated in 2012 the lack of construct validity of the LapSim (a VR simulator)
[6].

ProMIS is an augmented reality simulator that belongs to the second group of simulators. In 2008, we have proven the construct validity of ProMIS to differentiate between junior and senior urology residents in our urology residency program
[7]. In this article, we wanted to prospectively determine the predictive validity of ProMIS by comparing the performance of 14 urology residents in our program to their performance intra-operatively, assessed here through their performance in the porcine laparoscopic nephrectomy model.

Material and Methods

14 urology residents (PGY 1 to 3) residents at McGill University were enrolled in the study between June 2008 and December 2011. They underwent an extensive initial orientation prior to commencement of the study and then they practiced on ProMIS for 1 hour weekly. They had monthly assessment of their performance. They were trained and assessed in 5 different tasks (peg transfer, pattern cutting, EndoLoop placement, extracorporeal suturing, and intracorporeal suturing), based on the widely accredited and validated (Society of American Gastrointestinal and Endoscopic Surgeons, and AmericanCollege of Surgeons)McGill Inanimate System for Training and Evaluation of Laparoscopic Skills (MISTELS) tasks
[8]. The parameters of assessment were total time and smoothness of movement. The ProMIS hybrid simulator consists of a Toshiba®computer with a laparoscopic mannequin, which contains 3 camera tracking systems to identify instrument movement inside the simulator from 3 different angles.

The last assessment of the resident performance on ProMIS was compared with their first performance of porcine laparoscopic nephrectomy. Approval from McGill institutional board was obtained prior to the study. The operations were performed at the McGill wet labs and they were recorded on DVDs and their performance was then assessed independently and blindly by two urologic surgeons with an experience in MIS. They gave each resident a rating score from 1 (poor) to 5 (excellent) on 6 pre-defined rating scales of their psychomotor skills (Table 1). These rating scales were based on 2 previously published articles on the global assessment of intraoperative laparoscopic skills
[59].

We prospectively calculated the standardized cumulative score
for each rater’s observations, and the performance on the porcine models for
each resident. The first part of our statistical analysis involved the
assessment of the agreement between the two independent urologic surgeons’
rating of the resident performance on the porcine model using kappa test with
standardized weight function to assess for inter-observer bias, agreement, and
disagreement. In general, a kappa value less than 0.2 is considered poor
agreement, and a value in the range of 0.81 to 1.0 is considered very good
agreement [10]. Box whisker plots displaying the inter-quartile range, median, and mode were also constructed. Secondly, we assessed the predictive validity of ProMIS in predicting the resident performance on the porcine model using non-parametric spearman correlation testing in order to compare each rater’s cumulative score with the cumulative score obtained on the porcine models. All statistical analysis was conducted using STATA version 11 (stata), and a p-value <0.5 was deemed significant

Results

As mentioned earlier, the data on 14 residents was analyzed. The kappa results demonstrated acceptable agreement between the two observers amongst all domains of the rating scale of performance except for confidence of movement and efficiency (Table 2). The highest kappa values on agreement were observed on bimanual dexterity and tissue handling and the box whisker plots are shown in (Figure 1)

In order to examine the predictive validity of ProMIS in predicting the performance on the porcine models, spearman testing between the each of the ProMIS assessment components and the porcine scores demonstrated poor correlation across all components (Table 3), all correlation p values >0.05), and therefore poor predictive validity.

Discussion

Given its integration of a motion tracking system with real laparoscopic tasks, the ProMIS system is considered a hybrid system that was developed as a bridge between traditional box trainers and the VR simulators. This integration allows haptic feedback during performance of laparoscopic tasks, which was demonstrated to be of significant importance
[11], while enabling an objective evaluation tool similar to the VR simulators.

Our previously published results were consistent with other studies in the field establishing construct validity of hybrid simulators. Bann et al, showed the ability of the ProMIS system to distinguish expert laparoscopic surgeons from novice surgeons
[12]. In general surgery, Van Sickle et al further confirmed the construct validity of the hybrid simulator in distinguishing residents during a simple suturing task
[13]. Others have also reached the same conclusions [1415].

This study however could not demonstrate a predictive validity for this hybrid simulator in the context of a urology residency training program. Of the different types of validity, predictive validity is generally the least studied. In a gynecology residency training program, PGY1 residents who received laparoscopic training on a traditional trainer box performed significantly better on laparoscopic bilateral tubal ligation than their control counterparts
[16]. To our knowledge, there have been no other studies in urology or surgical specialties that directly examined the predictive validity of hybrid simulators.

Our study had its own limitations, and it is possible that we could not demonstrate the predictive validity because of our small sample size. The use of one last performance on the simulator and comparing it to one initial performance on the prcine model is also a limitation. Taking an average of 2-3 performances can be a solution. Another limitation is that those residents were actively involved in a surgical training program, and an additional unmeasured exposure to laparoscopic surgery is most likely. We previously showed the lack of predictive validity for ProMIS when we compared a group of medical students’ performance on the hybrid simulator to their later performance on the robotic console, but there was a predictive validity in the subset of students that were trained on both, ProMIS and LapSim
[17]. It is hard to extract direct applicable conclusions from that study; given the small sample size and that the study subjects were medical students rather than residents. In addition, their eventual performance was measured on a robotic rather than laparoscopic platform. However, it is always possible to examine the resident performance on both simulators together, ProMIS and LapSim, and compare it to their intra-operative performance. Further studies are required to investigate this hypothesis and examine other simulators in order to identify the best simulator that can achieve the goal of teaching the residents basic laparoscopic skills and properly evaluate their readiness for real operative performance.

Conclusions

We could not demonstrate predictive validity for the ProMIS hybrid simulator in our urology residency program when compared to intraoperative performance in a porcine model.

Authors' Contribution

AA and TA carried out the literature search, performed the analysis, and prepared the manuscript. RH and TMA prepared the scoring system and performed the analysis. JD designed the study and collected data. MA conceived the idea, designed the study, and edited the final manuscript. All authors read and approve the final manuscript for submission.