Computer-vision based instance counting of surgical instruments

  • James Ireland

Student thesis: Doctoral Thesis

Abstract

Without any ill intent, a complication can occur during a medical procedure that either lengthens or complicates a patient’s recovery post their operation. One possible complication is the unintentional retention of surgical instruments or materials, within the patient after the procedure has been completed. Regarding this complication, prevention is the best cure, being the timely detection of this situation while something can still be done about it, and no ill effect is suffered by the patient. The current best practice proposed by the World Health Organisation (WHO) is the manual counting of instruments, before and then after a medical procedure, to flag any discrepancies. However, this can be subject to mistakes due to operations occurring in the early hours of the morning, the body mass index (BMI) of the patient, increased blood loss, the emergency nature of the procedure, or the attention levels of the staff completing these counts.
Counting of objects has been proposed in the wider computer vision community but this is predominantly binary in focus (i.e. single-class/type of instance being counted), either for crowd counting, crop/livestock monitoring, or human cells for patient diagnoses. Computer vision within medical operations focus mainly on either the phase detection of an operation (i.e. current stage of the procedure or state of medical staff energy levels), or highlighting the surgical instruments during their use to aid the surgeon in their task, either via frame-wise classification of instruments present within a frame, localisation or pixel-wise segmentation. At the time of writing, no research has focused on the possible mitigation of retained surgical instruments and materials through a computer vision-based instance count of those instruments present. The research proposed within this thesis is an exciting and completely new direction for the research community. Specifically, the investigation of how computer vision could aid in the count of medical instruments at two important points in time, before and after the operation, to ensure the instruments used are accounted for before closing up the patient. The thesis is preliminarily focused on completing the first count.
This thesis presents an investigation into multi-class multi-instance counting, starting with a description of the two datasets (SORT, MSMI) created to evaluate instance counting of medical instruments and materials. This is followed by a benchmark of various popular computer vision approaches for deriving this instance count from a given image. Out of which, all methods trained on SORT, with the exception of one, achieved greater than 90% in counting accuracy. Further work expands upon this, leveraging synthetically generated images to aid in the training of real-world deceptions of medical instruments within MSMI. From the experiments, it was learned that pre-training on weights learnt from synthetic/simulated images (SORT) was found to have a measurable performance benefit to the Direct Regression transfer learning domain adaptation experiments.
Finally, the thesis presents an attempt at mitigating possible occlusion from affecting the multi-class multi-instance count through the use of multiple cameras placed at different points of view but observing the same scene. However, the results indicate that this is done at a loss to the accuracy performance of the instance counting method. While still an open research area, steps have been taken in this body of work towards a computer vision-based solution that, when fully researched and developed, could aid medical staff in further mitigating/minimising the possibility of an instrument or material being retained post-medical operation.
Date of Award2025
Original languageEnglish
SupervisorDamith HERATH (Supervisor), Ibrahim RADWAN (Supervisor) & Roland GOECKE (Supervisor)

Cite this

'