Objectives: Because of its importance for patient safety and treatment efficacy, the volume and methodological quality of the evidence available on the accuracy of therapeutic ultrasound equipment is examined with respect to power output and timing function. Evidence for a causal relationship between ultrasound inaccuracy and machine age, brand, frequency of use and intensity settings is also examined. Methods: A systematic literature search was performed to identify observational studies examining levels of ultrasound machine inaccuracy. Methodological quality was examined using the Sheffield University hierarchy of evidence and a modified version of the McMaster University critical appraisal tool. Data was pooled using the descriptive statistics (mean, standard deviation, percentage) available in each study. Results: Eighteen studies out of 47 were retained for review. Methodological scores ranged from 2 to 15˙5 (maximum 20). Two thirds of ultrasound machines (64˙6%; SD 23˙2; range 14–100%) were found to produce an inaccurate power input. The average percentage of timers found to be inaccurate was 30˙1 and 22˙6% at 5 and 10 min, respectively. The only variable correlated with level of machine inaccuracy was that of machine age. Discussion: The current review indicates that a significant proportion of ultrasound machines produce inaccurate power outputs or have an inaccurate timing function. This finding has implications for both clinical practice and therapeutic ultrasound research.