INTRODUCTION: The Modified Ashworth Scale is the most widely clinical scale used to measure the increase of muscle tone. Reliability is not an immutable property of a scale and can vary as a function of the variability and composition of the sample to which it is administered. The best method to examine how the reliability of a test scores varies is by conducting a systematic review and meta-analysis of the reliability coefficients obtained in different applications of the test with the data at hand. The objectives of this systematic revision are: What is the mean inter- and intra-rater reliability of the Modified Ashworth Scale’s scores in upper and lower extremities? Which study characteristics affect the reliability of the scores in this scale?EVIDENCE ACQUISITION: The PubMed, Embase and CINAHL databases were searched from 1987 to February 2015. Two reviewers independently selected empirical studies published in English or in Spanish that applied the Modified Ashworth Scale and reported any reliability coefficient with the data at hand in children, adolescents or adults with spasticity.EVIDENCE SYNTHESIS: Thirty-three studies reported any reliability estimate of Modified Ashworth Scale scores (N=1,065 participants). For lower extremities and inter-rater agreement, the mean intraclass correlation was ICC+ = 0.686 (95% CI: 0.563 and 0.780) and for kappa coefficients, k+ = 0.360 (95% CI: 0.241 and 0.468); for intra-rater agreement: ICC+ = 0.644 (95% CI: 0.543 and 0.726) and k+ = 0.488 (95% CI: 0.370 and 0.591). For upper extremities and inter-rater agreement: ICC+ = 0.781 (95% CI: 0.679 and 0.853) and k+ = 0.625 (95% CI: 0.350 and 0.801); for intra-rater agreement: ICC+ = 0.748 (95% CI: 0.671 and 0.809) and k+ = 0.593 (95% CI: 0.467 and 0.696). The type of design, the study focus, and the number of raters presented statistically significant relationships with ICC both for lower and upper extremities.CONCLUSIONS: Inter- and intra-rater agreement for Modified Ashworth Scale scores was satisfactory. Modified Ashworth Scale’ scores exhibited better reliability when measuring upper extremities than lower. Several characteristics of the studies were statistically associated to inter-rater reliability of the scores for lower and upper extremities.