Measurement Reliability of Swelling in the Acute Ankle Sprain

Background: Swelling and painful restriction of dorsiflexion characterize acute ankle sprain, and require accurate measurement to monitor effectiveness of intervention. Reliability of the figure of eight tape method for swelling and the weight-bearing lunge for dorsiflexion are highly reliable in the laboratory, but untested in the less predictable clinical setting. Materials and Methods: We determined intra and interrater reliability and standard error of measurement (SEM) of both methods in the clinical environment, using 4 physiotherapists as raters. Measurements were taken twice within a session and at a follow-up session from the uninjured ankle in 22 participants with unilateral ankle sprain, and from a randomly selected ankle in 11 uninjured participants. Results: Within session intrarater reliability was very high for both figure of eight (Intraclass correlations coefficients [ICC] = 0.99) and weight-bearing lunge (ICC = 0.97) methods. Betweensession inter-rater reliability was also very high (ICC > 0.99). The SEM was small for all measurements: ±0.2cm for figure of eight, and ±0.4cm for dorsiflexion lunge methods within a session, and ±0.3cm and ±0.4cm respectively for between-session measurements. Conclusions: Using simple techniques, swelling and dorsiflexion can be measured with high reliability in the clinic by different clinicians and can detect small changes in status between and within treatments. Clinical Relevance: Clinically meaningful changes (>0.5cm) can be detected by clinicians with varying levels of expertise and can confidently be attributed to the intervention rather than measurement error.

In addition to pain, the earliest symptoms are swelling and restricted dorsiflexion range of motion, and these symptoms can persist for years in up to 70% of people after a sprain. 14To determine efficacy of treatment and monitor progress, it is essential that these impairments be measured reliably and accurately in the clinic.
Swelling secondary to sprain of the lateral ankle ligaments is commonly localized, usually around the lateral malleolus 18,19 , but can also accumulate around the subtalar, talocrural and inferior tibiofibular joints.Measurement of swelling therefore needs to specifically include measurement of volume in these areas.
The gold standard for such measurement is the water displacement method 13,16 , but this method may be too time-consuming to use efficiently in the clinic.Although an indirect method of measuring ankle swelling, the figure of eight method 13 is time-efficient, cost-effective, and easy to apply in clinical settings.p. 131 (Fig. 1) The figure of eight method is highly reliable for measuring ankle swelling in the laboratory: withinsession intra and interrater reliability ranged from 0.98 to 0.99 (intraclass correlation coefficients [ICC] for asymptomatic and swollen ankles. 13,16,21urthermore, the method correlates highly (r>0.88) with the water displacement method for both injured 13,16 and uninjured 11 ankles, thereby conferring some validity for the figure of eight method.
Ankle dorsiflexion range of motion (ROM) during weight bearing is also commonly limited following ankle sprain 10,17 , with consequent high impact on functional activities such as walking 2,3,5,8 , and ascending and descending stairs. 1,12estoration of ankle dorsiflexion ROM is therefore a priority of early rehabilitation. 2,3,8asurement of dorsiflexion in standing 4 simulates the ROM achieved during these functional tasks. 1,6This is particularly relevant because the torques applied to the ankle in weight-bearing are clearly greater than in nonweight-bearing, and the resultant measurement may be more indicative of the range available for functional activities. 4,5asurement of ankle dorsiflexion ROM using the weight bearing lunge method has been shown 4 to be highly reliable in the laboratory (betweensession intra-rater reliability ICC (3,3) 0.97 to 0.98, within-session inter-rater reliability ICC (2,3) 0.99.However, it is unclear whether these methods are efficient and reliable in uncontrolled clinical environments, where high reliability is essential for monitoring of progress and treatment effects.
The aim of the current study therefore, was to assess in the clinical environment the reliability of: i) the figure of eight method; and ii) the weight bearing lunge method for measurements taken within-and between-sessions.The unaffected ankle of participants was investigated because it is not possible to determine between-session reliability on the injured ankle, given the expected rapid changes in swelling and dorsiflexion range 10 and associated confounding effects of intervention on repeatability.Nevertheless, this information is important clinically, because, similar to Phase I and II trials, laboratory results cannot necessarily be generalized to the clinical environment.

Design
A repeated measures design was used to test reliability.When a participant was attending for treatment of an injured ankle, the treating therapist and a second rater measured outcomes on the uninjured ankle before treatment commenced for the affected ankle.
On the first test occasion, raters took two measurements of both ROM and swelling within a 1-hour period.Repeat measurements for between-session reliability were taken approximately one week later.To minimize unblinding, raters used a new data sheet to record each measurement before sealing each data sheet in an opaque envelope for later analysis.

Participants
The raters were four physiotherapists of varying post-graduate experience (range 4 -15 years, mean 8.2 years).The participants (patient group) consisted of 15 males and 18 females aged 10 to 76 years (mean, standard deviation [SD]: 28, 14.3 years), recruited from staff and patients attending a physiotherapy and sports injury clinic in Sydney, Australia.Eleven participants (3 male, 8 female) were injury-free and asymptomatic.Twenty-two (12 male, 10 female) had sustained a recent unilateral ankle sprain, fifteen while participating in competitive sport, and seven while walking on an uneven surface.
The asymptomatic contralateral ankle was measured in injured participants and a randomly selected ankle in healthy participants.While raters routinely used the figure of eight and dorsiflexion lunge methods in their clinical practice, they were familiarized with the standardized protocols for measurement before undertaking data collection.

Protocols i) Figure of eight method for measuring swelling
The protocol used in the current study was based on that described by Mawdsley. 13Participants were positioned in long-sitting on a bed with the experimental foot resting over the end.(Fig. 1) The following standardized landmarks were marked with a pen prior to measurement: a) the point midway over the anterior ankle between the tibialis anterior tendon and lateral malleolus, b) the navicular tuberosity, c) the base of the fifth metatarsal, and d) the inferior tip of the medial malleolus.To blind therapists during measurements, one surface of a double-sided retractable plastic tape measure was blackened leaving the zero point visible.The rater placed the zero point over the mark on the anterior aspect of the ankle and pulled the tape medially over the navicular tuberosity, and then infero-laterally across the medial arch to the proximal aspect of the base of the fifth metatarsal.The tape was then pulled superiorly and medially over the tarsal bones across the inferior aspect of the medial malleolus, and postero-laterally around the Achilles tendon over the distal lateral malleolus to finish at the zero point.The rater tightened the tape measure and then released tension slightly to ensure there was no indentation of soft tissue.To obtain the measurement, a clip was placed at the point of intersection between the zero and finish points of the tape.(Fig. 1) The examiner removed and turned over the tape and recorded the result to the nearest millimeter.

ii) Weight-Bearing lunge for range of ankle dorsiflexion
The weight-bearing lunge used to measure ankle dorsiflexion range was based on that described by Bennell, et al., 4 .Each participant stood on an apparatus consisting of a horizontal footplate attached to a vertical board .(Fig. 2)  Participants aligned the great toe and heel of the test leg over a line marked along the center of the footplate.Participants were instructed not to lift the test heel, checked by the examiner who gently palpated for lifting 4 while the participant moved the knee forward into a lunge position, until the patella touched the midline of the vertical board.
To prevent forward movement of the great toe as the knee moved forward over the foot, a block was placed in front of the great toe.The measurement recorded was the distance (cm) from the vertical board to the great toe.
Participants were given up to five attempts and the best performance was used for further analysis.

Data Analysis
SPSS for Windows™ was used to calculate ICCs (ICC (1,1) and ICC (2,1) ) and 95% confidence intervals (CI) 20 for each method, within-and betweenraters and within-and between-sessions.The ICC values were interpreted according to the definition of Munro and Page 15 : ICC values 0.00 to 0.25 indicated little, if any correlation; 0.26 to 0.49 low correlation; 0.50 to 0.69 moderate correlation; 0.7 and 0.89 high correlation, and 0.9 to 1.0 indicated very high correlation.2 Means (standard deviation) for the 2 measurement occasions and standard error of the measurement (SEM) for figure of eight swelling and weight bearing lunge dorsiflexion lunge measurements (n = 4 raters).

Method
The SEM 15 , was calculated for repeated measurements on the same participant, withinand between-sessions, and was expressed in the original units of measurement to provide error data in clinically relevant terms.Paired t-tests were used to compare the means for each measurement occasion.

Intra-rater reliability: within session
Measurements of swelling using the figure of eight method and ankle dorsiflexion using the weight-bearing lunge were taken by the same rater one hour apart.For the figure of eight method, ICC (2,1) values were > 0.90 (Table 1), consistent with very high correlation. 15here was no difference in mean swelling between the first and second measurements (p = 0.32).The SEM was 0.2cm (Table 2) indicating that a therapist taking a repeat measurement of swelling after treatment could be confident on 95% of occasions that any reduction >0.4cm (1.96 x SEM) would be due to the treatment.Alternatively, an increase of ≥0.4cm would indicate that swelling had increased.
For the dorsiflexion lunge method of measuring range of motion, ICC (1,1) values were also >0.90 (Table 1), consistent with very high correlation.There was no difference in mean dorsiflexion range between the first and second measurements (p=0.5).The SEM was 0.4cm (Table 2).Thus, a therapist taking a repeat measurement after treatment using the weight-bearing lunge could be confident on 95% of occasions that any change in range of motion of >0.8cm could be attributed to treatment.

Intra-rater reliability: between session
Measurements of ankle swelling and ankle dorsiflexion were repeated, on average, 6.8 days (range 2 -28 days) after the first measurement occasion.For the figure of eight method, ICC (2,1)  values were > 0.90, consistent with very high correlation (Table 1).There was no difference in mean swelling between measurement occasions (p = 0.29).The SEM was 0.3cm (Table 2), indicating that a therapist taking measurements between treatment sessions could be confident on 95% of occasions that a difference in swelling of > 0.7cm between treatments would not be due to error.
For the dorsiflexion lunge method of measuring range of motion, ICC (1,1) values were > 0.90, consistent with very high correlation (Table 1).There was no difference in mean dorsiflexion range of motion between sessions (p = 0.2).The SEM was 0.4cm (Table 2) indicating that a therapist taking repeat measurements between occasions could be confident on 95% of occasions that a difference in ROM of > 0.8cm would not be due to error.

© The Foot & Ankle Journal, 2008
Inter-rater reliability: between sessions To determine inter-rater reliability, two different raters made the repeat measurements of swelling and range of motion, on average 6.8 days (range 2 -28 days) apart.For the figure of eight method, ICC (1,1) values were > 0.90, indicating very high reliability (Table 1).There was no difference in mean swelling (p = 0.2) between the two measurement occasions.The SEM was 0.3cm (Table 2), indicating that a different therapist repeating the measurement one week later could be confident on 95% of occasions that a change in swelling of > 0.6cm would not be due to error.
Similarly, for the dorsiflexion lunge method of measuring range of motion, ICC (1,1) values for inter-rater reliability were > 0.90, consistent with very high correlation (Table 1).There was no difference in mean range of motion for dorsiflexion (p = 0.09).The SEM was 0.4cm (Table 2), indicating that a therapist taking a repeat measurement one week after the first occasion could be confident on 95% of occasions that any difference in ROM of >0.8cm after treatment would not be due to error.

Discussion
The current results indicate that intra-and interrater reliability were very high for measurements taken in the clinic for both the figure of eight method (ankle swelling) and the weight-bearing lunge method (ankle dorsiflexion).
Whilst previous research has reported acceptable reliability in a well-controlled laboratory environment, the current study demonstrated reliability of these methods in the variable clinical environment.Very high intra and interrater reliability was observed for measurements repeated within a single session, and after a oneweek interval.Therefore, clinicians can use both techniques with confidence within and between sessions to determine the effects of interventions to improve ankle swelling and dorsiflexion range following ankle sprain.The results presented here will assist clinicians with decisions regarding the management of ankle sprain and monitoring progress with treatment.
Despite the inherent differences between the demands of the clinical environment and the laboratory environment, such as time constraints during measurement procedures and a more unpredictable environment 7 , the current findings for the figure of eight tape method are comparable to data derived from laboratory studies.Very high intra-and inter-rater reliability (ICC values 0.98 -0.99) have been reported using the figure of eight method for injured 13,16 as well as asymptomatic 11,21 ankles.However, previous research has only documented the reliability of the figure of eight method for repeated measurements taken within a single session. 11,13,16he current study observed very high intra and interrater reliability both within and betweensessions, with a comparable SEM of 0.4 to 0.5cm 11,13 , and therefore has demonstrated that different therapists can treat the same patient on different occasions and use the figure of eight method to confidently determine treatment effects.Furthermore, the small SEM observed in the current study informs clinicians that changes in swelling of greater than 0.7cm are more likely due to intervention effects than error.This suggests that the error is considerably less than changes that would be considered clinically worthwhile.
The figure of eight method has been reported to correlate well with water displacement methods for measurements of ankle swelling after lower limb injury. 13,16While water displacement is the gold standard method for measuring lower limb volume 11,13,16 , the method is time consuming, requiring between 5 and 6 minutes to perform, whereas the figure of eight method requires approximately 30 seconds to perform. 11It also requires relatively sophisticated equipment, unlike the tape measure method.Therefore, the figure of eight method may not only be a more timeefficient method for measuring ankle swelling, but also can be used without sacrificing reliability; even between raters, and between sessions.
Similarly, very high 15 intrarater reliability results have been reported for the weight bearing lunge method of measuring dorsiflexion in participants with asymptomatic ankles, within and between sessions, and for different raters taking repeat measurements within a single session. 4The current data recorded in a clinical environment are comparable to previous data collected in a laboratory environment, and again indicate that use of a cost and time-efficient method does not sacrifice reliability.Clinicians can therefore be confident that the weight-bearing lunge method for measuring dorsiflexion range is robust in the clinical environment.Furthermore, the small SEM suggests that clinicians detecting changes in ROM of greater than 1cm are more likely to be observing intervention effects than error.

Conclusions
Whereas previous studies using the figure of eight method for measuring ankle swelling and weight bearing lunge for measuring dorsiflexion have been conducted in laboratory settings, the current study was conducted in a clinical setting characterized by more variable environmental conditions and constraints that replicated conditions likely to be encountered by clinicians during rehabilitation of ankle sprain.In the current study, intra and interrater reliability for each method was observed to remain very high for both within and between sessions data.Therefore, with adequate familiarization, these simple, reliable, and time efficient methods can be used with confidence by clinicians with varying levels of expertise to assess treatment effects on swelling and ankle dorsiflexion in clinical populations.

Figure 1
Figure 1 The figure of eight tape method for measuring ankle swelling.

Figure 2
Figure 2The weight-bearing lunge method for measuring ankle dorsiflexion.