The forced swim test (FST) is frequently used to screen for antidepressant-like effects in rodents, but its predictive validity had been repeatedly questioned at both conceptual and practical levels [e.g. (Trunnell and Carvalho, Reference Trunnell and Carvalho2021)]. We previously suggested that some of the problematic issues with this test are related to individual variability in the responses of mice to the test and to treatments, which result in poor reproducibility and less than expected translational value (Einat et al., Reference Einat, Ezer, Kara and Belzung2018). To gain understanding into the issue of individual variability in the FST, we previously studied the effects of sex and of repeated exposures to the test (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019) and relations between the FST and other behavioural tests used in the study of affective disorders (Kazavchinsky et al., Reference Kazavchinsky, Dahan and Einat2020). Here we used the FST combined with lithium treatment to examine replicability of the test in three replications conducted under the same conditions. The objectives of the study were (1) to examine the reproducibility of group effects of lithium in the FST and (2) to explore the heterogeneity of the behaviour in control and lithium-treated groups within and across replications. We expected that as in previous work, lithium treatment will result in significant reduction in immobility time in the FST, and we hypothesised that the heterogeneity of behaviour across experiments will be larger in the lithium group compared with control animals. We conducted three identical experiments in ICR (CD-1®) male mice (N = 60 for experiment 1, N = 45 for experiment 2 and N = 37 for experiment 3). We used males only to reduce variable factors and as we previously reported sex effects in the FST (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019). Experiments were performed serially. We utilised our established protocol including single housing in enriched cages and standard laboratory setting (12/12 h light/dark cycle, constant temperature at 22 ± 1°C and ad-lib access to food and water). Single housing was used as we wanted to follow our established protocol that includes individual housing. The protocol was established in the past with single housing because it was needed for many of the previous experiments. Lithium was administered orally in food for two weeks prior to a single, six-min testing in the FST where the last 4 min are scored for active (swim and struggle) or passive (floating) behaviours using an automated videotracking system (FST, BioBserve, Bonn, Germany) (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019). All experimental procedures followed the Israeli Ministry of Health directives and were approved by the Tel Aviv-Yaffo Academic College IACUC (protocol MTA-2014-08-3). Data for group effects were analysed using two-way ANOVA with experiment and lithium treatment as main factors followed by post-hoc Bonferroni analysis and by t-tests for individual experiments (lithium versus control). Levene’s test was used to examine heterogeneity. Effect sizes were estimated using Cohen’s d. As expected, chronic administration of oral lithium resulted in reduced immobility time in the FST across experiments. Despite the efforts to equate conditions across experiments, data indicate a significant difference across experiments with no interaction [Fig. 1; ANOVA: experiment effect – F(2,132) = 12.65, p < 0.001; lithium effect – F(1,132) = 73.23, p < 0.001; interaction – F(2,132) = 2.29, p = 0.11]. It is important to note that the effect of lithium was not only statistically significant but also “clinically significant” with effects sizes at 0.94 (experiment 1), 2.33 (experiment 2) and 1.6 (experiment 3) ranging between large and very large.
Fig. 1. Means, STD and individual values of lithium and replications effects on immobility time in the FST in ICR male mice. *Symbolizes statistically significant difference between Lithium and Control mice within Experiment (p < 0.001). #Symbolizes statistically significant difference between experiments (p < 0.001).
Interestingly, and supporting the hypothesis, the differences between experiments can be attributed to variability in the lithium response rather than in the behaviour of the control animals. The mean for control animals across experiments ranges between 108 and 134 s immobility, approximately 25% difference, and immobility is not significantly different across experiments [ANOVA across experiments for control groups: F(3,30) = 0.53, p = 0.66]. In contrast, immobility time in the lithium groups ranges between 18 and 85 s, over 450% difference with clear statistical significance across experiments [F(3,102) = 39.1, p < 0.0001]. In contrast, and against expectations, within experiments, the heterogeneity of variance of the lithium animals was lower than that of the control mice with significant differences in experiment 2 [Levene’s test, F(1,41) = 11.58, p = 0.002] and experiment 3 [Levene’s test, F(1,34) = 39.2, p < 0.0001] and a close to significant difference in experiment 1 [Levene’s test F(1,57) = 2.7, p = 0.1].
In general, the present results replicate previous work showing the effects of chronic oral lithium to reduce immobility in the FST. Additionally, this study demonstrates that the response to lithium, whereas always in the same direction, varies significantly across experiments even when efforts are made to maintain similar conditions. These findings are in line with previous meta-analysis regarding the effects of a number of antidepressant drugs in the FST showing that the FST is valid for a qualitative appraisal of antidepressant-like effects of drugs but that it may not be accurate enough for quantitative evaluation of these effects (Kara et al., Reference Kara, Stukalin and Einat2018). The reduced heterogeneity of lithium-treated animals within experiment combined with the increased heterogeneity across experiments suggests that drug effects in the FST are more susceptible to small differences between experiments compared with the baseline behaviour of control animals. It is therefore suggested that when using the FST to screen for potential antidepressant-like effects, it is critical to follow established protocols very closely and take great care to maintain precisely similar conditions across experiments. Specifically, we suggest that some major factors that should not be altered in the mice protocol. These include the duration of the test and the scoring period (6 min test and scoring the last 4 min), the diameter of the cylinder (at the range of 18–20 cm), the temperature of the water (22–24 degrees), and the light conditions (standard laboratory light might be preferred to dim/red light). These conditions were repeatedly validated for mice of different strains. Clearly, different protocols are applied to other species with specific parameters for rats and more variations for non-traditional rodent model animals such as gerbils, fat sand rats, spiny mice, and others.
The forced swim test (FST) is frequently used to screen for antidepressant-like effects in rodents, but its predictive validity had been repeatedly questioned at both conceptual and practical levels [e.g. (Trunnell and Carvalho, Reference Trunnell and Carvalho2021)]. We previously suggested that some of the problematic issues with this test are related to individual variability in the responses of mice to the test and to treatments, which result in poor reproducibility and less than expected translational value (Einat et al., Reference Einat, Ezer, Kara and Belzung2018). To gain understanding into the issue of individual variability in the FST, we previously studied the effects of sex and of repeated exposures to the test (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019) and relations between the FST and other behavioural tests used in the study of affective disorders (Kazavchinsky et al., Reference Kazavchinsky, Dahan and Einat2020). Here we used the FST combined with lithium treatment to examine replicability of the test in three replications conducted under the same conditions. The objectives of the study were (1) to examine the reproducibility of group effects of lithium in the FST and (2) to explore the heterogeneity of the behaviour in control and lithium-treated groups within and across replications. We expected that as in previous work, lithium treatment will result in significant reduction in immobility time in the FST, and we hypothesised that the heterogeneity of behaviour across experiments will be larger in the lithium group compared with control animals. We conducted three identical experiments in ICR (CD-1®) male mice (N = 60 for experiment 1, N = 45 for experiment 2 and N = 37 for experiment 3). We used males only to reduce variable factors and as we previously reported sex effects in the FST (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019). Experiments were performed serially. We utilised our established protocol including single housing in enriched cages and standard laboratory setting (12/12 h light/dark cycle, constant temperature at 22 ± 1°C and ad-lib access to food and water). Single housing was used as we wanted to follow our established protocol that includes individual housing. The protocol was established in the past with single housing because it was needed for many of the previous experiments. Lithium was administered orally in food for two weeks prior to a single, six-min testing in the FST where the last 4 min are scored for active (swim and struggle) or passive (floating) behaviours using an automated videotracking system (FST, BioBserve, Bonn, Germany) (Kazavchinsky et al., Reference Kazavchinsky, Dafna and Einat2019). All experimental procedures followed the Israeli Ministry of Health directives and were approved by the Tel Aviv-Yaffo Academic College IACUC (protocol MTA-2014-08-3). Data for group effects were analysed using two-way ANOVA with experiment and lithium treatment as main factors followed by post-hoc Bonferroni analysis and by t-tests for individual experiments (lithium versus control). Levene’s test was used to examine heterogeneity. Effect sizes were estimated using Cohen’s d. As expected, chronic administration of oral lithium resulted in reduced immobility time in the FST across experiments. Despite the efforts to equate conditions across experiments, data indicate a significant difference across experiments with no interaction [Fig. 1; ANOVA: experiment effect – F(2,132) = 12.65, p < 0.001; lithium effect – F(1,132) = 73.23, p < 0.001; interaction – F(2,132) = 2.29, p = 0.11]. It is important to note that the effect of lithium was not only statistically significant but also “clinically significant” with effects sizes at 0.94 (experiment 1), 2.33 (experiment 2) and 1.6 (experiment 3) ranging between large and very large.
Fig. 1. Means, STD and individual values of lithium and replications effects on immobility time in the FST in ICR male mice. *Symbolizes statistically significant difference between Lithium and Control mice within Experiment (p < 0.001). #Symbolizes statistically significant difference between experiments (p < 0.001).
Interestingly, and supporting the hypothesis, the differences between experiments can be attributed to variability in the lithium response rather than in the behaviour of the control animals. The mean for control animals across experiments ranges between 108 and 134 s immobility, approximately 25% difference, and immobility is not significantly different across experiments [ANOVA across experiments for control groups: F(3,30) = 0.53, p = 0.66]. In contrast, immobility time in the lithium groups ranges between 18 and 85 s, over 450% difference with clear statistical significance across experiments [F(3,102) = 39.1, p < 0.0001]. In contrast, and against expectations, within experiments, the heterogeneity of variance of the lithium animals was lower than that of the control mice with significant differences in experiment 2 [Levene’s test, F(1,41) = 11.58, p = 0.002] and experiment 3 [Levene’s test, F(1,34) = 39.2, p < 0.0001] and a close to significant difference in experiment 1 [Levene’s test F(1,57) = 2.7, p = 0.1].
In general, the present results replicate previous work showing the effects of chronic oral lithium to reduce immobility in the FST. Additionally, this study demonstrates that the response to lithium, whereas always in the same direction, varies significantly across experiments even when efforts are made to maintain similar conditions. These findings are in line with previous meta-analysis regarding the effects of a number of antidepressant drugs in the FST showing that the FST is valid for a qualitative appraisal of antidepressant-like effects of drugs but that it may not be accurate enough for quantitative evaluation of these effects (Kara et al., Reference Kara, Stukalin and Einat2018). The reduced heterogeneity of lithium-treated animals within experiment combined with the increased heterogeneity across experiments suggests that drug effects in the FST are more susceptible to small differences between experiments compared with the baseline behaviour of control animals. It is therefore suggested that when using the FST to screen for potential antidepressant-like effects, it is critical to follow established protocols very closely and take great care to maintain precisely similar conditions across experiments. Specifically, we suggest that some major factors that should not be altered in the mice protocol. These include the duration of the test and the scoring period (6 min test and scoring the last 4 min), the diameter of the cylinder (at the range of 18–20 cm), the temperature of the water (22–24 degrees), and the light conditions (standard laboratory light might be preferred to dim/red light). These conditions were repeatedly validated for mice of different strains. Clearly, different protocols are applied to other species with specific parameters for rats and more variations for non-traditional rodent model animals such as gerbils, fat sand rats, spiny mice, and others.
Statement of interest
None.