Tackling Continual Learning: Our Journey in the CVPR 2021 Challenge

Projects · 01 Jun 2022 - 4 minutes to read.

In the rapidly evolving field of AI, one of the biggest hurdles is teaching machines to learn sequentially, like humans do, without forgetting previously acquired knowledge. This is the core challenge of Continual Learning (CL). In 2021, we participated in the CVPR 2021 Workshop’s Continual Learning Challenge to test our approach against this very problem. Our team, Real-DEEL, placed 6th in this competitive event. This post details our method and experience.

The Challenge: Learning Without Forgetting

Final Leaderboard Result with our team Real-DEEL

The competition, uniquely instantiated in the Sequoia framework, was designed to push the boundaries of CL. The main goal was to develop a single method that could perform well across two very different domains:

Supervised Learning (SL): An incremental image classification task on the Synbols dataset, where the model sees new classes over time.
Reinforcement Learning (RL): A task requiring an agent to adapt to changing environments.

The ultimate prize was reserved for the method that achieved the best average performance across both tracks, demanding a truly robust and domain-agnostic solution. Our focus was on the Supervised Learning track.

Our Approach: Dark Experience Replay (DER)

Our entry was built upon the principles of “Dark Experience for General Continual Learning” (DER) (Buzzega et al., 2020) , a powerful and efficient rehearsal-based strategy.

The core idea behind rehearsal methods is to store a small subset of past data in a memory buffer. When the model trains on a new task, it “replays” samples from this buffer to refresh its memory of old tasks, thus preventing catastrophic forgetting.

DER enhances this by not just storing the input data (e.g., images) and their labels, but also the logits (the raw output of the model before the final softmax) produced by the model when it first learned that data. When a past experience is replayed, we train the model on a dual objective:

Cross-Entropy Loss: The standard classification loss on the new data.
Dark Experience Loss: A distillation loss (typically MSE or L1) that penalizes differences between the current model’s logits and the stored logits for the replayed data.

This second loss encourages the model to remember how it made its previous predictions, preserving the learned representations more effectively than just using the ground-truth labels. Our implementation, der.py, uses this exact strategy.

Our final submission used the following key hyperparameters:

Buffer Size: 6000 samples
Dark Coefficient: 1.2

The Results: 6th Place Finish

After an intense competition, our Real-DEEL method secured 6th place on the final leaderboard for the supervised learning track.

CVPR 2021 Continual Learning Leaderboard showing Real-DEEL team in 6th place The final leaderboard for the Supervised Learning track.

Our final scores were:

Metric	Score
Final/Average Online Performance:	0.80
Final/Average Final Performance:	0.86
Final/Runtime (seconds):	482
Final/CL Score:	0.87

The plots below visualize our model’s performance. The first chart shows our strong final and online performance scores. The second chart illustrates the steady increase in average accuracy across tasks, demonstrating the model’s ability to learn continually without catastrophic forgetting.

Final Performance Metrics Test Accuracy Over Time

Finally, the plot below shows the behavior of the “Dark Loss” for several tasks. The spikes occur at the beginning of a new task, as the model’s predictions for replayed data initially diverge. The subsequent decrease shows the model successfully re-aligning with its past knowledge while learning the new task.

Dark Loss Per Task

Conclusion

The CVPR 2021 Continual Learning Challenge was a fantastic experience that validated the strength of rehearsal-based methods like Dark Experience Replay. Our 6th place finish demonstrates that this simple yet powerful technique is a formidable baseline for tackling the complex problem of catastrophic forgetting.

You can find the complete code for our submission on GitHub.

References

Buzzega, P., Boschini, M., Porrello, A., Abati, D., & Calderara, S. (2020). Dark experience for general continual learning: a strong, simple baseline. Advances in Neural Information Processing Systems, 33, 15920–15930.