aPaperADay
18 CapsNets - Conclusion

This paper is still a little bit over my head. Let me list some of the topics I don’t yet understand:

  1. Margin Loss
  2. Shapes of Capsules and why
  3. How to do Reconstruction
  4. Why exactly Dynamic routing is necessary
  5. How memory explodes on large inputs
  6. The Crowding phenomenon

Per 4. though I don’t fully understand, I have seen a good visualization of the routing algorithm
routing_vis.gif

I think I would have to spend some time in the code in order to really understand the details.

This paper has been difficult to decipher because the closer I get to really understanding these papers, the more I can see how little I know regarding make-or-break details. Thus my study has also been tempered by the reality of Capsule Nets use today. In their favor is the SOTA on MNIST, but there is certainly a lack of backing compared to networks like CNNs. Capsules have been introduced quite a while ago (2011).

The paper concludes with an argument that yesterday’s techniques are bound to be beaten by tomorrows algorithms that are designed to improve on exponential limits imposed by old algorithms. It is impossible to refute the fact that algorithms change, and is motivating to consider the future. I’m still interested in how capsules may contribute to the future of algorithms.

I think in order to understand an algorithm’s position in the future is to understand the fundamental limitations and representational assumptions that it makes. The Capsule networks make a very strong representational assumption about the locality of capsule representation; however, this idea is also countered with the understanding that the human vision system also might make the same representational assumption.

Although I think it is crucial to understand the fundamental limits and assumptions, these details are the last I am able to understand well, which motivates my journey to continue to put time in not only summarizing these papers, but learning from others and attempting to code this up myself. There is a huge time gap to learn this all perfectly from scratch so I also need to be efficient in my choices of what to study and how to learn.