Perceiver

According to "Perceiver: General Perception with Iterative Attention", arXiv:2103.03206 [cs.CV]

perceiver.png

"With great flexibility comes great overfitting" 😎

Configuration

Imports

Configuration

Data

Model

Attention

Utilities

Weight before residual according to

"ReZero is All You Need: Fast Convergence at Large Depth", arXiv:2003.04887 [cs.LG]

Attention Block

Position Encoding

Head

Perceiver

Training

History

Optimizer

Setup trainer

Start training