ConvMixer

Convolutions Attention MLPs Patches Are All You Need? 🤷‍♂️

ICLR 2022 submission

https://github.com/tmp-iclr/convmixer

ConvMixer:

def ConvMixr(h,d,k,p,n):
    S,C,A=Sequential,Conv2d,lambda x:S(x,GELU(),BatchNorm2d(h))
    R=type('',(S,),{'forward':lambda s,x:s[0](x)+x})
    return S(A(C(3,h,p,p)),*[S(R(A(C(h,h,k,groups=h,padding=k//2))),A(C(h,h,1))) for i
        in range(d)],AdaptiveAvgPool2d((1,1)),Flatten(),Linear(h,n))

convmixer.png

Configuration

Imports

Configuration

Data

Model

Utilities

ConvMixer

Training

Optimizer

Setup trainer

Start training

Improved model

Model

Training

Configuration

Setup trainer

Start training

Larger model: extended channel mixing

Model

Training

Configuration

Setup trainer

Start training