Preface :

transformer There are more and more applications for image , The main method is to block the image , Forming block sequence , Simply drop the block directly into transformer in . However, this approach ignores the internal structure information between blocks , So , In this paper, we propose a new algorithm that utilizes both the sequence information within and between blocks transformer Model , be called Transformer-iN-Transformer, abbreviation TNT.

The main idea

TNT The model divides an image into a sequence of blocks , Each piece reshape It's a sequence of pixels . After linear transformation, it can be obtained from blocks and pixels patch embedding and pixel embedding. Put the two on top of each other TNT block Middle school learning .

stay TNT block Zhongyou outer transformer block and inner transformer block form .

outer transformer block Responsible for modeling patch embedding Global correlation on ,inner block Responsible for modeling pixel embedding Local structure information between . Through the pixel embedding Linear mapping to patch embedding The way of space patch embedding Fusion of local information . In order to keep the spatial information , Position coding is introduced . Last class token Through one MLP Used for classification .

By proposing TNT Model , We can model global and local structural information , And improve the ability of feature representation . In terms of accuracy and computation ,TNT stay ImageNet and downstream  Excellent performance on the mission . for example ,TNT-S Where ImageNet top-1 On the Internet, only 5.2B FLOPs Under the premise of 81.3%, Than DeiT High  1.5%.

Some details

Compare this picture , Introduce it with several formulas .

MSA by Multi-head Self-Attention.

MLP by Multi Layer Perceptron.

LN by Layer Normalization.

Vec by flatten.

The plus sign indicates the residual connection .

The first two formulas are inner transformer block, Processing information inside a block , The third formula is to linearly map the information inside the block to patch embedding Space , The last two formulas are outer transformer block, Processing information between blocks .

It's enough to look at the figure below in the way of location coding .

The model parameters and calculations are shown in the table below :

Conclusion

Recently put the public account (CV Technical guide ) All the technical summaries are packaged into one pdf, Reply to key words in official account “ Technical summary ” Available .

This article comes from the official account CV Technical summary series of technical guide , For more details, please scan the end of the code for the official account. .

CVPR2021 | Huawei Noah laboratory proposed Transformer in Transformer More articles about

  1. educational circles | Huawei Noah Ark laboratory proposes a new meta learning method Meta-SGD , Excel in regression and classification tasks

    educational circles | Huawei Noah Ark laboratory proposes a new meta learning method Meta-SGD , Excel in regression and classification tasks Heart of machine is published in heart of machine subscription 499 Advertising closed 11.11 Wisdom goes up New users of cloud server enterprises have priority to buy , Enjoy double 11 Equal ...

  2. Huawei terminal open laboratory Android Beta 4 Test capability Online

    ​​​7 month 26 Japan ,Android P Beta 4 Release ( namely Android P DP5), This is the developer's last preview version , Also indicates the Android P The official version is coming soon . To make sure developers do it before the official version comes ...

  3. Huawei terminal open laboratory Android P Beta 4 Test capability Online

    7 month 26 Japan ,Android P Beta 4 Release ( namely Android P DP5), This is the developer's last preview version , Also indicates the Android P The official version is coming soon . In order to ensure that the developers do a good job before the official version comes ...

  4. ( turn )The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  5. Automatic web search (NAS) Application in semantic segmentation ( One )

    [ Abstract ] This article briefly introduces NAS Development and application in semantic segmentation , And read two popular work:DARTS and Auto-DeepLab. Automatic web search Most neural network structures are based on some mature backbone, ...

  6. Neural machine translation (NMT) Sort out relevant information

    author :zhbzz2007 Source :http://www.cnblogs.com/zhbzz2007 Welcome to reprint , Please also retain this statement . thank you ! 1 brief introduction since 2013 After the introduction of neural machine translation system in , Neural machine translation system ...

  7. AI How far is translation from barrier free communication

    AI Translation services through hardware . Software connects thousands of application scenarios , Will it break the awkward situation of language barrier ? Will it be the terminator of artificial translation ? The world is so big , I want to see it ! The 11th national holiday is approaching , You carry your bags in your dreams , Walking freely in the streets of foreign countries . However, the reality of the painting ...

  8. My first book :Spark Technology insider listing !

    Now the major website sales ! JD.COM :http://item.jd.com/11770787.html Dangdang :http://product.dangdang.com/23776595.html Amazon :http ...

  9. BAT And so on big factory already open source 70 A utility inventory ( Attached download address )

    Previous article < Microsoft . Google . Amazon .Facebook The big Silicon Valley factory 91 An inventory of open source software ( Attached download address )> Foreign countries are listed 8 An Internet company ( Including Microsoft .Google. Amazon .IBM.Facebook.Twit ...

  10. Machine learning classics &amp; The paper

    Original address :http://blog.sina.com.cn/s/blog_7e5f32ff0102vlgj.html Introduction list 1.< The beauty of Mathematics >PDF6 The author Wu Jun is very familiar to everyone . In a very popular language ...

Random recommendation

  1. git checkout Detailed command

    from :http://www.cnblogs.com/hutaoer/archive/2013/05/07/git_checkout.html?utm_source=tuicool&utm_me ...

  2. SAS 5/iR Adapter Driver download

    http://www.dell.com/support/home/cn/zh/cnbsd1/Drivers/DriversDetails?driverId=FF6F6

  3. ubuntu recovery rm -rf Delete the document by mistake

    Use extundelete Tools sudo apt-get install extundelete Recovery operation command The first thing you need to umount perhaps read only Partition  umount /dev/partit ...

  4. css+div Common sense of layout

    / root directory ../ An example of the upper level directory is as follows : stay photo.css Write in file : div.ls{ background:url(../photo/framels.jpg) no-repeat left; }div. ...

  5. Java Basic learning notes 7 Java Inheritance and abstract class of basic grammar

    Inherit Concept of inheritance in real life , Inheritance generally means that children inherit the property of their parents . In the program , Inheritance describes the relationship between things , Through inheritance, we can form a relationship system among many things . For example, the R & D department employees and maintenance department employees belong to the employees , ...

  6. JSON.stringify And JSON.parse

      JSON.stringify(value [, replacer] [, space]) Is used to object --> JSON character string . value: object . Array . class replacer: Array time :v ...

  7. forfiles Detailed command

    Directory copy command : xcopy   //server/bak/*.*    d:/serverbak /s /e /v /c / d /y /h             at 05:30 shutdown ...

  8. 【 Paper notes 】Learning Convolutional Neural Networks for Graphs

    Learning Convolutional Neural Networks for Graphs 2018-01-17  21:41:57 [Introduction] This article paper It was published in ...

  9. (C/C++ Learning notes ) Twenty-two . Standard template library

    Twenty-two . Standard template library ● STL Basic introduction Standard template library (STL, standard template library): C++ Provides a large number of function templates ( General algorithms ) And class templates . ※  Why don't we need to write by ourselves ...

  10. test json Characters and java Object properties are different in multiple json The performance of frame down conversion

    package com.longge.mytest; import java.io.IOException; import org.junit.Test; import com.alibaba.fas ...