24.10.23 - Advances in AI: How RMT is Improving Visual Recognition

Nowadays, almost everybody knows that the T in ChatGPT stands for transformer. Transformers were first introduced in the field of natural language processing, then applied to computer vision and, in some settings, proved to outperform Convolutional Neural Networks (CNNs).

In a recent paper appeared on arXiv, Qihang Fan and colleagues propose a novel method named #RMT, combining #VisionTransformers (ViT) with a #RetentionNetwork (RetNet), the latter being a promising architecture for language models. According to their experiments, RMT outperforms ViT in several benchmarks.

While keeping an eye on recent advances in computer vision algorithms, the Swiss Territorial Data Lab (STDL) recently published on GitHub a first release of its Object Detector, with added support to use XYZ raster web services and Docker containers. Last but not least, another hands-on example has been added, on quarry detection.

Image credit and article can be found here : https://lnkd.in/eDUBbkQE

« retour