â–º Code examples / Computer Vision

Computer Vision

Image classification

★
V3
Image classification from scratch
★
V3
Simple MNIST convnet
★
V3
Image classification via fine-tuning with EfficientNet
V3
Image classification with Vision Transformer
V3
Classification using Attention-based Deep Multiple Instance Learning
V3
Image classification with modern MLP models
V3
A mobile-friendly Transformer-based model for image classification
V3
Pneumonia Classification on TPU
V3
Compact Convolutional Transformers
V3
Image classification with ConvMixer
V3
Image classification with EANet (External Attention Transformer)
V3
Involutional neural networks
V3
Image classification with Perceiver
V3
Few-Shot learning with Reptile
V3
Semi-supervised image classification using contrastive pretraining with SimCLR
V3
Image classification with Swin Transformers
V2
Train a Vision Transformer on small datasets
V2
A Vision Transformer without Attention
V3
Image Classification using Global Context Vision Transformer
V3
Image Classification using BigTransfer (BiT)

Image segmentation

★
V3
Image segmentation with a U-Net-like architecture
V3
Multiclass semantic segmentation using DeepLabV3+
V2
Highly accurate boundaries segmentation using BASNet
V3
Image Segmentation using Composable Fully-Convolutional Networks

Object detection

V2
Object Detection with RetinaNet
V3
Keypoint Detection with Transfer Learning
V3
Object detection with Vision Transformers

3D

V3
3D image classification from CT scans
V3
Monocular depth estimation
★
V3
3D volumetric rendering with NeRF
V3
Point cloud segmentation with PointNet
V3
Point cloud classification

OCR

V3
OCR model for reading Captchas
V3
Handwriting recognition

Image enhancement

V3
Convolutional autoencoder for image denoising
V3
Low-light image enhancement using MIRNet
V3
Image Super-Resolution using an Efficient Sub-Pixel CNN
V3
Enhanced Deep Residual Networks for single-image super-resolution
V3
Zero-DCE for low-light image enhancement

Data augmentation

V3
CutMix data augmentation for image classification
V3
MixUp augmentation for image classification
V3
RandAugment for Image Classification for Improved Robustness

Image & Text

★
V3
Image captioning
V2
Natural language image search with a Dual Encoder

Vision models interpretability

V3
Visualizing what convnets learn
V3
Model interpretability with Integrated Gradients
V3
Investigating Vision Transformer representations
V3
Grad-CAM class activation visualization

Image similarity search

V2
Near-duplicate image search
V3
Semantic Image Clustering
V3
Image similarity estimation using a Siamese Network with a contrastive loss
V3
Image similarity estimation using a Siamese Network with a triplet loss
V3
Metric learning for image similarity search
V2
Metric learning for image similarity search using TensorFlow Similarity
V3
Self-supervised contrastive learning with NNCLR

Video

V3
Video Classification with a CNN-RNN Architecture
V3
Next-Frame Video Prediction with Convolutional LSTMs
V3
Video Classification with Transformers
V3
Video Vision Transformer

Performance recipes

V3
Gradient Centralization for Better Training Performance
V3
Learning to tokenize in Vision Transformers
V3
Knowledge Distillation
V3
FixRes: Fixing train-test resolution discrepancy
V3
Class Attention Image Transformers with LayerScale
V3
Augmenting convnets with aggregated attention
V3
Learning to Resize

Other

V2
Semi-supervision and domain adaptation with AdaMatch
V2
Barlow Twins for Contrastive SSL
V2
Consistency training with supervision
V2
Distilling Vision Transformers
V2
Focal Modulation: A replacement for Self-Attention
V2
Using the Forward-Forward Algorithm for Image Classification
V2
Masked image modeling with Autoencoders
V2
Segment Anything Model with 🤗Transformers
V2
Semantic segmentation with SegFormer and Hugging Face Transformers
V2
Self-supervised contrastive learning with SimSiam
V2
Supervised Contrastive Learning
V2
When Recurrence meets Transformers
V2
Efficient Object Detection with YOLOV8 and KerasCV