MITB Banner

10 Open Datasets You Can Use For Computer Vision Projects

Share

Listen to this story

Computer vision is accelerating almost every domain in the industry. With the help of Computer Vision technologies, organisations are revolutionising the way machines used to work earlier. Now, big tech around the globe are utilising computer vision technology domains like healthcare and autonomous driving, among others. In order to build a robust deep learning model for Computer Vision, one must apply high-quality datasets into the training phase.

In this article, we will list down 10 high-quality datasets that one can use for Computer Vision projects.

1| CIFAR-10

CIFAR-10 is a popular computer-vision dataset collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. This dataset is used for object recognition and it consists of 60,000 32×32 colour images in 10 classes, with 6,000 images per class. It is divided into five training batches and one test batch, each with 10,000 images which means there are 50,000 training images and 10,000 test images.

Click here to download.

2| Cityscapes

Cityscapes is an open-sourced large-scale dataset for Computer Vision projects which contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities. It includes high-quality pixel-level annotations of 5,000 frames in addition to a larger set of 20,000 weakly annotated frames. This dataset is mainly used for training deep neural networks and assessing the performance of vision algorithms for major tasks of semantic urban scene understanding. 

Click here to download.

3| Fashion MNIST

Fashion-MNIST is an image dataset for Computer Vision which consists of a training set of 60,000 examples and a test set of 10,000 examples. In this dataset, each example is a 28×28 grayscale image, associated with a label from 10 classes. There is an automatic benchmarking system based on Scikit-learn that covers 129 classifiers with different parameters.

Click here to download.

4| ImageNet

One of the popular datasets for Computer Vision projects, ImageNet provides an accessible image database which is organised according to the WordNet hierarchy. There are more than 100,000 synsets in WordNet where ImageNet provides an average of 1,000 images to illustrate each synset in the WordNet. It offers tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.

Click here to download.

5| IMDB-Wiki Dataset

IMDB-Wiki dataset is one of the largest and open-sourced datasets of face images with gender and age labels for training. There is a total of 523,051 face images in this dataset where 460,723 face images are obtained from 20,284 celebrities from IMDB and 62,328 from Wikipedia. 

Click here to download.

6| Kinetics-700

Kinetics-700 is a large-scale, high-quality dataset of YouTube video URLs which include a diverse range of human-focused actions. The dataset consists of approximately 650,000 video clips and covers 700 human action classes with at least 600 video clips for each action class. Here, each clip lasts around 10 seconds and is labelled with a single class. 

Click here to download.

7| MS Coco

COCO or Common Objects in COntext is large-scale object detection, segmentation, and captioning dataset. The dataset contains photos of 91 objects types which is easily recognisable and has a total of 2.5 million labelled instances in 328k images. 

Click here to download.

8| MPII Human Pose Dataset

MPII Human Pose dataset is used for evaluation of articulated human pose estimation. The dataset includes around 25K images containing over 40K people with annotated body joints. Here, each image is extracted from a YouTube video and provided with preceding ann following un-annotated frames. Overall the dataset covers 410 human activities and each image is provided with an activity label. 

Click here to download.

9| Open Images

This Open Images dataset is one of the largest existing datasets with object location annotations. It consists of around 9 million images annotated with image-level labels, object bounding boxes, object segmentation masks, and visual relationships. The dataset contains a total of 16 million bounding boxes for 600 object classes on 1.9 million images.

Click here to download.

10| The 20BN-something-something Dataset V2

The 20BN-Something-Something dataset is a large collection of densely-labelled video clips that show humans performing pre-defined basic actions with everyday objects. It was created by a large number of crowd workers which allows ML models to develop a fine-grained understanding of basic actions that occur in the physical world. The total number of videos includes 220,847 where 168,913 is training set, 24,777 is validation set and 27,157 is the test set. 

Click here to download.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.