CIFAR 100 데이터셋 이미지로 저장하는 방법

기본 적인 방법은 CIFAR 10과 동일하나 몇 가지 다른 점이 있습니다.

CIFAR 10 데이터셋 이미지로 저장하는 방법

Tensor Flow나 PyTorch을 사용한 데이터셋 형태가 아니라 각각의 개별 이미지가 필요해서 작성한 코드를 공유 합니다. 순서는 1. https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 이곳에서 파일을 받고..

mindw96.tistory.com

https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz 에서 파일 저장 후 원하는 곳에 압축 해제
압축 해제 한 곳에서 첨부한 py 파일을 실행 혹은 아래의 코드를 실행 (numpy 설치는 필수 입니다.)

import numpy as np
from PIL import Image
import os
import pickle


def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict


with open('meta', 'rb') as infile:
    data = pickle.load(infile, encoding='latin1')
    classes = data['fine_label_names']

# 클래스 별 폴더 생성
os.mkdir('./train_image')
os.mkdir('./test_image')
for name in classes:
    os.mkdir('./train_image/{}'.format(name))
    os.mkdir('./test_image/{}'.format(name))

# Trainset Unpacking
# data_batch 파일들 순서대로 unpacking
print('Unpacking Train File')

train_file = unpickle('train')
train_data = train_file[b'data']

# 10000, 3072 -> 10000, 3, 32, 32 형태로 변환
train_data_reshape = np.vstack(train_data).reshape((-1, 3, 32, 32))

# 이미지 저장을 위해 10000, 32, 32, 3으로 변환
train_data_reshape = train_data_reshape.swapaxes(1, 3)
train_data_reshape = train_data_reshape.swapaxes(1, 2)

# 레이블 리스트 생성
train_labels = train_file[b'fine_labels']

# 파일 이름 리스트 생성
train_filename = train_file[b'filenames']

# 50000개의 파일을 순차적으로 저장
for idx in range(50000):
    train_label = train_labels[idx]
    train_image = Image.fromarray(train_data_reshape[idx])
    
    # 클래스 별 폴더에 파일 저장
    train_image.save('./train_image/{}/{}'.format(classes[train_label], train_filename[idx].decode('utf8')))
    
# -----------------------------------------------------------------------------------------

# Testset Unpacking
print('Unpacking Test File')
test_file = unpickle('test')

test_data = test_file[b'data']

# 10000, 3072 -> 10000, 3, 32, 32 형태로 변환
test_data_reshape = np.vstack(test_data).reshape((-1, 3, 32, 32))

# 이미지 저장을 위해 10000, 32, 32, 3으로 변환
test_data_reshape = test_data_reshape.swapaxes(1, 3)
test_data_reshape = test_data_reshape.swapaxes(1, 2)

# 레이블 리스트 생성
test_labels = test_file[b'fine_labels']

# 파일 이름 리스트 생성
test_filename = test_file[b'filenames']

# 10000개의 파일을 순차적으로 저장
for idx in range(10000):
    test_label = test_labels[idx]
    test_image = Image.fromarray(test_data_reshape[idx])
    
    # 클래스 별 폴더에 파일 저장
    test_image.save('./test_image/{}/{}'.format(classes[test_label], test_filename[idx].decode('utf8')))

print('Unpacking Finish')

저작자표시 비영리 변경금지

'인공지능 > 딥러닝' 카테고리의 다른 글

CIFAR 10 데이터셋 이미지로 저장하는 방법 (2)	2022.01.25
OpenCV로 실시간 웹캠 이미치 처리시 문제점 (1)	2022.01.21
Jupyter Notebook 전용 Data Spell 출시 (0)	2021.10.14
NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_type": "c...') 해결 방법 (0)	2021.10.14
CUDA_ERROR_LAUNCH_FAILED 해결법 (0)	2020.11.26

인공지능 대학원생 블로그

CIFAR 100 데이터셋 이미지로 저장하는 방법

'인공지능 > 딥러닝' 카테고리의 다른 글

티스토리툴바

CIFAR 100 데이터셋 이미지로 저장하는 방법

'인공지능 > 딥러닝' 카테고리의 다른 글

'인공지능/딥러닝' Related Articles

티스토리툴바