[필수과제1] Multi-Layer Perceptron

*Problem

MNIST 데이터셋으로 pytorch를 활용한 MLP구현

*Code

Import Library
- numpy
- matplotlib
- torch
- torch.nn as nn
- torch.optim as optim
- torch.nn.functional as F
- ++device setup : gpu or cpu
  - device = torch.device(if torch.cuda.is_available() else 'cpu')

Dataset
- from torchvision import datasets, transforms
- MNIST datasets(Train data sets : 60000 , Test data sets : 10000)

Data Iterator
- torch.utils.data.DataLoader를 통해 dataset과 Batch-size 등 지정해주고 iterator 객체를 선언
- argument에 shuffle = True 설정시 1 epoch 후 데이터 random shuffle 수행

Define the MLP model

class MultiLayerPerceptronClass(nn.Module):
    #super class로 nn.Module을 가져오는 이유는 기본적으로 forward와 같은 함수를 inherit할 수 있음.
    """
        Multilayer Perceptron (MLP) Class
    """
    def __init__(self,name='mlp',xdim=784,hdim=256,ydim=10):
        super(MultiLayerPerceptronClass,self).__init__()
        self.name = name
        self.xdim = xdim
        self.hdim = hdim
        self.ydim = ydim
        self.lin_1 = nn.Linear(self.xdim,self.hdim)
        self.lin_2 = nn.Linear(self.hdim,self.ydim)
        self.init_param() # initialize parameters pre-trained model을 새롭게 파라미터 초기화 
        
    def init_param(self):
        nn.init.kaiming_normal_(self.lin_1.weight)
        nn.init.zeros_(self.lin_1.bias)
        nn.init.kaiming_normal_(self.lin_2.weight)
        nn.init.zeros_(self.lin_2.bias)

    def forward(self,x):
        net = x
        net = self.lin_1(net)
        net = F.relu(net)
        net = self.lin_2(net)
        #입력 x -> linear1 (affine) -> activation function -> linear2 -> return 
        #liner2 이후엔 activation을 거치지않는다 logit이 나오니까
        return net

M = MultiLayerPerceptronClass(name='mlp',xdim=784,hdim=256,ydim=10).to(device)
loss = nn.CrossEntropyLoss()
optm = optim.Adam(M.parameters(),lr=1e-3)
print ("Done.")

MLP class객체를 만들어줌
MNIST dataset에 맞춰 xdim = 784(28 * 28) , hdim = 256 , ydim = 10 (0~9 class) 로 설정
parameter 초기화 및 forward 로직 설계
optimizer = Adam , lr_rate = 1e-3 , loss function = cross-entropy (classification)

Train

print ("Start training.")
M.init_param() # initialize parameters
M.train()
EPOCHS,print_every = 10,1
for epoch in range(EPOCHS):
    loss_val_sum = 0
    for batch_in,batch_out in train_iter:
        # Forward path
        y_pred = M.forward(batch_in.view(-1, 28*28).to(device))
        loss_out = loss(y_pred,batch_out.to(device))
        # Update
        optm.zero_grad()      # reset gradient 
        loss_out.backward()      # backpropagate
        optm.step()      # optimizer update
        loss_val_sum += loss_out
    loss_val_avg = loss_val_sum/len(train_iter)
    # Print
    if ((epoch%print_every)==0) or (epoch==(EPOCHS-1)):
        train_accr = func_eval(M,train_iter,device)
        test_accr = func_eval(M,test_iter,device)
        print ("epoch:[%d] loss:[%.3f] train_accr:[%.3f] test_accr:[%.3f]."%
               (epoch,loss_val_avg,train_accr,test_accr))
print ("Done")

epochs : 10 , mini-batch를 활용하여 train 진행
train 결과

Test

n_sample = 25
sample_indices = np.random.choice(len(mnist_test.targets), n_sample, replace=False)
test_x = mnist_test.data[sample_indices]
test_y = mnist_test.targets[sample_indices]
with torch.no_grad():
    y_pred = M.forward(test_x.view(-1, 28*28).type(torch.float).to(device)/255.)
y_pred = y_pred.argmax(axis=1)
plt.figure(figsize=(10,10))
for idx in range(n_sample):
    plt.subplot(5, 5, idx+1)
    plt.imshow(test_x[idx], cmap='gray')
    plt.axis('off')
    plt.title("Pred:%d, Label:%d"%(y_pred[idx],test_y[idx]))
plt.show()    
print ("Done")

MNIST test data를 활용하여 모델이 잘 훈련 됐는지 확인
train 결과에서 98%의 accuracy를 보인것처럼 test결과 위 25개의 classification은 모두 잘 수행됨

*학습회고

일단 MLP를 기본적으로 이해하고, loss function이나 optimizer, forward prop , back prop 등을 pytorch를 통해 간결하게 구현할 수 있음을 알 수 있었다.

지난 주차 필수과제에서 SGD를 mini-batch로 구현할 때에는 random 라이브러리를 활용하여 직접 데이터셋의 셔플을 구현하였었는데 torch데이터로더를 활용하여 iterator객체를 만들어주니 보다 편하게 mini-batch를 구현할 수 있는것 같다.

그리고 self.init_param()으로 파라미터 초기화를 해주는 이유는 모델을 불러오는 과정에서 pre-trained model이 아닌 새롭게 train을 시키기 위함이다.

일단 강의에서 기본적으로 코드를 오픈해주셔서 편하게 할 수 있었지만 pytorch의 활용이라던지 세부적인 코드 작성법에 대해서는 아직 충분히 익숙해질 필요가 있는것 같다..

백chef

[필수과제1] Multi-Layer Perceptron

*Problem

티스토리툴바