Kwangmin Kim - Tensor Introduction

1 Tensor Flow

pytorch 이전 까지 deep learning을 위해 가장 많이 사용되었던 Framework
2020년 이후로 pytorch를 더 많이 사용하지만 여전히 많은 사람들이 Tensor Flow 사용
데이터 자료형으로 텐서(tensor) 객체를 사용
Tensorflow에서는 텐서(tensor)를 NumPy 배열처럼 사용할 수 있다.
GPU 사용 지원

1.1 GPU 사용 여부 체크하기

GPU를 사용하면 TensorFlow나 pytorch에서 딥러닝 모델을 효과적 구현 가능
각 텐서(tensor)와 연산이 어떠한 장치에 할당되었는지 출력할 수 있다.

import tensorflow as tf
# placement 함수: 각 텐서와 연산이 어떠한 장치에 할당되었는지 출력하기
#tf.debugging.set_log_device_placement(True)

# 텐서 생성
a = tf.constant([
    [1, 1],
    [2, 2]
])
b = tf.constant([
    [5, 6],
    [7, 8]
])

c = tf.matmul(a, b)
print("matrix multiplication: ", c)

#tf.debugging.set_log_device_placement(False)

from tensorflow.python.client import device_lib
# 구체적으로 사용 중인 장치(device) 정보 출력
device_lib.list_local_devices()

1.2 텐서 소개 및 생성 방법

TensorFlow에서의 텐서(tensor)는 기능적으로 넘파이(NumPy)와 매우 유사하다.
기본적으로 다차원 배열을 처리하기에 적합한 자료구조로 이해할 수 있다.
TensorFlow의 텐서는 “자동 미분” 기능을 제공한다.
TensorFlow는 기능적으로 Pytorch와 거의 같음, 하지만 문법이 불편함
TensorFlow 2.0부터는 pytorch와 문법적으로 유사

1.3 Tensor

특징
- 기본적으로 다차원 배열을 처리하기에 적합한 자료구조로 이해할 수 있다
- TensorFlow에서의 텐서(tensor)는 기능적으로 넘파이(NumPy)의 ndarray 객체와 유사
- 기본 python 데이터 유형을 자동 변환 (e.g., list)
- TensorFlow의 텐서는 “자동 미분” 기능을 제공한다.
속성
- 크기 (shape)
- 자료형 (data type)
- 저장된 장치, 가속기 메모리에 상주 가능 (e.g., GPU )
Numpy 배열과 tf.Tensor의 차이점
- 텐서는 가속기 메모리(GPU, TPU)에서 사용 가능
- 텐서는 불변성(immutable)

1.4 Tensor 초기화

# 기본적인 모양(shape), 자료형(data type) 출력

data = [
    [1, 2],
    [3, 4]
]
x = tf.constant(data) # list -> tensor object로 변환
print(x)
print(tf.rank(x)) # 축(axis)의 개수 출력 = 차원의 개수 출력

data = tf.constant("String") # 문자열 (string)도 int형 tensor로 변환 가능
print(data)

# NumPy 배열에서 텐서를 초기화할 수 있다.

## 파이썬의 리스트 넘파이는 compatible하다. 상호보완적으로 교체가 가능

a = tf.constant([5])
b = tf.constant([7])

c = (a + b).numpy()
print(c)
print(type(c))

result = c * 10
tensor = tf.convert_to_tensor(result) # numpy -> tensor
print(tensor)
print(type(tensor))

1.5 텐서(tensor) 객체 생성 (기본 python 데이터 유형)


import numpy as np

print(tf.math.add(1, 2))
print(tf.math.add([1, 2], [3, 4]))
print(tf.math.square(5))
print(tf.math.reduce_sum([1, 2, 3]))

# Operator overloading is also supported
print(tf.math.square(2) + tf.math.square(3))

data = [
    [1,2],
    [3,4]
]
x = tf.constant(data)
print(x)
print(x.shape)
print(x.dtype)
print(tf.rank(x)) # tf.rank() : 축(axis)의 개수 출력 (차원의 개수)

1.6 텐서(tensor) 객체 생성 (numpy)

TensorFlow 연산은 자동으로 NumPy 배열을 텐서(tensor)로 변환
NumPy 연산은 자동으로 텐서(tensor)를 NumPy 배열로 변환

import numpy as np
ndarray = np.ones([3, 3])
ndarray

tensor = tf.math.multiply(ndarray, 42)
tensor
np.add(tensor, 1)
tensor.numpy() # numpy.ndarray
type(tensor.numpy())
ctensor = tf.constant(ndarray)
ctensor

1.7 다른 텐서로부터 텐서 초기화

텐서(tensor) 객체 생성 (tf.Tensor)
tf.ones_like(x) : 값이 1이고 x와 shape & data type이 동일한 텐서 생성


x = tf.constant([
    [5, 7],
    [3, 2]
])

x_ones = tf.ones_like(x)
x_ones
     

x = tf.constant([
    [5.1, 7.0],
    [3.4, 2.1]
])

x_ones = tf.ones_like(x)
x_ones

# tf.random.uniform(shape, dtype) : 랜덤 값으로 원하는 shape과 dtype을 갖는 텐서 생성
x_rand = tf.random.uniform(shape=x.shape, dtype=tf.float32)
x_rand

1.8 텐서(tensor) 사용

특정 차원 접근


tensor = tf.constant([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
])

print(tensor[0])       # first row
print(tensor[:, 0])    # first column
print(tensor[..., -1]) # last column

텐서 Concatenate

axis : 어느 축을 기준으로 객체를 이어붙일지 결정

axis=0 : 0번째 축 (=row) axis=1 : 1번째 축 (=column)


tensor = tf.constant([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
])

tensor_concat = tf.concat([tensor, tensor, tensor], axis=0) # row
tensor_concat

tensor_concat = tf.concat([tensor, tensor, tensor], axis=1) # column
tensor_concat

1.8.1 형변환 (Type Casting)


a = tf.constant([2])   # dtype: int32
b = tf.constant([5.0]) # dtype: float32

print('a dtype: ', a.dtype, '\nb dtype: ', b.dtype)

#| eval: false

a + b # dtype 불일치 -> InvalidArgumentError 발생


tf.cast(a, tf.float32) + b # a의 dtype을 b의 dtype으로 변환 후 계산

1.8.2 텐서 Shape 변경


x = tf.Variable([1,2,3,4,5,6,7,8])
y = tf.reshape(x, (4,2))           # row=4, col=2
y

1.8.3 x와 y는 서로 다른 객체

x.assign_add([1,1,1,1,1,1,1,1])
print(x) # 1씩 더해짐
print(y) # 값 변화 X

1.8.4 텐서 차원 교환

tf.transpose(a, perm=[], ...) a의 차원 순서를 바꾼다. perm=[2, 1, 0]일 경우, a의 2번째 축을 첫번째로, 1번째 축을 두번째로, 0번째 축을 세번째로 교환하겠다는 의미


a = tf.random.uniform((64, 32, 3))
print(a.shape)

b = tf.transpose(a, perm=[2, 1, 0]) # 차원 자체를 교환
print(b.shape)

1.8.5 사칙연산

element끼리 연산한다


a = tf.constant([
    [1,2],
    [3,4]
])
b = tf.constant([
    [1,2],
    [3,4]
])

print(a + b)
print(a - b)
print(a * b)
print(a / b)

1.8.6 행렬 곱 (matrix multiplication)



a = tf.constant([
    [1,2],
    [3,4]
])
b = tf.constant([
    [1,2],
    [3,4]
])
tf.matmul(a, b)

1.8.7 평균 함수

차원을 축소하며 평균을 계산

tf.reduce_mean(a, axis=0) : 0차원(행)을 축소하여 평균 계산 -> 각 열에 대한 평균
tf.reduce_mean(a, axis=1) : 1차원(열)을 축소하여 평균 계산 -> 각 행에 대한 평균

a = tf.constant([
    [1,2,3,4],
    [5,6,7,8]
])

print(tf.reduce_mean(a))         # a 전체 평균
print(tf.reduce_mean(a, axis=0)) # 각 column에 대한 평균
print(tf.reduce_mean(a, axis=1)) # 각 row에 대한 평균

1.8.8 합계 함수

차원을 축소하며 합계를 계산 (평균과 동일하게 동작)

a = tf.constant([
    [1,2,3,4],
    [5,6,7,8]
])

print(tf.reduce_sum(a))         # a 전체 합계
print(tf.reduce_sum(a, axis=0)) # 각 column에 대한 합계
print(tf.reduce_sum(a, axis=1)) # 각 row에 대한 합계

1.8.9 최대 함수

max() : 원소의 최댓값 반환
argmax() : 최댓값의 index를 반환


a = tf.constant([
    [1,2,3,4],
    [5,6,7,8]
])

print(tf.reduce_max(a))         # a 전체 원소의 최댓값
print(tf.reduce_max(a, axis=0)) # 각 column에 대한 최댓값
print(tf.reduce_max(a, axis=1)) # 각 row에 대한 최댓값
print(tf.argmax(a, axis=0)) # 각 column에 대한 최댓값의 index
print(tf.argmax(a, axis=1)) # 각 row에 대한 최댓값의 index

차원 축소
- squeeze() : 크기가 1인 차원을 제거
차원 확장
- expand_dims() : 크기가 1인 차원을 추가
- 흔히 배치(batch) 차원을 추가하기 위한 목적으로 사용됨
- pytorch에서는 차원 축소 시, unsqueeze() 사용


a = tf.constant([
    [1,2,3,4],
    [5,6,7,8]
])
print('original a shape: ', a.shape)

a = tf.expand_dims(a, 0) # 첫번째 축에 차원 추가
print('add 0th dims: ', a.shape)

a = tf.expand_dims(a, 3) # 세번째 축에 차원 추가
print('add 3rd dims: ', a.shape)


print(tf.squeeze(a).shape)         # 크기가 1인 차원을 모두 제거 
print(tf.squeeze(a, axis=3).shape) # 세번째 차원을 제거


#tf.squeeze(a, axis=1) # 제거하려는 차원의 크기가 1이 아닐 경우 오류 발생

1.9 자동 미분과 기울기

기울기 테이프 (Gradient Tape)
중간 연산들을 테이프에 기록하고 역전파(back propagation)를 수행했을 때 기울기가 계산됨
TensorFlow에서는 변수가 아닌 상수에 대해 기본적으로 기울기를 측정하지 않음 (not watched). 또한 변수여도 학습 가능하지 않으면(not trainable) 자동 미분을 사용하지 않음



x = tf.Variable([3.0, 4.0])
y = tf.Variable([3.0, 4.0])

# with 구문 안에서 진행되는 모든 연산들을 기록
with tf.GradientTape() as tape:
  z = x + y
  loss = tf.math.reduce_mean(z)

dx = tape.gradient(loss, x) # loss가 scalar이므로 계산 가능
print(dx)

TensorFlow에서는 변수가 아닌 상수에 대해 기본적으로 기울기를 측정하지 않음 (not watched). 또한 변수여도 학습 가능하지 않으면(not trainable) 자동 미분을 사용하지 않음



x = tf.linspace(-10, 10, 100) # -10 ~ 10까지 100r개의 데이터 생성

with tf.GradientTape() as tape:
  tape.watch(x) # x에 대해 기울기를 측정할거니까 기록해줘
  y = tf.nn.sigmoid(x)

dx = tape.gradient(y, x)
print(dx)

import matplotlib.pyplot as plt

plt.plot(x, y, 'r', label="y")
plt.plot(x, dx, 'b--', label="dy/dx")
plt.legend()
plt.show()