day 18 linear and logistic regression

16 minute read

linear regregression and logistic regression

regression analysis

regression analysis is a statisticall analysis to find the relation between variables.

  1. Linearity: The relationship between X and the mean of Y is linear.

  2. Homoscedasticity: The variance of residual is the same for any value of X.

  3. Independence: Observations are independent of each other.

  4. Normality: For any fixed value of X, Y is normally distributed.

Supervised learning consists of classification and regression problems.

  • classification: predict data class y from a set of data x features

  • regression: accurately predicting other related data y’s values from a set of data x features

linear regression

linear model

linear regression is a model that assumes a linear relationship between the input variables (x) and the single output variable (y).

From the number of independent variables, it can be a simple or a multi linear regeression.

\[y=βx+ϵ\]

β is the regression coefficient(회귀계수)

ϵ is the error between the dependent and independent variable

The two above are the parameters. To find a good linear regression model is to find the right parameters that best explains the data.

linear regression in machine learning

\[H=Wx+b\]

H: 가정(Hypothesis)

W: 가중치(Weight)

b: 편향(bias)

In machine learning, we often say that we use deep learning to find the regression equation. In other words, we are finding the W and b from the given data.

important notes

잔차(Residuals) is the difference between the actual and predicted value of the regression model.

\[argminW,b​(\sum_{i=1}^{n}​residuals^2)\]

for n number of data points, we calculate the W and b least sum of square error. In machine learning, we use loss function to calcuate the regression coefficients

We can use R-squared to check how well the model explains the data.

preparing data

from sklearn.datasets import load_boston
from sklearn import model_selection
import pandas as pd
import numpy as np

# 데이터 로드
boston = load_boston()
data, price = boston['data'], boston['target']
x_train, x_test, y_train, y_test = model_selection.train_test_split(data, price, test_size=0.2)

df = pd.DataFrame(x_train, columns=boston['feature_names'])
print("boston dataset의 차원: ", data.shape)
print("price의 차원", price.shape)
print("boston train dataset의 차원: ", x_train.shape)
print("boston test dataset의 차원: ", x_test.shape)

print("Boston dataset의 예시")
df.head() # data 요약
boston dataset의 차원:  (506, 13)
price의 차원 (506,)
boston train dataset의 차원:  (404, 13)
boston test dataset의 차원:  (102, 13)
Boston dataset의 예시
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT
0 0.01301 35.0 1.52 0.0 0.442 7.241 49.3 7.0379 1.0 284.0 15.5 394.74 5.49
1 2.92400 0.0 19.58 0.0 0.605 6.101 93.0 2.2834 5.0 403.0 14.7 240.16 9.81
2 0.01096 55.0 2.25 0.0 0.389 6.453 31.9 7.3073 1.0 300.0 15.3 394.72 8.23
3 4.34879 0.0 18.10 0.0 0.580 6.167 84.0 3.0334 24.0 666.0 20.2 396.90 16.29
4 0.04462 25.0 4.86 0.0 0.426 6.619 70.4 5.4007 4.0 281.0 19.0 395.63 7.22
print(boston["DESCR"])
.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
        - LSTAT    % lower status of the population
        - MEDV     Median value of owner-occupied homes in $1000's

    :Missing Attribute Values: None

    :Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
https://archive.ics.uci.edu/ml/machine-learning-databases/housing/


This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978.   Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980.   N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.   
     
.. topic:: References

   - Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
   - Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.
# Boston dataset의 각 attribute에 선형회귀 적용하는 예제
import pandas as pd
from sklearn import datasets
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn import datasets
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,35))
fig.suptitle('Boston dataset - (X:Y = each attr: price) with R2', fontsize=16, y=0.9)

for i in range(data.shape[1]): # boston dataset에서 i번째 attribute(column)을 살펴볼 거에요.

    single_attr, attr_name = data[:, i].reshape(-1, 1), boston['feature_names'][i] # i번째 attribute에 대한 data 및 이름
    estimator = LinearRegression() # 선형 회귀 모델이에요.

    #x에는 single_attr, y에는 price에 해당하는 데이터를 대입해서 최소제곱법을 이용하여 모델 내에서 W, b를 구하는 과정이에요
    estimator.fit(single_attr, price) 

    #위 fit() 과정을 통해 구한 회귀계수를 기반으로 회귀모델에 X값을 대입했을 때의 예측 Y 값이에요. 
    pred_price = estimator.predict(single_attr)

    score = metrics.r2_score(price, pred_price) # 결정계수를 구하는 함수에요. 

    # 캔버스 생성
    ax = fig.add_subplot(7, 2, i+1)
    ax.scatter(single_attr, price) # 실제 데이터에 대한 산포도
    ax.plot(single_attr, pred_price, color='red') # 선형회귀모델의 추세선
    ax.set_title("{} x price, R2 score={:.3f}".format(attr_name ,score)) #subplot의 제목이에요
    ax.set_xlabel(attr_name) # x축
    ax.set_ylabel('price') # y축

output_3_0

From the result, we can see that LSTAT(R2=0.544)와 RM(R2=0.484) has higher R^2 and explans the data the best.

Gradient Descent Algorithm

Getting the correct loss function is important to calculating the correct regression coefficent. However, the loss function does not alwaays represent the best distribution of data. In machine learning, we calculate the minimun value of the gradient.

image

\[W:=W−α(∂/(W∂))​J(W)\]

αis the learning rate. If the value is bigger, the gradient will move along the curve faster and will return a faster result. However, then it could skip over the minimum optimal point.

Logistic Regression

Logistic regression calculating the probability between 0~1 that the data is within the category.It is a classification supervised learning algorithm. It is similar to linear regression but the y value is limited to values between 0~1. It is most used in binary classification problems.

\[P(y=0∣x)= 1/(1+exp (−z))\]

image

This is the sigmoid function which is used many times in deep learning. We use log to change the sigmoid function value to 0~1. This stengthens the model’s classification because it minimizes the interval x 0.3<p<0.70.3<p<0.7) where the classification borders become unclear.

  1. 실제 데이터를 대입하여 Odds 및 회귀계수를 구한다.

  2. Log-odds를 계산한 후, 이를 sigmoid function의 입력으로 넣어서 특정 범주에 속할 확률 값을 계산한다.

  3. 설정한 threshold에 맞추어 설정값 이상이면 1, 이하면 0으로 이진 분류를 수행한다.

# 로지스틱 회귀분석 예제: 유방암 데이터셋
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
# 데이터 로드
cancer=load_breast_cancer()

# y = 0(Malignant - 악성 종양), y=1(Benign - 양성 종양)
cancer_X, cancer_y= cancer.data, cancer['target']
train_X, test_X, train_y, test_y = train_test_split(cancer_X, cancer_y, test_size=0.1, random_state=10) # train 데이터셋과 test 데이터셋으로 나눔
print("전체 검사자 수: {}".format(len(cancer_X)))
print("Attribute 수: {}".format(len(cancer_X[0])))
print("Train dataset에 사용되는 검사자 수: {}".format(len(train_X)))
print("Test dataset에 사용되는 검사자 수: {}".format(len(test_X)))
cancer_df = pd.DataFrame(cancer_X, columns=cancer['feature_names'])
cancer_df.head()
전체 검사자 수: 569
Attribute 수: 30
Train dataset에 사용되는 검사자 수: 512
Test dataset에 사용되는 검사자 수: 57
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension ... worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension
0 17.99 10.38 122.80 1001.0 0.11840 0.27760 0.3001 0.14710 0.2419 0.07871 ... 25.38 17.33 184.60 2019.0 0.1622 0.6656 0.7119 0.2654 0.4601 0.11890
1 20.57 17.77 132.90 1326.0 0.08474 0.07864 0.0869 0.07017 0.1812 0.05667 ... 24.99 23.41 158.80 1956.0 0.1238 0.1866 0.2416 0.1860 0.2750 0.08902
2 19.69 21.25 130.00 1203.0 0.10960 0.15990 0.1974 0.12790 0.2069 0.05999 ... 23.57 25.53 152.50 1709.0 0.1444 0.4245 0.4504 0.2430 0.3613 0.08758
3 11.42 20.38 77.58 386.1 0.14250 0.28390 0.2414 0.10520 0.2597 0.09744 ... 14.91 26.50 98.87 567.7 0.2098 0.8663 0.6869 0.2575 0.6638 0.17300
4 20.29 14.34 135.10 1297.0 0.10030 0.13280 0.1980 0.10430 0.1809 0.05883 ... 22.54 16.67 152.20 1575.0 0.1374 0.2050 0.4000 0.1625 0.2364 0.07678

5 rows × 30 columns

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

LR = LogisticRegression() #로지스틱 회귀분석
LR.fit(train_X, train_y) # 유방암 train data를 이용하여 로지스틱 회귀분석 모델 학습
pred = LR.predict(test_X) # 학습된 로지스틱 회귀분석 모델을 이용한 예측값 출력


# 로지스틱 회귀분석 모델의 예측값과 실제값 비교결과를 나타낸 통계표
print(classification_report(test_y, pred))
              precision    recall  f1-score   support

           0       0.91      0.95      0.93        21
           1       0.97      0.94      0.96        36

    accuracy                           0.95        57
   macro avg       0.94      0.95      0.94        57
weighted avg       0.95      0.95      0.95        57



C:\Users\jwl23\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(

Softmax function

softmax is used for multinominal logistic regression(다중 로지스틱 회귀). We expand the category function to a softmax function. softmax categorizes not only 2 but multiple classifications. It is useful for multi class classification. \(softmax x(x) = e^{xi}/(\sum_{j=0}^{k}​e^{xj})(i=0,1,...k)\)

Cross Entropy

Cross entropy is used as the loss function inlcuded in the softmax function.

\[H(p,q)=−\sum_{x}​p(x)logq(x)\]

p(x) is the interval value of the real data, and q(x) is the softmax values.

The smaller the probability distribution of q(x) and real distribution of p(x), the smaller the cross entropy.

import tensorflow.compat.v1 as tf
import numpy as np
tf.compat.v1.disable_eager_execution()

x = tf.placeholder(tf.float32, [None, 30]) # 30개의 attribute를 가진 유방암 데이터셋
y = tf.placeholder(tf.float32, [None, 2]) # 악성, 양성 label에 one-hot encoding을 적용하여 2차원이 될 예정 
num_classes = 2 # 악성, 양성
lr = 0.0003 # learning rate

W = tf.Variable(tf.random_normal([30, num_classes]), name='weight') # 가중치
b = tf.Variable(tf.random_normal([num_classes]), name='bias') # bias

logits = tf.matmul(x, W) + b # log-odds에 해당되는 값
hypothesis = tf.nn.softmax(logits) # log-odds에 softmax 적용한 값

loss_i = tf.nn.softmax_cross_entropy_with_logits(
    logits=logits, labels=y) # i번째 변수의 p(x)*log(q(x))
loss = tf.reduce_mean(loss_i) # 이를 모두 summation한 최종 loss function

#최적화함수로 gradient descent method 사용
optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(loss)

pred = tf.argmax(hypothesis, 1) # hypothesis의 결과 큰 값을 1, 작은 값을 0으로 매핑
correct_pred = tf.equal(pred, tf.argmax(y, 1)) # 예측값과 ground truth 값을 비교해서 같으면 True, 다르면 False
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # True는 1, False는 0이 되기 때문에 전체 평균값이 결국 accuracy가 됩니다.


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer()) # Tensorflow의 전역 변수 초기화
    one_hot_train_y = np.eye(num_classes)[train_y] # one-hot encoding을 수행한 train data 라벨
    one_hot_test_y = np.eye(num_classes)[test_y] # one-hot encoding을 수행한 test data 라벨
    
    # 전체 train data를 1000번 iteration
    for step in range(1000):
        sess.run(optimizer, feed_dict={x: train_X, y: one_hot_train_y}) # train data로 학습
        test_loss, test_acc= sess.run([loss, accuracy], feed_dict={x: test_X, y: one_hot_test_y}) #학습된 모델을 통해 측정한 test data의 loss 값 및 정확도
        if step % 10 == 0: # 10번째 step마다 print 수행
            print("Step {}, loss: {:.3f} accuracy: {:.3f}".format(
            step, 
            test_loss,
            test_acc
            ))
WARNING:tensorflow:From C:\Users\jwl23\anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py:206: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

Step 0, loss: 112.585 accuracy: 0.632
Step 10, loss: 30.169 accuracy: 0.649
Step 20, loss: 21.461 accuracy: 0.702
Step 30, loss: 25.634 accuracy: 0.719
Step 40, loss: 26.652 accuracy: 0.719
Step 50, loss: 27.885 accuracy: 0.719
Step 60, loss: 29.207 accuracy: 0.719
Step 70, loss: 30.838 accuracy: 0.719
Step 80, loss: 30.247 accuracy: 0.719
Step 90, loss: 30.529 accuracy: 0.719
Step 100, loss: 26.927 accuracy: 0.789
Step 110, loss: 17.670 accuracy: 0.807
Step 120, loss: 9.863 accuracy: 0.825
Step 130, loss: 9.612 accuracy: 0.825
Step 140, loss: 6.142 accuracy: 0.877
Step 150, loss: 4.726 accuracy: 0.912
Step 160, loss: 4.653 accuracy: 0.912
Step 170, loss: 4.576 accuracy: 0.912
Step 180, loss: 4.507 accuracy: 0.912
Step 190, loss: 4.450 accuracy: 0.912
Step 200, loss: 4.505 accuracy: 0.912
Step 210, loss: 4.485 accuracy: 0.895
Step 220, loss: 4.423 accuracy: 0.895
Step 230, loss: 4.378 accuracy: 0.895
Step 240, loss: 4.340 accuracy: 0.895
Step 250, loss: 4.308 accuracy: 0.895
Step 260, loss: 4.285 accuracy: 0.895
Step 270, loss: 4.286 accuracy: 0.895
Step 280, loss: 4.379 accuracy: 0.895
Step 290, loss: 4.401 accuracy: 0.895
Step 300, loss: 4.382 accuracy: 0.895
Step 310, loss: 4.366 accuracy: 0.895
Step 320, loss: 4.347 accuracy: 0.895
Step 330, loss: 4.335 accuracy: 0.895
Step 340, loss: 4.350 accuracy: 0.895
Step 350, loss: 4.353 accuracy: 0.895
Step 360, loss: 4.324 accuracy: 0.895
Step 370, loss: 4.297 accuracy: 0.895
Step 380, loss: 4.284 accuracy: 0.895
Step 390, loss: 4.376 accuracy: 0.895
Step 400, loss: 29.588 accuracy: 0.667
Step 410, loss: 28.577 accuracy: 0.684
Step 420, loss: 27.005 accuracy: 0.702
Step 430, loss: 7.847 accuracy: 0.860
Step 440, loss: 4.099 accuracy: 0.877
Step 450, loss: 4.077 accuracy: 0.877
Step 460, loss: 4.050 accuracy: 0.877
Step 470, loss: 4.039 accuracy: 0.877
Step 480, loss: 4.029 accuracy: 0.877
Step 490, loss: 4.018 accuracy: 0.877
Step 500, loss: 4.006 accuracy: 0.877
Step 510, loss: 3.994 accuracy: 0.877
Step 520, loss: 3.983 accuracy: 0.877
Step 530, loss: 3.974 accuracy: 0.877
Step 540, loss: 3.971 accuracy: 0.877
Step 550, loss: 3.969 accuracy: 0.877
Step 560, loss: 3.959 accuracy: 0.877
Step 570, loss: 3.947 accuracy: 0.877
Step 580, loss: 3.934 accuracy: 0.877
Step 590, loss: 3.922 accuracy: 0.877
Step 600, loss: 3.910 accuracy: 0.877
Step 610, loss: 3.898 accuracy: 0.877
Step 620, loss: 3.888 accuracy: 0.895
Step 630, loss: 3.880 accuracy: 0.895
Step 640, loss: 3.873 accuracy: 0.895
Step 650, loss: 3.865 accuracy: 0.895
Step 660, loss: 3.856 accuracy: 0.895
Step 670, loss: 3.847 accuracy: 0.895
Step 680, loss: 3.845 accuracy: 0.877
Step 690, loss: 3.848 accuracy: 0.877
Step 700, loss: 3.855 accuracy: 0.877
Step 710, loss: 4.168 accuracy: 0.895
Step 720, loss: 32.773 accuracy: 0.667
Step 730, loss: 28.188 accuracy: 0.684
Step 740, loss: 24.791 accuracy: 0.702
Step 750, loss: 4.705 accuracy: 0.895
Step 760, loss: 4.229 accuracy: 0.895
Step 770, loss: 4.219 accuracy: 0.895
Step 780, loss: 4.212 accuracy: 0.895
Step 790, loss: 4.205 accuracy: 0.895
Step 800, loss: 4.198 accuracy: 0.895
Step 810, loss: 4.191 accuracy: 0.895
Step 820, loss: 4.185 accuracy: 0.895
Step 830, loss: 4.179 accuracy: 0.895
Step 840, loss: 4.173 accuracy: 0.912
Step 850, loss: 4.167 accuracy: 0.912
Step 860, loss: 4.167 accuracy: 0.912
Step 870, loss: 4.163 accuracy: 0.912
Step 880, loss: 4.160 accuracy: 0.912
Step 890, loss: 4.156 accuracy: 0.912
Step 900, loss: 4.153 accuracy: 0.912
Step 910, loss: 4.150 accuracy: 0.912
Step 920, loss: 4.147 accuracy: 0.912
Step 930, loss: 4.144 accuracy: 0.912
Step 940, loss: 4.142 accuracy: 0.912
Step 950, loss: 4.141 accuracy: 0.912
Step 960, loss: 4.141 accuracy: 0.912
Step 970, loss: 4.145 accuracy: 0.895
Step 980, loss: 4.152 accuracy: 0.895
Step 990, loss: 4.170 accuracy: 0.895

we can se that as steps increase, the bias is minimized and loss is decreased while the accuracy increases.

Below is the tensorflow v2 code.

import tensorflow as tf
from tensorflow import keras

n_dense=30
n_train_epoch=20
num_classes = 2 # 악성, 양성

model=keras.models.Sequential()
model.add(keras.layers.Dense(num_classes, use_bias=True, activation='softmax', input_shape=(30,)))

model.summary()
model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

# 모델 훈련
model.fit(train_X, train_y, epochs=n_train_epoch)

# 모델 시험
test_loss, test_accuracy = model.evaluate(test_X, test_y, verbose=1)
print("test_loss: {} ".format(test_loss))
print("test_accuracy: {}".format(test_accuracy))
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 2)                 62        
=================================================================
Total params: 62
Trainable params: 62
Non-trainable params: 0
_________________________________________________________________
Train on 512 samples
Epoch 1/20
512/512 [==============================] - 0s 84us/sample - loss: 235.4257 - accuracy: 0.6270
Epoch 2/20
512/512 [==============================] - 0s 21us/sample - loss: 202.2681 - accuracy: 0.6270
Epoch 3/20
512/512 [==============================] - 0s 21us/sample - loss: 169.0789 - accuracy: 0.6270
Epoch 4/20
512/512 [==============================] - 0s 21us/sample - loss: 136.4722 - accuracy: 0.6270
Epoch 5/20
512/512 [==============================] - 0s 18us/sample - loss: 101.9908 - accuracy: 0.6270
Epoch 6/20
512/512 [==============================] - 0s 20us/sample - loss: 69.4104 - accuracy: 0.6309
Epoch 7/20
512/512 [==============================] - 0s 21us/sample - loss: 37.6982 - accuracy: 0.5840
Epoch 8/20
512/512 [==============================] - 0s 20us/sample - loss: 22.2549 - accuracy: 0.2129
Epoch 9/20
512/512 [==============================] - 0s 21us/sample - loss: 20.7897 - accuracy: 0.1875
Epoch 10/20
512/512 [==============================] - 0s 23us/sample - loss: 19.4167 - accuracy: 0.1816
Epoch 11/20
512/512 [==============================] - 0s 23us/sample - loss: 18.4999 - accuracy: 0.1914
Epoch 12/20
512/512 [==============================] - 0s 27us/sample - loss: 17.5924 - accuracy: 0.1992
Epoch 13/20
512/512 [==============================] - 0s 18us/sample - loss: 16.6883 - accuracy: 0.2031
Epoch 14/20
512/512 [==============================] - 0s 21us/sample - loss: 15.7293 - accuracy: 0.2500
Epoch 15/20
512/512 [==============================] - 0s 23us/sample - loss: 14.9052 - accuracy: 0.2637
Epoch 16/20
512/512 [==============================] - 0s 21us/sample - loss: 13.9781 - accuracy: 0.2812
Epoch 17/20
512/512 [==============================] - 0s 27us/sample - loss: 13.2065 - accuracy: 0.3164
Epoch 18/20
512/512 [==============================] - 0s 18us/sample - loss: 12.4410 - accuracy: 0.3438
Epoch 19/20
512/512 [==============================] - 0s 25us/sample - loss: 11.7536 - accuracy: 0.3379
Epoch 20/20
512/512 [==============================] - 0s 25us/sample - loss: 11.1255 - accuracy: 0.3867
test_loss: 13.462975719518829 
test_accuracy: 0.3333333432674408


C:\Users\jwl23\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py:2426: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '

This looks similar to the deep learning classification codes.

Below is the deep learning model.

import tensorflow as tf
from tensorflow import keras

n_dense=30
n_train_epoch=20
num_classes = 2 # 악성, 양성

model=keras.models.Sequential()

#레이어 3장을 추가
model.add(keras.layers.Dense(n_dense, input_shape=(30,), use_bias=True))
model.add(keras.layers.Dense(n_dense,  use_bias=True))
model.add(keras.layers.Dense(n_dense,  use_bias=True))

model.add(keras.layers.Dense(num_classes, use_bias=True, activation='softmax'))

model.summary()
model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

# 모델 훈련
model.fit(train_X, train_y, epochs=n_train_epoch)

# 모델 시험
test_loss, test_accuracy = model.evaluate(test_X, test_y, verbose=1)
print("test_loss: {} ".format(test_loss))
print("test_accuracy: {}".format(test_accuracy))
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 30)                930       
_________________________________________________________________
dense_2 (Dense)              (None, 30)                930       
_________________________________________________________________
dense_3 (Dense)              (None, 30)                930       
_________________________________________________________________
dense_4 (Dense)              (None, 2)                 62        
=================================================================
Total params: 2,852
Trainable params: 2,852
Non-trainable params: 0
_________________________________________________________________
Train on 512 samples
Epoch 1/20
512/512 [==============================] - 0s 115us/sample - loss: 76.6753 - accuracy: 0.4805
Epoch 2/20
512/512 [==============================] - 0s 35us/sample - loss: 11.5238 - accuracy: 0.6211
Epoch 3/20
512/512 [==============================] - 0s 29us/sample - loss: 2.6161 - accuracy: 0.7773
Epoch 4/20
512/512 [==============================] - 0s 29us/sample - loss: 1.0152 - accuracy: 0.8789
Epoch 5/20
512/512 [==============================] - 0s 29us/sample - loss: 0.8146 - accuracy: 0.9023
Epoch 6/20
512/512 [==============================] - 0s 27us/sample - loss: 1.0132 - accuracy: 0.8711
Epoch 7/20
512/512 [==============================] - 0s 33us/sample - loss: 0.8232 - accuracy: 0.9004
Epoch 8/20
512/512 [==============================] - 0s 31us/sample - loss: 0.7393 - accuracy: 0.9004
Epoch 9/20
512/512 [==============================] - 0s 29us/sample - loss: 0.7088 - accuracy: 0.8965
Epoch 10/20
512/512 [==============================] - 0s 37us/sample - loss: 0.7585 - accuracy: 0.9004
Epoch 11/20
512/512 [==============================] - 0s 29us/sample - loss: 0.7788 - accuracy: 0.9160
Epoch 12/20
512/512 [==============================] - 0s 37us/sample - loss: 0.8663 - accuracy: 0.8809
Epoch 13/20
512/512 [==============================] - 0s 29us/sample - loss: 0.8939 - accuracy: 0.8809
Epoch 14/20
512/512 [==============================] - 0s 33us/sample - loss: 1.5881 - accuracy: 0.8281
Epoch 15/20
512/512 [==============================] - 0s 23us/sample - loss: 1.1497 - accuracy: 0.8691
Epoch 16/20
512/512 [==============================] - 0s 33us/sample - loss: 0.5884 - accuracy: 0.9141
Epoch 17/20
512/512 [==============================] - 0s 27us/sample - loss: 0.6733 - accuracy: 0.9141
Epoch 18/20
512/512 [==============================] - 0s 31us/sample - loss: 0.5782 - accuracy: 0.9141
Epoch 19/20
512/512 [==============================] - 0s 31us/sample - loss: 0.4742 - accuracy: 0.9199
Epoch 20/20
512/512 [==============================] - 0s 35us/sample - loss: 0.6270 - accuracy: 0.9082
test_loss: 1.072830989695432 
test_accuracy: 0.859649121761322

Because the data set is only 512, there is not much advantage in creating multiple layers. The logistic regression does a good enough job,

Summery

image

Leave a comment