手写一个神经网络(不用Pytorch模块)来实现MINIST手写数据集_minset手写数据集-程序员宅基地

技术标签: Pytorch  机器学习  深度学习  pytorch  人工智能  神经网络  

概览

这是我们老师之前上课布置的一个小作业,现分享出来,纪念一下。因为当时是用paddle实现的,众所周知,paddle~pytorch,所以想用torch运行的伙伴,把paddle部分改成torch就行。不过比较狗血的是,像损失函数那些,torch中都内置了相关的函数,比如BCEloss,Softmax之类的,我们老师要我们自己写出来。。。还有反向传播的部分也是。

思路

Softmax

Softmax主要还是注意其反向传播过程。

 def value(self, x: np.ndarray) -> np.ndarray:
        n, k = x.shape
        beta = x.max(axis = 1).reshape((n, 1))
        tmp = np.exp(x - beta)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        val = tmp / numer
        return val

Softmax’s Derivate

def derivative(self, x: np.ndarray) -> np.ndarray:
        n, k = x.shape
        D = np.zeros((k, k, n))
        for i in range(n):
            tmp = x[i:i+1, :]
            val = self.value(x)
            D[:,:,i] = np.diag(val.reshape(-1)) - val.T.dot(val)

CrossEntropyLoss

    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        yhat = np.clip(yhat, 0.0001, 0.9999) #同逻辑回归的截断操作
        los = -np.mean(np.multiply(np.log(yhat), y) + np.multiply(np.log(1 - yhat), (1 - y)))
        return los

CrossEntropyLoss’s Derivate

    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        der = (yhat - y) / (yhat * (1 - yhat))
        return der

CEwithLogitLoss

这个与普通的BCE不太一样,是将经过softmax后的yhat输入进去。

    def value(self, logits: np.ndarray, y: np.ndarray) -> float:
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        tmp = np.sum(tmp, axis = 1)
        tmp = np.log(tmp+1.0e-40)
        los = -np.sum(y*logits) + np.sum(beta) + np.sum(tmp)
        los = los / n
        return los

CEwithLogitLoss’s Derivate

    def derivative(self, logits: np.ndarray, y: np.ndarray) -> np.ndarray:
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        yhat = tmp/numer
        der = (yhat - y) / n
        return der

弄清楚以上几部分就差不多了。

完整代码

import numpy as np
import struct
import matplotlib.pyplot as plt
import os
from PIL import Image
from sklearn.utils import gen_batches
from sklearn.metrics import classification_report, confusion_matrix
from typing import *
from numpy.linalg import *


train_image_file = './data/data136845/train-images-idx3-ubyte'
train_label_file = './data/data136845/train-labels-idx1-ubyte'
test_image_file = './data/data136845/t10k-images-idx3-ubyte'
test_label_file = './data/data136845/t10k-labels-idx1-ubyte'


def decode_image(path):
    with open(path, 'rb') as f:
        magic, num, rows, cols = struct.unpack('>IIII', f.read(16))
        images = np.fromfile(f, dtype=np.uint8).reshape(-1, 784)
        images = np.array(images, dtype = float)
    return images

def decode_label(path):
    with open(path, 'rb') as f:
        magic, n = struct.unpack('>II',f.read(8))
        labels = np.fromfile(f, dtype=np.uint8)
        labels = np.array(labels, dtype = float)
    return labels

def load_data():
    train_X = decode_image(train_image_file)
    train_Y = decode_label(train_label_file)
    test_X = decode_image(test_image_file)
    test_Y = decode_label(test_label_file)
    return (train_X, train_Y, test_X, test_Y)
trainX, trainY, testX, testY = load_data()

num_train, num_feature = trainX.shape
plt.figure(1, figsize=(20,10))
for i in range(8):
    idx = np.random.choice(range(num_train))
    plt.subplot(int('24'+str(i+1)))
    plt.imshow(trainX[idx,:].reshape((28,28)))
    plt.title('label is %d'%trainY[idx])
plt.show()

# normalize input value between 0 and 1.
trainX, testX = trainX/255, testX/255

# convert all scaler labels to one-hot vectors.
def to_onehot(y):
    y = y.astype(int)
    num_class = len(set(y))
    Y = np.eye((num_class))
    return Y[y]

trainY = to_onehot(trainY)
testY = to_onehot(testY)
num_train, num_feature = trainX.shape
num_test, _ = testX.shape
_, num_class = trainY.shape
print('number of features is %d'%num_feature)
print('number of classes is %d'%num_class)
print('number of training samples is %d'%num_train)
print('number of testing samples is %d'%num_test)
from abc import ABC, abstractmethod, abstractproperty

class Activation(ABC):
    '''
    An abstract class that implements an activation function
    '''
    @abstractmethod
    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Value of the activation function when input is x.
        Parameters:
          x is an input to the activation function.
        Returns: 
          Value of the activation function. The shape of the return is the same as that of x.
        '''
        return x
    @abstractmethod
    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Derivative of the activation function with input x.
        Parameters:
          x is the input to activation function
        Returns: 
          Derivative of the activation function w.r.t x.
        '''
        return x

class Identity(Activation):
    '''
    Identity activation function. Input and output are identical. 
    '''

    def __init__(self):
        super(Identity, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        return x
    
    def derivative(self, x: np.ndarray) -> np.ndarray:
        n, m = x.shape
        return np.ones((n, m))
    

class Sigmoid(Activation):
    '''
    Sigmoid activation function y = 1/(1 + e^(x*k)), where k is the parameter of the sigmoid function 
    '''

    def __init__(self, k: float = 1.):
        '''
        Parameters:
          k is the parameter of the sigmoid function.
        '''
        self.k = k
        super(Sigmoid, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is a two dimensional numpy array.
        Returns:
          element-wise sigmoid value of the two dimensional array.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        val = 1 / (1 + np.exp(np.negative(x * self.k)))
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is a two dimensional array.
        Returns:
          a two dimensional array whose shape is the same as that of x. The returned value is the elementwise 
          derivative of the sigmoid function w.r.t. x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        val = 1 / (1 + np.exp(np.negative(x * self.k)))
        der = val * (1 - val)
        return der
    
class ReLU(Activation):
    '''
    Rectified linear unit activation function
    '''

    def __init__(self):
        super(ReLU, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE BELOW ####
        '''
        val = x*(x>=0)
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        The derivative of the ReLU function w.r.t. x. Set the derivative to 0 at x=0.
        Parameters:
          x is the input to ReLU function
        Returns:
          elementwise derivative of ReLU. The shape of the returned value is the same as that of x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        der = np.ones(x.shape)*(x>=0)
        return der


class Softmax(Activation):
    '''
    softmax nonlinear function.
    '''

    def __init__(self):
        '''
        There are no parameters in softmax function.
        '''
        super(Softmax, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is the input to the softmax function. x is a two dimensional numpy array. Each row is the input to the softmax function
        Returns:
          output of the softmax function. The returned value is with the same shape as that of x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        n, k = x.shape
        beta = x.max(axis = 1).reshape((n, 1))
        tmp = np.exp(x - beta)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        val = tmp / numer
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is the input to the softmax function. x is a two dimensional numpy array.
        Returns:
          a two dimensional array representing the derivative of softmax function w.r.t. x.
        '''
        n, k = x.shape
        D = np.zeros((k, k, n))
        for i in range(n):
            tmp = x[i:i+1, :]
            val = self.value(x)
            D[:,:,i] = np.diag(val.reshape(-1)) - val.T.dot(val)

##################################################################################################################
# LOSS FUNCTIONS
##################################################################################################################

class Loss(ABC):
    '''
    Abstract class for a loss function
    '''
    @abstractmethod
    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        '''
        Value of the empirical loss function.
        Parameters:
          y_hat is the output of a neural network. The shape of y_hat is (n, k).
          y contains true labels with shape (n, k).
        Returns:
          value of the empirical loss function.
        '''
        return 0

    @abstractmethod
    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        Derivative of the empirical loss function with respect to the predictions.
        Parameters:
          
        Returns:
          The derivative of the loss function w.r.t. y_hat. The returned value is a two dimensional array with 
          shape (n, k)
        '''
        return yhat

class CrossEntropy(Loss):
    '''
    Cross entropy loss function
    '''

    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        '''
        #### YOUR CODE BELOW ####
        '''
        #m = yhat.shape[0]
        yhat = np.clip(yhat, 0.0001, 0.9999) #同逻辑回归的截断操作
       # los = -np.sum(y * np.log(yhat)) 
        los = -np.mean(np.multiply(np.log(yhat), y) + np.multiply(np.log(1 - yhat), (1 - y)))
        return los

    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE HERE ####
        '''

        der = (yhat - y) / (yhat * (1 - yhat))
        return der

class CEwithLogit(Loss):
    '''
    Cross entropy loss function with logits (input of softmax activation function) and true labels as inputs.
    '''
    def value(self, logits: np.ndarray, y: np.ndarray) -> float:
        '''
        #### YOUR CODE BELOW ####
        '''
        #m = y.shape[0]
        #logits = np.clip(logits, 0.0001, 0.9999) #同逻辑回归的截断操作
        #los = (-1/m) * np.sum(y * logits) 
        #los = np.sum(y * logits)- np.log(np.sum(np.exp(logits)))
        #delta = 1e-7
        #los = np.sum(-np.log(logits + delta) * y - np.log(1 - logits + delta) * (1 - y)) / m
        #los = -np.sum(y * np.log(logits) + (1 - y) * np.log(1 - logits)) / m
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        tmp = np.sum(tmp, axis = 1)
        tmp = np.log(tmp+1.0e-40)
        los = -np.sum(y*logits) + np.sum(beta) + np.sum(tmp)
        los = los / n
        return los


    def derivative(self, logits: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE BELOW ####
        '''
        #logits = logits - np.max(logits, axis=1, keepdims=True)
        #logits_sum = np.sum(np.exp(logits), axis=1, keepdims=True)
        #print("np.exp(logits)的数值:\n",np.exp(logits))
        #print("logits_sum的数值:\n",logits_sum)
        #val = np.exp(logits) / logits_sum
        #val = np.divide(val, logits_sum, out = np.zeros_like(val), where = logits_sum != 0)
        #val = np.exp(logits) / np.sum(np.exp(logits))
        #logits = np.clip(logits, 0.0001, 0.9999) #同逻辑回归的截断操作
        #val = np.exp(logits) / np.sum(np.exp(logits))
        #der = y - val
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        yhat = tmp/numer
        der = (yhat - y) / n
        return der
In [12]
##################################################################################################################
# METRICS
##################################################################################################################

def accuracy(y_hat: np.ndarray, y: np.ndarray) -> float:
    '''
    Accuracy of predictions, given the true labels.
    Parameters:
      y_hat is a two dimensional array. Each row is a softmax output.
      y is a two dimensional array. Each row is a one-hot vector.
    Returns:
      accuracy which is a float number
    '''
    '''
    #### YOUR CODE HERE ####
    '''
    #max_index = np.argmax(y_hat, axis=1) #找出每行的最大索引
    #y_hat[np.arange(y_hat.shape[0]), max_index] = 1 #直接令其为1
    #acc = np.sum(np.argmax(y_hat, axis=1) == np.argmax(y, axis=1)) #统计个数
    #acc = acc * 1.0 / y.shape[0] #求准确率
    n = y.shape[0]
    acc = np.sum(np.argmax(y_hat, axis = 1) == np.argmax(y, axis = 1)) / n
    return acc

# the following code implements a three layer neural network, namely input layer, hidden layer and output layer
digits = 10 # number of classes
_, n_x = trainX.shape
n_h = 64 # number of nodes in the hidden layer
learning_rate = 0.0001

sigmoid = ReLU() # activation function in the hidden layer
softmax = Softmax() # nonlinear function in the output layer
loss = CEwithLogit() # loss function
epoches = 2000 

# initialization of W1, b1, W2, b2
W1 = np.random.randn(n_x, n_h)
b1 = np.random.randn(1, n_h)
W2 = np.random.randn(n_h, digits)
b2 = np.random.randn(1, digits)

# training procedure
for epoch in range(epoches):
    Z1 = np.dot(trainX, W1) + b1
    A1 = sigmoid.value(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = softmax.value(Z2)
    cost = loss.value(Z2, trainY)

    dZ2 = loss.derivative(Z2, trainY)
    dW2 = np.dot(A1.T, dZ2)
    db2 = np.sum(dZ2, axis = 0, keepdims=True)

    dA1 = np.dot(dZ2, W2.T)
    dZ1 = dA1 * sigmoid.derivative(Z1) 
    dW1 = np.dot(trainX.T, dZ1)
    db1 = np.sum(dZ1, axis = 0, keepdims=True)

    W2 = W2 - learning_rate * dW2
    b2 = b2 - learning_rate * db2
    W1 = W1 - learning_rate * dW1
    b1 = b1 - learning_rate * db1


    if (epoch % 100 == 0):
        print("Epoch", epoch, "cost: ", cost)

print("Final cost:", cost)

# testing procedure
Z1 = np.dot(testX, W1) + b1
A1 = sigmoid.value(Z1)
Z2 = np.dot(A1, W2) + b2
A2 = softmax.value(Z2)

predictions = np.argmax(A2, axis = 1)
labels = np.argmax(testY, axis = 1)

print(confusion_matrix(predictions, labels))
print(classification_report(predictions, labels))

# design a neural network class

class NeuralNetwork():
    '''
    Fully connected neural network.
    Attributes:
      n_layers is the number of layers.
      activation is a list of Activation objects corresponding to each layer's activation function.
      loss is a Loss object corresponding to the loss function used to train the network.
      learning_rate is the learning rate.
      W is a list of weight matrix used in each layer.
      b is a list of biases used in each layer.
    '''

    def __init__(self, layer_size: List[int], activation: List[Activation], loss: Loss, learning_rate: float = 0.01) -> None:
        '''
        Initializes a NeuralNetwork object
        '''
        assert len(activation) == len(layer_size), \
        "Number of sizes for layers provided does not equal the number of activation"
        self.layer_size = layer_size
        self.num_layer = len(layer_size)
        self.activation = activation
        self.loss = loss
        self.learning_rate = learning_rate
        self.W = []
        self.b = []
        for i in range(self.num_layer-1):
            W = np.random.randn(layer_size[i], layer_size[i+1]) #/ np.sqrt(layer_size[i])
            b = np.random.randn(1, layer_size[i+1])
            self.W.append(W)
            self.b.append(b)
        self.A = []
        self.Z = []

    def forward(self, X: np.ndarray) -> (List[np.ndarray], List[np.ndarray]):
        '''
        Forward pass of the network on a dataset of n examples with m features. Except the first layer, each layer
        computes linear transformation plus a bias followed by a nonlinear transformation.
        Parameters:
          X is the training data with shape (n, m).
        Returns:
          A is a list of numpy data, representing the output of each layer after the first layer. There are 
            self.num_layer numpy arrays in the list and each array is of shape (n, self.layer_size[i]).
          Z is a list of numpy data, representing the input of each layer after the first layer. There are
            self.num_layer numpy arrays in the list and each array is of shape (n, self.layer_size[i]).
        '''
        num_sample = X.shape[0]
        A, Z = [], []
        for i in range(self.num_layer):
            if i == 0:
                a = X.copy()
                z = X.copy()
            else:
                a = A[-1]
                z = a.dot(self.W[i-1]) + self.b[i-1]
                a = self.activation[i].value(z)
            Z.append(z)
            A.append(a)
        self.A = A
        self.Z = Z
        return Z[-1], A[-1]

    def backward(self, dLdyhat) -> List[np.ndarray]:
        '''
        Backward pass of the network on a dataset of n examples with m features. The derivatives are computed from 
          the end of the network to the front.
        Parameters:
          Z is a list of numpy data, representing the input of each layer. There are self.num_layer numpy arrays in 
            the list and each array is of shape (n, self.layer_size[i]).
          dLdyhat is the derivative of the empirical loss w.r.t. yhat which is the output of the neural network.
            dLdyhat is with shape (n, self.layer_size[-1])
        Returns:
          dZ is a list of numpy array. Each numpy array in dZ represents the derivative of the emipirical loss function
            w.r.t. the input of that specific layer. There are self.n_layer arrays in the list and each array is of 
            shape (n, self.layer_size[i])
        '''
        dZ = []
        for i in range(self.num_layer-1, -1, -1):
            if i == self.num_layer - 1:
                dLdz = dLdyhat
            else:
                dLda = np.dot(dLdz, self.W[i].T)
                dLdz = self.activation[i].derivative(self.Z[i])*dLda# derivative w.r.t. net input for layer i
            dZ.append(dLdz)
        dZ = list(reversed(dZ))
        self.dLdZ = dZ
        return

    def update_weights(self) -> List[np.ndarray]:
        '''
        Having computed the delta values from the backward pass, update each weight with the sum over the training
        examples of the gradient of the loss with respect to the weight.
        :param X: The training set, with size (n, f)
        :param Z_vals: a list of z-values for each example in the dataset. There are n_layers items in the list and
                       each item is an array of size (n, layer_sizes[i])
        :param deltas: A list of delta values for each layer. There are n_layers items in the list and
                       each item is an array of size (n, layer_sizes[i])
        :return W: The newly updated weights (i.e. self.W)
        '''
        for i in range(self.num_layer-1):
            a = self.A[i]
            dW = np.dot(a.T, self.dLdZ[i+1])
            db = np.sum(self.dLdZ[i+1], axis = 0, keepdims = True)
            self.W[i] -= self.learning_rate*dW
            self.b[i] -= self.learning_rate*db
        return
    
    def one_epoch(self, X: np.ndarray,  Y: np.ndarray, batch_size: int, train: bool = True)-> (float, float):
        '''
        One epoch of either training or testing procedure.
        Parameters:
          X is the data input. X is a two dimensional numpy array.
          Y is the data label. Y is a one dimensional numpy array.
          batch_size is the number of samples in each batch.
          train is a boolean value indicating training or testing procedure.
        Returns:
          loss_value is the average loss function value.
          acc_value is the prediction accuracy. 
        '''
        n = X.shape[0]
        slices = list(gen_batches(n, batch_size))
        num_batch = len(slices)
        idx = list(range(n))
        np.random.shuffle(idx)
        loss_value, acc_value = 0, 0
        for i, index in enumerate(slices):
            index = idx[slices[i]]
            x, y = X[index,:], Y[index]
            z, yhat = model.forward(x)   # Execute forward pass
            if train:
                dLdz = self.loss.derivative(z, y)         # Calculate derivative of the loss with respect to out
                self.backward(dLdz)     # Execute the backward pass to compute the deltas
                self.update_weights()  # Calculate the gradients and update the weights
            loss_value += self.loss.value(z, y)*x.shape[0]
            acc_value += accuracy(yhat, y)*x.shape[0]
        loss_value = loss_value/n
        acc_value = acc_value/n
        return loss_value, acc_value
def train(model : NeuralNetwork, X: np.ndarray, Y: np.ndarray, batch_size: int, epoches: int) -> (List[np.ndarray], List[float]):
    '''
    trains the neural network.
    Parameters:
      model is a NeuralNetwork object.
      X is the data input. X is a two dimensional numpy array.
      Y is the data label. Y is a one dimensional numpy array.
      batch_size is the number of samples in each batch.
      epoches is an integer, representing the number of epoches.
    Returns:
      epoch_loss is a list of float numbers, representing loss function value in all epoches.
      epoch_acc is a list of float numbers, representing the accuracies in all epoches.
    '''
    loss_value, acc = model.one_epoch(X, Y, batch_size, train = False)
    epoch_loss, epoch_acc = [loss_value], [acc]
    print('Initialization: ', 'loss %.4f  '%loss_value, 'accuracy %.2f'%acc)
    for epoch in range(epoches):
        if epoch%100 == 0 and epoch > 0: # decrease the learning rate
            model.learning_rate = min(model.learning_rate/10, 1.0e-5)
        loss_value, acc = model.one_epoch(X, Y, batch_size, train = True)
        if epoch%10 == 0:
            print("Epoch {}/{}: Loss={}, Accuracy={}".format(epoch, epoches, loss_value, acc))
        epoch_loss.append(loss_value)
        epoch_acc.append(acc)
    return epoch_loss, epoch_acc

# training procedure
num_sample, num_feature = trainX.shape
epoches = 200
batch_size = 512
Loss = []
Acc = []
learning_rate = 1/num_sample*batch_size
np.random.seed(2022)
model = NeuralNetwork([784, 256, 64, 10], [Identity(), ReLU(), ReLU(), Softmax()], CEwithLogit(), learning_rate = learning_rate)
epoch_loss, epoch_acc = train(model, trainX, trainY, batch_size, epoches)

# testing procedure
test_loss, test_acc = model.one_epoch(testX, testY, batch_size, train = False)
z, yhat = model.forward(testX)
yhat = np.argmax(yhat, axis = 1)
y = np.argmax(testY, axis = 1)
print(yhat.shape, y.shape)
print(confusion_matrix(yhat, y))
print(classification_report(yhat, y))

完整代码中有两部分,一部分是纯手写的,另一部分是用神经网络写的。

成果展示

Initialization:  loss 786.6578   accuracy 0.09
Epoch 0/200: Loss=67.51685728447026, Accuracy=0.6034333333333334
Epoch 10/200: Loss=1.5569211739656414, Accuracy=0.6363833333333333
Epoch 20/200: Loss=1.195973422899852, Accuracy=0.6734166666666667
Epoch 30/200: Loss=1.0514650120496702, Accuracy=0.7037666666666667
Epoch 40/200: Loss=0.9529855833743788, Accuracy=0.7264666666666667
Epoch 50/200: Loss=0.9016336932334414, Accuracy=0.7411833333333333
Epoch 60/200: Loss=0.8326483342395156, Accuracy=0.7559
Epoch 70/200: Loss=0.78866026237516, Accuracy=0.7687
Epoch 80/200: Loss=0.7709660201411853, Accuracy=0.77445
Epoch 90/200: Loss=0.726955942954085, Accuracy=0.7852333333333333
Epoch 100/200: Loss=0.8591143145912691, Accuracy=0.7367
Epoch 110/200: Loss=0.5961169773508302, Accuracy=0.8235333333333333
Epoch 120/200: Loss=0.5877743327835988, Accuracy=0.8259833333333333
Epoch 130/200: Loss=0.5841449699894233, Accuracy=0.8258833333333333
Epoch 140/200: Loss=0.5818215242196157, Accuracy=0.8264
Epoch 150/200: Loss=0.5801588790387723, Accuracy=0.8270833333333333
Epoch 160/200: Loss=0.5788907581218291, Accuracy=0.8273833333333334
Epoch 170/200: Loss=0.5778682345964669, Accuracy=0.8274666666666667
Epoch 180/200: Loss=0.577021751676371, Accuracy=0.82755
Epoch 190/200: Loss=0.5762952337956709, Accuracy=0.8273833333333334
(10000,) (10000,)
[[ 891    0   23   10    0   20   25    7   28   10]
 [   0 1062   11    3    4    2    3    7    5    7]
 [   7   12  848   35   20    9   21   27   27    6]
 [   2   18   16  810    7   70    1    8   31   22]
 [   3    1   19    6  757   14   18   11    8  101]
 [  22    6   13   75   10  616   19    2   53   19]
 [  23    2   20    1   35   46  849    0   19    8]
 [   7    3   21   19    6   10    1  872   14   42]
 [  19   22   45   24    6   62   14    9  756   17]
 [   6    9   16   27  137   43    7   85   33  777]]
              precision    recall  f1-score   support

           0       0.91      0.88      0.89      1014
           1       0.94      0.96      0.95      1104
           2       0.82      0.84      0.83      1012
           3       0.80      0.82      0.81       985
           4       0.77      0.81      0.79       938
           5       0.69      0.74      0.71       835
           6       0.89      0.85      0.87      1003
           7       0.85      0.88      0.86       995
           8       0.78      0.78      0.78       974
           9       0.77      0.68      0.72      1140

    accuracy                           0.82     10000
   macro avg       0.82      0.82      0.82     10000
weighted avg       0.82      0.82      0.82     10000
版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/weixin_44673253/article/details/125434687

智能推荐

5个超厉害的资源搜索网站,每一款都可以让你的资源满满!_最全资源搜索引擎-程序员宅基地

文章浏览阅读1.6w次,点赞8次,收藏41次。生活中我们无时不刻不都要在网站搜索资源,但就是缺少一个趁手的资源搜索网站,如果有一个比较好的资源搜索网站可以帮助我们节省一大半时间!今天小编在这里为大家分享5款超厉害的资源搜索网站,每一款都可以让你的资源丰富精彩!网盘传奇一款最有效的网盘资源搜索网站你还在为找网站里面的资源而烦恼找不到什么合适的工具而烦恼吗?这款网站传奇网站汇聚了4853w个资源,并且它每一天都会持续更新资源;..._最全资源搜索引擎

Book类的设计(Java)_6-1 book类的设计java-程序员宅基地

文章浏览阅读4.5k次,点赞5次,收藏18次。阅读测试程序,设计一个Book类。函数接口定义:class Book{}该类有 四个私有属性 分别是 书籍名称、 价格、 作者、 出版年份,以及相应的set 与get方法;该类有一个含有四个参数的构造方法,这四个参数依次是 书籍名称、 价格、 作者、 出版年份 。裁判测试程序样例:import java.util.*;public class Main { public static void main(String[] args) { List <Book>_6-1 book类的设计java

基于微信小程序的校园导航小程序设计与实现_校园导航微信小程序系统的设计与实现-程序员宅基地

文章浏览阅读613次,点赞28次,收藏27次。相比于以前的传统手工管理方式,智能化的管理方式可以大幅降低学校的运营人员成本,实现了校园导航的标准化、制度化、程序化的管理,有效地防止了校园导航的随意管理,提高了信息的处理速度和精确度,能够及时、准确地查询和修正建筑速看等信息。课题主要采用微信小程序、SpringBoot架构技术,前端以小程序页面呈现给学生,结合后台java语言使页面更加完善,后台使用MySQL数据库进行数据存储。微信小程序主要包括学生信息、校园简介、建筑速看、系统信息等功能,从而实现智能化的管理方式,提高工作效率。

有状态和无状态登录

传统上用户登陆状态会以 Session 的形式保存在服务器上,而 Session ID 则保存在前端的 Cookie 中;而使用 JWT 以后,用户的认证信息将会以 Token 的形式保存在前端,服务器不需要保存任何的用户状态,这也就是为什么 JWT 被称为无状态登陆的原因,无状态登陆最大的优势就是完美支持分布式部署,可以使用一个 Token 发送给不同的服务器,而所有的服务器都会返回同样的结果。有状态和无状态最大的区别就是服务端会不会保存客户端的信息。

九大角度全方位对比Android、iOS开发_ios 开发角度-程序员宅基地

文章浏览阅读784次。发表于10小时前| 2674次阅读| 来源TechCrunch| 19 条评论| 作者Jon EvansiOSAndroid应用开发产品编程语言JavaObjective-C摘要:即便Android市场份额已经超过80%,对于开发者来说,使用哪一个平台做开发仍然很难选择。本文从开发环境、配置、UX设计、语言、API、网络、分享、碎片化、发布等九个方面把Android和iOS_ios 开发角度

搜索引擎的发展历史

搜索引擎的发展历史可以追溯到20世纪90年代初,随着互联网的快速发展和信息量的急剧增加,人们开始感受到了获取和管理信息的挑战。这些阶段展示了搜索引擎在技术和商业模式上的不断演进,以满足用户对信息获取的不断增长的需求。

随便推点

控制对象的特性_控制对象特性-程序员宅基地

文章浏览阅读990次。对象特性是指控制对象的输出参数和输入参数之间的相互作用规律。放大系数K描述控制对象特性的静态特性参数。它的意义是:输出量的变化量和输入量的变化量之比。时间常数T当输入量发生变化后,所引起输出量变化的快慢。(动态参数) ..._控制对象特性

FRP搭建内网穿透(亲测有效)_locyanfrp-程序员宅基地

文章浏览阅读5.7w次,点赞50次,收藏276次。FRP搭建内网穿透1.概述:frp可以通过有公网IP的的服务器将内网的主机暴露给互联网,从而实现通过外网能直接访问到内网主机;frp有服务端和客户端,服务端需要装在有公网ip的服务器上,客户端装在内网主机上。2.简单的图解:3.准备工作:1.一个域名(www.test.xyz)2.一台有公网IP的服务器(阿里云、腾讯云等都行)3.一台内网主机4.下载frp,选择适合的版本下载解压如下:我这里服务器端和客户端都放在了/usr/local/frp/目录下4.执行命令# 服务器端给执_locyanfrp

UVA 12534 - Binary Matrix 2 (网络流‘最小费用最大流’ZKW)_uva12534-程序员宅基地

文章浏览阅读687次。题目:http://acm.hust.edu.cn/vjudge/contest/view.action?cid=93745#problem/A题意:给出r*c的01矩阵,可以翻转格子使得0表成1,1变成0,求出最小的步数使得每一行中1的个数相等,每一列中1的个数相等。思路:网络流。容量可以保证每一行和每一列的1的个数相等,费用可以算出最小步数。行向列建边,如果该格子是_uva12534

免费SSL证书_csdn alphassl免费申请-程序员宅基地

文章浏览阅读504次。1、Let's Encrypt 90天,支持泛域名2、Buypass:https://www.buypass.com/ssl/resources/go-ssl-technical-specification6个月,单域名3、AlwaysOnSLL:https://alwaysonssl.com/ 1年,单域名 可参考蜗牛(wn789)4、TrustAsia5、Alpha..._csdn alphassl免费申请

测试算法的性能(以选择排序为例)_算法性能测试-程序员宅基地

文章浏览阅读1.6k次。测试算法的性能 很多时候我们需要对算法的性能进行测试,最简单的方式是看算法在特定的数据集上的执行时间,简单的测试算法性能的函数实现见testSort()。【思想】:用clock_t计算某排序算法所需的时间,(endTime - startTime)/ CLOCKS_PER_SEC来表示执行了多少秒。【关于宏CLOCKS_PER_SEC】:以下摘自百度百科,“CLOCKS_PE_算法性能测试

Lane Detection_lanedetectionlite-程序员宅基地

文章浏览阅读1.2k次。fromhttps://towardsdatascience.com/finding-lane-lines-simple-pipeline-for-lane-detection-d02b62e7572bIdentifying lanes of the road is very common task that human driver performs. This is important ..._lanedetectionlite

推荐文章

热门文章

相关标签