Keras备忘录

为了记录Keras基本API，本博客展示一次线性回归极简机器学习全流程。

建立模型

定义一个简单的线性回归模型，使用 Keras 模块来构建和编译模型。以最简单的单层网络为例，设置1个输出节点，输入节点的数量为特征的种数。
keras.Sequential(layers=None, trainable=True, name=None)是models中的一个类，表示序贯模型（各层仅线性堆叠，无跨层连接）。

注意到定义时没有层，那么如何添加层？可以用其自带的方法Sequential.add(layer, rebuild=True)。其各个参数代表含义如下：

layer: layer instance.

什么是layer实例？一层layer由一个IO为张量的计算函数（层的call方法）和一些状态组成，这些状态保存在TensorFlow变量中（层的weights）。网络层通用的读写weights方法有

layer.get_weights(): 以含有Numpy矩阵的列表形式返回层的权重。
layer.set_weights(weights): 从含有Numpy矩阵的列表中设置层的权重（与get_weights的输出形状相同）。

当且仅当它作为模型的第一层时，需传入输入尺寸：

input_shape: （整数元组，不包括样本数的轴）

以keras.layers中的全连接层为例

Dense

 keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

对于这种2D层，输入尺寸传入方式也可更为

input_dim: 即输入特征种类数量，是整数类型。

接下来需传入一些重要参数：

units: 正整数，输出空间维度。
activation: 激活函数 (详见 activations)。若None，则不使用激活函数 (即，线性激活: a(x) = x)。

要实现生成一个指定参数的模型，既可以通过将网络层实例的列表传递给 Sequential 的构造器，来创建一个 Sequential 模型：

Sequential

num_features = 1
model = Sequential([ Dense(1, input_shape=(num_feaetures,)), ])

也可以简单地使用 .add() 方法将各层添加到模型中：

model = keras.models.Sequential()
  # Describe the topography of the model.
  # The topography of a simple linear regression model is a single node in a single layer.
model.add(keras.layers.Dense(units=1, input_shape=(num_features,)))

在机器学习之前，需要把模型的topography编译为Keras可以高效执行的程序。使用Sequential 模型的compile 方法完成，它接收三个参数：

优化器 optimizer。它可以是现有优化器的字符串标识符，如 rmsprop 或 adagrad；也可以是 Optimizer 类的实例，此时可以自定义传参。

keras.optimizers类具有公共的参数 clipnorm 和 clipvalue ，用于控制梯度裁剪（Gradient Clipping）。

而对于优化器对象，以keras.optimizers.RMSprop(learning_rate=0.001, rho=0.9)为例

- learning_rate: float >= 0. 学习率。
- rho: float >= 0. RMSProp 梯度平方的移动均值的衰减率。

详见：optimizers。

损失函数 loss，模型试图最小化的目标函数。它可以是现有损失函数的字符串标识符，如 categorical_crossentropy 或 mse，也可以是一个损失函数。详见：losses，Compile。
评估标准 metrics：List of现有的标准的字符串标识符，或keras.metrics.Metric的实例，或自定义的评估标准函数。

对于任何分类问题，你都希望将其设置为 metrics = ['accuracy']。在下面的例子中，采用均方误差作为评估标准：

keras.metrics.RootMeanSquaredError(name="root_mean_squared_error", dtype=None)

Compile

 model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=my_learning_rate),
                loss="mean_squared_error",
                metrics=[keras.metrics.RootMeanSquaredError()])

注意：Compile方法的metrics参数可接受的标准built-in评价函数详见: metrics。上例中的均方误差类是指标 - Keras 中文类的一个实例。

下面的程序创建了一个单层、输出为一维的模型。

 1 def build_model(my_learning_rate, num_features):
 2   """Create and compile a simple linear regression model."""
 3   # Most simple keras models are sequential.
 4   model = keras.models.Sequential()
 5 
 6   # Describe the topography of the model.
 7   # The topography of a simple linear regression model
 8   # is a single node in a single layer.
 9   model.add(keras.layers.Dense(units=1,
10                                   input_shape=(num_features,)))
11 
12   # Compile the model topography into code that Keras can efficiently
13   # execute. Configure training to minimize the model's mean squared error.
14   model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=my_learning_rate),
15                 loss="mean_squared_error",
16                 metrics=[keras.metrics.RootMeanSquaredError()])
17 
18   return model

build_model

2. 训练模型

Keras 模型在输入数据和标签的 Numpy 矩阵上进行训练。使用Sequential模型的fit方法完成：

fit

 fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False)

常用的参数包括：

x: 输入数据。可以是：
- 一个 Numpy 数组（或类数组），等
- 一个将名称映射到对应array/tensor的字典（如果模型有已命名的输入）
y: 目标数据。与输入数据 x 类似，它可以是 Numpy 数组（序列）等。
batch_size: 整数或 None。每次梯度更新的样本数。如果未指定，默认为 32。未完待续。
epochs: 整数。训练模型迭代轮次。一个轮次是在整个 x 或 y 上的一轮迭代。请注意，与 initial_epoch 一起，epochs 被理解为「最终轮次」。模型并不是训练了 epochs 轮，而是到第 epochs 轮停止训练。

该方法返回一个 History 对象：其History.epoch属性是各训练轮次的索引组成的列表；其 History.history属性是一个字典，记录连续 epoch 训练以及验证集（如果适用）的损失和评估值，可以直接传给pandas.DataFrame方法转化。

更多信息，请查阅Sequential 顺序模型 - 中文文档。

训练结束后，可以用模型的get_weights() 方法查看模型参数（weights, bias, etc.），它返回一个flat list，元素依次为：首层的权重，首层的bias，第二层的权重，第二层的bias，…（某层的权重矩阵.shape=(该层的输入维数，输出维数)）

3. Validate模型

使用Sequential模型的predict_on_batch(x)方法，其中

x: 输入数据，Numpy 数组（可用pandas的Series.values方法转换）或列表（如果模型有多输入）。

它返回预测值的NumPy数组。

为了补充Keras基本用法，本博客展示一次极简二元分类机器学习全流程。

1. 设置随机数seed

为了使实验可复现，设置NumPy，backend和Python的random seed，一句就解决：

随机种子

 keras.utils.set_random_seed(42)

2. 将数据贴标签（数值标签）并切分

对于二分类问题，可以把标签转化为0和1

查看代码

 normalized_dataset['Class_Bool'] = (
    normalized_dataset['Class'] == 'Cammeo'
).astype(int)
normalized_dataset.sample(10)

上述程序的广播计算此处不再赘述。Series.astype(dtype, copy=None, errors='raise') casts a pandas object to a specified dtype dtype, where

dtype: str, data type, Series or Mapping of column name -> data type
Use a str, numpy.dtype, pandas.ExtensionDtype or Python type to cast entire pandas object to the same type.
copy: bool, return a copy when copy=True.
errors: {‘raise’, ‘ignore’}, ’
Control raising of exceptions on invalid data for provided dtype.
- raise : allow exceptions to be raised
- ignore : suppress exceptions. On error return original object.

切分数据集

 number_samples = len(normalized_dataset)
index_80th = round(number_samples * 0.8)
index_90th = index_80th + round(number_samples * 0.1)

shuffled_dataset = normalized_dataset.sample(frac=1, random_state=100)
train_data = shuffled_dataset.iloc[0:index_80th]
validation_data = shuffled_dataset.iloc[index_80th:index_90th]
test_data = shuffled_dataset.iloc[index_90th:]

test_data.head()

为了防止标签泄露，通常的做法是分开存储特征和标签：

查看代码

 label_columns = ['Class', 'Class_Bool']

train_features = train_data.drop(columns=label_columns)
train_labels = train_data['Class_Bool'].to_numpy()
validation_features = validation_data.drop(columns=label_columns)
validation_labels = validation_data['Class_Bool'].to_numpy()
test_features = test_data.drop(columns=label_columns)
test_labels = test_data['Class_Bool'].to_numpy()

3. 训练

输入特征

 input_features = [
    'Eccentricity',
    'Major_Axis_Length',
    'Area',
]

可以用dataclass便捷保存实验参数及试验记录。

数据类

 import dataclasses


@dataclasses.dataclass()
class ExperimentSettings:
  """Lists the hyperparameters and input features used to train am model."""

  learning_rate: float
  number_epochs: int
  batch_size: int
  classification_threshold: float
  input_features: list[str]


@dataclasses.dataclass()
class Experiment:
  """Stores the settings used for a training run and the resulting model."""

  name: str
  settings: ExperimentSettings
  model: keras.Model
  epochs: np.ndarray
  metrics_history: keras.callbacks.History

  def get_final_metric_value(self, metric_name: str) -> float:
    """Gets the final value of the given metric for this experiment."""
    if metric_name not in self.metrics_history:
      raise ValueError(
          f'Unknown metric {metric_name}: available metrics are'
          f' {list(self.metrics_history.columns)}'
      )
    return self.metrics_history[metric_name].iloc[-1]

掌握注解最佳实践和dataclasses --- 数据类可以轻松理解上述Python语法。

创建模型还可以通过定义模型输入输出的方式完成：例如，如果 a、b 和 c 是 Keras 张量，则可以执行以下操作：model = Model(input=[a, b], output=c)。

首先，实例化输入对象：

keras.Input(shape=None, batch_size=None, dtype=None, sparse=None, batch_shape=None, name=None, tensor=None,
)

其中

name: 层的可选名称字符串。在模型中应唯一（不要重复使用相同的名称两次）。如果未提供，它将自动生成。
shape: 形状元组（整数或 None 对象的元组），不包括批次大小。例如，shape=(32,) 表示预期输入将是 32 维向量的批次。此元组的元素可以是 None；None 元素表示形状未知且可能变化的维度（例如，序列长度）。

它返回一个Keras张量。

多个输入对象（张量）可以拼接成一个输入对象：keras.layers.Concatenate(axis=-1, **kwargs) 类实例化后，以多个张量的列表为输入（这些张量除了要对齐/拼接的维度，其他维度都是同形的），返回拼接后的单个张量。

有了输入张量，下来该确定输出张量了。如何得到输出张量？你向一个层输入，计算输出就得到了么。

以常见的全连接神经网络层——密集层为例，Dense类的实例接受输入为N维张量，形状为：(batch_size, ..., input_dim)。最常见的情况是形状为 (batch_size, input_dim) 的二维输入。其计算前向传播过程，输出为N维张量，形状为：(batch_size, ..., units)。例如，对于形状为 (batch_size, input_dim) 的二维输入，输出的形状将为 (batch_size, units)。

创建模型topography

def create_model(
    settings: ExperimentSettings,
    metrics: list[keras.metrics.Metric],
) -> keras.Model:
  """Create and compile a simple classification model."""
  model_inputs = [
      keras.Input(name=feature, shape=(1,))
      for feature in settings.input_features
  ]
  # Use a Concatenate layer to assemble the different inputs into a single
  # tensor which will be given as input to the Dense layer.
  # For example: [input_1[0][0], input_2[0][0]]

  concatenated_inputs = keras.layers.Concatenate()(model_inputs)
  dense = keras.layers.Dense(
      units=1, input_shape=(1,), name='dense_layer', activation=keras.activations.sigmoid
  )
  model_output = dense(concatenated_inputs)
  model = keras.Model(inputs=model_inputs, outputs=model_output)
  # Call the compile method to transform the layers into a model that
  # Keras can execute.  Notice that we're using a different loss
  # function for classification than for regression.
  model.compile(
      optimizer=keras.optimizers.RMSprop(
          settings.learning_rate
      ),
      loss=keras.losses.BinaryCrossentropy(),
      metrics=metrics,
  )
  return model

有细心的朋友就会问了，密集层的文档里并未显式给出Dense类的输入参数name，为什么可以用在上段例程中呢？实际上，其基类

keras.layers.Layer(activity_regularizer=None, trainable=True, dtype=None, autocast=True, name=None, **kwargs)

包含了这个输入参数。

定义训练函数

训练函数

def train_model(
    experiment_name: str,
    model: keras.Model,
    dataset: pd.DataFrame,
    labels: np.ndarray,
    settings: ExperimentSettings,
) -> Experiment:
  """Feed a dataset into the model in order to train it."""

  # The x parameter of keras.Model.fit can be a list of arrays, where
  # each array contains the data for one feature.
  features = {
      feature_name: np.array(dataset[feature_name])
      for feature_name in settings.input_features
  }

  history = model.fit(
      x=features,
      y=labels,
      batch_size=settings.batch_size,
      epochs=settings.number_epochs,
  )

  return Experiment(
      name=experiment_name,
      settings=settings,
      model=model,
      epochs=history.epoch,
      metrics_history=pd.DataFrame(history.history),
  )


print('Defined the create_model and train_model functions.')

开始主实验：给定实验参数

实验一参数

settings = ExperimentSettings(
    learning_rate=0.001,
    number_epochs=60,
    batch_size=100,
    classification_threshold=0.35,
    input_features=input_features,
)

metrics = [
    keras.metrics.BinaryAccuracy(
        name='accuracy', threshold=settings.classification_threshold
    ),
    keras.metrics.Precision(
        name='precision', thresholds=settings.classification_threshold
    ),
    keras.metrics.Recall(
        name='recall', thresholds=settings.classification_threshold
    ),
    keras.metrics.AUC(num_thresholds=100, name='auc'),
]

上例传递给model.compile方法的实参metrics变成了keras.metrics.Metric实例：

keras.metrics.BinaryAccuracy(name="binary_accuracy", dtype=None, threshold=0.5)计算二分类准确率，其中

name: (Optional) metric实例对外名称.
dtype: (Optional) data type of the metric result.
threshold: (Optional) Float.

keras.metrics.Precision(thresholds=None, top_k=None, class_id=None, name=None, dtype=None)计算精确度，其中

thresholds: (Optional) A float value, or a Python list/tuple of float 介于[0, 1].如果连同都没有指定，默认阈值为半。
top_k: (Optional) An int value，缺省缺少设置。指定.如果不设定，那么只有你预测概率最高的那个类别对了，计算精确度时才会记为TP；否则就会从回归概率最高的top_k个类别中找有没有标签类别，有的话记为TP。
name: 同上

keras.metrics.Recall(thresholds=None, top_k=None, class_id=None, name=None, dtype=None)计算召回率，其中

top_k: 类推同上。
name: 同上。
thresholds: 同上

keras.metrics.AUC( num_thresholds=200, curve="ROC", summation_method="interpolation", name=None, dtype=None, thresholds=None, multi_label=False, num_labels=None, label_weights=None, from_logits=False,
)

计算ROC或者PR曲线的AUC，其中：

num_thresholds: (Optional) 离散化曲线所使用的阈值点数。必须大于一
name: (Optional) 同上。

这里需要注意的是，各个keras.metrics.Metric实例的输入参数名有细微差别，如阈值，有的是threshold，有的是thresholds。

实验正文：

核心代码

# Establish the model's topography.
model = create_model(settings, metrics)

# Train the model on the training set.
experiment = train_model(
    'baseline', model, train_features, train_labels, settings
)

可视化：定义基于matplotlib的绘画函数：

绘图函数

def plot_experiment_metrics(experiment: Experiment, metrics: list[str]):
  """Plot a curve of one or more metrics for different epochs."""
  plt.figure(figsize=(12, 8))

  for metric in metrics:
    plt.plot(
        experiment.epochs, experiment.metrics_history[metric], label=metric
    )

  plt.xlabel("Epoch")
  plt.ylabel("Metric value")
  plt.grid()
  plt.legend()


print("Defined the plot_curve function.")

作图

# Plot metrics vs. epochs
plot_experiment_metrics(experiment, ['accuracy', 'precision', 'recall'])
plot_experiment_metrics(experiment, ['auc'])

4. 测试

Model.evaluate( x=None, y=None, batch_size=None, verbose="auto", sample_weight=None, steps=None, callbacks=None, return_dict=False, **kwargs )

其中，

x: 输入数据，它可以是：
- A NumPy array (or array-like), or a list of arrays (模型有多输入的情况).
- A tensor, or a list of tensors (模型有多输入的情况).
- A dict mapping input names to the corresponding array/tensors（模型有命名的输入）.
- 等
y: 标签值。It could be either NumPy array(s) or backend-native tensor(s).（未详尽）
batch_size: 整数或 None。每次计算的样本数。如果未指定，默认为 32。未完待续。
verbose: "auto", 0, 1, or 2. 0 = silent, 1 = progress bar, 2 = single line. "auto" 大多数情况下为1。
return_dict: 如果为真，loss and metric results are returned as a dict, with each key being the name of the metric. 否则它们将返回一张列表。

它返回标量test loss (模型仅一个输出且没有metrics) 或 list of 标量 (模型有多输出 and/or metrics)。详尽描述看evaluate的文档。

据此，定义评估函数

测试函数

 def evaluate_experiment( experiment: Experiment, test_dataset: pd.DataFrame, test_labels: np.array) -> dict[str, float]:
  features = {
      feature_name: np.array(test_dataset[feature_name])
      for feature_name in experiment.settings.input_features
  }
  return experiment.model.evaluate( x=features, y=test_labels, batch_size=settings.batch_size, verbose=0, # Hide progress bar return_dict=True,
  )

进行测试，并对比训练测试结果：

测试、比对

 def compare_train_test(experiment: Experiment, test_metrics: dict[str, float]):
  print('Comparing metrics between train and test:')
  for metric, test_value in test_metrics.items():
    print('------')
    print(f'Train {metric}: {experiment.get_final_metric_value(metric):.4f}')
    print(f'Test {metric}:  {test_value:.4f}')


# Evaluate test metrics
test_metrics = evaluate_experiment(experiment, test_features, test_labels)
compare_train_test(experiment, test_metrics)

posted on 2024-09-10 18:19 后生那各膊客圆了阅读(85) 评论(2) 收藏举报

刷新页面返回顶部

ArmRoundMan

公告

建立模型