【489】高级的深度学习最佳实践

参考：Deep Learning with Python P196

一、不用 Sequential 模型的解决方案：Keras 函数式 API

一个多输入模型
一个多输出（或多头）模型

1.1 函数式 API 简介

　　都是按照输入输出的模式，以下两种模式是一致的。

from keras.models import Sequential, Model
from keras import layers, Input

seq_model = Sequential()
seq_model.add(layers.Dense(32, activation='relu', input_shape=(64,)))
seq_model.add(layers.Dense(32, activation='relu'))
seq_model.add(layers.Dense(10, activation='softmax'))

input_tensor = Input(shape=(64,))
h1 = layers.Dense(32, activation='relu')(input_tensor)
h2 = layers.Dense(32, activation='relu')(h1)
output_tensor = layers.Dense(10, activation='softmax')(h2)

model = Model(input_tensor, output_tensor)

model.summary()

　　output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 64)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_2 (Dense)              (None, 32)                1056      
_________________________________________________________________
dense_3 (Dense)              (None, 10)                330       
=================================================================
Total params: 3,466
Trainable params: 3,466
Non-trainable params: 0
_________________________________________________________________

　　对这种 Model 实例进行编译、训练或评估时，其 API 与 Sequential 模型相同。

1.2 多输入模型

参考：3.keras实现-->高级的深度学习最佳实践

　　典型的问答模型有两个输入：一个自然语言描述的问题和一个文本片段（比如新闻文章），后者提供用于回答问题的信息。然后模型要生成一个回答，在最简单的情况下，这个回答只包含一个词，可以通过对摸个预定义的词表做softmax得到。

　　输入：问题 + 文本片段

　　输出：回答（一个词）

from keras.models import Model
from keras import layers
from keras import Input
 
text_vocabulary_size = 10000
question_vocabulary_size = 10000
answer_vocabulary_size = 500
 
text_input = Input(shape=(None,),
                   dtype='int32',
                   name='text')
embeded_text = layers.Embedding(text_vocabulary_size,64)(text_input)
encoded_text = layers.LSTM(32)(embeded_text)
 
 
question_input = Input(shape=(None,),
                      dtype = 'int32',
                      name = 'question')
embeded_question = layers.Embedding(question_vocabulary_size,32)(question_input)
encoded_question = layers.LSTM(16)(embeded_question)
 
concatenated = layers.concatenate([encoded_text,encoded_question],axis=-1)
answer = layers.Dense(answer_vocabulary_size,activation='softmax')(concatenated)
 
model = Model([text_input,question_input],answer)
model.compile(optimizer='rmsprop',
             loss = 'categorical_crossentropy',
             metrics = ['acc'])
 
model.summary()

　　output:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
text (InputLayer)               (None, None)         0                                            
__________________________________________________________________________________________________
question (InputLayer)           (None, None)         0                                            
__________________________________________________________________________________________________
embedding_5 (Embedding)         (None, None, 64)     640000      text[0][0]                       
__________________________________________________________________________________________________
embedding_6 (Embedding)         (None, None, 32)     320000      question[0][0]                   
__________________________________________________________________________________________________
lstm_5 (LSTM)                   (None, 32)           12416       embedding_5[0][0]                
__________________________________________________________________________________________________
lstm_6 (LSTM)                   (None, 16)           3136        embedding_6[0][0]                
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 48)           0           lstm_5[0][0]                     
                                                                 lstm_6[0][0]                     
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 500)          24500       concatenate_3[0][0]              
==================================================================================================
Total params: 1,000,052
Trainable params: 1,000,052
Non-trainable params: 0
__________________________________________________________________________________________________

　　训练这种模型需要能够对网络的各个头指定不同的损失函数，例如：年龄预测是标量回归任务,而性别预测是二分类任务，二者需要不同的损失过程。但是，梯度下降要求将一个标量最小化,所以为了能够训练模型，我们必须将这些损失合并为单个标量。合并不同损失最简单的方法就是对所有损失求和。

#多输出模型的编译选项：多重损失
#方法一
model.compile(optimizer='rmsprop',
             loss = ['mse','categorical_crossentropy','binary_crossentropy'])
#方法二
# model.compile(optimizer='rmsprop',
#              loss={'age':'mse',
#                   'income':'categorical_crossentropy',
#                   'gender':'binary_crossentropy'})

　　#多输出模型的编译选项：损失加权

#方法一
model.compile(optimizer='rmsprop',
             loss = ['mse','categorical_crossentropy',<br>'binary_crossentropy'],
             loss_weights=[0.25,1.,10.])
#方法二
# model.compile(optimizer='rmsprop',
#              loss={'age':'mse',
#                   'income':'categorical_crossentropy',
#                   'gender':'binary_crossentropy'},
#              loss_weights={'age':0.25,
#                            'income':1.,
#                            'gender':10.})

　　不同的损失值具有不同的取值范围，为了平衡不同损失的贡献，应该对loss_weights进行设置

#将数据输入到多输出模型中
#方法一
model.fit(posts,[age_targets,income_targets,gender_targets],
         epochs=10,batch_size=64)
#方法二
# model.fit(posts,{'age':age_targets,
#                 'income':income_targets,
#                 'gender':gender_targets},
#          epochs=10,batch_size=64)

posted on 2020-10-02 20:51 McDelfino 阅读(250) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

alex_bn_lee

导航

公告

【489】高级的深度学习最佳实践

一、不用 Sequential 模型的解决方案：Keras 函数式 API

1.1 函数式 API 简介

1.2 多输入模型