Early Diagnosis of Alzheimer’s Disease using Deep Learning and Ensemble Learning Research

本篇文章为 Early Diagnosis of Alzheimer’s Disease using Deep Learning and Ensemble Learning 的 research 日志。

Week 1

考虑个人没有生物学基础,所以要先从 Alzheimer Disease (后文简称 AD) 本身开始了解。

通过阅读以下文章了解 AD 的诊断过程并做出总结:

初步发现关于诊断 AD 最困难的问题,本质在于并不知道该病的根源,医生无法通过从一个固定的方向进行诊断。

具体总结在 AD background reading

那么对于这种情况,我们可以通过较大的数据集分析出病状的 common feature 去找到 AD 患者的常见症状以及身体状况。

此时便可以用到 deep learning 和 ensemble learning。

为了建立对于该项目的初步了解,我阅读了 A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using 18F-FDG PET of the Brain

该篇文章针对 75.8 个月的患者数据集,采集 18F-FDG PET brain images,通过 deep learning model 发现该放射性 元素关于 AD 的 Specificity 和 Sensitvity。

鉴于对专业术语和背景知识的不够了解,对本篇文章的解读并不够完善,可能还需要进一步学习。


Week 2

根据阅读之前那篇文章不够理解的部分进行学习

通过阅读 Evaluating Categorical Models II: Sensitivity and Specificity 我们学习到:

  • Sensitivity

    • the metric that evaluates a model’s ability to predict true positives of each available category
      • $ = \frac{\texttt{True Positives}}{\texttt{True Positives + False Negatives}}$
  • Specificity

    • the metric that evaluates a model’s ability to predict true negatives of each available category.
      • = \(\frac{\texttt{True Negatives}}{\texttt{True Negatives + False Positives}}\)

在之前那篇文章中还有 F1 Score 和 ROC curve 的概念:

  • F1 Score (F-score, F-measure)

    • measure of a test's accuracy
    • Precision: True Positive divided by all positive results
      • \(=\frac{\texttt{True Positives}}{\texttt{True Positives+False Positives}}\)
    • Recall: True Positive divided by results that should have been postive
      • \(=\frac{\texttt{True Positives}}{\texttt{True Positives+False Positives}}\)
    • \(F_1\) Score is calculated from the precision and recall of the test by the equation:
      • \(F_1 = \frac{2}{\texttt{recall}^{-1}+\texttt{precision}^{-1}}=2\times \frac{\texttt{precision}\times\texttt{recall}}{\texttt{precision}+\texttt{recall}}=\frac{\text{tp}}{\text{tp}+\frac{1}{2}\text{(fp+fn)}}\)
    • \(F_\beta\) Score is a more general F score, \(\beta\) is chosen such that recall is considered \(\beta\) times as important as precision.
      • \(F_\beta=(1+\beta)^2\times\frac{\texttt{precision}\times\texttt{recall}}{(\beta^2\times\texttt{precision})+\texttt{recall}}\)
  • ROC curve (receiver operating characteristic curve)

    • The curve plots two parameters: True Positive Rate & False Positive Rate
    • TPR
      • \(=\frac{\text{TP}}{\text{TP+FN}}\)
    • FPR
      • \(=\frac{\text{FP}}{\text{FP+TN}}\)
    • A ROC curve plots TPR vs. FPR at different classification thresholds. If a point is more close to the top left corner, it has better prediction.
  • AUC (Area under the Roc Curve)

    • AUC provides an aggregate measure of performance across all possible classification thresholds. One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example.
  • Transfer Learning

    • Generally, TL is to store knowledge gained while solving one problem and apply it to another related problem.
      • Reusing data and transfering information from previously learned tasks for the learning of new tasks help improve the sample efficiency of a reinforcement learning agent.

有一个 machine learning algorithms 相关的 python library 叫做 Scikit-Learn。

如果日后要做相关编程及研究,对该 package 的使用至关重要。

Tutorial: Introducing-Scikit-Learn.ipynb

posted @ 2022-03-14 09:44  Reywmp  阅读(141)  评论(1编辑  收藏  举报