[Python] numpy fillna() for Dataframe

In the store marketing, for many reason, one stock's data can be incomplete:

We can use 'forward fill' and 'backward fill' to fill the gap:

forward fill:

backward fill:

TO do those in code, we can use numpy's 'fillna()' mathod:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html?highlight=fillna#pandas.DataFrame.fillna

复制代码
"""Fill missing values"""

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

def fill_missing_values(df_data):
    
    df_data.fillna(method='ffill', inplace=True)
    return df_data.fillna(method='bfill', inplace=True)
    


def symbol_to_path(symbol, base_dir="data"):
    """Return CSV file path given ticker symbol."""
    return os.path.join(base_dir, "{}.csv".format(str(symbol)))


def get_data(symbols, dates):
    """Read stock data (adjusted close) for given symbols from CSV files."""
    df_final = pd.DataFrame(index=dates)
    if "SPY" not in symbols:  # add SPY for reference, if absent
        symbols.insert(0, "SPY")

    for symbol in symbols:
        file_path = symbol_to_path(symbol)
        df_temp = pd.read_csv(file_path, parse_dates=True, index_col="Date",
            usecols=["Date", "Adj Close"], na_values=["nan"])
        df_temp = df_temp.rename(columns={"Adj Close": symbol})
        df_final = df_final.join(df_temp)
        if symbol == "SPY":  # drop dates SPY did not trade
            df_final = df_final.dropna(subset=["SPY"])

    return df_final


def plot_data(df_data):
    """Plot stock data with appropriate axis labels."""
    ax = df_data.plot(title="Stock Data", fontsize=2)
    ax.set_xlabel("Date")
    ax.set_ylabel("Price")
    plt.show()


def test_run():
    """Function called by Test Run."""
    # Read data
    symbol_list = ["JAVA", "FAKE1", "FAKE2"]  # list of symbols
    start_date = "2005-12-31"
    end_date = "2014-12-07"
    dates = pd.date_range(start_date, end_date)  # date range as index
    df_data = get_data(symbol_list, dates)  # get data for each symbol

    # Fill missing values
    fill_missing_values(df_data)

    # Plot
    plot_data(df_data)


if __name__ == "__main__":
    test_run()
复制代码

 

posted @   Zhentiw  阅读(3216)  评论(0编辑  收藏  举报
编辑推荐:
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具
历史上的今天:
2016-12-22 [RxJS] Split an RxJS observable conditionally with windowToggle
2016-12-22 [Angular Directive] Write a Structural Directive in Angular 2
2016-12-22 [Compose] 18. Maintaining structure whilst asyncing
2016-12-22 [Angular Directive] Combine HostBinding with Services in Angular 2 Directives
2016-12-22 [Angular Directive] Build a Directive that Tracks User Events in a Service in Angular 2
2015-12-22 [Polymer] Introduction
2015-12-22 [Redux] Implementing combineReducers() from Scratch
点击右上角即可分享
微信分享提示