1. pandas.read_csv() to read the .csv file. After read, it is automatically converted to DataFrame format
2.The DataFrame is the frame for Pandas. It is not a matrix. The first column is not the column name but the first row of data. Column name is different from row of data.
3.Pandas utilizes this feature to provide more context when returning a row or a column from a DataFrame. For example, when you select a row from a DataFrame, instead of just returning the values in that row as a list, Pandas returns a Series object that contains the column labels as well as the corresponding values
4.In the numpy, we use a[99,0] to present the 99th row of the matrix. In pandas, we only need to use a.loc[99] .And there is also a series of 100th row shows on the display.
5.A convenient dtypes
attribute for DataFrames returns a Series with the data type of each column.
6.The process of selecting certain columns in all the columns in Pandas format. First convert the Dataframe format to a vector by using .tolist() function. Then loop the list to select the certain row which satisfy the requirement and append these rows into a empty list. In the end, convert the selected list to DataFrame format by using food_list[[list]].
7.To normalize the data, devide the maximum .max() value in the column.
8.The way to assign a new column is similar to the way to assign a new key&value to a dictionary.
9.Sort columns:
DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')
Parameters: |
by : string name or list of names which refer to the axis items axis : index, columns to direct sorting ascending : bool or list of bool
inplace : bool
kind : {quicksort, mergesort, heapsort}
na_position : {‘first’, ‘last’}
|
---|---|
Returns: |
sorted_obj : DataFrame |