python - If test on DataFrame that excludes first column -
i have following dataframe:
raw_data_4iftest= time f1 f2 2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999 -83.820000 29.430000 2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000 -84.330002 28.940001 2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000 -84.660004 27.940001 2082-05-03 00:04:00.019200000 2082-05-03 00:04:00.019200000 -84.699997 -84.69999 dtype='datetime64[ns]', length=1440, freq=none, tz=none) i run if-test on dataframe excludes 'time' column - i.e. along lines of:
if (*cells*) in raw_data_4iftest !='datetime64[ns]' # if cell **not** datetime64 object raw_data_iftest = raw_data_4iftest >= 0.05 raw_data_iftest_num = raw_data_iftest.astype(int) so raw_data_iftest_num returns:
raw_data_iftest_num= time f1 f2 2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999 0 1 2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000 0 1 2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000 0 1 currently, i'm doing following:
raw_data_iftest = raw_data_4iftest >= 0.05 raw_data_iftest_num = raw_data_iftest.astype(int) but gives output below, doesn't allow me perform manipulations need to raw_data_iftest_num later in code:
raw_data_iftest_num = time f1 f2 2082-05-03 00:00:59.961599999 1 0 1 2082-05-03 00:02:00.009600000 1 0 1 2082-05-03 00:02:59.971200000 1 0 1 i'm pretty new programming in python (and using pandas) help/input appreciated.
to select first column of dataframe, df, use
df.iloc[:, 1:] or, explicitly select columns want name:
df[['f1', 'f2']] or remove columns name:
df[[col col in df if col not in ['time']]] another alternative use select_dtypes select columns data type. example,
import numpy np import pandas pd df = pd.dataframe({'time' : np.array([1,2,3,4], dtype='<m8[d]'), 'bbb' : [10,20,30,40], 'ccc' : [100,50,-30,-50]}) df.select_dtypes(include=[np.number]) yields
bbb ccc 0 10 100 1 20 50 2 30 -30 3 40 -50 you can select excluding columns of dtype datetime64[ns]:
df.select_dtypes(exclude=['datetime64[ns]']) which yields same result in case.
so instead of
raw_data_iftest = raw_data_4iftest >= 0.05 you use
raw_data_iftest = (raw_data_4iftest.iloc[:, 1:] >= 0.05) or
raw_data_iftest = (raw_data_4iftest.select_dtypes(include=[np.number]) >= 0.05)
Comments
Post a Comment