python - If test on DataFrame that excludes first column -
i have following dataframe:
raw_data_4iftest= time f1 f2 2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999 -83.820000 29.430000 2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000 -84.330002 28.940001 2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000 -84.660004 27.940001 2082-05-03 00:04:00.019200000 2082-05-03 00:04:00.019200000 -84.699997 -84.69999 dtype='datetime64[ns]', length=1440, freq=none, tz=none)
i run if-test on dataframe excludes 'time' column - i.e. along lines of:
if (*cells*) in raw_data_4iftest !='datetime64[ns]' # if cell **not** datetime64 object raw_data_iftest = raw_data_4iftest >= 0.05 raw_data_iftest_num = raw_data_iftest.astype(int)
so raw_data_iftest_num returns:
raw_data_iftest_num= time f1 f2 2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999 0 1 2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000 0 1 2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000 0 1
currently, i'm doing following:
raw_data_iftest = raw_data_4iftest >= 0.05 raw_data_iftest_num = raw_data_iftest.astype(int)
but gives output below, doesn't allow me perform manipulations need to raw_data_iftest_num later in code:
raw_data_iftest_num = time f1 f2 2082-05-03 00:00:59.961599999 1 0 1 2082-05-03 00:02:00.009600000 1 0 1 2082-05-03 00:02:59.971200000 1 0 1
i'm pretty new programming in python (and using pandas) help/input appreciated.
to select first column of dataframe, df
, use
df.iloc[:, 1:]
or, explicitly select columns want name:
df[['f1', 'f2']]
or remove columns name:
df[[col col in df if col not in ['time']]]
another alternative use select_dtypes
select columns data type. example,
import numpy np import pandas pd df = pd.dataframe({'time' : np.array([1,2,3,4], dtype='<m8[d]'), 'bbb' : [10,20,30,40], 'ccc' : [100,50,-30,-50]}) df.select_dtypes(include=[np.number])
yields
bbb ccc 0 10 100 1 20 50 2 30 -30 3 40 -50
you can select excluding columns of dtype datetime64[ns]
:
df.select_dtypes(exclude=['datetime64[ns]'])
which yields same result in case.
so instead of
raw_data_iftest = raw_data_4iftest >= 0.05
you use
raw_data_iftest = (raw_data_4iftest.iloc[:, 1:] >= 0.05)
or
raw_data_iftest = (raw_data_4iftest.select_dtypes(include=[np.number]) >= 0.05)
Comments
Post a Comment