python - If test on DataFrame that excludes first column -


i have following dataframe:

raw_data_4iftest=                                                       time        f1         f2   2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999  -83.820000  29.430000       2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000  -84.330002  28.940001    2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000  -84.660004  27.940001    2082-05-03 00:04:00.019200000 2082-05-03 00:04:00.019200000  -84.699997  -84.69999           dtype='datetime64[ns]', length=1440, freq=none, tz=none) 

i run if-test on dataframe excludes 'time' column - i.e. along lines of:

if (*cells*) in raw_data_4iftest !='datetime64[ns]'    # if cell **not** datetime64 object     raw_data_iftest = raw_data_4iftest >= 0.05     raw_data_iftest_num = raw_data_iftest.astype(int) 

so raw_data_iftest_num returns:

raw_data_iftest_num=                                                       time  f1 f2   2082-05-03 00:00:59.961599999 2082-05-03 00:00:59.961599999  0  1       2082-05-03 00:02:00.009600000 2082-05-03 00:02:00.009600000  0  1    2082-05-03 00:02:59.971200000 2082-05-03 00:02:59.971200000  0  1 

currently, i'm doing following:

raw_data_iftest = raw_data_4iftest >= 0.05 raw_data_iftest_num = raw_data_iftest.astype(int) 

but gives output below, doesn't allow me perform manipulations need to raw_data_iftest_num later in code:

raw_data_iftest_num =                                    time  f1 f2       2082-05-03 00:00:59.961599999    1  0  1           2082-05-03 00:02:00.009600000    1  0  1        2082-05-03 00:02:59.971200000    1  0  1 

i'm pretty new programming in python (and using pandas) help/input appreciated.

to select first column of dataframe, df, use

df.iloc[:, 1:] 

or, explicitly select columns want name:

df[['f1', 'f2']] 

or remove columns name:

df[[col col in df if col not in ['time']]] 

another alternative use select_dtypes select columns data type. example,

import numpy np import pandas pd df = pd.dataframe({'time' : np.array([1,2,3,4], dtype='<m8[d]'),                     'bbb' : [10,20,30,40],                    'ccc' : [100,50,-30,-50]}) df.select_dtypes(include=[np.number]) 

yields

   bbb  ccc 0   10  100 1   20   50 2   30  -30 3   40  -50 

you can select excluding columns of dtype datetime64[ns]:

df.select_dtypes(exclude=['datetime64[ns]']) 

which yields same result in case.


so instead of

raw_data_iftest = raw_data_4iftest >= 0.05 

you use

raw_data_iftest = (raw_data_4iftest.iloc[:, 1:] >= 0.05) 

or

raw_data_iftest = (raw_data_4iftest.select_dtypes(include=[np.number]) >= 0.05) 

Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

apache - setting document root in antoher partition on ubuntu -