python - Slicing a pandas dataframe to the first instance of all columns containing a value -
i have simple pandas time series dataframe similar this:
in [69]: df out[69]: b date 2015-01-01 nan nan 2015-02-01 1.1 nan 2015-03-01 nan nan 2015-04-01 1.2 nan 2015-05-01 1.5 1.2 2015-06-01 1.6 1.9 2015-07-01 1.3 nan 2015-08-01 1.2 3.0 2015-09-01 1.1 1.1
what best way obtain data frame first point there value in columns onwards, i.e. output programmatically?
in [71]: df.ix[4:] out[71]: b date 2015-05-01 1.5 1.2 2015-06-01 1.6 1.9 2015-07-01 1.3 nan 2015-08-01 1.2 3.0 2015-09-01 1.1 1.1
you can use .first_valid_index()
first non-nan index column.
# data # ============================ df b date 2015-01-01 nan nan 2015-02-01 1.1 nan 2015-03-01 nan nan 2015-04-01 1.2 nan 2015-05-01 1.5 1.2 2015-06-01 1.6 1.9 2015-07-01 1.3 nan 2015-08-01 1.2 3.0 2015-09-01 1.1 1.1 # processing # ================================ # first valid index each column # , calculate max first_valid_loc = df.apply(lambda col: col.first_valid_index()).max() df.loc[first_valid_loc:] b date 2015-05-01 1.5 1.2 2015-06-01 1.6 1.9 2015-07-01 1.3 nan 2015-08-01 1.2 3.0 2015-09-01 1.1 1.1
Comments
Post a Comment