python - Slicing a pandas dataframe to the first instance of all columns containing a value -


i have simple pandas time series dataframe similar this:

in [69]: df out[69]:                  b date 2015-01-01  nan  nan 2015-02-01  1.1  nan 2015-03-01  nan  nan 2015-04-01  1.2  nan 2015-05-01  1.5  1.2 2015-06-01  1.6  1.9 2015-07-01  1.3  nan 2015-08-01  1.2  3.0 2015-09-01  1.1  1.1 

what best way obtain data frame first point there value in columns onwards, i.e. output programmatically?

in [71]: df.ix[4:] out[71]:                  b date 2015-05-01  1.5  1.2 2015-06-01  1.6  1.9 2015-07-01  1.3  nan 2015-08-01  1.2  3.0 2015-09-01  1.1  1.1 

you can use .first_valid_index() first non-nan index column.

# data # ============================ df                   b date                 2015-01-01  nan  nan 2015-02-01  1.1  nan 2015-03-01  nan  nan 2015-04-01  1.2  nan 2015-05-01  1.5  1.2 2015-06-01  1.6  1.9 2015-07-01  1.3  nan 2015-08-01  1.2  3.0 2015-09-01  1.1  1.1  # processing # ================================ # first valid index each column # , calculate max first_valid_loc = df.apply(lambda col: col.first_valid_index()).max()  df.loc[first_valid_loc:]                   b date                 2015-05-01  1.5  1.2 2015-06-01  1.6  1.9 2015-07-01  1.3  nan 2015-08-01  1.2  3.0 2015-09-01  1.1  1.1 

Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -