In [2]:
import pandas as pd

时间序列数据的处理

创建,索引,切片

In [3]:
dates=['2016-01-01','2016-01-02','2016-01-03']
ts=pd.Series([1,2,3],index=pd.to_datetime(dates))
In [4]:
ts
Out[4]:
2016-01-01    1
2016-01-02    2
2016-01-03    3
dtype: int64
In [6]:
print(ts['2016'])
print(ts['2016-01'])
2016-01-01    1
2016-01-02    2
2016-01-03    3
dtype: int64
2016-01-01    1
2016-01-02    2
2016-01-03    3
dtype: int64
In [7]:
ts.truncate(after='2016-01-02')
#舍弃前面的,丢掉后面的
Out[7]:
2016-01-01    1
2016-01-02    2
dtype: int64
In [9]:
#滞前,滞后操作
print(ts.shift(1))
print(ts.shift(-1))
2016-01-01    NaN
2016-01-02    1.0
2016-01-03    2.0
dtype: float64
2016-01-01    2.0
2016-01-02    3.0
2016-01-03    NaN
dtype: float64

高低频数据的转换

In [10]:
ts.index.freq
In [15]:
rts = ts.resample('M',how='first')
rts
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).first()
  """Entry point for launching an IPython kernel.
Out[15]:
2016-01-31    1
Freq: M, dtype: int64
In [16]:
rts = ts.resample('MS',how='first')
rts
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).first()
  """Entry point for launching an IPython kernel.
Out[16]:
2016-01-01    1
Freq: MS, dtype: int64