@@ -1157,14 +1157,16 @@ Converting to Python datetimes
11571157
11581158.. _timeseries.resampling :
11591159
1160- Up- and downsampling
1161- --------------------
1160+ Resampling
1161+ ----------
11621162
1163- With 0.8, pandas introduces simple, powerful, and efficient functionality for
1163+ Pandas has a simple, powerful, and efficient functionality for
11641164performing resampling operations during frequency conversion (e.g., converting
11651165secondly data into 5-minutely data). This is extremely common in, but not
11661166limited to, financial applications.
11671167
1168+ ``resample `` is a time-based groupby, followed by a reduction method on each of its groups.
1169+
11681170See some :ref: `cookbook examples <cookbook.resample >` for some advanced strategies
11691171
11701172.. ipython :: python
@@ -1203,19 +1205,6 @@ end of the interval is closed:
12031205
12041206 ts.resample(' 5Min' , closed = ' left' )
12051207
1206- For upsampling, the ``fill_method `` and ``limit `` parameters can be specified
1207- to interpolate over the gaps that are created:
1208-
1209- .. ipython :: python
1210-
1211- # from secondly to every 250 milliseconds
1212-
1213- ts[:2 ].resample(' 250L' )
1214-
1215- ts[:2 ].resample(' 250L' , fill_method = ' pad' )
1216-
1217- ts[:2 ].resample(' 250L' , fill_method = ' pad' , limit = 2 )
1218-
12191208 Parameters like ``label `` and ``loffset `` are used to manipulate the resulting
12201209labels. ``label `` specifies whether the result is labeled with the beginning or
12211210the end of the interval. ``loffset `` performs a time adjustment on the output
@@ -1240,34 +1229,58 @@ retains the input representation.
12401229(detail below). It specifies how low frequency periods are converted to higher
12411230frequency periods.
12421231
1243- Note that 0.8 marks a watershed in the timeseries functionality in pandas. In
1244- previous versions, resampling had to be done using a combination of
1245- ``date_range ``, ``groupby `` with ``asof ``, and then calling an aggregation
1246- function on the grouped object. This was not nearly as convenient or performant
1247- as the new pandas timeseries API.
12481232
1249- Sparse timeseries
1233+ Up Sampling
1234+ ~~~~~~~~~~~
1235+
1236+ For upsampling, the ``fill_method `` and ``limit `` parameters can be specified
1237+ to interpolate over the gaps that are created:
1238+
1239+ .. ipython :: python
1240+
1241+ # from secondly to every 250 milliseconds
1242+
1243+ ts[:2 ].resample(' 250L' )
1244+
1245+ ts[:2 ].resample(' 250L' , fill_method = ' pad' )
1246+
1247+ ts[:2 ].resample(' 250L' , fill_method = ' pad' , limit = 2 )
1248+
1249+ Sparse Resampling
12501250~~~~~~~~~~~~~~~~~
12511251
1252- If your timeseries are sparse, be aware that upsampling will generate a lot of
1253- intermediate points filled with whatever passed as ``fill_method ``. What
1254- ``resample `` does is basically a group by and then applying an aggregation
1255- method on each of its groups, which can also be achieve with something like the
1256- following.
1252+ Sparse timeseries are ones where you have a lot fewer points relative
1253+ to the amount of time you are looking to resample. Naively upsampling a sparse series can potentially
1254+ generate lots of intermediate values. When you don't want to use a method to fill these values, e.g. ``fill_method `` is ``None ``,
1255+ then intermediate values will be filled with ``NaN ``.
1256+
1257+ Since ``resample `` is a time-based groupby, the following is a method to efficiently
1258+ resample only the groups that are not all ``NaN ``
12571259
12581260.. ipython :: python
12591261
1260- def round (t , freq ):
1261- # round a Timestamp to a specified freq
1262- return Timestamp((t.value // freq.delta.value) * freq.delta.value)
1262+ rng = date_range(' 2014-1-1' , periods = 100 , freq = ' D' ) + Timedelta(' 1s' )
1263+ ts = Series(range (100 ), index = rng)
12631264
1264- from functools import partial
1265+ If we want to resample to the full range of the series
12651266
1266- rng = date_range(' 1/1/2012' , periods = 100 , freq = ' S' )
1267+ .. ipython :: python
1268+
1269+ ts.resample(' 3T' ,how = ' sum' )
1270+
1271+ We can instead only resample those groups where we have points as follows:
1272+
1273+ .. ipython :: python
12671274
1268- ts = Series(randint(0 , 500 , len (rng)), index = rng)
1275+ from functools import partial
1276+ from pandas.tseries.frequencies import to_offset
1277+
1278+ def round (t , freq ):
1279+ # round a Timestamp to a specified freq
1280+ freq = to_offset(freq)
1281+ return Timestamp((t.value // freq.delta.value) * freq.delta.value)
12691282
1270- ts.groupby(partial(round , freq = offsets.Minute( 3 ) )).sum()
1283+ ts.groupby(partial(round , freq = ' 3T ' )).sum()
12711284
12721285 .. _timeseries.periods :
12731286
0 commit comments