Friday October 29 7:00 PM – Friday October 29 7:30 PM in Talks II

Why Datetimes Need Units: Avoiding a Y2262 Problem & Harnessing the Power of NumPy's datetime64

Christopher Ariza

Prior knowledge:
No previous knowledge expected

Summary

This talk will introduce the NumPy datetime64 datatype, describing its features and performance in comparison to Python's date and datetime objects. Practical examples of working with, and converting between, these types will be provided. The usage of datetime64 with time series data in Pandas and StaticFrame will be compared, illustrating the value of using units with datetime64.

Description

NumPy supports a datetime array datatype called datetime64. Unlike Python's standard library types (datetime and date), datetime64 supports an extensive range of time units, from year to attosecond. This specification of unit permits unambiguous resolution specification, more narrow typing of time information, and taking full advantage of time ranges that fit within the underlying representation (a 64-bit signed integer).

This talk will introduce datetime64 arrays and describe their features and performance in comparison to Python's date and datetime types. Practical examples of working with, and converting between, these formats will be provided. As date and time information is particularly useful for labeled time-series data, the usage of datetime64 in Pandas and StaticFrame indices will be examined. Pandas exclusive and coercive use of only a single unit (nanosecond) will be shown to lead to a "Y2262" problem and offer other disadvantages compared to StaticFrame's full support for datetime64 units.

The audience for this talk is anyone working with NumPy datetime64 or Pandas DatetimeIndex or Timestamp types, or those wanting to better understand the limitations of Python's date and datetime objects, particularly when used in NumPy arrays. Basic familiarity with these types is helpful but not required. This will be an informative presentation with concise code examples and practical tips for working with these types. Audience members will come away with a firm understanding of the limits and opportunities of these types, relevant for anyone working with time series data.