Thread: [Matplotlib-users] record array and date support

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I just added support for native plotting of python date and datetime
objects (you still can, but don't have to use plot_date with date2num
conversions).  We will continue to do conversion to floats under the
hood, but the conversion can be handled automagically.  I also added
support for loading CSV files (or general space/tab/comma delimited
files) into numpy record arrays, and the type conversions (int, float,
date, etc...) happen automagically.  The function assumes there is a
header row, and these strings will be munged to give valid python
attribute names.  It inspects the first checkrows lines after the
header to try and infer the datatype and set the appropriate
conversion function.   It's not entirely bullet proof, but it should
cover a lot of common use cases.

Here is an example (svn only)

  from matplotlib.mlab import csv2rec
  from pylab import figure, show

  a = csv2rec('data/msft.csv')
  fig = figure()
  ax = fig.add_subplot(111)
  ax.plot(a.date, a.adj_close, '-')
  fig.autofmt_xdate()
  show()

The autofmt_xdate is optional, but is a new function that does a few
things you usually want in date plots: turns off tick labels in the
upper subplots if any, rotates the tick labels on the lowest axes and
right aligns them, and increases the bottom of the subplots adjust to
make room for the rotated tick labels.

Here is what the dtype looks like from the example above.

  In [3]: !head -3 data/msft.csv
  Date,Open,High,Low,Close,Volume,Adj. Close*
  19-Sep-03,29.76,29.97,29.52,29.96,92433800,29.79
  18-Sep-03,28.49,29.51,28.42,29.50,67268096,29.34

  In [4]: a = csv2rec('data/msft.csv')

  In [5]: a.dtype
  Out[5]: dtype([('date', '|O4'), ('open', '<f8'), ('high', '<f8'),
('low', '<f8'), ('close', '<f8'),   ('volume', '<i4'), ('adj_close',
'<f8')])

  In [6]: a.date[:2]
  Out[6]: array([2003-09-19 00:00:00, 2003-09-18 00:00:00], dtype=object)

I'll probably add a few performance features to the csv2rec function,
mainly to let you skip columns and supply conversion functions where
desired because the autodate parser is pretty slow if you want to
parse date strings, but this is enough to make it useful.  Another
useful feature will be able to support customizable type dependent
NULL value conversion (eg convert to numpy.nan for floats,
'0000-00-00' for dates, etc...)

Record arrays are your friend; have fun!
JDH

Thread: [Matplotlib-users] record array and date support

matplotlib-users