|
From: antonv <vas...@ya...> - 2009-01-03 17:21:38
|
Hi all, I have a lot of csv files to process, all of them with the same number of columns. The only issue is that each file has a unique column name for the fourth column. All the csv2rec examples I found are using the r.column_name format to access the data in that column which is of no use for me because of the unique names. Is there a way to access that data using the column number? I bet this should be something simple but I cannot figure it out... Thanks in advance, Anton -- View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html Sent from the matplotlib - users mailing list archive at Nabble.com. |
|
From: Patrick M. <pat...@gm...> - 2009-01-03 17:28:39
|
I'm not sure what you are needing it for, but I would suggest looking into numpy's loadtxt function. You can use this to load the csv data into numpy arrays and pass the resulting arrays arround. -Patrick On Sat, Jan 3, 2009 at 11:21 AM, antonv <vas...@ya...> wrote: > > Hi all, > > I have a lot of csv files to process, all of them with the same number of > columns. The only issue is that each file has a unique column name for the > fourth column. > > All the csv2rec examples I found are using the r.column_name format to > access the data in that column which is of no use for me because of the > unique names. Is there a way to access that data using the column number? I > bet this should be something simple but I cannot figure it out... > > Thanks in advance, > Anton > -- > View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html > Sent from the matplotlib - users mailing list archive at Nabble.com. > > > ------------------------------------------------------------------------------ > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > |
|
From: antonv <vas...@ya...> - 2009-01-03 17:39:19
|
I am plotting the data in those csv files and the forst 4 columns in the
files have the same title but the 5th has the name based on the date and
time so it would be unique in each of the files. As I have about 600 files
to batch process, adjusting my script manually is not an option.
The way I have it for one test file is:
r = mlab.csv2rec('test.csv')
#i know that the column name for the 5th column is 'htsgw_12191800'
#so to read the data in the 5th column i just use:
z = r.htsgw_12191800
What i need is to be able to get that data by specifying the column number
as that stays the same in all files.
I'll look at numpy but I hope there is a simpler way.
Thanks,
Anton
Patrick Marsh-2 wrote:
>
> I'm not sure what you are needing it for, but I would suggest looking
> into numpy's loadtxt function. You can use this to load the csv data
> into numpy arrays and pass the resulting arrays arround.
>
> -Patrick
>
>
>
>
>
>
> On Sat, Jan 3, 2009 at 11:21 AM, antonv <vas...@ya...> wrote:
>>
>> Hi all,
>>
>> I have a lot of csv files to process, all of them with the same number of
>> columns. The only issue is that each file has a unique column name for
>> the
>> fourth column.
>>
>> All the csv2rec examples I found are using the r.column_name format to
>> access the data in that column which is of no use for me because of the
>> unique names. Is there a way to access that data using the column number?
>> I
>> bet this should be something simple but I cannot figure it out...
>>
>> Thanks in advance,
>> Anton
>> --
>> View this message in context:
>> http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html
>> Sent from the matplotlib - users mailing list archive at Nabble.com.
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Matplotlib-users mailing list
>> Mat...@li...
>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Matplotlib-users mailing list
> Mat...@li...
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>
>
--
View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267232.html
Sent from the matplotlib - users mailing list archive at Nabble.com.
|
|
From: Patrick M. <pat...@gm...> - 2009-01-03 17:59:57
|
In my limited opinion, numpy's loadtxt is the way to go. Loadtxt
doesn't care about the headerYou can read in the arrays like this:
# read in all 5 columns as text
col1, col2, col3, col4, col5 = np.loadtxt(filename, dtype=dtype, unpack=True)
or if you want to skip the column headings and read in just a specific
data type of just the last column
# read in only column 5, as a specific dtype, and exclude the column 5 heading
col5_no_header = np.loadtxt(filename, skiprows=1, usecols=(5),
dtype=dtype, unpack=True)
-Patrick
On Sat, Jan 3, 2009 at 11:39 AM, antonv <vas...@ya...> wrote:
>
> I am plotting the data in those csv files and the forst 4 columns in the
> files have the same title but the 5th has the name based on the date and
> time so it would be unique in each of the files. As I have about 600 files
> to batch process, adjusting my script manually is not an option.
>
> The way I have it for one test file is:
>
> r = mlab.csv2rec('test.csv')
> #i know that the column name for the 5th column is 'htsgw_12191800'
> #so to read the data in the 5th column i just use:
> z = r.htsgw_12191800
>
> What i need is to be able to get that data by specifying the column number
> as that stays the same in all files.
>
> I'll look at numpy but I hope there is a simpler way.
>
> Thanks,
> Anton
>
>
>
> Patrick Marsh-2 wrote:
>>
>> I'm not sure what you are needing it for, but I would suggest looking
>> into numpy's loadtxt function. You can use this to load the csv data
>> into numpy arrays and pass the resulting arrays arround.
>>
>> -Patrick
>>
>>
>>
>>
>>
>>
>> On Sat, Jan 3, 2009 at 11:21 AM, antonv <vas...@ya...> wrote:
>>>
>>> Hi all,
>>>
>>> I have a lot of csv files to process, all of them with the same number of
>>> columns. The only issue is that each file has a unique column name for
>>> the
>>> fourth column.
>>>
>>> All the csv2rec examples I found are using the r.column_name format to
>>> access the data in that column which is of no use for me because of the
>>> unique names. Is there a way to access that data using the column number?
>>> I
>>> bet this should be something simple but I cannot figure it out...
>>>
>>> Thanks in advance,
>>> Anton
>>> --
>>> View this message in context:
>>> http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html
>>> Sent from the matplotlib - users mailing list archive at Nabble.com.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Matplotlib-users mailing list
>>> Mat...@li...
>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Matplotlib-users mailing list
>> Mat...@li...
>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>
>>
>
> --
> View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267232.html
> Sent from the matplotlib - users mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Matplotlib-users mailing list
> Mat...@li...
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>
|
|
From: antonv <vas...@ya...> - 2009-01-03 18:06:04
|
You're right! I read more about recarrays and they were built specially for
being called by the column name, so I shouldn't have used csv2rec from the
start!
Thanks for the quick responses!
Anton
Patrick Marsh-2 wrote:
>
> In my limited opinion, numpy's loadtxt is the way to go. Loadtxt
> doesn't care about the headerYou can read in the arrays like this:
>
> # read in all 5 columns as text
> col1, col2, col3, col4, col5 = np.loadtxt(filename, dtype=dtype,
> unpack=True)
>
> or if you want to skip the column headings and read in just a specific
> data type of just the last column
>
> # read in only column 5, as a specific dtype, and exclude the column 5
> heading
> col5_no_header = np.loadtxt(filename, skiprows=1, usecols=(5),
> dtype=dtype, unpack=True)
>
>
> -Patrick
>
>
>
>
>
>
> On Sat, Jan 3, 2009 at 11:39 AM, antonv <vas...@ya...> wrote:
>>
>> I am plotting the data in those csv files and the forst 4 columns in the
>> files have the same title but the 5th has the name based on the date and
>> time so it would be unique in each of the files. As I have about 600
>> files
>> to batch process, adjusting my script manually is not an option.
>>
>> The way I have it for one test file is:
>>
>> r = mlab.csv2rec('test.csv')
>> #i know that the column name for the 5th column is 'htsgw_12191800'
>> #so to read the data in the 5th column i just use:
>> z = r.htsgw_12191800
>>
>> What i need is to be able to get that data by specifying the column
>> number
>> as that stays the same in all files.
>>
>> I'll look at numpy but I hope there is a simpler way.
>>
>> Thanks,
>> Anton
>>
>>
>>
>> Patrick Marsh-2 wrote:
>>>
>>> I'm not sure what you are needing it for, but I would suggest looking
>>> into numpy's loadtxt function. You can use this to load the csv data
>>> into numpy arrays and pass the resulting arrays arround.
>>>
>>> -Patrick
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sat, Jan 3, 2009 at 11:21 AM, antonv <vas...@ya...>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I have a lot of csv files to process, all of them with the same number
>>>> of
>>>> columns. The only issue is that each file has a unique column name for
>>>> the
>>>> fourth column.
>>>>
>>>> All the csv2rec examples I found are using the r.column_name format to
>>>> access the data in that column which is of no use for me because of the
>>>> unique names. Is there a way to access that data using the column
>>>> number?
>>>> I
>>>> bet this should be something simple but I cannot figure it out...
>>>>
>>>> Thanks in advance,
>>>> Anton
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html
>>>> Sent from the matplotlib - users mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> Matplotlib-users mailing list
>>>> Mat...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Matplotlib-users mailing list
>>> Mat...@li...
>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/csv2rec-column-names-tp21267055p21267232.html
>> Sent from the matplotlib - users mailing list archive at Nabble.com.
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Matplotlib-users mailing list
>> Mat...@li...
>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Matplotlib-users mailing list
> Mat...@li...
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>
>
--
View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267490.html
Sent from the matplotlib - users mailing list archive at Nabble.com.
|
|
From: Pierre GM <pgm...@gm...> - 2009-01-03 20:31:23
|
FYI, I recoded np.loadtxt to handle missing data, automatic name definition and conversion functions, as a merge of np.loadtxt and mlab.csv2rec. You can access the code here: https://code.launchpad.net/~pierregm/numpy/numpy_addons Hopefully these functions will make it to numpy at one point or another. Note also that you are not limited to recarrays: you can use what's called a flexible-type arrays, which still gives the possibility to access individual fields by keys, without the overload of recarrays (where fields can also be accessed as attributes). For example: >>> x=np.array([(1,10.), (2,20.)], dtype=[('A',int),('B',float)]) >>>x['A'] array([1, 2]) On Jan 3, 2009, at 12:59 PM, Patrick Marsh wrote: > In my limited opinion, numpy's loadtxt is the way to go. Loadtxt > doesn't care about the headerYou can read in the arrays like this: > > # read in all 5 columns as text > col1, col2, col3, col4, col5 = np.loadtxt(filename, dtype=dtype, > unpack=True) > > or if you want to skip the column headings and read in just a specific > data type of just the last column > > # read in only column 5, as a specific dtype, and exclude the column > 5 heading > col5_no_header = np.loadtxt(filename, skiprows=1, usecols=(5), > dtype=dtype, unpack=True) > > > -Patrick > > > > > > > On Sat, Jan 3, 2009 at 11:39 AM, antonv <vas...@ya...> > wrote: >> >> I am plotting the data in those csv files and the forst 4 columns >> in the >> files have the same title but the 5th has the name based on the >> date and >> time so it would be unique in each of the files. As I have about >> 600 files >> to batch process, adjusting my script manually is not an option. >> >> The way I have it for one test file is: >> >> r = mlab.csv2rec('test.csv') >> #i know that the column name for the 5th column is 'htsgw_12191800' >> #so to read the data in the 5th column i just use: >> z = r.htsgw_12191800 >> >> What i need is to be able to get that data by specifying the column >> number >> as that stays the same in all files. >> >> I'll look at numpy but I hope there is a simpler way. >> >> Thanks, >> Anton >> >> >> >> Patrick Marsh-2 wrote: >>> >>> I'm not sure what you are needing it for, but I would suggest >>> looking >>> into numpy's loadtxt function. You can use this to load the csv >>> data >>> into numpy arrays and pass the resulting arrays arround. >>> >>> -Patrick >>> >>> >>> >>> >>> >>> >>> On Sat, Jan 3, 2009 at 11:21 AM, antonv >>> <vas...@ya...> wrote: >>>> >>>> Hi all, >>>> >>>> I have a lot of csv files to process, all of them with the same >>>> number of >>>> columns. The only issue is that each file has a unique column >>>> name for >>>> the >>>> fourth column. >>>> >>>> All the csv2rec examples I found are using the r.column_name >>>> format to >>>> access the data in that column which is of no use for me because >>>> of the >>>> unique names. Is there a way to access that data using the column >>>> number? >>>> I >>>> bet this should be something simple but I cannot figure it out... >>>> >>>> Thanks in advance, >>>> Anton >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/csv2rec-column-names-tp21267055p21267055.html >>>> Sent from the matplotlib - users mailing list archive at >>>> Nabble.com. >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> _______________________________________________ >>>> Matplotlib-users mailing list >>>> Mat...@li... >>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >>>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Matplotlib-users mailing list >>> Mat...@li... >>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21267232.html >> Sent from the matplotlib - users mailing list archive at Nabble.com. >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Matplotlib-users mailing list >> Mat...@li... >> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >> > > ------------------------------------------------------------------------------ > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users |
|
From: Gaius H. <ga...@ga...> - 2009-01-03 22:21:01
|
Hi all, Does anyone know if it's possible to make the polar plot look like a 12- or 24-hr clockface? I.e. 0 (or 12) at the top rather than the right, and labelled in 12ths (or 24ths) instead of degrees? Thanks, G |
|
From: Ryan M. <rm...@gm...> - 2009-01-04 01:00:25
|
Pierre GM wrote:
> Note also that you are not limited to recarrays: you can use what's
> called a flexible-type arrays, which still gives the possibility to
> access individual fields by keys, without the overload of recarrays
> (where fields can also be accessed as attributes). For example:
> >>> x=np.array([(1,10.), (2,20.)], dtype=[('A',int),('B',float)])
> >>>x['A']
> array([1, 2])
True, but the problem in this case is that he wants to access by column number,
which you can't really do with recarray or flexible dtype arrays.
Ryan
--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
|
|
From: Thomas I-N <ti...@by...> - 2009-02-03 13:23:57
|
Hello Anton, I just had the same problem and came up with the following solution: a = csv2rec(fname) # read a csv file into a a[a.dtype.names[5]] # access column 6 (index 5) in the file As a shorthand you could assign the column names to another field in the recarray: a.cols = a.dtype.names a[a.cols[5]] # access column 6 (index 5) in the file Hope this helps, even though it may not be good coding practice. I am a novice myself... Best regards, Thomas antonv wrote: > > Hi all, > > I have a lot of csv files to process, all of them with the same number of > columns. The only issue is that each file has a unique column name for the > fourth column. > > All the csv2rec examples I found are using the r.column_name format to > access the data in that column which is of no use for me because of the > unique names. Is there a way to access that data using the column number? > I bet this should be something simple but I cannot figure it out... > > Thanks in advance, > Anton > -- View this message in context: http://www.nabble.com/csv2rec-column-names-tp21267055p21809832.html Sent from the matplotlib - users mailing list archive at Nabble.com. |