Skip to content

pandas.read_csv() won't read back in complex number dtypes from pandas.DataFrame.to_csv() #9379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jason-s opened this issue Jan 30, 2015 · 5 comments
Labels
Complex Complex Numbers Enhancement IO CSV read_csv, to_csv

Comments

@jason-s
Copy link

jason-s commented Jan 30, 2015

How can I read back in dataframes from CSV files which I export using to_csv() that have complex numbers?

test case:

data = pd.DataFrame([1+2j,2+3j,3+4j],columns=['a'])
print 'a='
print data['a']
print 'a*2='
print data['a']*2

filename = 'testcase1.csv'
data.to_csv(filename)

print "\nReadback..."
data2 = pd.read_csv(filename)
print data2['a']
print data2['a']*2

output:

a=
0    (1+2j)
1    (2+3j)
2    (3+4j)
Name: a, dtype: complex128
a*2=
0    (2+4j)
1    (4+6j)
2    (6+8j)
Name: a, dtype: complex128

Readback...
0    (1+2j)
1    (2+3j)
2    (3+4j)
Name: a, dtype: object
0    (1+2j)(1+2j)
1    (2+3j)(2+3j)
2    (3+4j)(3+4j)
Name: a, dtype: object
@jason-s
Copy link
Author

jason-s commented Jan 30, 2015

hmm... I found some workarounds here:
http://stackoverflow.com/questions/16659818/how-to-read-complex-numbers-from-file-with-numpy

it just seems like read_csv() ought to be able to handle to_csv() right out of the box to read back in any numeric datatype.

@jreback jreback added IO CSV read_csv, to_csv Complex Complex Numbers labels Jan 30, 2015
@jreback jreback added this to the Someday milestone Jan 30, 2015
@jreback
Copy link
Contributor

jreback commented Jan 30, 2015

its not implemented at the moment. Should be straightforward. Would you like to give it a shot?

prob need to enhance convert_objects to deal with this.

@jason-s
Copy link
Author

jason-s commented Jan 30, 2015

If I weren't overcommitted at work, I would. Sorry. :/

I'm creating a workaround for my specific case, not sure how to generalize.

@shoyer
Copy link
Member

shoyer commented Jan 30, 2015

If I recall correctly, saving/loading complex numbers does work with HDF5 files. Might want to give that a try.

@jason-s
Copy link
Author

jason-s commented Jan 30, 2015

this worked for me in my particular case (converting complex #s for columns matching _tf$):

# %!#@%#!$%#$%! Pandas doesn't read complex numbers back in by default
# we have to do this manually (see http://stackoverflow.com/questions/16659818)

import re
class TFConverter(dict):
    column_name_pattern = re.compile(r'_tf$')
    def __getitem__(self, k):
        if k in self:
            return TFConverter.convert
        else:
            raise KeyError(k)
    def __contains__(self,k):
        return self.column_name_pattern.search(k) is not None
    @staticmethod
    def convert(txt):
        return complex(txt.strip("()"))

def read_tf_csv(filename, **kwargs):
    return pd.read_csv(filename, converters = TFConverter(), **kwargs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Complex Complex Numbers Enhancement IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

6 participants