Interfacing with the pandas package¶
In order to be able to easily exchange data between the
Table class and the pandas DataFrame class (the main data structure in pandas), the
Table class includes two methods,
To demonstrate these, we can create a simple table:
>>> from astropy.table import Table >>> t = Table() >>> t['a'] = [1, 2, 3, 4] >>> t['b'] = ['a', 'b', 'c', 'd']
which we can then convert to a pandas DataFrame:
>>> df = t.to_pandas() >>> df a b 0 1 a 1 2 b 2 3 c 3 4 d >>> type(df) <class 'pandas.core.frame.DataFrame'>
It is also possible to create a table from a DataFrame:
>>> t2 = Table.from_pandas(df) >>> t2 <Table length=4> a b int64 string8 ----- ------- 1 a 2 b 3 c 4 d
The conversions to/from pandas are subject to the following caveats:
- The pandas DataFrame structure does not support multi-dimensional
Tableobjects with multi-dimensional columns cannot be converted to DataFrame.
- Masked tables can be converted, but DataFrame uses
numpy.nanto indicate masked values, so all numerical columns (integer or float) are converted to
numpy.floatcolumns in DataFrame, and string columns with missing values are converted to object columns with
numpy.nanvalues to indicate missing values. For numerical columns, the conversion therefore does not necessarily round-trip if converting back to an Astropy table, because the distinction between
numpy.nanand masked values is lost, and the different for example integer columns will be converted to floating-point.
- Tables with mixin columns can currently not be converted, but this may be implemented in the future.