ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas'

366 views Asked by At

I got below warning and error

from databricks.koalas import KoalasFrame

WARNING:root:Found pyspark version "3.5.0" installed. The pyspark version 3.2 and above has a built-in "pandas APIs on Spark" module ported from Koalas. Try `import pyspark.pandas as ps` instead. 
WARNING:root:'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to set this environment variable to '1' in both driver and executor sides if you use pyarrow>=2.0.0. Koalas will set it for you but it does not work if there is a Spark context already launched.


ImportError                               Traceback (most recent call last)
cnrl\users\yongnual\Data\Spyder_workplace\DTS_dashboard\pandas2_high_performance_testing.ipynb Cell 18 line 1
----> 1 from databricks.koalas import KoalasFrame

ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas' (c:\Anaconda\envs\dash2\lib\site-packages\databricks\koalas\__init__.py)
2

There are 2 answers

8
Talha Tayyab On BEST ANSWER

There is no module KoalasFrame in databricks.koalas

pip install databricks
pip install koalas
pip install pyspark

from databricks.koalas import KoalasFrame

#Error
ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas'

You can check the methods available in databricks.koalas by:

dir(databricks.koalas)

['DataFrame',
 'Index',
 'LooseVersion',
 'MultiIndex',
 'NamedAgg',
 'Series',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '__warningregistry__',
 '_auto_patch',
 'assert_pyspark_version',
 'broadcast',
 'concat',
 'from_pandas',
 'get_dummies',
 'get_option',
 'groupby',
 'isna',
 'isnull',
 'melt',
 'merge',
 'namespace',
 'notna',
 'notnull',
 'option_context',
 'options',
 'os',
 'pandas_wraps',
 'pyarrow',
 'pyspark',
 'range',
 'read_clipboard',
 'read_csv',
 'read_delta',
 'read_excel',
 'read_html',
 'read_json',
 'read_parquet',
 'read_spark_io',
 'read_sql',
 'read_sql_query',
 'read_sql_table',
 'read_table',
 'reset_option',
 'set_option',
 'sql',
 'to_datetime',
 'to_numeric']

I assume that you mean:

from databricks.koalas import DataFrame

https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.html

2
Alex Ott On

Koalas package is deprecated as this functionality is merged into Apache Spark as Pandas API on Spark. Follow the steps outlined in the Pandas API on Spark user guides. Instead of importing databricks.koalas you need to import pyspark.pandas:

import pyspark.pandas as ps

that has it's own DataFrame class.