Вопросы по тегу: amazon-emr

(1)

Невозможно определить схему для Parquet. Ее необходимо указать вручную

Я запускаю весь код из EMR Notebooks. spark.version '3.0.1-amzn-0' temp_df.printSchema() root |-- dt: string (nullable = true) |-- AverageTemperature: double (nullable = true) |-- AverageTemperatureUncertainty: double (nullable = true) |-- State: string (nullable = true) |-- Country: string (nullable = true) |-- year: integer (nullable = true) |-- month: integer (nullable = true) |-- day: integer (nullable = true) |-- weekday: integer (nullable = true) temp_df.show(2) +----------+------------------+-----------------------------+-----+-------+----+-----+---+-------+ | dt|AverageTemperature|AverageTemperatureUncertainty|State|Country|year|month|day|weekday| +----------+------------------+-----------------------------+-----+-------+----+-----+---+-------+ |1855-05-01| 25.544| 1.171| Acre| Brazil|1855| 5| 1| 3| |1855-06-01| 24.228| 1.103| Acre| Brazil|1855| 6| 1| 6| +----------+------------------+-----------------------------+-----+-------+----+-----+---+-------+ only showing top 2 rows temp_df.write.parquet(path='s3://project7878/clean_data/temperatures.parquet', mode='overwrite', partitionBy=['year']) spark.read.parquet(path='s3://project7878/clean_data/temperatures.parquet').show(2) An error was encountered: Unable to infer schema for Parquet. It must be specified manually.; Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 353, in parquet return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths))) File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in deco raise_from(converted) File "", line 3, in raise_from pyspark.sql.utils.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.; Я обращался к другим постам в stack overflow, но представленное там решение (проблема из-за записи пустых файлов) ко мне не относится.Пожалуйста, помогите мне. Спасибо!!!
v

vjp

год назад

Ответов: 1

2022 WebDevInsider