Skip to content
IT Nursery
  • Home
  • Programming
    • PHP
    • C
    • C++
  • DataBase
    • MySQL
  • CMS
    • WordPress

apache-spark

(Why) do we need to call cache or persist on a RDD

by IT Nursery

When a resilient distributed dataset (RDD) is created from a text file or collection (or from another RDD), do we need to call … Read more

Tags apache-spark, rdd, scala

Add JAR files to a Spark job – spark-submit

by IT Nursery

True… it has been discussed quite a lot. However, there is a lot of ambiguity and some of the answers provided … including … Read more

Tags apache-spark, jar, java, scala, spark-submit

Spark performance for Scala vs Python

by IT Nursery

I prefer Python over Scala. But, as Spark is natively written in Scala, I was expecting my code to run faster in the … Read more

Tags apache-spark, performance, pyspark, rdd, scala

How to stop INFO messages displaying on spark console?

by IT Nursery

I’d like to stop various messages that are coming on spark shell. I tried to edit the log4j.properties file in order to stop … Read more

Tags apache-spark, log4j, spark-submit

What is the difference between cache and persist?

by IT Nursery

In terms of RDD persistence, what are the differences between cache() and persist() in spark ? 6 Answers 6

Tags apache-spark, distributed-computing, rdd

Apache Spark: The number of cores vs. the number of executors

by IT Nursery

I’m trying to understand the relationship of the number of cores and the number of executors when running a Spark job on YARN. … Read more

Tags apache-spark, hadoop, hadoop-yarn

Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects

by IT Nursery

Getting strange behavior when calling function outside of a closure: when function is in a object everything is working when function is in … Read more

Tags apache-spark, scala, serialization

Spark java.lang.OutOfMemoryError: Java heap space

by IT Nursery

My cluster: 1 master, 11 slaves, each node has 6 GB memory. My settings: spark.executor.memory=4g, Dspark.akka.frameSize=512 Here is the problem: First, I read … Read more

Tags apache-spark, out-of-memory

What are workers, executors, cores in Spark Standalone cluster?

by IT Nursery

I read Cluster Mode Overview and I still can’t understand the different processes in the Spark Standalone cluster and the parallelism. Is the … Read more

Tags apache-spark, distributed-computing

How to change dataframe column names in pyspark?

by IT Nursery

I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column … Read more

Tags apache-spark, apache-spark-sql, pyspark, python
Post navigation
Older posts
Page1 Page2 Next →

Important Tag

.net admin ajax android arrays bash c categories comments CSS custom-field custom-post-types custom-taxonomy customization database filters functions git hooks HTML images ios java javascript jQuery menus multisite MySQL node.js permalinks php plugin-development plugins posts python Shortcode sql string theme-development themes uploads users woocommerce-offtopic wp-admin wp-query

Recent Posts

  • INSTALL_FAILED_DUPLICATE_PERMISSION… C2D_MESSAGE
  • How to sort by meta value?
  • WPF text Wrap vs WrapWithOverflow
  • How to retrieve the list of all posts ever published via the feed?
  • how to use javascript Object.defineProperty

android c categories CSS custom-post-types custom-taxonomy customization database functions git HTML images java javascript jQuery multisite MySQL php plugin-development plugins posts python string theme-development wp-query

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Content from: Stack Exchange

Important Link

  • About
  • Privacy Policy

IT Nursery

The Goal of ITNursery Engaging the world to foster innovation through aggregate information. Our Question Answer post, blog information, products and tools help developers and technologists in life and at work.

copyright © 2023 All Right Reserved | IT NurSery