Sudheer Keshav Bhat ยท January 5, 2021
homebrew
, Python3 is assumed to be installed. Verified on macOS BigSur 11.6
brew install java
brew install scala
brew install apache-spark
It is assumed Python3 is installed
pip install pyspark
echo "export JAVA_HOME=/Library/Java/JavaVirtualMachines/openjdk.jdk/Contents/Home/" >> ~/.zshrc
echo "export JRE_HOME=/Library/Java/JavaVirtualMachines/openjdk.jdk/Contents/Home/" >> ~/.zshrc
echo "export SPARK_HOME=/usr/local/Cellar/apache-spark/3.1.2/libexec" >> ~/.zshrc
echo "export PATH=/usr/local/Cellar/apache-spark/3.1.2/bin:$PATH" >> ~/.zshrc
echo "export PYSPARK_PYTHON=/Users/sudheerb/.pyenv/shims/python" >> ~/.zshrc
echo "export PYSPARK_DRIVER_PYTHON=jupyter" >> ~/.zshrc
echo "export PYSPARK_DRIVER_PYTHON_OPTS='notebook'" >> ~/.zshrc
source ~/.zshrc
Launch pyspark
from the directory containing iPython notebook
โ pyspark
[I 18:33:57.084 NotebookApp] JupyterLab extension loaded from .../opt/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 18:33:57.085 NotebookApp] JupyterLab application directory is .../opt/anaconda3/share/jupyter/lab
[I 18:33:57.090 NotebookApp] Serving notebooks from local directory: .../spark
[I 18:33:57.090 NotebookApp] The Jupyter Notebook is running at:
[I 18:33:57.090 NotebookApp] http://localhost:8888/?token=....
[I 18:33:57.090 NotebookApp] or http://127.0.0.1:8888/?token=....
[I 18:33:57.090 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 18:33:57.127 NotebookApp]
To access the notebook, open this file in a browser:
file:///..../Library/Jupyter/runtime/nbserver-6161-open.html
Or copy and paste one of these URLs:
http://localhost:8888/?token=....
or http://127.0.0.1:8888/?token=....
Write a sample spark program & check the result
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('HelloSpark').getOrCreate()
df = spark.sql(''' select 'SQL' as Hello ''')
df.show()
+-----+
|Hello|
+-----+
| SQL|
+-----+