Important Notice

We have decided to stop the maintenance of this public GitHub repository.

HANA Vora Spark extensions

These are some extensions of Apache Spark developed for SAP HANA Vora. They can be used with any supported Spark version, even without Vora. Note that some features might improve their performance significantly if the HANA Vora datasource is used.

First Steps

Prerequisites

Minimal requirements for the spark extensions:

Java SE 7 (or later) installed
Maven installed (mvn -version should work)
Spark 1.6.1 installed (SPARK_HOME must be set to the installation directory) (Note that the only supported spark versions are 1.6.0 and 1.6.1!)

Building

Build the distribution package with Maven:

mvn clean package

You can also skip the tests by adding the appropriate switch to the command line:

mvn clean package -D maven.test.skip

Then extract the package to its target directory:

export SAP_SPARK_HOME=$HOME/sap-spark-extensions # choose your install dir
mkdir -p $SAP_SPARK_HOME
tar xzpf ./dist/target/spark-sap-extensions-*-dist.tar.gz -C $SAP_SPARK_HOME

Starting an Extended Spark Shell

From the command line, execute

$SAP_SPARK_HOME/bin/start-spark-shell.sh

Using the Extensions

While the spark shell starts up with SparkContext and SQLContext predefined, you need to instantiate a SapSQLContext to make use of the extensions.

import org.apache.spark.sql._

val voraContext = new SapSQLContext(sc)

// now we can start with some SQL queries
voraContext.sql("SHOW TABLES").show()

Further Documentation

Package Documentation

TODO

Inline Documentation

You can also start at the rootdoc and work your way through the code from there.

Troubleshooting

Wrong Spark Version

java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.optimizer.Optimizer: method <init>()V not found
  at org.apache.spark.sql.extension.ExtendableOptimizer.<init>(ExtendableOptimizer.scala:13)
  at org.apache.spark.sql.hive.ExtendableHiveContext.optimizer$lzycompute(ExtendableHiveContext.scala:93)
  at org.apache.spark.sql.hive.ExtendableHiveContext.optimizer(ExtendableHiveContext.scala:92)
  at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:43)
  at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:43)
  ...

This happens when spark version 1.6.2 is used, due to a change in the Optimizer interface. Because of this compatibility issue, we only support versions 1.6.0 and 1.6.1.

Name		Name	Last commit message	Last commit date
Latest commit History 866 Commits
compat161		compat161
compat162		compat162
core		core
dist		dist
doc		doc
project		project
sap-thriftserver		sap-thriftserver
zeppelin		zeppelin
.gitignore		.gitignore
.xmake.cfg		.xmake.cfg
AUTHORS		AUTHORS
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
pom.xml		pom.xml

License

SAP-archive/HANAVora-Extensions

Folders and files

Latest commit

History

Repository files navigation

Important Notice

HANA Vora Spark extensions

First Steps

Prerequisites

Building

Starting an Extended Spark Shell

Using the Extensions

Further Documentation

Package Documentation

Inline Documentation

Troubleshooting

Wrong Spark Version

About

Topics

Resources

License

Stars

Watchers

Forks

Languages