Custom Java Jar file with Scala and SBT

Diptiman Chakrabarti
4 min readJul 5, 2020

Introduction

With wake of open source, automation, microservice architectures, it has become more of a need to create reusable code snippets and use them as “plug and play” in different implementations.

Problem

In Java, multiple custom build reusable components are created and imported in main application as and when required. Once jar file is created for the component , it can be used in any implementations.

But when working on Scala, how to import custom build java jars into Scala application? More over how to make sure when a deployable jar file is created for the Scala app, the java code is also integral part of that package?

Solution

Following section describes how the above problems can be solved.

Apart from Maven and SVN, in Scala SBT is well used for build, test and deployment. If new to SBT in IntelliJ then please refer link: https://www.jetbrains.com/help/idea/sbt-support.html

Let’s consider java jar file is available and its name is Java_Jar-0.0.1-SNAPSHOT.jar. The jar file location is /$java_Jar_Path/

When creating a Scala Project with SBT the basic structure of SBT build file looks like :

name := "new-Project"

version := "0.1"

scalaVersion := "2.11.12"

To add dependent libaries in the build file following command is used.

libraryDependencies += Seq()

if multiple dependencies are added then above syntax will be tweaked and look like

libraryDependencies ++= Seq()

Also in sbt build file, metadata Information is added as variable reference. With the reference, when the code executes, meta-inf helps to cater all information related to environment, compiler and other details to the code. It also helps to identify native resources for the language and framework.

val meta = """META.INF(.)*""".r

Dependent library in sbt build file is added under libraryDependencies as

"org.apache.spark" %% "spark-core" % "2.4.0"

if the dependent library is already available with deployment environment and not required as part of jar file then the above syntax is

"org.apache.spark" %% "spark-core" % "2.4.0" % "provided"

sbt-assembly is sbt plugin used to create fat jars. fat jars are those where all dependent libraries are part of the build jar and during execution those libraries are referred within jar instead of environment provided libraries.

To add sbt-assembly plugin in IntelliJ, create a file “plugins.sbt” under “project” folder and add following line in the file

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")

In example sbt-assembly version is 0.14.10. If there is an error occurs to identify the plugin with scala and sbt version then at the top of the plugins.sbt file add following resolver

resolvers += Resolver.bintrayIvyRepo("com.eed3si9n", "sbt-plugins")

Also another important point in build configuration is assembly merge process. A merge startegy for libraries is required and sample merge startegy is

assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith ".html" => MergeStrategy.first
case n if n.startsWith("reference.conf") => MergeStrategy.concat
case n if n.endsWith(".conf") => MergeStrategy.concat
case meta(_) => MergeStrategy.discard
case x => MergeStrategy.first
}

Once all is done then configuration file for fat sbt looks like similar to following

name := "odwedw-Project"

version := "0.1"

scalaVersion := "2.11.12"


libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "2.4.0" ,
"org.apache.spark" %% "spark-sql" % "2.4.0" ,
"org.apache.spark" %% "spark-hive" % "2.4.0" )

libraryDependencies += "Javajar" % "Javajar" % "0.0.1-SNAPSHOT" from "file:/{$JAVA_LOCAL_JAR_PATH}/{$JAR_FILE}"

val
meta = """META.INF(.)*""".r
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith ".html" => MergeStrategy.first
case n if n.startsWith("reference.conf") => MergeStrategy.concat
case n if n.endsWith(".conf") => MergeStrategy.concat
case meta(_) => MergeStrategy.discard
case x => MergeStrategy.first
}

In above replace {$JAVA_LOCAL_JAR_PATH} and {$JAR_FILE} with jar file and its absolute path name.

If some dependent libraries are required in fat jar and some are not, then it is better to create two sepearte libraryDependencies with sbt configuration or build file. and it might look like following

name := "odwedw-Project"

version := "0.1"

scalaVersion := "2.11.12"


libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "2.4.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.4.0" % "provided",
"org.apache.spark" %% "spark-hive" % "2.4.0" % "provided" )

libraryDependencies += "Javajar" % "Javajar" % "0.0.1-SNAPSHOT" from "file:/{$JAVA_LOCAL_JAR_PATH}/{$JAR_FILE}"

val
meta = """META.INF(.)*""".r
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith ".html" => MergeStrategy.first
case n if n.startsWith("reference.conf") => MergeStrategy.concat
case n if n.endsWith(".conf") => MergeStrategy.concat
case meta(_) => MergeStrategy.discard
case x => MergeStrategy.first
}

Once “% “provided”” is mentioned against any library then that library is not imported within the application jar file.

Conclusion:

The above section describes how to import multiple external and local libraries using sbt in fat jar files. Also it shows how within a single configuration file few dependent libraries can be excluded from jar files.

NT: When ever using spark on Scala make sure scala and versions are matching else sbt configuration file will throw error.

--

--