Create Docker image with Maven build

There are lots of maven Docker plugin available to integrate the docker with maven.

In this example, I am going to show how to build the Docker image while building a maven project.

Copy the below snippet and put into your pom.xml file and then create a maven property “docker.image.name” with the appropriate docker image name and also make sure that the Dockerfile available in the correct location.

Then run the ‘mvn install’ and once its done, run ‘docker images’ and check that the docker image is available in the list of images.

pom.xml:


  <plugin>
	<groupId>org.codehaus.mojo</groupId>		
	<artifactId>exec-maven-plugin</artifactId>		
	<version>1.4.0</version>		
	<executions>		
		<execution>		
			<goals>		
				<goal>java</goal>		
			</goals>		
		</execution>		
		<execution>		
			<id>build-image</id>		
			<phase>install</phase>		
			<goals>		
				<goal>exec</goal>		
			</goals>		
			<configuration>		
				<executable>docker</executable>		
				<arguments>		
					<argument>build</argument>		
					<argument>-t=${docker.image.name}</argument>		
					<argument>.</argument>		
				</arguments>		
			</configuration>		
		</execution>		
	</executions>			
 </plugin>		
<plugin>
      

Spark Scala Unit Testing

In this post, I am going to show an example for writing unit test cases for Spark Scala job and run it with Maven.

Assume that we have a set of XML files which has user information like first name, last name and etc. Assume that middle name and county name are optional fields but the XML file does contain empty nodes for these two fields. So now our job is to read those files and remove those empty nodes and output those updated content into a text file either in local env or hadoop env.

The sample XML content is given below,

 

<persons>
    <person>
        <firstName>Bala</firstName>
        <middleName/>
        <lastName>Samy</lastName>
        <countyName/>
    </person>
    <person>
        <firstName>Bala1</firstName>
        <middleName/>
        <lastName>Samy1</lastName>
        <countyName/>
    </person>
</persons>

 

The Spark scala code for reading XML files and removing the empty nodes are given below.


package com

import org.apache.spark.{SparkConf, SparkContext}

import scala.collection.Map

object EmptyTagReplacer {

  def main(args: Array[String]) {

    if (args.length < 2) {
      println("Usage <inputDir> <outputDir>")
    }
    val conf = new SparkConf().setAppName("EmptyTagReplacer")
    val sc = new SparkContext(conf)

    val inFile = args(0)
    val outFile = args(1)

    val input: Map[String, String] = sc.wholeTextFiles(inFile).collectAsMap()
    searchAndReplaceEmptyTags(sc, input, outFile)
    sc.stop()
  }

  def searchAndReplaceEmptyTags(sc: SparkContext, inputXml: Map[String, String], outFile: String):
  scala.collection.mutable.ListBuffer[String] = {

    var outputXml = new scala.collection.mutable.ListBuffer[String]()
    val htmlTags = List("<middleName/>", "<countyName/>")
    inputXml.foreach { case (fileName, content) =>
      var newContent = content
      for (tag  <- htmlTags) {
        val data = sc.parallelize(newContent)
        data.saveAsTextFile(outFile + "/" + fileName)
      }
      outputXml += newContent
    }
    outputXml
  }

  def countTags(sc: SparkContext, xmlRecords: List[String]): List[Int] = {

    var middleNameTagCounter = sc.accumulator(0)
    var countyTagCounter = sc.accumulator(0)
    val middleNameRegex = "<middleName/>".r
    val countyRegEx = "<countyName/>".r
    xmlRecords.foreach { content =>
      middleNameTagCounter += middleNameRegex.findAllIn(content).length
      countyTagCounter += countyRegEx.findAllIn(content).length
    }
    List(middleNameTagCounter.value, countyTagCounter.value)
  }
}

Now the test case for testing the above spark job is given below,



package com

import java.io.File

import com.holdenkarau.spark.testing.SharedSparkContext
import org.apache.commons.io.FileUtils
import org.scalatest.FunSuite
import collection.mutable.Map

//import scala.io.Source._

class EmptyTagReplacerTest extends FunSuite with SharedSparkContext {


  test("Empty HTML tag replacer test") {

    //Read the content and create a content Map.
    //val content: String = scala.io.Source.fromFile("./src/test/resources/text-files/xml1").mkString
    val content: String =  FileUtils.readFileToString(new File("./src/test/resources/text-files/xml1"), "UTF-8")

    println("content"+content)
    val contentMap = collection.mutable.Map[String, String]()
    contentMap.+=("fileName" -> content)
    //Call searchAndReplaceMethod to remove empty Nodes
    val outputContent: scala.collection.mutable.ListBuffer[String] = EmptyTagReplacer.searchAndReplaceEmptyTags(sc, contentMap, "")
    val counts: List[Int] = EmptyTagReplacer.countTags(sc, outputContent.toList)
    println(counts)
    val expected = List(0, 0)
    assert(counts == expected)
  }
}


You have to include the scala-maven-plugin and scalatest-maven-plugin in pom.xml to make this work.

Please refer my github repo to know more https://github.com/dkbalachandar/scala-spark-test

How to integrate JaCoCo Code coverage tool with Maven

I have used Cobertura code coverage tool for one of my recent project and followed the steps mentioned in this link Cobertura Example.

When I have tried to upgrade Java version to 1.8, got some issues due to the usage of Lambda expression. So I am looking for an another code coverage tool. So I choose  JaCoCo

Assume that I have a Maven Java project and some unit test cases. I have given the below snippet from my Maven pom.xml below. Just copy and paste this into the build section and then run your build. After the build is done, then go to below location (target/site/jacoco-ut/) and click on index.html file to view the code coverage report


<plugin>
                <groupId>org.jacoco</groupId>
                <artifactId>jacoco-maven-plugin</artifactId>
                <version>0.7.7.201606060606</version>
                <configuration>
                 <!--  Add it here to exclude it from code coverage analysis -> 
                    <excludes>
                        <exclude>**/model/**</exclude>
                        <exclude>**/test/**</exclude>                        
                    </excludes>
                </configuration>
                <executions>
                    <execution>
                        <id>pre-unit-test</id>
                        <goals>
                            <goal>prepare-agent</goal>
                        </goals>
                        <configuration>
                            <destFile>${project.build.directory}/coverage-reports/jacoco-ut.exec</destFile>
                            <propertyName>surefireArgLine</propertyName>
                        </configuration>
                    </execution>
                    <execution>
                        <id>post-unit-test</id>
                        <phase>test</phase>
                        <goals>
                            <goal>report</goal>
                        </goals>
                        <configuration>
                            <dataFile>${project.build.directory}/coverage-reports/jacoco-ut.exec</dataFile>
                            <outputDirectory>${project.reporting.outputDirectory}/jacoco-ut</outputDirectory>
                        </configuration>
                    </execution>
                    <execution>
                        <id>pre-integration-test</id>
                        <phase>pre-integration-test</phase>
                        <goals>
                            <goal>prepare-agent</goal>
                        </goals>
                        <configuration>
                            <destFile>${project.build.directory}/coverage-reports/jacoco-it.exec</destFile>
                            <propertyName>failsafeArgLine</propertyName>
                        </configuration>
                    </execution>
                    <execution>
                        <id>post-integration-test</id>
                        <phase>post-integration-test</phase>
                        <goals>
                            <goal>report</goal>
                        </goals>
                        <configuration>
                            <dataFile>${project.build.directory}/coverage-reports/jacoco-it.exec</dataFile>
                            <outputDirectory>${project.reporting.outputDirectory}/jacoco-it</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.19.1</version>
                <configuration>
                    <argLine>${surefireArgLine}</argLine>
                    <excludes>
                        <exclude>**/*IntegrationTest*</exclude>
                    </excludes>
                </configuration>
            </plugin>

Rest API to produce message to Kafka using Docker Maven Plugin

I have developed a simple REST API to send the incoming message to Apache Kafka.

I have used Docker Kafka (https://github.com/spotify/docker-kafka) and the Docker Maven Plugin(https://github.com/fabric8io/docker-maven-plugin) to do this.

So before going through this post be familiarize yourself with Docker and Docker Compose

Docker Maven Plugin[Docker Maven Plugin] provides us a nice way to specify multiple images in POM.xml and link it as necessary. We can also use Docker compose for doing this. But I have used this plugin here.

    1. Clone the project (https://github.com/dkbalachandar/kafka-message-sender)
    2. Then go into kafka-message-sender folder
    3. Then enter ‘mvn clean install’
    4. Then enter  ‘mvn docker:start’. Then enter ‘docker ps’ and make sure that there are two containers are running. The name of those containers are kafka, kafka-rest
    5. Then access http://localhost:8080/api/kafka/send?msg=test and confirm that you see message has been sent on the browser
    6. Then enter the below command and make sure that whatever message which you sent is available at Kafka[Kafka Command Line Consumer] or you can also consume via a Flume agent[Kafka Flume Agent Consumer]
docker exec -it kafka /opt/kafka_2.11-0.8.2.1/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning

Java 8 + Cobertura maven plugin – net/sourceforge/cobertura/coveragedata/TouchCollector – No class def found error

I have got the below stack trace while upgrading the java version from 1.7 to 1.8 and using cobertura maven plugin version 2.7.

 

java.lang.NoClassDefFoundError: net/sourceforge/cobertura/coveragedata/TouchCollector
    at [org.package.ClassName].__cobertura_init([ClassName].java)
    at [org.package.ClassName].([ClassName].java)
    at [org.package.ClassName]Test.[method]([ClassName]Test.java:113)
Caused by: java.lang.ClassNotFoundException: net.sourceforge.cobertura.coveragedata.TouchCollector

 

I have tried with various options but nothing worked out. But solved it by following the workaround given in the below link

http://www.befreeman.com/2014/09/getting-cobertura-code-coverage-with.html

If you have used Lamda expression, then its better to use any other code coverage tool other than Cobertura

Refer my another post to know how to use Jacoco code coverage tool How to integrate JaCoCo Code coverage tool with Maven

Copy File/Directory with Maven resource plugin

We can use Maven resource plugin to copy files and directory to any folder/path. Please refer the below snippet from pom.xml. This will be used to copy the config files from ${project.build.directory}/config to /opt/config folder.


 <plugin>
        <artifactId>maven-resources-plugin</artifactId>
        <version>2.7</version>
        <executions>
          <execution>
            <id>copy-resources</id>
            <phase>package</phase>
            <goals>
              <goal>copy-resources</goal>
            </goals>
            <configuration>
              <outputDirectory>/opt/config</outputDirectory>
              <resources>
                <resource>
                  <directory>${project.build.directory}/config</directory>
                  <filtering>false</filtering>
                </resource>
              </resources>
            </configuration>
          </execution>         
        </executions>
      </plugin>

Create a Fat JAR

To create a FAT jar contains all the dependent classes and Jars, Please use the below approach

Maven:

maven-assembly

 

SBT:

Add the sbt-assembly plugin in the plugins.sbt as below


addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.1")

Change the version appropriately and run sbt assembly to create the Fat Jar