Wednesday, July 20, 2016

JAXB and Log4j XML Configuration Files

Both Log4j 1.x and Log4j 2.x support use of XML files to specify logging configuration. This post looks into some of the nuances and subtleties associated with using JAXB to work with these XML configuration files via Java classes. The examples in this post are based on Apache Log4j 1.2.17, Apache Log4j 2.6.2, and Java 1.8.0_73 with JAXB xjc 2.2.8-b130911.1802.

Log4j 1.x : log4j.dtd

Log4j 1.x's XML grammar is defined by a DTD instead of an W3C XML Schema. Fortunately, the JAXB implementation that comes with the JDK provides an "experimental,unsupported" option for using DTDs as the input from which Java classes are generated. The following command can be used to run the xjc command-line tool against the log4j.dtd.

    xjc -p dustin.examples.l4j1 -d src -dtd log4j.dtd

The next screen snapshot demonstrates this.

Running the command described above and demonstrated in the screen snapshot leads to Java classes being generated in a Java package in the src directory called dustin.examples.l4fj1 that allow for unmarshalling from log4j.dtd-compliant XML and for marshalling to log4j.dtd-compliant XML.

Log4j 2.x : Log4j-config.xsd

Log4j 2.x's XML configuration can be either "concise" or "strict" and I need to use "strict" in this post because that is the form that uses a grammar defined by the W3C XML Schema file Log4j-config.xsd and I need a schema to generate Java classes with JAXB. The following command can be run against this XML Schema to generate Java classes representing Log4j2 strict XML.

    xjc -p dustin.examples.l4j2 -d src Log4j-config.xsd -b l4j2.jxb

Running the above command leads to Java classes being generated in a Java package in the src directory called dustin.examples.l4j2 that allow for unmarshalling from Log4j-config.xsd-compliant XML and for marshalling to Log4j-config.xsd-compliant XML.

In the previous example, I included a JAXB binding file with the option -b followed by the name of the binding file (-b l4j2.jxb). This binding was needed to avoid an error that prevented xjc from generated Log4j 2.x-compliant Java classes with the error message, "Property "Value" is already defined. Use <jaxb:property> to resolve this conflict." This issue and how to resolve it are discussed in A Brit in Bermuda's post Property "Value" is already defined. Use to resolve this conflict. The source for the JAXB binding file I used here is shown next.

l4j2.jxb

<jxb:bindings version="2.0"
              xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
              xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   <jxb:bindings schemaLocation="Log4j-config.xsd" node="/xsd:schema">
      <jxb:bindings node="//xsd:complexType[@name='KeyValuePairType']">
         <jxb:bindings node=".//xsd:attribute[@name='value']">
            <jxb:property name="pairValue"/>
         </jxb:bindings>
      </jxb:bindings>
   </jxb:bindings>
</jxb:bindings>

The JAXB binding file just shown allows xjc to successfully parse the XSD and generate the Java classes. The one small price to pay (besides writing and referencing the binding file) is that the "value" attribute of the KeyValuePairType will need to be accessed in the Java class as a field named pairValue instead of value.

Unmarshalling Log4j 1.x XML

A potential use case for working with JAXB-generated classes for Log4j 1.x's log4j.dtd and Log4j 2.x's Log-config.xsd is conversion of Log4j 1.x XML configuration files to Log4j 2.x "strict" XML configuration files. In this situation, one would need to unmarshall Log4j 1.x log4j.dtd-compliant XML and marshall Log4j 2.x Log4j-config.xsd-compliant XML.

The following code listing demonstrates how the Log4j 1.x XML might be unmarshalled using the previously generated JAXB classes.

   /**
    * Extract the contents of the Log4j 1.x XML configuration file
    * with the provided path/name.
    *
    * @param log4j1XmlFileName Path/name of Log4j 1.x XML config file.
    * @return Contents of Log4j 1.x configuration file.
    * @throws RuntimeException Thrown if exception occurs that prevents
    *    extracting contents from XML with provided name.
    */
   public Log4JConfiguration readLog4j1Config(final String log4j1XmlFileName)
      throws RuntimeException
   {
      Log4JConfiguration config;
      try
      {
         final File inputFile = new File(log4j1XmlFileName);
         if (!inputFile.isFile())
         {
            throw new RuntimeException(log4j1XmlFileName + " is NOT a parseable file.");
         }

         final SAXParserFactory spf = SAXParserFactory.newInstance();
         final SAXParser sp = spf.newSAXParser();
         final XMLReader xr = sp.getXMLReader();
         
         final JAXBContext jaxbContext = JAXBContext.newInstance("dustin.examples.l4j1");
         final Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
         final UnmarshallerHandler unmarshallerHandler = unmarshaller.getUnmarshallerHandler();
         xr.setContentHandler(unmarshallerHandler);

         final FileInputStream xmlStream = new FileInputStream(log4j1XmlFileName);
         final InputSource xmlSource = new InputSource(xmlStream);
         xr.parse(xmlSource);

         final Object unmarshalledObject = unmarshallerHandler.getResult();
         config = (Log4JConfiguration) unmarshalledObject;
      }
      catch (JAXBException | ParserConfigurationException | SAXException | IOException exception)
      {
         throw new RuntimeException(
            "Unable to read from file " + log4j1XmlFileName + " - " + exception,
            exception);
      }
      return config;
   }

Unmarshalling this Log4j 1.x XML was a bit trickier than some XML unmarshalling because of the nature of log4j.dtd's namespace treatment. This approach for dealing with this wrinkle is described in Gik's Jaxb UnMarshall without namespace and in Deepa S's How to instruct JAXB to ignore Namespaces. Using this approach helped avoid the error message:

UnmarshalException: unexpected element (uri:"http://jakarta.apache.org/log4j/", local:"configuration"). Expected elements ...

To unmarshall the Log4j 1.x that in my case references log4j.dtd on the filesystem, I needed to provide a special Java system property to the Java launcher when running this code with Java 8. Specifically, I needed to specify
     -Djavax.xml.accessExternalDTD=all
to avoid the error message, "Failed to read external DTD because 'file' access is not allowed due to restriction set by the accessExternalDTD property." Additional details on this can be found at NetBeans's FaqWSDLExternalSchema Wiki page.

Marshalling Log4j 2.x XML

Marshalling Log4j 2.x XML using the JAXB-generated Java classes is fairly straightforward as demonstrated in the following example code:

   /**
    * Write Log4j 2.x "strict" XML configuration to file with
    * provided name based on provided content.
    *
    * @param log4j2Configuration Content to be written to Log4j 2.x
    *    XML configuration file.
    * @param log4j2XmlFile File to which Log4j 2.x "strict" XML
    *    configuration should be written.
    */
   public void writeStrictLog4j2Config(
      final ConfigurationType log4j2Configuration,
      final String log4j2XmlFile)
   {
      try (final OutputStream os = new FileOutputStream(log4j2XmlFile))
      {
         final JAXBContext jc = JAXBContext.newInstance("dustin.examples.l4j2");
         final Marshaller marshaller = jc.createMarshaller();
         marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
         marshaller.marshal(new ObjectFactory().createConfiguration(log4j2Configuration), os);
      }
      catch (JAXBException | IOException exception)
      {
         throw new RuntimeException(
            "Unable to write Log4 2.x XML configuration - " + exception,
            exception);
      }
   }

There is one subtlety in this marshalling case that may not be obvious in the just-shown code listing. The classes that JAXB's xjc generated from the Log4j-config.xsd lack any class with @XmlRootElement. The JAXB classes that were generated from the Log4j 1.x log4j.dtd did include classes with this @XmlRootElement annotation. Because the Log4j 2.x Log4j-config.xsd-based Java classes don't have this annotation, the following error occurs when trying to marshal the ConfigurationType instance directly:

MarshalException - with linked exception: [com.sun.istack.internal.SAXException2: unable to marshal type "dustin.examples.l4j2.ConfigurationType" as an element because it is missing an @XmlRootElement annotation]

To avoid this error, I instead (line 18 of above code listing) marshalled the result of invoking new ObjectFactory().createConfiguration(ConfigurationType) on the passed-in ConfigurationType instance and it is now successfully marshalled.

Conclusion

JAXB can be used to generate Java classes from Log4j 1.x's log4j.dtd and from Log4j 2.x's Log4j-config.xsd, but there are some subtleties and nuances associated with this process successfully generate these Java classes and to use the generated Java classes to marshal and unmarshal XML.

Friday, July 15, 2016

Apache PDFBox Command-line Tools: No Java Coding Required

In the blog post Apache PDFBox 2, I demonstrated use of Apache PDFBox 2 as a library called from within Java code to manipulate PDFs. It turns out that Apache PDFBox 2 also provides command-line tools that can be used directly from the command-line as-is with no additional Java coding required. There are several command-line tools available and I will demonstrate some of them in this post.

The PDFBox command-line tools are executed by taking advantage of PDFBox's executable JAR (java -jar with Main-Class: org.apache.pdfbox.tools.PDFBox). This is the JAR with "app" in its name and, for this particular blog post, is pdfbox-app-2.0.2.jar. The general format used to invoke these tools in java -jar pdfbox-app-2.0.2.jar <Command> [options] [files].

When the executable JAR is executed without arguments, a form of help is provided that lists the available commands. This is shown in the next screen snapshot.

This screen snapshot shows that this version of Apache PDFBox (2.0.2) advertises support for the "Possible commands" of ConvertColorspace, Decrypt, Encrypt, ExtractText, ExtractImages, OverlayPDF, PrintPDF, PDFDebugger, PDFMerger, PDFReader, PDFSplit, PDFToImage, TextToPDF, and WriteDecodedDoc.

Extracting Text: "ExtractText"

The first command-line tool I am looking at is extracting text from a PDF. I demonstrated using PDFBox to do this from Java code in my previous blog post. Here, I will use PDFBox to do the same thing directly from the command-line with no Java source code in sight. The following operation extracts the text from the PDF Scala by Example. In my previous, post the Java code accessed this PDF online and used PDFBox to extract text from it. In this case, I've downloaded the Scala by Example and am running the PDFBox ExtractText command-line tool against that downloaded PDF stored on my hard drive at C:\pdf\ScalaByExample.pdf.

The command to extract text from the PDF from the command-line using PDFBox is: java -jar pdfbox-app-2.0.2.jar ExtractText C:\pdf\ScalaByExample.pdf. The next two screen snapshots demonstrate running this command and the file it generates. From these screen snapshots, we can see that the text file generated by this command by default has the same name as the source PDF but with a .txt extension. This command supports multiple options including the ability to specify the name of the text file by placing that name after the source PDF's file name and the ability to write the text to the console instead of to a file via the -console flag (from which the output could be redirected). Examples of how to specify a custom text file name and how to direct text to console instead of file are shown next.

  • Explicitly Specifying Text File Name:
    • java -jar pdfbox-app-2.0.2.jar ExtractText C:\pdf\ScalaByExample.pdf C:\pdf\dustin.txt
  • Rendering Text on Console
    • java -jar pdfbox-app-2.0.2.jar ExtractText -console C:\pdf\ScalaByExample.pdf

PDF from Text: "TextToPDF"

When it is desirable to go the other way (start with text as the source and generate a PDF), the command TextToPDF is appropriate. To demonstrate this, I'm using a source text file called doi.txt that contains a portion of the United States Declaration of Independence:

The unanimous Declaration of the thirteen united States of America,

When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness

With a sample text file in place at C:\pdf\doi.txt, PDFBox's TextToPDF can be run against it. The command, java -jar pdfbox-app-2.0.2.jar TextToPDF C:\pdf\doi.pdf C:\pdf\doi.txt (note that the target PDF is listed as the first argument and the source text file in listed as the second argument). The next three screen snapshots demonstrate running this command the successful generation of a PDF from the source text file.

Extracting Images from PDFs: "ExtractImages"

The PDFBox command-line tool ExtractImages makes it as easy to extract images from a PDF as the command-line tool "ExtractText" made it to extract text from a PDF. My demonstration of this capability will extract four images from a PDF I created with images from the Black Hills (and surrounding area) of South Dakota that is called BlackHillsSouthDakotaAndSurroundingSights.pdf. A screen snapshot of this PDF is shown next.

PDFBox can be used to extract the four photographs in this PDF with the command java -jar pdfbox-app-2.0.2.jar ExtractImages C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as demonstrated in the next screen snapshot.

Running this command as shown in the last screen snapshot extracts the four images from the PDF. Each extracted image is named after the source PDF with a hyphen and counting integer appended to the end of the name. The generated images are also JPEG files with .jpg extensions. In this case, the names of the generated files are thus BlackHillsSouthDakotaAndSurroundingSights-1.jpg, BlackHillsSouthDakotaAndSurroundingSights-2.jpg, BlackHillsSouthDakotaAndSurroundingSights-3.jpg, and BlackHillsSouthDakotaAndSurroundingSights-4.jpg and each is displayed next in the form extracted directly from the PDF.

BlackHillsSouthDakotaAndSurroundingSights-1.jpg BlackHillsSouthDakotaAndSurroundingSights-2.jpg
BlackHillsSouthDakotaAndSurroundingSights-3.jpg BlackHillsSouthDakotaAndSurroundingSights-4.jpg

Encrypting PDF: "Encrypt"

Apache PDFBox makes it easy to encrypt a PDF. For example, I can encrypt the PDF used in the "ExtractImages" example with the following command: java -jar pdfbox-app-2.0.2.jar Encrypt -O DustinWasHere -U DustinWasHere C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as shown in the next screen snapshot:

Once I've run the encrypt command, I need a password to open this PDF in Adobe Reader:

Decrypting PDF: "Decrypt"

It's just as easy to decrypt this PDF with the command java -jar pdfbox-app-2.0.2.jar Decrypt -password DustinWasHere C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as shown in the next screen snapshot. The image demonstrates that an InvalidPasswordException is thrown when no password is provided (or the wrong password is provided) for decrypting the PDF and then it shows a successful decryption and I'm once again able to open the PDF in Adobe Reader without password.

Merging PDFs: "PDFMerger"

PDFBox allows multiple PDFs to be merged into a single PDF with the "PDFMerger" command. This is demonstrated in the next screen snapshots by merging the two single-page PDFs mentioned earlier (doi.pdf and BlackHillsSouthDakotaAndSurroundingSights.pdf into a new PDF called third.pdf with the command java -jar pdfbox-app-2.0.2.jar PDFMerger C:\pdf\doi.pdf C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf C:\pdf\third.pdf.

Splitting PDFs: "PDFSplit"

I can split the third.pdf PDF just created with PDFMerger with the command PDFSplit. This is a particularly simple case because the PDF being split is only two pages. The command is demonstrated with the next screen snapshots.

The snapshots demonstrate that the PDFs split out of third.pdf are called third-1.pdf and third-2.pdf.

Conclusion

In this post, I showed several of the command-line utilities available out-of-the-box with no Java coding required. There are a few other command-line utilities available that were not demonstrated here. All of these commands are easily used by running the executable "app" JAR provided with a PDFBox distribution. As command-line utilities, these tools enjoy the advantages of command-line tools including being quick to run and able to be included within scripts and other automated tools. Another benefit of these tools is that, because they are implemented in open source, developers can use the source code for these tools to see how to use the PDFBox APIs in their own applications and tools. Apache PDFBox's command-line tools are freely available and easy-to-use PDF manipulation tools that can be used with no extra Java code being written.

Monday, July 4, 2016

Apache PDFBox 2

Apache PDFBox 2 was released earlier this year and Apache PDFBox 2.0.1 and Apache PDFBox 2.0.2 have since been released. Apache PDFBox is open source (Apache License Version 2) and Java-based (and so is easy to use with wide variety of programming language including Java, Groovy, Scala, Clojure, Kotlin, and Ceylon). Apache PDFBox can be used by any of these or other JVM-based languages to read, write, and work with PDF documents.

Apache PDFBox 2 introduces numerous bug fixes in addition to completed tasks and some new features. Apache PDFBox 2 now requires Java SE 6 (J2SE 5 was minimum for Apache PDFBox 1.x). There is a migration guide, Migration to PDFBox 2.0.0, that details many differences between PDFBox 1.8 and PDFBox 2.0, including updated dependencies (Bouncy Castle 1.53 and Apache Commons Logging 1.2) and "breaking changes to the library" in PDFBox 2.

PDFBox can be used to create PDFs. The next code listing is adapted from the Apache PDFBox 1.8 example "Create a blank PDF" in the Document Creation "Cookbook" examples. The referenced example explicitly closes the instantiated PDDocument and probably does so for benefit of those using a version of Java before JDK 7. For users of Java 7, however, try-with-resources is a better option for ensuring that the PDDocument instance is closed and it is supported because PDDocument implements AutoCloseable.

Creating (Empty) PDF
/**
 * Demonstrate creation of an empty PDF.
 */
private void createEmptyDocument()
{
   try (final PDDocument document = new PDDocument())
   {
      final PDPage emptyPage = new PDPage();
      document.addPage(emptyPage);
      document.save("EmptyPage.pdf");
   }
   catch (IOException ioEx)
   {
      err.println(
         "Exception while trying to create blank document - " + ioEx);
   }
}

The next code listing is adapted from the Apache PDFBox 1.8 example "Hello World using a PDF base font" in the Document Creation "Cookbook" examples. The most significant change in this listing from that 1.8 Cookbook example is the replacement of deprecated methods PDPageContentStream.moveTextPositionByAmount(float, float) and PDPageContentStream.drawString(String) with PDPageContentStream.newLineAtOffset(float, float) and PDPageContentStream.showText(String) respectively.

Creating Simple PDF with Font
/**
 * Create simple, single-page PDF "Hello" document.
 */
private void createHelloDocument()
{
   final PDPage singlePage = new PDPage();
   final PDFont courierBoldFont = PDType1Font.COURIER_BOLD;
   final int fontSize = 12;
   try (final PDDocument document = new PDDocument())
   {
      document.addPage(singlePage);
      final PDPageContentStream contentStream = new PDPageContentStream(document, singlePage);
      contentStream.beginText();
      contentStream.setFont(courierBoldFont, fontSize);
      contentStream.newLineAtOffset(150, 750);
      contentStream.showText("Hello PDFBox");
      contentStream.endText();
      contentStream.close();  // Stream must be closed before saving document.

      document.save("HelloPDFBox.pdf");
   }
   catch (IOException ioEx)
   {
      err.println(
         "Exception while trying to create simple document - " + ioEx);
   }
}

The next code listing demonstrates parsing text from a PDF using Apache PDFBox. This extremely simple implementation parses all of the text into a single String using PDFTextStripper.getText(PDDocument). In most realistic situations, I'd not want all the text from the PDF in a single String and would likely use PDFTextStripper's ability to more narrowly specify which text to parse. It's also worth noting that while this code listing gets the PDF from online (Scala by Example PDF at http://www.scala-lang.org/docu/files/ScalaByExample.pdf), there are numerous constructors for PDDocument that allow one to access PDFs on file systems and via other types of streams.

Parsing Text from Online PDF

/**
 * Parse text from an online PDF.
 */
private void parseOnlinePdfText()
{
   final String address = "http://www.scala-lang.org/docu/files/ScalaByExample.pdf";
   try
   {
      final URL scalaByExampleUrl = new URL(address);
      final PDDocument documentToBeParsed = PDDocument.load(scalaByExampleUrl.openStream());
      final PDFTextStripper stripper = new PDFTextStripper();
      final String pdfText = stripper.getText(documentToBeParsed);
      out.println("Parsed text size is " + pdfText.length() + " characters:");
      out.println(pdfText);
   }
   catch (IOException ioEx)
   {
      err.println("Exception while trying to parse text from PDF at " + address);
   }
}

The JDK 8 Issue

PDFBox 2 exposes an issue in JDK 8 that is filed under Bug JDK-8041125 ("ColorConvertOp filter much slower in JDK 8 compared to JDK7"). The Apache PDFBox "Getting Started" documentation describes the issue, "Due to the change of the java color management module towards 'LittleCMS', users can experience slow performance in color operations." This same "Getting Started" section provides the work-around: "disable LittleCMS in favour of the old KCMS (Kodak Color Management System)."

The bug appears to have been identified and filed by IDR Solutions in conjunction with their commercial Java PDF library JPedal. Their blog post Major change to Color performance in newer Java releases provides more details related to this issue.

The just-mentioned posts and documentation, including Apache PDFBox 2's "Getting Started" section, explicitly demonstrate use of Java system properties to work-around the issue by explicitly specifying using of KCMS (which could be removed at any time) instead of the default LittleCMS. As these sources state, one can either provide the system property to the Java launcher [java] with the -D option [-Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider] or specify the property within the executable code itself [System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider");].

It sounds like this issue is not exclusive to version 2 of Apache PDFBox, but is more commonly seen with Apache PDFBox 2 because version 2 uses dependent constructs more frequently and because it's more likely that someone using Java 8 is also using the newer PDFBox.

The change in JDK 8 of the default implementation associated with property sun.java2d.cmm demonstrates a point I tried to make in my recent blog post Observations From A History of Java Backwards Incompatibility. In that post, I concluded, "Beware of and use only with caution any APIs, classes, and tools advertised as experimental or subject to removal in future releases of Java." It turns out that the Java 2D system properties are in this class. The System Properties for Java 2D Technology page provides this background and warning information regarding use of these properties:

This document describes several unsupported properties that you can use to customize how the 2D painting system operates. You might use these properties to improve performance, fix incorrect rendering, or avoid system crashes under certain configurations. ... Warning: Take care when using these properties. Some of them are unsupported for very practical reasons. ... Since these properties have the sole purpose of enabling or disabling implementation-specific behaviors, they are subject to change or removal without notification. Some properties might work only on the exact product releases for which they are documented.

Conclusion

Apache PDFBox 2 is a relatively easy way to manipulate PDF documents in Java. Its liberal Apache 2 license makes it amenable to a very large audience and its open source nature allows developers to see how to use the libraries it uses underneath the covers and adapt it as needed.

Additional Resources

Thursday, June 30, 2016

Our Tools (Sometimes) Lie to Us

My bachelors degree is in Electrical Engineering and when I started looking for my first post-college job, I had to make the decision whether to work in more traditional electrical engineering careers or in computer science-oriented careers. I had been writing code in BASIC since I was a kid, then Borland Turbo Pascal in my middle school and high school years, then more Pascal and C and C++ in my college years. I had a computer science emphasis as part of my EE degree, but the actual degree was in electrical engineering. There were many factors that influenced my decision to take the computer science fork in the road instead of the electrical engineering fork and one of these was an experience I had in an electrical engineering lab in which a tool lied to me. This post is about how our software development tools sometimes deceive us, though I'm really addressing cases where developers may share some culpability versus software that is intentionally deceptive.

Several of my electrical engineering classes had 1 credit hour labs associated with them that actually often required 10-20 hours of my week to complete the labs. I had one circuits-related lab that was giving me particular trouble and I spent hours trying to get my circuit to work as expected so that I could pass off the lab assignment. After hours of frustration and increasing doubt in my own understanding of the project and associated topics, the teaching assistant realized that the oscilloscope I had been using was faulty. We connected a properly working oscilloscope to the circuit and it immediately passed. I left relieved, but also left with a stronger impression than ever before that I would prefer to make my career in software development rather than in working with circuits or hardware.

It turns out, of course, that our tools in software development sometimes fail us just as the oscilloscope failed me in that lab experience that has perhaps forever traumatized me. I generally have more confidence in my own abilities in software development than I did in the circuit topic being discussed in that lab and many years of software development experience have contributed to that confidence that I did not have after only a couple semesters of study in the circuits topic. I am using the remainder of this post to provide some examples of where software development tools have failed me or those around me, though I emphasize that many of these are as much or more developer issues than tools issues. I think it's better to just blame the tools.

The Case of the Missing Float

I was working on a project in the early days of Oracle's PL/SQL Server Pages (PSP) product and our the PSP-based pages were not displaying our primary key column in some of our Oracle 8i Database tables. After working with Oracle Support and having Oracle Support put us in touch with a developer of PSP, it was realized that this early stage of the tool did not expect the PL/SQL type FLOAT. Any column of that type was not rendered in the PSP presentation of the table structure. As I recall, no types other than FLOAT were affected in this way. I don't remember how much time we spent before we realized the specific cause of this issue, but there certainly was some time lost and some questioning of other issues surrounding our DDL statements and database construct before we realized the issued was with the tool.

The Cases of Database Command-line Tools Misrepresenting Their Databases

When using command-line tools such as psql for PostgreSQL or SQL*Plus for Oracle database, one can occasionally be deceived if not careful. There are multiple ways in which this can happen. These "deceptions" tend not to be issues with the tools themselves as much as misconceptions upon the part of the users of these tools.

Based on a particular user's settings, these command-line tools often don't differentiate null from empty string. Both command-line tools referenced allow users to have a string or character substituted for null in query results to avoid this potential deception. These command-line tools may also not show full precision of some numeric values, but this is again controlled by user settings.

A really easy deception trap for users to fall into when using command-line tools to connect to a database is that of blaming slowness in the command-line tool or its enclosing terminal on the database. For example, if a table with many large (lots of columns) rows is queried, the scrolling results from that query may take quite a bit of time as the user watches them scroll across the screen. It could be easy to blame the query for being slow and taking however long it took for the results to be queried to the terminal, but in reality it can be shown that the query is much quicker when its results are spooled to a file instead of the terminal or, better yet, when its timing is measured using the database command-line interface's query performance measuring tools.

One other deception I've seen related to database command-line client tools (or really any database client tool) is when a developer thinks his or her software is not working properly because they cannot see the changes being made to the database in their client. In some cases, this is because the software being tested or debugged has not been allowed to commit its transaction yet and so, in their particular isolation level, the developer should not be able to see the not-yet-committed changes being made with a different database session.

The Case of the Java IDE Classpath Deception

Most Java developers prefer doing the bulk of their development in an IDE. These powerful IDEs make many of us much more productive, but these tools have been known to lie to developers. Perhaps the most common deception in a Java IDE occurs when the IDE maintains a separate classpath than the project's command-line-based build (for example, with Gradle, Maven, or Ant). In this case, it's easy for the IDE to report successfully building code that doesn't build from the command-line or vice versa.

The Case of the Java IDE Compiler Version Deception

Another potential deception associated with use of a Java-based IDE occurs when the IDE uses its own version of a compiler that is not the same as the compiler version used by the command-line build. I alluded to this situation in the blog post NetBeans 7.1's Internal Compiler and JDK 6 Respecting Return Type for Method Overloading.

The Many Cases of Slow-to-Update Tool Presentations

During my software development career, I've been burned multiple times by a tool that is slow to update its presentation or report. This has led me down a wrong road as I investigated a certain issue because I thought a tool was telling me something, but it really hadn't gotten around to telling me that yet. If I'm lucky, I'll eventually see a case presented by the tool where I know the data being shown me by the tool cannot be correct and then, on looking into it further, I realize that I'm still seeing output data from a previous run. When I run into this issue with a particular tool, I like to make sure that I have some field or indicator in the tool's report that will have to be updated for each run of that tool so that I know if the data has been refreshed or not.

A very similar issue can occur with tools that cache results. In such cases, a developer may change things without any noticeable effect in the cached presentation and therefore think his or her changes are ineffectual or won't impact anything. In these case, the developer is best served to ensure that data refreshes occur, even if it means forcibly causing the refresh.

Conclusion

This post has looked at situations in which we might want to blame the tools and suggest that they have led us astray. While this is true when the tool is built to intentionally lie or when the tool is broken or immature (as was the case in my first example), most of my examples are of situations in which the tool was actually doing its advertised job and it was developer misuse or misunderstanding of how to use the tool or the tool's limitations that was the real issue.

Software development tools have come a long way and make our jobs easier and make us more productive. However, when not used appropriately or used too carelessly, they can sometimes deceive us or at least contribute to our making some erroneous decisions based on what we think the tools are telling us. The best approaches for addressing these potential deceptions by our tools is to understand our tools well, understand how our tools perform their job, and understand our tools' limitations.

Thursday, June 23, 2016

Lombok, AutoValue, and Immutables

I liked Brandon's suggestion of a blog post comparing Project Lombok, AutoValue, and Immutables and this is a post that attempts to do that. I have covered Project Lombok, AutoValue, and Immutables individually with brief overviews, but this post is different in that it highlights the similarities and differences between them.

Lombok, AutoValue, and Immutables share quite a bit in common and I try to summarize these similarities in this single descriptive sentence: Lombok, AutoValue, and Immutables use annotation processing to generate boilerplate code for common operations used by value object classes. The remainder of this post looks at these similarities in more detail and contrasts the three approaches.

Code Generation

Lombok, AutoValue, and Immutables are all designed to generate verbose boilerplate code from concise code representations that focus on the high-level business logic and leave low-level details of implementation to the code generation. Common object methods such as toString(), equals(Object), and hashCode() are important but need to be written correctly. It is easy to make mistakes with these and even when they are written correctly originally (including via IDE generation), they can be neglected when other changes are made to the class that impact them.

Value Objects

Lombok, AutoValue, and Immutables each support generation of "value objects." While AutoValue strictly enforces generation of value objects, Immutables allows generated objects to be modifiable if @Modifiable is specified, and Lombok supports multiple levels of modification in its generated classes with annotations such as @Set and @Data.

Beyond Value Objects

AutoValue is focused on generation of value objects and supports generation of fields, constructor/builder, concrete accessor methods, and implementations of common methods equals(Object), hashCode(), and toString() based on the abstract methods in the template class.

Immutables provides capability similar to that provided by AutoValue and adds the ability to generate modifiable classes with @Value.Modifiable. Immutables also offers additional features that include:

Lombok provides value class generation capability similar to AutoValue with the @Value annotation and provides the ability to generate modifiable classes with the @Data annotation. Lombok also offers additional features that include:

Based on Annotations Processing

Lombok, AutoValue, and Immutables all generate more verbose boilerplate code from more concise template code via annotations processing. Each includes a javax.annotation.processing.Processor defined in its JAR file's META-INF/services area as part of the standard annotation processor discovery process that is part of the javac compiler.

Not All Annotation Processing is the Same

Although Lombok, AutoValue, and Immutables all employ annotation processing via javac, the particulars of how Lombok uses annotation processing are different than how AutoValue and Immutables do it. AutoValue and Immutables use annotation processing in the more conventional sense and generate source from source. The class source code generated by AutoValue and Immutables is not named the same as the template class and, in fact, extends the template class. AutoValue and Immutables both read the template class and generate an entirely new class in Java source with its own name that has all the generated methods and fields. This avoids any name collisions with the template class and makes it fairly easy to mix the template class source code and generated class source code in the same IDE project because they are in fact different classes.

AutoValue's Generation via Annotation Processing

Immutables's Generation via Annotation Processing

Lombok approaches generation via annotations processing differently than AutoValue and Immutables do. Lombok generates a compiled .class file with the same class name as the "template" source code and adds the generated methods to this compiled version. A developer only sees the concise template code when looking at .java files, but sees the compiled .class file with methods not present in the source code when looking at the .class files. The generation by Lombok is not of another source file but rather is of an enhanced compiled version of the original source. There is a delombok option one can use with Lombok to see what the generated source behind the enhanced .class file looks like, but the project is really designed to go straight from concise template source to enhanced compiled class without need or use for the intermediate enhanced source file. The delombok option can be used to see what the generated source would look like or, perhaps more importantly, can be used in situations where it is confusing to the tools to have inconsistent source (concise template .java file) and generated class (enhanced .class file of same name) in the same space.

Lombok's Generation via Annotation Processing

Lombok's approach to annotation processing is less conventional than the approach AutoValue and Immutables employ and some, including Lombok's creator, have called the approach "a hack." A good explanation of the Lombok "trick" or "hack" is contained in neildo's post Project Lombok - Trick Explained, which cites the also informative OpenJDK Compilation Overview.

The main reasons for the controversy surrounding Lombok's approach are closely related and are that it uses non-standard APIs and, because of this, it can be difficult to integrate well with IDEs and other tools that perform their own compilation (such as javadoc). Because AutoValue and Immutables naturally generate source code with new class names, any traditional tools and IDEs can work with the generated source alongside the template source without any major issues.

Summary of Similarities and Differences

Characteristic Project Lombok AutoValue Immutables Comments
Covered Version 1.16.8 (2016) 1.2 (2016) 2.2.8 (2016) Version used for this post
My Overview 2010 2016 2016  
Year Originated 2009 2014 2014  
License MIT (also) Apache 2 Apache 2 All open source
Minimum Java 1.6 1.6 1.7 Oldest supported Java version
Dependencies ASM (for Eclipse integration) ASM (Optional) Runtime Dependency: Guava Libraries dependent upon (included) at compile time
javax.annotation.processing.Processor lombok.launch.AnnotationProcessorHider$AnnotationProcessor com.google.auto.value.processor.AutoAnnotationProcessor
com.google.auto.value.processor.AutoValueBuilderProcessor
com.google.auto.value.processor.AutoValueProcessor
org.immutables.processor.ProxyProcessor Standard annotation processor specification location
Generated Source Relationship to Template Source Enhanced generated class replaces template source Generated source extends template source Lombok only shows generated source with "delombok" option
Access Generated Source Specify delombok option Default Default To view/control generated source code
Generated Methods equals(Object), hashCode(), toString(), construction/builder, accessors, setters equals(Object), hashCode(), toString(), construction/builder, accessors equals(Object), hashCode(), toString(), construction/builder, accessors, setters
Degree of Immutability Allows full mutability with field-level @Set but provides @Value when immutability is desired Enforces strict immutability "Heavily biased towards immutability" but provides class-level @Value.Modifiable AutoValue is most opinionated and Lombok is least opinionated
Bonus Features Resource cleanup
Immutable or Mutable
Sneakily thrown checked exceptions
Object synchronization locks
Logging annotation
More ...
Faithfulness to Value Object concept
Documented Best Practices
Style customization
Serialization (including JSON)
Pre-computed hash codes
More...
 

Considerations When Choosing

Lombok, AutoValue, and Immutables are similar toolkits that provide similar benefits and any of these three could be used successfully by a wide range of applications. However, there are differences between these toolkits that can be considered when selecting which of them to use.

  • Lombok generates a class with the same package and class name as the template while AutoValue and Immutables generate classes that extend the template class and have their own class name (but same package).
    • Developers who would like the compiled .class file to have exactly the same package and name as the template class will prefer Lombok.
    • Developers who prefer the generated source code always be available and not in conflict in any way with the template source will prefer AutoValue or Immutables.
  • AutoValue is the most opinionated of the three toolkits and Lombok tends to be the least opinionated.
    • Developers wanting the tight enforcement of characteristics of "value objects" are likely to prefer AutoValue. AutoValue does not provide a mechanism for generated classes to be modifiable and enforces several other rules that the other two toolkits do not enforce. For example, AutoValue only allows the template class to be expressed as an abstract class and not as an interface to avoid "[losing] the immutability guarantee ... and ... [inviting] more ... bad behavior." Immutables, on the other hand, does allow interfaces to be used as the templates for code generation.
    • Developers who want to depart from strict immutability or use some of the features AutoValue does not support in the interest of best practices opinions will likely prefer Immutables or Lombok.
  • AutoValue and Immutables use standard annotations processing and Lombok uses a non-standard annotations processing approach.
    • Developers wishing to avoid non-standard dependencies will favor AutoValue or Immutables.
    • Developers wanting to avoid IDE plugins or other special tools outside of javac and basic Java IDE support will favor AutoValue or Immutable.
  • All three toolkits support some level of customization and developers wishing to customize the generated code may want to choose the toolkit that allows them to customize the generated code in the ways they desire.
    • Lombok provides a configuration system that allows for several aspects of the generated code to be adjusted to desired conventions.
    • Immutables provides style customization that allows for several aspects of the generated code to be adjusted to desired conventions.
    • The How Do I? section of AutoValue's User Guide spells out some approaches to customize the code AutoValue generates (typically via use or avoidance of keywords in the template class).
  • AutoValue and Lombok are supported on JDK 1.6, but Immutables requires JDK 1.7.

Conclusion

Lombok, AutoValue, and Immutables share much in common and all three can be used to generate value classes from simple template files. However, they each also offer different advantages and features that may make any one of them more or less appealing to developers than the others based on the developers' individual circumstances.

Saturday, June 18, 2016

Creating Value Objects with Immutables

In response to my recent post AutoValue: Generated Immutable Value Classes, Brandon suggested that it might be interesting to see how AutoValue compares to Project Lombok and Immutables and Kevin seconded this. I agree that this is a good idea, but I am first publishing this post as a brief overview of Immutables because I have already provided similar posts for Lombok and AutoValue.

Immutables 2.2.5 is available from the Maven Central Repository and its license page states "The Immutables toolkit and all required dependencies are covered under The Apache Software License, Version 2.0." The Get started! page states that "Java 7 or higher is required to run the Immutables annotation processor."

Immutables, like AutoValue, uses compile-time annotations to generate the source code for the classes that define immutable objects. Because they both use this approach, both introduce only compile-time dependencies and their respective JARs are not needed on the application's runtime classpath. In other words, the Immutable JARs need to be on the compiler's (javac's) classpath but not on Java launcher's (java's) classpath.

The code listing for a "template" Person class is shown in the next code listing (Person.java). It looks very similar to the Person.java I used in my AutoValue demonstration.

Person.java
package dustin.examples.immutables;

import org.immutables.value.Value;

/**
 * Represents an individual as part of demonstration of
 * the Immutables project (http://immutables.github.io/).
 */
@Value.Immutable  // concrete extension will be generated by Immutables
abstract class Person
{
   /**
    * Provide Person's last name.
    *
    * @return Last name of person.
    */
   abstract String lastName();

   /**
    * Provide Person's first name.
    *
    * @return First name of person.
    */
   abstract String firstName();

   /**
    * Provide Person's birth year.
    *
    * @return Person's birth year.
    */
   abstract long birthYear();
}

The only differences in this "template" class and the "template" class I listed in my AutoValue post is the name of the package, the Javadoc comments on which product is being demonstrated, and (most significantly) the annotation imported and applied to the class. There is a specific "create" method in the AutoValue example that's not in the Immutables example, but that's only because I didn't demonstrate use of AutoValue's builder, which would have rendered the "create" method unnecessary.

When I appropriately specify use of Immutables on my classpath and use javac to compile the above source code, the annotation processor is invoked and the following Java source code is generated:

ImmutablePerson.java
package dustin.examples.immutables;

import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import javax.annotation.Generated;

/**
 * Immutable implementation of {@link Person}.
 * <p>
 * Use the builder to create immutable instances:
 * {@code ImmutablePerson.builder()}.
 */
@SuppressWarnings("all")
@Generated({"Immutables.generator", "Person"})
final class ImmutablePerson extends Person {
  private final String lastName;
  private final String firstName;
  private final long birthYear;

  private ImmutablePerson(String lastName, String firstName, long birthYear) {
    this.lastName = lastName;
    this.firstName = firstName;
    this.birthYear = birthYear;
  }

  /**
   * @return The value of the {@code lastName} attribute
   */
  @Override
  String lastName() {
    return lastName;
  }

  /**
   * @return The value of the {@code firstName} attribute
   */
  @Override
  String firstName() {
    return firstName;
  }

  /**
   * @return The value of the {@code birthYear} attribute
   */
  @Override
  long birthYear() {
    return birthYear;
  }

  /**
   * Copy the current immutable object by setting a value for the {@link Person#lastName() lastName} attribute.
   * An equals check used to prevent copying of the same value by returning {@code this}.
   * @param lastName A new value for lastName
   * @return A modified copy of the {@code this} object
   */
  public final ImmutablePerson withLastName(String lastName) {
    if (this.lastName.equals(lastName)) return this;
    String newValue = Objects.requireNonNull(lastName, "lastName");
    return new ImmutablePerson(newValue, this.firstName, this.birthYear);
  }

  /**
   * Copy the current immutable object by setting a value for the {@link Person#firstName() firstName} attribute.
   * An equals check used to prevent copying of the same value by returning {@code this}.
   * @param firstName A new value for firstName
   * @return A modified copy of the {@code this} object
   */
  public final ImmutablePerson withFirstName(String firstName) {
    if (this.firstName.equals(firstName)) return this;
    String newValue = Objects.requireNonNull(firstName, "firstName");
    return new ImmutablePerson(this.lastName, newValue, this.birthYear);
  }

  /**
   * Copy the current immutable object by setting a value for the {@link Person#birthYear() birthYear} attribute.
   * A value equality check is used to prevent copying of the same value by returning {@code this}.
   * @param birthYear A new value for birthYear
   * @return A modified copy of the {@code this} object
   */
  public final ImmutablePerson withBirthYear(long birthYear) {
    if (this.birthYear == birthYear) return this;
    return new ImmutablePerson(this.lastName, this.firstName, birthYear);
  }

  /**
   * This instance is equal to all instances of {@code ImmutablePerson} that have equal attribute values.
   * @return {@code true} if {@code this} is equal to {@code another} instance
   */
  @Override
  public boolean equals(Object another) {
    if (this == another) return true;
    return another instanceof ImmutablePerson
        && equalTo((ImmutablePerson) another);
  }

  private boolean equalTo(ImmutablePerson another) {
    return lastName.equals(another.lastName)
        && firstName.equals(another.firstName)
        && birthYear == another.birthYear;
  }

  /**
   * Computes a hash code from attributes: {@code lastName}, {@code firstName}, {@code birthYear}.
   * @return hashCode value
   */
  @Override
  public int hashCode() {
    int h = 31;
    h = h * 17 + lastName.hashCode();
    h = h * 17 + firstName.hashCode();
    h = h * 17 + Long.hashCode(birthYear);
    return h;
  }

  /**
   * Prints the immutable value {@code Person} with attribute values.
   * @return A string representation of the value
   */
  @Override
  public String toString() {
    return "Person{"
        + "lastName=" + lastName
        + ", firstName=" + firstName
        + ", birthYear=" + birthYear
        + "}";
  }

  /**
   * Creates an immutable copy of a {@link Person} value.
   * Uses accessors to get values to initialize the new immutable instance.
   * If an instance is already immutable, it is returned as is.
   * @param instance The instance to copy
   * @return A copied immutable Person instance
   */
  public static ImmutablePerson copyOf(Person instance) {
    if (instance instanceof ImmutablePerson) {
      return (ImmutablePerson) instance;
    }
    return ImmutablePerson.builder()
        .from(instance)
        .build();
  }

  /**
   * Creates a builder for {@link ImmutablePerson ImmutablePerson}.
   * @return A new ImmutablePerson builder
   */
  public static ImmutablePerson.Builder builder() {
    return new ImmutablePerson.Builder();
  }

  /**
   * Builds instances of type {@link ImmutablePerson ImmutablePerson}.
   * Initialize attributes and then invoke the {@link #build()} method to create an
   * immutable instance.
   * <p><em>{@code Builder} is not thread-safe and generally should not be stored in a field or collection,
   * but instead used immediately to create instances.</em>
   */
  static final class Builder {
    private static final long INIT_BIT_LAST_NAME = 0x1L;
    private static final long INIT_BIT_FIRST_NAME = 0x2L;
    private static final long INIT_BIT_BIRTH_YEAR = 0x4L;
    private long initBits = 0x7L;

    private String lastName;
    private String firstName;
    private long birthYear;

    private Builder() {
    }

    /**
     * Fill a builder with attribute values from the provided {@code Person} instance.
     * Regular attribute values will be replaced with those from the given instance.
     * Absent optional values will not replace present values.
     * @param instance The instance from which to copy values
     * @return {@code this} builder for use in a chained invocation
     */
    public final Builder from(Person instance) {
      Objects.requireNonNull(instance, "instance");
      lastName(instance.lastName());
      firstName(instance.firstName());
      birthYear(instance.birthYear());
      return this;
    }

    /**
     * Initializes the value for the {@link Person#lastName() lastName} attribute.
     * @param lastName The value for lastName 
     * @return {@code this} builder for use in a chained invocation
     */
    public final Builder lastName(String lastName) {
      this.lastName = Objects.requireNonNull(lastName, "lastName");
      initBits &= ~INIT_BIT_LAST_NAME;
      return this;
    }

    /**
     * Initializes the value for the {@link Person#firstName() firstName} attribute.
     * @param firstName The value for firstName 
     * @return {@code this} builder for use in a chained invocation
     */
    public final Builder firstName(String firstName) {
      this.firstName = Objects.requireNonNull(firstName, "firstName");
      initBits &= ~INIT_BIT_FIRST_NAME;
      return this;
    }

    /**
     * Initializes the value for the {@link Person#birthYear() birthYear} attribute.
     * @param birthYear The value for birthYear 
     * @return {@code this} builder for use in a chained invocation
     */
    public final Builder birthYear(long birthYear) {
      this.birthYear = birthYear;
      initBits &= ~INIT_BIT_BIRTH_YEAR;
      return this;
    }

    /**
     * Builds a new {@link ImmutablePerson ImmutablePerson}.
     * @return An immutable instance of Person
     * @throws java.lang.IllegalStateException if any required attributes are missing
     */
    public ImmutablePerson build() {
      if (initBits != 0) {
        throw new IllegalStateException(formatRequiredAttributesMessage());
      }
      return new ImmutablePerson(lastName, firstName, birthYear);
    }

    private String formatRequiredAttributesMessage() {
      List<String> attributes = new ArrayList<String>();
      if ((initBits & INIT_BIT_LAST_NAME) != 0) attributes.add("lastName");
      if ((initBits & INIT_BIT_FIRST_NAME) != 0) attributes.add("firstName");
      if ((initBits & INIT_BIT_BIRTH_YEAR) != 0) attributes.add("birthYear");
      return "Cannot build Person, some of required attributes are not set " + attributes;
    }
  }
}

Several observations can be made from examining the generated code (and you'll find that these are remarkably similar to the observations listed for AutoValue in my earlier post):

  • The generated class extends (implementation inheritance) the abstract class that was hand-written, allowing consuming code to use the hand-written class's API without having to know that a generated class was being used.
  • Fields were generated even though no fields were defined directly in the source class; Immutables interpreted the fields from the provided abstract accessor methods.
  • The generated class does not provide "set"/mutator methods for the fields (get/accessor methods). This is not surprising because a key concept of Value Objects is that they are immutable and even the name of this project (Immutables) implies this characteristic. Note that Immutables does provide some ability for modifiable objects with the @Value.Modifiable annotation.
  • Implementations of equals(Object), hashCode(), and toString() are automatically generated appropriately for each field with its type in mind.
  • Javadoc comments on the source class and methods are not reproduced on the generated extension class. Instead, simpler (and more generic) Javadoc comments are supplied on the generated class's methods and more significant (but still generic) Javadoc comments are provided on the builder class's methods.

As I stated with regards to AutoValue, one of the major advantages of using an approach such as Immutables generation is that developers can focus on the easier higher level concepts of what a particular class should support and the code generation ensures that the lower-level details are implemented consistently and correctly. However, there are some things to keep in mind when using this approach.

  • Immutables is most likely to be helpful when the developers are disciplined enough to review and maintain the abstract "source" Java class instead of the generated class.
    • Changes to the generated classes would be overwritten the next time the annotation processing generated the class again or generation of that class would have to be halted so that this did not happen.
    • The "template" abstract class has the documentation and other higher-level items most developers will want to focus on and the generated class simply implements the nitty gritty details.
  • You'll want to set your build/IDE up so that the generated classes are considered "source code" so that the abstract class will compile and any dependencies on the generated classes will compile.
  • Special care must be taken when using mutable fields with Immutables if one wants to maintain immutability (which is typically the case when choosing to use Immutables or Value Objects in general).

Conclusion

My conclusion can be almost word-for-word the same as for my post on AutoValue. Immutables allows developers to write more concise code that focuses on high-level details and delegates the tedious implementation of low-level (and often error-prone) details to Immutables for automatic code generation. This is similar to what an IDE's source code generation can do, but Immutables's advantage over the IDE approach is that Immutables can regenerate the source code every time the code is compiled, keeping the generated code current. This advantage of Immutables is also a good example of the power of Java custom annotation processing.

Thursday, June 16, 2016

AutoValue: Generated Immutable Value Classes

The Google GitHub-hosted project AutoValue is interesting for multiple reasons. Not only does the project make it easy to write less Java code for "value objects," but it also provides a conceptually simple demonstration of practical application of Java annotation processing. The auto/value project is provided by Google employees Kevin Bourrillion and √Čamonn McManus and is licensed with an Apache Version 2 license.

The AutoValue User Guide is short and to the point and this conciseness and simplicity are reflective of the project itself. The User Guide provides simple examples of employing AutoValue, discusses why AutoValue is desirable, short answers to common questions in the How Do I... section, and outlines some best practices related to using AutoValue.

The following code listing contains a simple class I have hand-written called Person. This class has been written with AutoValue in mind.

Person.java
package dustin.examples.autovalue;

import com.google.auto.value.AutoValue;

/**
 * Represents an individual as part of demonstration of
 * GitHub-hosted project google/auto/value
 * (see https://github.com/google/auto/tree/master/value).
 */
@AutoValue  // concrete extension will be generated by AutoValue
abstract class Person
{
   /**
    * Create instance of Person.
    *
    * @param lastName Last name of person.
    * @param firstName First name of person.
    * @param birthYear Birth year of person.
    * @return Instance of Person.
    */
   static Person create(String lastName, String firstName, long birthYear)
   {
      return new AutoValue_Person(lastName, firstName, birthYear);
   }

   /**
    * Provide Person's last name.
    *
    * @return Last name of person.
    */
   abstract String lastName();

   /**
    * Provide Person's first name.
    *
    * @return First name of person.
    */
   abstract String firstName();

   /**
    * Provide Person's birth year.
    *
    * @return Person's birth year.
    */
   abstract long birthYear();
}

When using AutoValue to generate full-fledged "value classes," one simply provides an abstract class (interfaces are intentionally not supported) for AutoValue to generate a corresponding concrete extension of. This abstract class must be annotated with the @AutoValue annotation, must provide a static method that provides an instance of the value class, and must provide abstract accessor methods of either public or package scope that imply the value class's supported fields.

In the code listing above, the static instance creation method instantiates a AutoValue_Person object, but I have no such AutoValue_Person class defined. This class is instead the name of the AutoValue generated class that will be generated when AutoValue's annotation processing is executed against as part of the javac compiling of Person.java. From this, we can see the naming convention of the AutoValue-generated classes: AutoValue_ is prepended to the source class's name to form the generated class's name.

When Person.java is compiled with the AutoValue annotation processing applied as part of the compilation process, the generated class is written. In my case (using AutoValue 1.2 / auto-value-1.2.jar), the following code was generated:

AutoValue_Person.java: Generated by AutoValue
package dustin.examples.autovalue;

import javax.annotation.Generated;

@Generated("com.google.auto.value.processor.AutoValueProcessor")
 final class AutoValue_Person extends Person {

  private final String lastName;
  private final String firstName;
  private final long birthYear;

  AutoValue_Person(
      String lastName,
      String firstName,
      long birthYear) {
    if (lastName == null) {
      throw new NullPointerException("Null lastName");
    }
    this.lastName = lastName;
    if (firstName == null) {
      throw new NullPointerException("Null firstName");
    }
    this.firstName = firstName;
    this.birthYear = birthYear;
  }

  @Override
  String lastName() {
    return lastName;
  }

  @Override
  String firstName() {
    return firstName;
  }

  @Override
  long birthYear() {
    return birthYear;
  }

  @Override
  public String toString() {
    return "Person{"
        + "lastName=" + lastName + ", "
        + "firstName=" + firstName + ", "
        + "birthYear=" + birthYear
        + "}";
  }

  @Override
  public boolean equals(Object o) {
    if (o == this) {
      return true;
    }
    if (o instanceof Person) {
      Person that = (Person) o;
      return (this.lastName.equals(that.lastName()))
           && (this.firstName.equals(that.firstName()))
           && (this.birthYear == that.birthYear());
    }
    return false;
  }

  @Override
  public int hashCode() {
    int h = 1;
    h *= 1000003;
    h ^= this.lastName.hashCode();
    h *= 1000003;
    h ^= this.firstName.hashCode();
    h *= 1000003;
    h ^= (this.birthYear >>> 32) ^ this.birthYear;
    return h;
  }

}

Several observations can be made from examining the generated code:

  • The generated class extends (implementation inheritance) the abstract class that was hand-written, allowing consuming code to use the hand-written class's API without having to know that a generated class was being used.
  • Fields were generated even though no fields were defined directly in the source class; AutoValue interpreted the fields from the provided abstract accessor methods.
  • The generated class does not provide "set"/mutator methods for the fields (get/accessor methods). This is an intentional design decision of AutoValue because a key concept of Value Objects is that they are immutable.
  • Implementations of equals(Object), hashCode(), and toString() are automatically generated appropriately for each field with its type in mind.
  • Javadoc comments on the source class and methods are not reproduced on the generated extension class.

One of the major advantages of using an approach such as AutoValue generation is that developers can focus on the easier higher level concepts of what a particular class should support and the code generation ensures that the lower-level details are implemented consistently and correctly. However, there are some things to keep in mind when using this approach and the Best Practices section of the document is a good place to read early to find out if AutoValue's assumptions work for your own case.

  • AutoValue is most likely to be helpful when the developers are disciplined enough to review and maintain the abstract "source" Java class instead of the generated class.
    • Changes to the generated classes would be overwritten the next time the annotation processing generated the class again or generation of that class would have to be halted so that this did not happen.
    • The "source" abstract class has the documentation and other higher-level items most developers will want to focus on and the generated class simply implements the nitty gritty details.
  • You'll want to set your build/IDE up so that the generated classes are considered "source code" so that the abstract class will compile.
  • Special care must be taken when using mutable fields with AutoValue if one wants to maintain immutability (which is typically the case when choosing to use Value Objects).
  • Review the Best Practices and How do I... sections to make sure no design assumptions of AutoValue make it not conducive to your needs.

Conclusion

AutoValue allows developers to write more concise code that focuses on high-level details and delegates the tedious implementation of low-level (and often error-prone) details to AutoValue for automatic code generation. This is similar to what an IDE's source code generation can do, but AutoValue's advantage over the IDE approach is that AutoValue can regenerate the source code every time the code is compiled, keeping the generated code current. This advantage of AutoValue is also a good example of the power of Java custom annotation processing.