Thursday, April 30, 2015

Software Development Lessons Learned from Consumer Experience

Because we software developers are also inevitably consumers of others' software applications, we are undoubtedly influenced in the creation of our own software by the software we use. For example, our opinions of what makes an effective interface for users are influenced by our experiences "on the other side" using someone else's software interface. This is particularly true for web and mobile development as web application and mobile applications have become pervasive in our lives.

We are prone to adopt idioms and styles that appeal to us and shun idioms and styles that we don't like. The degree of this influence may vary widely based on the type of software we are developing versus the type of software we use as a consumer (the more alike they are, the stronger the influence). There are times when the influence may be more subconscious and other times when the influence of others' software may be obvious. In this post, I describe a recent experience with an online site that reminded me of some important software development practices to keep in mind when creating software (particularly web applications).

I was recently creating a photobook on a popular photography-related web site. I began by uploading numerous photographs for the book. The web application reported that all but one of the photographs uploaded successfully. It reported that it failed to uploaded one photograph and recommended that I verify my Internet connection. After verifying my Internet connection, I tried uploading the single photograph several more times without any success. I tried changing the name of the file and changing its file location, but still had no success. I was able to upload other photographs after those failures, but could still not upload the one particular photograph.

I decided to work on the photobook with the photographs that did upload and spent a couple hours arranging the photographs exactly the way I wanted them. When I tried to save my photobook project, however, the application would not allow me to save because it said it could not save until it finished uploading the photograph that it kept failing to upload. I clicked on the link to save the project several times without success. I could not remove the reference to the offending photograph and even the Save-As option did not allow me to save because the application thought the photograph was still uploading. Ultimately, I gave up and closed the browser, knowing that my two hours' worth of work was lost.

When I looked carefully at the characteristics of the problematic photograph, I noticed that it was exactly 1 MB (1024 KB) in size. I used some image manipulation software to make a minor change to it that affected its size (made it a bit larger) and it uploaded without incident. I had to start over, but at that point I was able to arrange (again) the photographs where I wanted them and able to save the project as desired.

As a consumer of this software, I learned a few lessons. One lesson is the need to explicitly save often when using that application because it does not have an implicit save feature and because the act of saving a project seems to be the only way to find out that the software is in an inconsistent state in which no more future work on the project will be savable. I also learned to avoid the rare case of attempting to upload an image that is exactly 1 MB to that application.

I was reminded of even more important lessons as a software developer. First, I was reminded of the importance of unit testing, especially boundary conditions. I speculate that the code used by this application looked something like this pseudo code:

if (imageSize < 1024000)
{
   // upload as-is
}
else if (imageSize > 1024000)
{
   // compress and then upload
}

In my speculative pseudo code shown above, the case of an image that is exactly 1024000 bytes leads to an image that is not explicitly uploaded. It's an easy error to make, especially if in a hurry and if no code review is performed or is rushed. Effective unit tests are perfect for driving out this type of bug and unit tests that test boundary conditions like 1024000 in this case are easy to implement. Often, just the writing of the unit test to test this boundary condition will cause the developer to realize the error of his or her ways.

Being an irritated and frustrated consumer of this software also reminded me of the importance of planning for "unhappy paths" in the software I develop. Use of this online photobook-creating software would have been less frustrating in this case if one of several options had been implemented. Had the application supported an auto-save feature that reported when it couldn't save, I'd have known that what I was working on wasn't savable. Blogger, which I used for this blog, has such a feature and when it reports to me that it cannot save, I know to stop adding new content to my post until it can save and I have a chance to copy and paste what I have typed into a file on my local hard drive.

Another option that could have saved me significant frustration would have been a Save-As feature that allowed me to save my project as a different project. I can understand the software being written to not allow me to save while it thinks it's in an inconsistent state (it thinks it's still trying to save), but it should still be able to save it as a different project. I have seen this allowed on several desktop applications that think the currently loaded document is corrupt or inconsistent but allow me to save that document anyway as a separate document (and not overwriting the previous document).

My frustration with the online photobook creation software reminded me of the importance of keeping the users' experience in mind when writing software. We can write all of the clean, readable, and maintainable software we want, but if it provides a poor user experience, that effort is for naught. This experience also was a good reminder of the importance of thorough testing, especially of "unhappy paths" and boundary conditions.

Packt Publishing's Free Learning Offer

In February of this year, Packt Publishing offered "a free eBook every day" for 18 days. The offer is back again at https://www.packtpub.com/packt/offers/free-learning with the title Getting Started with C++ Audio Programming for Game Development being offered as I write this post (Kinect in Motion – Audio and Visual Tracking by Example was available free yesterday, the first day of this new offer).

The e-mail message I received from Packt about this new offer states, "It's back! And this time for good. Following on from the huge success of our Free Learning campaign, we've decided to open up our free library to you permanently, with better titles than ever." Unlike the offer in February that lasted for 18 days, it sounds like this one may be available indefinitely. The e-mail message further describes the offer, "Each eBook will only be free for 24 hours, so make sure you come back every day to grab your Free Learning fix!"

You need to create a Packt Publishing account if you don't already have one so that you can login to claim each day's free book. I have had an account for some time, but the most challenging part for me in the previous campaign was thinking of going to the site each day to see which free book was being offered.

Saturday, April 18, 2015

The JDK 8 SummaryStatistics Classes

Three of the new classes introduced in JDK 8 are DoubleSummaryStatistics, IntSummaryStatistics, and LongSummaryStatistics of the java.util package. These classes make quick and easy work of calculating total number of elements, minimum value of elements, maximum value of elements, average value of elements, and the sum of elements in a collection of doubles, integers, or longs. Each class's class-level Javadoc documentation begins with the same single sentence that succinctly articulates this, describing each as "A state object for collecting statistics such as count, min, max, sum, and average."

The class-level Javadoc for each of these three classes also states of each class, "This class is designed to work with (though does not require) streams." The most obvious reason for the inclusion of these three types of SummaryStatistics classes is to be used with streams that were also introduced with JDK 8.

Indeed, each of the three class's class-level Javadoc comments also provide an example of using each class in conjunction with streams of the corresponding data type. These examples demonstrate invoking the respective Streams' collect(Supplier, BiConsumer, BiConsumer) method (a mutable reduction terminal stream operation) and passing each SummaryStatistics class's new instance (constructor), accept, and combine methods (as method references) to this collect method as its "supplier", "accumulator", and "combiner" arguments respectively.

The rest of this post demonstrates use of IntSummaryStatistics, LongSummaryStatistics, and DoubleSummaryStatistics. Several of these examples will reference a map of The X-Files television series's seasons to the Nielsen rating for that season's premiere. This is shown in the next code listing.

Declaring and Initializing xFilesSeasonPremierRatings
/**
 * Maps the number of each X-Files season to the Nielsen rating
 * (millions of viewers) for the premiere episode of that season.
 */
private final static Map<Integer, Double> xFilesSeasonPremierRatings;

static
{
   final Map<Integer, Double> temporary = new HashMap<>();
   temporary.put(1, 12.0);
   temporary.put(2, 16.1);
   temporary.put(3, 19.94);
   temporary.put(4, 21.11);
   temporary.put(5, 27.34);
   temporary.put(6, 20.24);
   temporary.put(7, 17.82);
   temporary.put(8, 15.87);
   temporary.put(9, 10.6);
   xFilesSeasonPremierRatings = Collections.unmodifiableMap(temporary);
}

The next code listing uses the map created in the previous code listing, demonstrates applying DoubleSummaryStatistics to stream of the "values" portion of the map, and is very similar to the examples provided in the Javadoc for the three SummaryStatistics classes. The DoubleSummaryStatistics class, the IntSummaryStatistics class, and the LongSummaryStatistics class have essentially the same fields, methods, and APIs (only differences being the supported datatypes). Therefore, even though this and many of this post's examples specifically use DoubleSummaryStatistics (because the X-Files's Nielsen ratings are doubles), the principles apply to the other two integral types of SummaryStatistics classes.

Using DoubleSummaryStatistics with a Collection-based Stream
/**
 * Demonstrate use of DoubleSummaryStatistics collected from a
 * Collection Stream via use of DoubleSummaryStatistics method
 * references "new", "accept", and "combine".
 */
private static void demonstrateDoubleSummaryStatisticsOnCollectionStream()
{
   final DoubleSummaryStatistics doubleSummaryStatistics =
      xFilesSeasonPremierRatings.values().stream().collect(
         DoubleSummaryStatistics::new,
         DoubleSummaryStatistics::accept,
         DoubleSummaryStatistics::combine);
   out.println("X-Files Season Premieres: " + doubleSummaryStatistics);
}

The output from running the above demonstration is shown next:

X-Files Season Premieres: DoubleSummaryStatistics{count=9, sum=161.020000, min=10.600000, average=17.891111, max=27.340000}

The previous example applied the SummaryStatistics class to a stream based directly on a collection (the "values" portion of a Map). The next code listing demonstrates a similar example, but uses an IntSummaryStatistics and uses a stream's intermediate map operation to specify which Function to invoke on the collection's objects for populating the SummaryStatistics object. In this case, the collection being acted upon in a Set<Movie> as returned by the Java8StreamsMoviesDemo.getMoviesSample() method and spelled out in my blog post Stream-Powered Collections Functionality in JDK 8.

Using IntSummaryStatistics with Stream's map(Function)
/**
 * Demonstrate collecting IntSummaryStatistics via mapping of
 * certain method calls on objects within a collection and using
 * lambda expressions (method references in particular).
 */
private static void demonstrateIntSummaryStatisticsWithMethodReference()
{
   final Set<Movie> movies = Java8StreamsMoviesDemo.getMoviesSample();
   IntSummaryStatistics intSummaryStatistics =
      movies.stream().map(Movie::getImdbTopRating).collect(
         IntSummaryStatistics::new, IntSummaryStatistics::accept, IntSummaryStatistics::combine);
   out.println("IntSummaryStatistics on IMDB Top Rated Movies: " + intSummaryStatistics);
}

When the demonstration above is executed, its output looks like this:

IntSummaryStatistics on IMDB Top Rated Movies: IntSummaryStatistics{count=5, sum=106, min=1, average=21.200000, max=49}

The examples so far have demonstrated using the SummaryStatistics classes in their most common use case (in conjunction with data from streams based on existing collections). The next example demonstrates how a DoubleStream can be instantiated from scratch via use of DoubleStream.Builder and then the DoubleStream's summaryStatistics() method can be called to get an instance of DoubleSummaryStatistics.

Obtaining Instance of DoubleSummaryStatistics from DoubleStream
/**
 * Uses DoubleStream.builder to build an arbitrary DoubleStream.
 *
 * @return DoubleStream constructed with hard-coded doubles using
 *    a DoubleStream.builder.
 */
private static DoubleStream createSampleOfArbitraryDoubles()
{
   return DoubleStream.builder().add(12.4).add(13.6).add(9.7).add(24.5).add(10.2).add(3.0).build();
}

/**
 * Demonstrate use of an instance of DoubleSummaryStatistics
 * provided by DoubleStream.summaryStatistics().
 */
private static void demonstrateDoubleSummaryStatisticsOnDoubleStream()
{
   final DoubleSummaryStatistics doubleSummaryStatistics =
      createSampleOfArbitraryDoubles().summaryStatistics();
   out.println("'Arbitrary' Double Statistics: " + doubleSummaryStatistics);
}

The just-listed code produces this output:

'Arbitrary' Double Statistics: DoubleSummaryStatistics{count=6, sum=73.400000, min=3.000000, average=12.233333, max=24.500000}

Of course, similarly to the example just shown, IntStream and IntStream.Builder can provide an instance of IntSummaryStatistics and LongStream and LongStream.Builder can provide an instance of LongSummaryStatistics.

One doesn't need to have a collection stream or other instance of BaseStream to use the SummaryStatistics classes because they can be instantiated directly and used directly for the predefined numeric statistical operations. The next code listing demonstrates this by directly instantiating and then populating an instance of DoubleSummaryStatistics.

Directly Instantiating DoubleSummaryStatistics
/**
 * Demonstrate direct instantiation of and population of instance
 * of DoubleSummaryStatistics instance.
 */
private static void demonstrateDirectAccessToDoubleSummaryStatistics()
{
   final DoubleSummaryStatistics doubleSummaryStatistics =
      new DoubleSummaryStatistics();
   doubleSummaryStatistics.accept(5.0);
   doubleSummaryStatistics.accept(10.0);
   doubleSummaryStatistics.accept(15.0);
   doubleSummaryStatistics.accept(20.0);
   out.println("Direct DoubleSummaryStatistics Usage: " + doubleSummaryStatistics);
}

The output from running the previous code listing is shown next:

Direct DoubleSummaryStatistics Usage: DoubleSummaryStatistics{count=4, sum=50.000000, min=5.000000, average=12.500000, max=20.000000}

As done in the previous code listing for a DoubleSummaryStatistics, the next code listing instantiates a LongSummaryStatistics directly and populates it). This example also demonstrates how the SummaryStatistics classes provide individual methods for requesting individual statistics.

Directly Instantiating LongSummaryStatistics / Requesting Individual Statistics
/**
 * Demonstrate use of LongSummaryStatistics with this particular
 * example directly instantiating and populating an instance of
 * LongSummaryStatistics that represents hypothetical time
 * durations measured in milliseconds.
 */
private static void demonstrateLongSummaryStatistics()
{
   // This is a series of longs that might represent durations
   // of times such as might be calculated by subtracting the
   // value returned by System.currentTimeMillis() earlier in
   // code from the value returned by System.currentTimeMillis()
   // called later in the code.
   LongSummaryStatistics timeDurations = new LongSummaryStatistics();
   timeDurations.accept(5067054);
   timeDurations.accept(7064544);
   timeDurations.accept(5454544);
   timeDurations.accept(4455667);
   timeDurations.accept(9894450);
   timeDurations.accept(5555654);
   out.println("Test Results Analysis:");
   out.println("\tTotal Number of Tests: " + timeDurations.getCount());
   out.println("\tAverage Time Duration: " + timeDurations.getAverage());
   out.println("\tTotal Test Time: " + timeDurations.getSum());
   out.println("\tShortest Test Time: " + timeDurations.getMin());
   out.println("\tLongest Test Time: " + timeDurations.getMax());
}

The output from this example is now shown:

Test Results Analysis:
 Total Number of Tests: 6
 Average Time Duration: 6248652.166666667
 Total Test Time: 37491913
 Shortest Test Time: 4455667
 Longest Test Time: 9894450

In most examples in this post, I relied on the SummaryStatistics classes' readable toString() implementations to demonstrate the statistics available in each class. This last example, however, demonstrated that each individual type of statistic (number of values, maximum value, minimum value, sum of values, and average value) can be retrieved individually in numeric form.

Conclusion

Whether the data being analyzed is directly provided as a numeric Stream, is provided indirectly via a collection's stream, or is manually placed in the appropriate SummaryStatistics class instance, the three SummaryStatistics classes can provide useful common statistical calculations on integers, longs, and doubles.