So Many Utils, So Little Time
Will Brisbane, March 10, 2020
Part 1: Node Manipulation Utilities
One of the many joys of working on a platform that utilizes technology like Apache Felix, Apache Sling, and a Java Content Repository is that there is generally no shortage of tools available at your disposal to accomplish whatever your need might be. This can often be a double-edged sword as having too many options can make it unclear as to which is the best for your particular use case or if it even matters. Does one utility have all the methods you need? If not, which combination would be best? Is one method better or more efficient than another? These are questions we all find ourselves asking but often lack the time to fully research. In this multipart series we’ll be taking the time to examine a handful of utilities that can be leveraged to accomplish very similar tasks. We will explore the differences between the offerings to determine if there are any factors worth considering when deciding one should be used over another. In this installment we’ll be discussing the JcrUtil, JcrUtils, Node, ResourceResolver, and ResourceUtil APIs.
Chances are good that if you’ve done any sort of programmatic node creation or manipulation from within your Java classes, you’ve made use of at least one of the above APIs. If you’re like me, you took a look at what methods were available and were a little annoyed at the redundancy. All of the above APIs have methods that create and/or retrieve nodes or resources, and most have methods to get or manipulate children, parents, and properties. The exact mix, however, varies from utility to utility.
The Node API is very robust and offers methods to add Nodes with node name and node type. The API also provides a means to get and set node properties, get children, and delete the node. One caveat of this API is that you need to have a node to start with to make use of it. So if you’re looking to get or create a specific node at a specific path, you will end up having to get the parent node a different way or resolve the resource first and use an adaptTo() call with null checks supplementing both of these methods in order to get to the point where you can make use of the API. One additional caveat is that in order for any changes to be persisted, a Session.save() must be executed whereas all of the other utilities allow for passing the Session or ResourceResolver as a parameter along with an autosave flag or have their own methods of saving.
JcrUtil has methods to copy nodes and resources into a destination parent node. It also has overloaded createPath() methods which allows for creating a different node type from the intermediate node type for any nonexistent nodes along the path, and to create a unique node at the given path if an existing node of the same name already exists. JcrUtil also contains methods around ensuring nodes have valid names that comply with JCR standards and setting node order. There are methods for setting JCR properties on a node, but the API is not as user friendly as the Node and JcrUtils API in this regard. There are also no methods for removing nodes within JcrUtil.
JcrUtils is probably my favorite of the utilities tested. In addition to the several overloaded getOrCreateByPath() methods which replicates the functionality encompassed in JcrUtil, JcrUtils has many methods around getting different property types, associated files, and the children of nodes. JcrUtils also provides methods for setting properties, getting references to a respective node, and adding files to a given node. Like JcrUtil, there is no remove method available.
ResourceResolver has similar functionality to Node as a “native” API providing methods for creating, deleting, copying, and getting children of resources. Outside of the basics, the ResourceResolver API lacks some of the more user-friendly convenience methods while still providing access to paths at a resource level. Unlike the previously mentioned, ResourceResolver is able to call its own commit() method to save rather than being dependent on a Session.
Finally, ResourceUtil seems to hold very little utility outside of some convenience methods in line with the offerings of ResourceResolver. Methods such as getting children, parent, name, and valueMap of properties aren’t completely unique to this utility. The only set of methods that are more unique are isNonExistingResource(), isStarResource(), and isSyntheticResource(). Outside of these, most methods seem replicated within other APIs. One other unique element of this API, while limited in use case, is that it provides access to ResourceUtil.BatchResourceRemover which allows for recursive deletion of resources. Changes made with this utility are saved via a ResourceResolver.commit() call or by passing a ResourceResolver in as a parameter along with an autoCommit flag.
As we can see, there is quite a bit of overlap in what methods are available with each of these APIs. But what about performance? I decided to test each utility using a method they all had in common: creating a node or resource. For the purposes of my performance testing I used a fresh AEM 6.5 instance with Service Pack 2 installed. I then wrote a simple servlet that, when hit, would execute the following calls to simulate node/resource creation:
- JcrUtil.createPath(String absolutePath, String nodeType, Session session)
- JcrUtils.getOrCreateByPath(String absolutePath, String nodeType, Session session)
- Node.addNode(String relPath)
- ResourceResolver.create(Resource parent, String name, Map<String, Object> properties)
- ResourceUtil.getOrCreateResource(ResourceResolver resolver, String path, String resourceType, String intermediateResourceType, Boolean autoCommit)
My method included creating nodes/resources in batches of 1000 nodes or resources and utilized system time to monitor duration of execution in milliseconds. This creation was repeated 1000 times each, creating one million nodes or resources per method. In order to avoid interference from and possible bottlenecks due to session saves, the session was refreshed before each iteration. The nodes or resources were created using counter-based node names and the resource type for all was nt:unstructured.
After collecting data, averages, min and max values, and standard deviation, z-scores were calculated and outliers in excess of ±3 were removed from the results. The results of the testing are as follows:
Perhaps unsurprisingly, the Node addNode() method was the highest performing followed by ResourceResolver’s create() method. This is unsurprising because these are more “native” methods and the other utilities all make use of one of these two methods in their API with JcrUtil and JcrUtils using Node.addNode()and ResourceUtils using ResourceResolver.create() after additional processing logic. JcrUtil.createPath() method was the next most performant with JcrUtils.getOrCreateByPath() closely behind. ResourceUtil brought up the rear by a sizeable percentage. One thing to note about ResourceUtil is that it also set the sling:resourceType in addition to the jcr:primaryType. To me this doesn’t necessarily justify the added processing time but if both of those values need to be set to the same value, this may eliminate the need for an additional method call.
Ultimately what you choose will depend on your use case and which utility offers the most bang for your buck. As we probably could have expected, it’s best to use the Node API if you’re doing basic node creation, manipulation, and property setting as it offers the most expansive set of functionalities with the most efficient operations. Though if you are doing a lot of manipulations and want a more fully featured and user-friendly utility while maintaining a respectable level of efficiency, I’d recommend JcrUtils. It has the best balance of competitive performance with robust method offerings to accomplish all the tasks you’re likely to desire. JcrUtil comes in closely after with its balance but would likely need to be combined with Node API usage to accomplish most standard use cases. In any event, ResourceUtil appears to be the least advisable as performance wise it is lacking and the unique methods it contains are super use case sensitive.
This concludes our first installment of examination of utility APIs. Be sure to check out the next installment where we’ll be examining DAM and asset specific utilities.
Do you need help with your implementation? Why not work with a partner who has experience implementing over 1,000+ successful Adobe Experience Cloud solutions for clients around the globe? With a proven track record of quality delivery, 3|SHARE is mindful of performance, budget, and timeline. Reach out to us and get a conversation started today!