While all of these subrequests on the previous block until they are complete, they do not have to. You can combine multiple
steps into an asynchronous pipeline. Instead of getting results back, you get the equivalent of a
Future object handle. By joining on
these handles, the whole process will block, but each subrequest is potentially scheduled in parallel (and thus taking
advantage of extra CPUs) (see Listing 1
In this example, the image is not actually being used after it is fetched but it is still retrieved via HTTP. A proper pipeline would catch errors and exceptions as well, but this is just a demonstration of how easy it is to orchestrate asynchronous calls. The logical abstractions can represent invocation of all manner of elaborate backend processing, but the clients are protected from these details and can pick and choose when they want asynchronous or synchronous subrequest handling. Not only is this a tremendously simpler approach than trying to manage all of this by yourself, but it is likely to scale better too.
An important point is that the results of issuing these calls in a resource-oriented environment are immutable resource representations. In this way, NetKernel is very much a stateless request mechanism; a little bit like REST, a little bit like functional programming languages. It takes a while for OO programmers to change how they think about these ideas, but there are great benefits to doing so.
The point is not that you could not do this with regular Java concurrency constructs, but that it would be significantly more difficult to get right. Having a microkernel-based architecture of URI-addressable behavior is a powerful combination. You can easily imagine doing asynchronous federated queries across multiple data sources from heterogeneous backend systems (relational databases, web services, etc.).
While it is great to have a flexible, fluid, scalable environment at your disposal, you do not always want your system to hammer a particular resource as much or as fast as it can. There may be operational or legal limits to usage of a service, a library or a data source.
As an example, some commercial entity extractors have really obnoxious licensing terms. You can only use one thread at a time or risk needing to pay tens of thousands of dollars more for additional thread use (this is per-CPU licensing to the extreme!). At the very moment you are trying to take advantage of extra CPUs for the rest of your system, the lawyers are telling you not to in this case!
Let us assume you can access the functionality needed through the API as follows:
Results r = ExpensiveTool.extractEntity(myDocument);
The goal is to limit access to a single thread. Given what you know about Java threading, you might be tempted to do something like:
Results r = null;
r = ExpensiveTool.extractEntity(myDocument);
This solves the legal issue, but a code-level monitor here is too blunt of a weapon to employ in this situation. If you decide for business reasons that paying the extra money is worth it, you cannot simply move from enforcing use of a single thread to enforcing the use of two or three threads because the Java monitor is only a
mutex. You need to change your locking mechanism to counting semaphores (see java.util.concurrent.Semaphore
in JDK 5 or later) or some other more sophisticated tool. Additionally, if you want to apply the limited thread use policy across a variety of resources, tools or systems, this suddenly starts to feel very complicated.
NetKernel's URI abstractions and throttling capabilities work well here. First, put the expensively-licensed tool behind a URI like active:expensive-tool. Rather than issuing the call directly, you could wrap the request with a call to throttle requests based on a configuration file:
handle = context.issueAsyncSubRequest(req);
result = handle.join();
The configuration file indicates how many requests you allow for that URI into the kernel at a time and how many requests to queue up before you start rejecting them:
Not only are you legally compliant, but also if you do buy a second or third license, you do not have to change the policy, you just change the configuration file. You could also wrap requests for other resources with the same throttle definition if that made sense to do. It is very cool that you can trivially enforce throttles against arbitrary tools and services whether they offer the capability some other way or not! The issue is not that the problem is intractable using language-level concurrency constructs, it is just that they are hard to universalize, are very detail-oriented, and painful to get right.
Concurrent Threading Is Useful Yet Nuanced
Java's support for concurrent threading constructs is tremendously useful in the right hands, but they remain nuanced and error-prone. Language-level multithreading tools are often not the right abstractions for mixing arbitrary processing behavior with complicated system orchestrations and changing business rules. By shifting your focus to a resource-oriented mindset like that provided by NetKernel, you could easily take advantage of modern hardware and the surfeit of CPUs they offer while reusing large amounts of your existing code. Environments like NetKernel heavily leverage the language concurrency tools so you do not necessarily have to. Consider letting it do the heavy work!