This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
In this chapter it is explained how Lazy Loading is done with MicroStream.
Of course, it's not really about the technical implementation of procrastination, but about efficiency: why bloat the limited RAM with stuff before you even need it?
Classic example: The application has self-contained data areas that contain a large amount of data. The data for an area is not loaded if the area is not worked at all. Instead, you only load a tiny amount of "head data" for each area (name or other for displaying purposes) and the actual data only when the application really needs it. E.g. fiscal years with hundreds of thousands or millions of sales. One million revenue records for 2010, one million for 2011, for 2012, etc. In 2019, most of the time only 2019 and 2018 will be needed. The previous few, and the year 2000 sales are not of great interest anymore. Therefore: load data only when needed. Super efficient.
For example let's say the app "MyBusinessApp" has a root instance class, looking like this:
The business year hold the turnovers:
This approach would be a problem: During initialization, the root instance would be loaded, from there its HashMap
with all BusinessYear
instances, each with its ArrayList
and thus all conversions. For all years. 20 years of approximately 1 million sales makes 20 million entities, which are pumped directly into the RAM at the start. It does not matter if someone needs it or not. We don't want it that way.
It would be nice if you could simply add a "lazy" to the turnover list. And that's exactly how it works:
And bingo, the turnovers are now loaded lazily.
Of course, this is no longer an ArrayList<Turnover>
field, which is now magically loaded lazy, but this is now a Lazy
field and the instances of this type are typed generically to ArrayList<Turnover>
. Lazy
is just a simple class whose instances internally hold an ID and a reference to the actual thing (here the ArrayList
instance). If the internal reference is zero, the reserved ID is used to reload it. If it is not null
, it is simply returned. So just a reference intermediate instance. Similar to the JDK's WeakReference
, just not JVM-weak, but storage-lazy.
What do you have to do now to get the actual ArrayList<Turnover>
instance?
Just as with WeakReference
, or simply as one would expect from a reference intermediate type in general: a simple get
method.
The .get()
call reloads the data as needed. But you do not want to mess around with that yourself. No "SELECT bla FROM turnovers WHERE ID =" + this.turnovers.getId()
. Since you want to program your application you don' t have to mess around with low-level database ID-loading stuff. That's what the MicroStream Code does internally. You do not even need to access the ID, you just have to say "get!".
That's it.
There are different strategies, what you write here. Analogous to the code example before it would be simply:
So always a new ArrayList
instance, wrapped in a Lazy
instance. If the actual ArrayList
reference should be null
at first, it works the same way:
The this.turnovers.get()
also just always returns null
. Completely transparent.
But you could also do this:
If there is no list, then you do not make any intermediate reference instance for any list. A separate instance for null
is indeed a bit ... meh.
But that has a nasty problem elsewhere: this.turnovers.get()
does not work then. Because NullPointerException
.
Anytime you need to write this here, the readability of code is not exactly conducive:
But there is a simple solution: Just move this check into a static util method. Just like that:
This is the same .get()
, just with a static null-check around it. This always puts you on the safe side.
For Lazy Loading, simply wrap Lazy<>
around the actual field and then call .get()
or maybe better Lazy.get(...)
.
It's as simple as that.
The full example can be found on GitHub.
Why do you have to replace your actual instance with a lazy loading intermediate instance and fiddle around with generics? Why is not something like this:
Put simply:
If it were just that it would be bare Java bytecode for accessing an ArrayList
. There would be no way for a middleware library to get access and look it up and perhaps reload it. What's written there is an ArrayList
reference. There is no lazy anymore. Either, the instance is null
, or it is not null
. If you wanted to reach in there, you would have to start with bytecode manipulation. Technically possible, but something you really don't want in your application.
So there must always be some form of intermediary.
Hibernate solves this through its own collection implementations that do lazy loading internally. Although the lazy loading is nicely hidden in some way (or not, if you need an annotation for that), it also comes with all sorts of limitations. You can only use interfaces instead of concrete classes for collections. At first, the instance is not the one you dictate, the code becomes intransparent and difficult to debug, you have to use a collection, even if it's just a single instance, and so on. You want to be able to write anything you want and you want full insight and control (debugability, etc.) over the code.
All this can be done with the tiny Lazy Interim Reference class. No restrictions, no incomprehensible "magic" under the hood (proxy instances and stuff) and also usable for individual instances.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
For convenience MicroStream provides Null-safe static access methods for lazy references.
These are
Lazy.get(Lazy)
Gets the lazy referenced object, loads it if required.
return value:
null
if the lazy reference itself is null
otherwise the referenced object
Lazy.peek(Lazy)
Get the lazy referenced object if it is loaded, no lazy loading is done. If the object has been unloaded before peek will return null
.
return value:
null
if the lazy reference itself is null
otherwise the current reference without on-demand loading
Lazy.clear(Lazy)
Clears the lazy reference if it is not null
.
All lazy references track the time of their last access (creation or querying) as a timestamp in milliseconds. If an instance is deemed timed out by a LazyReferenzManager its subject gets cleared.
The timestamp is currently not public accessible.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Loading data can be done in two ways, eager and Lazy. The basic, default way of loading is eager loading. This means that all objects of a stored object graph are loaded immediately. This is done during startup of the MicroStream database instance automatically if an already existing database is found.
Contrary to lazy loading, eager loading has no requirements to your entity model.
To load your data you just need to create an EmbeddedStorageManager
instance:
After that just get the root instance of your object graph from the StorageManager
by calling EmbeddedStorageManager.root()
and check for null
as this indicates a non-existing database
The full code for the eager loading example is on GitHub.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The Lazy
class has a .clear()
method. When called, the reference held in the Lazy Reference is removed and only the ID is kept so that the instance can be reloaded when needed.
Important background knowledge:
However, such a clear does not mean that the referenced instance immediately disappears from memory. That's the job of the garbage collector of the JVM. The reference is even registered in another place, namely in a global directory (Swizzle Registry), in which each known instance is registered with its ObjectId in a bijective manner. This means: if you clear such a reference, but shortly thereafter the Lazy Reference is queried again, probably nothing has to be loaded from the database, but simply the reference from the Swizzle Registry is restored. Nevertheless, the Swizzle Registry is not a memory leak, because it references the instances only via WeakReference
. In short, if an instance is only referenced as "weak," the JVM GC will still clean it up.
So that the Lazy References do not have to be managed manually, but the whole goes automatically, there is the following mechanism: Each Lazy
instance has a lastTouched
timestamp. Each .get()
call sets it to the current time. This will tell you how long a Lazy Reference has not been used, i.e. if it is needed at all.
The LazyReferenceManager
audits this. It is enabled by default, with a timeout of 1,000,000 milliseconds, which is about 15 minutes.
A custom manager can be set easily, which should happen before a storage is started.
The timeout of lazy references is set to 30 minutes, meaning references which haven't been touched for this time are cleared. In combination with a memory quota of 0.75.