This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Welcome to the MicroStream Reference Manual. This manual includes concepts, instructions and examples to guide you on how to use MicroStream Serializer and Data-Store, version 2.2.
You should be familiar with the Java programming language and you should have installed your preferred Integrated Development Environment (IDE). But since you are here we guest you got that covered ;)
The API documentation is available at https://api.docs.microstream.one/2.2/.
For information on the commercial support for MicroStream see microstream.one.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Tested and officially supported JDKs:
JDK
Supported Versions
8, 11, 13
8, 11, 13
8, 11, 13
8, 11, 12
8, 11
8, 11, 13
8, 11, 13
11, 13
8, 11, 13
19.2.1
API level 26+
In theory MicroStream is compatible with all JDK distributions from Version 8 on.
Every desktop or server operating system which the supported JVMs are available for
Android 8+
MicroStream itself doesn't have any dependencies to other libraries whatsoever. So you don't have to worry about potential conflicts in your environment. This was a matter of choice of ours to keep the life of the developers using MicroStream as simple as possible. On the other hand feel free to include any dependencies you need, MicroStream will play along well, e.g. a logging framework of your choice.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream – Persistence – License Agreement Version 1, September 16, 2019, MicroStream Software GmbH
IMPORTANT: READ THE TERMS OF THIS AGREEMENT CAREFULLY BEFORE DOWNLOADING AND USING THE SOFTWARE. BY DOWNLOADING, INSTALLING, COPYING, ACCESSING, OR USING THE SOFTWARE YOU AGREE TO THE TERMS OF THIS AGREEMENT. IF YOU ARE ACCEPTING THESE TERMS ON BEHALF OF ANOTHER PERSON OR A COMPANY OR OTHER LEGAL ENTITY, YOU REPRESENT AND WARRANT THAT YOU HAVE FULL AUTHORITY TO BIND THAT PERSON, COMPANY, OR LEGAL ENTITY TO THESE TERMS. IF YOU DO NOT AGREE TO ALL THESE TERMS, YOU MUST NOT DOWNLOAD, INSTALL, COPY, ACCESS, OR USE THE SOFTWARE; AND YOU MUST PROMPTLY DESTROY ALL COPIES OF SOFTWARE AND ALL RELATED MATERIALS.
DEFINITIONS. "MicroStream Software GmbH" is a software developing firm based in Regensburg, Germany, hereafter referred to as "MicroStream" or "Licensor", www.microstream.one. "Licensor" refers either to an individual person or to a single legal entity. "Software" is the following, including the original and all whole or partial copies: (i) machine-readable instructions and data, (ii) components, (iii) audio-visual content (such as images, text, recordings, or pictures), (iv) related licensed materials, and (v) license use documents or keys, and documentation. "Agreement" refers to this MicroStream License Agreement.
LICENSE TO USE. As between the parties, Licensor reserves all rights in and to the MicroStream software.
License to internal use and development. Subject to the terms and conditions of this agreement, MicroStream grants you a non-exclusive, non-transferable, limited license to
reproduce and use internally Software complete and unmodified for the sole purpose of running programs, unless additional rights to use are explicitly granted in a written document. As appropriate additional licenses for developers are granted in the Supplemental License Terms. The Licensee is entitled to use the MicroStream software for an unlimited number of open-source projects. The prerequisite for this is (i) that the MicroStream software remains closed source, and (ii) that you have registered online.
License to distribute software. Subject to the terms and conditions of this agreement, MicroStream grants you a non-exclusive, nontransferable, limited license to reproduce and distribute the Software, provided that (i) you distribute the Software complete and unmodified and only bundled as part of, and for the sole purpose of running, your Programs, (ii) the Programs add significant and primary functionality to the Software, (iii) you do not distribute additional software intended to replace any component(s) of the Software, (iv) you do not remove or alter any proprietary legends or notices contained in the Software, (v) you only distribute the Software subject to a license agreement that protects MicroStream's interests consistent with the terms contained in this Agreement, (vi) you agree to defend and indemnify MicroStream and its licensors from and against any damages, costs, liabilities, settlement amounts and/or expenses (including attorneys' fees) incurred in connection with any claim, lawsuit or action by any third party that arises or results from the use or distribution of any and all Programs and/or Software, and (vii) that you have registered online.
RESTRICTIONS.
Proprietary Notices. The software is owned by the licensor, confidential, copyrighted, and licensed, not sold. Title to Software and all associated intellectual property rights is retained by MicroStream and/or its licensors.
Reverse Engineering. THE LICENSEE MAY NOT REVERSE ENGINEER, DECOMPILE OR DISASSEMBLE THE SOFTWARE. THE LICENSEE MAY NOT MODIFY, ADAPT, TRANSLATE, RENT, LEASE, LOAN OR CREATE DERIVATIVE WORKS BASED UPON THE MICROSTREAM SOFTWARE OR ANY PART THEREOF.
Ethnic Restriction. The Licensee acknowledges that the software is not intended for use in the design, construction, operation or maintenance of any nuclear facilities, aircraft navigation or communication systems, air traffic control systems, life support, machines or machines or other equipment in which failure of the software could lead to death, personal injury, or severe physical or environmental damage. MicroStream disclaims any express or implied warranty of fitness for such uses.
Using Trademarks. No right, title or interest in or to any trademark, service mark, logo or trade name of MicroStream or its licensors is granted under this agreement.
TRANSFER. The Licensee may not transfer or assign its rights under this license to another party without MicroStream's prior written consent.
CHANGES TO THIS AGREEMENT.
The Licensor reserves the right at its discretion to change, modify, add or remove terms of use of this Agreement at any time.
Any change, modification, addition or removal of the terms of use of this Agreement must be notified to licensee as soon as possible. Such notification will be done by announcement on the MicroStream website.
The Licensee will have to agree on such change, modification, addition or removal of the terms of use of this Agreement before the use of the latest version of the MicroStream software will be allowed again. In case of a missing renewed consent by licensee, any further use of the MicroStream software will be automatically denied without any right of compensation or reimbursement of payment being due.
In case of modifications and changes of any national or international legal framework having compulsory effect on this Agreement as well as on the provision of any contractual duties, rights and services formerly negotiated between licensor and licensee, licensor shall be allowed to change this Agreement without the explicit consent of the licensee.
TERMINATION. This Agreement is effective until terminated. The Licensee may terminate this Agreement at any time by destroying all copies of Software. This Agreement will terminate immediately without notice from MicroStream if the Licensee fails to comply with any provision of this Agreement. Either party may terminate this Agreement immediately should any Software become, or in either party's opinion be likely to become, the subject of a claim of infringement of any intellectual property right. Upon termination, the Licensee must destroy all copies of Software. MicroStream may terminate your license if you fail to comply with the terms of this Agreement. If MicroStream does so, you must destroy all copies of the program and its proof of entitlement.
EXPORT REGULATIONS. The Licensee may not use or otherwise export or re-export the Software except as authorized by United States law and the laws of the jurisdiction in which the Software was obtained. In particular, but without limitation, the Software may not be exported or re-exported, (i) into any U.S. embargoed countries or, (ii) to anyone on the U.S. Treasury Department's list of Specially Designated Nationals or the U.S. Department of Commerce's Denied Person's List or Entity List or any other restricted party lists. By using the Software, you represent and warrant that you are not located in any such country or on any such list. You also agree that you will not use the Software for any purposes prohibited by United States law, including, without limitation, the development, design, manufacture or production of missiles, nuclear, chemical or biological weapons.
LIABILITY. Licensor shall only be liable for damages occurring on wilful intent or gross negligence. Licensor shall not be liable for any material defects/damages, including consequential damages, loss of income, business or profit, special, indirect or incidental damages due to the use of the MicroStream software. Licensor's liability for material defects is restricted to those taking place during the transfer of the MicroStream software from the original source to Licensee. Licensee indemnifies Licensor against any claim of third parties due to the use of the MicroStream software. Licensee must assume the entire risk of using the MicroStream software.
LIMITED WARRANTIES AND DISCLAIMERS. Unless otherwise set forth in this Agreement, MicroStream warrants for a period of ninety (90) days from your date of download that the Software as provided by MicroStream will perform substantially in accordance with the accompanying documentation. MicroStream's entire liability and your sole and exclusive remedy for any breach of the foregoing limited warranty will be, at MicroStream's option, replacement or repair of the Software.
THIS LIMITED WARRANTY IS THE ONLY WARRANTY PROVIDED BY MICROSTREAM AND MICROSTREAM AND ITS LICENSORS EXPRESSLY DISCLAIM ALL OTHER WARRANTIES, CONDITIONS OR OTHER TERMS, EITHER EXPRESS OR IMPLIED (WHETHER COLLATERALLY, BY STATUTE OR OTHERWISE), INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES, CONDITIONS OR OTHER TERMS OF MERCHANTABILITY, SATISFACTORY QUALITY AND/OR FITNESS FOR A PARTICULAR PURPOSE WITH REGARD TO THE SOFTWARE AND ACCOMPANYING WRITTEN MATERIALS. FURTHERMORE, THERE IS NO WARRANTY AGAINST INTERFERENCE WITH YOUR ENJOYMENT OF THE SOFTWARE OR AGAINST INFRINGEMENT OF THIRD PARTY PROPRIETARY RIGHTS BY THE SOFTWARE. MICROSTREAM DOES NOT WARRANT THAT THE OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT DEFECTS IN THE SOFTWARE WILL BE CORRECTED. NO ORAL OR WRITTEN INFORMATION OR ADVICE GIVEN BY MICROSTREAM OR AN MICROSTREAM AUTHORIZED REPRESENTATIVE SHALL CREATE A WARRANTY. BECAUSE SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF IMPLIED WARRANTIES, CONDITIONS OR OTHER TERMS THE ABOVE LIMITATION MAY NOT APPLY TO YOU. THE TERMS OF THIS DISCLAIMER AND THE LIMITED WARRANTY UNDER THIS SECTION 9 DO NOT AFFECT OR PREJUDICE THE STATUTORY RIGHTS OF A CONSUMER ACQUIRING THE SOFTWARE OTHERWISE THAN IN THE COURSE OF A BUSINESS, NEITHER DO THEY LIMIT OR EXCLUDE ANY LIABILITY FOR DEATH OR PERSONAL INJURY CAUSED BY MICROSTREAM'S NEGLIGENCE.
EXCLUSION AND LIMITATIONS OF REMEDIES AND DAMAGES.
Exclusion. IN NO EVENT WILL MICROSTREAM, ITS PARENT, SUBSIDIARIES, OR ANY OF ITS LICENSORS, DIRECTORS, OFFICERS, EMPLOYEES OR AFFILIATES OF ANY OF THE FOREGOING BE LIABLE TO YOU FOR ANY CONSEQUENTIAL, INCIDENTAL, INDIRECT OR SPECIAL DAMAGES WHATSOEVER (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION AND THE LIKE) OR DIRECT LOSS OF BUSINESS, BUSINESS PROFITS OR REVENUE, WHETHER FORESEEABLE OR UNFORESEEABLE, ARISING OUT OF THE USE OF OR INABILITY TO USE THE SOFTWARE OR ACCOMPANYING WRITTEN MATERIALS, REGARDLESS OF THE BASIS OF THE CLAIM (WHETHER UNDER CONTRACT, NEGLIGENCE OR OTHER TORT OR UNDER STATUTE OR OTHERWISE HOWSOEVER ARISING) AND EVEN IF MICROSTREAM OR A MICROSTREAM REPRESENTATIVE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Limitation. MICROSTREAM'S TOTAL LIABILITY TO THE LICENSEE FOR DAMAGES FOR ANY CAUSE WHATSOEVER NOT EXCLUDED BY SECTION 10.1. ABOVE HOWSOEVER CAUSED (WHETHER IN CONTRACT, NEGLIGENCE OR OTHER TORT, UNDER STATUTE OR OTHERWISE HOWSOEVER ARISING) WILL BE LIMITED TO THE GREATER OF U.S.$5.00 OR THE MONEY PAID FOR THE SOFTWARE THAT CAUSED THE DAMAGES. THE PARTIES AGREE THAT THIS LIMITATION OF REMEDIES AND DAMAGES PROVISION SHALL BE ENFORCED INDEPENDENTLY OF AND SURVIVE THE FAILURE OF ESSENTIAL PURPOSE OF ANY WARRANTY REMEDY. THIS LIMITATION WILL NOT APPLY IN THE CASE OF DEATH OR PERSONAL INJURY CAUSED BY FMI'S NEGLIGENCE ONLY WHERE AND TO THE EXTENT THAT APPLICABLE LAW REQUIRES SUCH LIABILITY. BECAUSE SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE LIMITATION OF LIABILITY IN THIS SECTION 6 MAY NOT APPLY TO YOU. NOTHING IN THIS LICENSE AFFECTS OR PREJUDICES THE STATUTORY RIGHTS OF A CONSUMER ACQUIRING THE SOFTWARE OTHERWISE THAN IN THE COURSE OF A BUSINESS.
SUBLICENSING. Licensee agrees that all distribution of the runtime and extras will be subject to a written agreement, the terms and conditions of which will, at a minimum: (i) grant a nonexclusive right to use only one copy of the Runtime application and/or Extras for each copy of your own Runtime Solutions which you license to your customer, (ii) provide that any subsequent transfer is subject to the restrictions set forth in this Section 11, (iii) state that the Runtime and Extras (or as renamed) are licensed, not sold, to the end-user and that title to all copies of the Runtime and Extras remain with MicroStream and its licensors, (iv) include restrictions substantially similar to those set forth in Section 3 (RESTRICTIONS) and Section 7 (EXPORT REGULATIONS) of this License, and (v) include Warranty Disclaimer and Disclaimer of Liability provisions which are consistent with and substantially similar to the terms set forth in Sections 5 and 6 of this License.
TECHNICAL SUPPORT. You are solely responsible for providing all technical support to your sublicensees of your own runtime solution, and you will not direct any sublicensee to contact MicroStream for technical support regarding your own runtime solution. You further agree to include your name and contact information in your own License Agreement as part of your own runtime solution.
INDEMNIFICATION. You will indemnify and hold MicroStream harmless from any and all claims, damages, losses, liabilities, costs and expenses (including reasonable fees of attorneys and other professionals) arising out of or in connection with any runtime solutions distributed by you and which is based on your contributions to such runtime solution.
GENERAL. The parties agree that the United Nations Convention on Contracts for the International Sale of Goods (1980), as amended, is specifically excluded from application to this License. This License constitutes the entire agreement between the parties with respect to the Software licensed under these terms, and it supersedes all prior or contemporaneous agreement, arrangement and understanding regarding such subject matter. You acknowledge and agree that you have not relied on any representations made by MicroStream, however, nothing in this license shall limit or exclude liability for any representation made fraudulently. No amendment to or modification of this License will be binding unless in writing and signed by MicroStream.
APPLICABLE LAW AND COURT OF JURISDICTION. This agreement shall be governed, subjected to, and construed in accordance with the laws of Germany. All disputes arising from and/or in connection with the present agreement, and/or from any further agreements resulting therefrom, and which the parties are unable to resolve between themselves, shall exclusively be brought before the competent court of jurisdiction in Regensburg, Germany. No choice of law rules of any jurisdiction will apply.
SEVERABILITY CLAUSE. If any provision of this agreement shall be held by a court of competent jurisdiction to be contrary to law, that provision will be enforced to the maximum extent permissible, The concerned provision is superseded in accordance with the legal laws, and the remaining provisions of this agreement will remain in full force and effect.
END OF TERMS AND CONDITIONS
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Removed SelfStoring
without replacement since it could not be used recursively and has no advantages over just creating a static storing utility method for a certain entity.
Added state validation of value type objects (e.g. String, Integer, BigDecimal, etc.) upon loading. This is hardly relevant in practice, but not having it can lead to confusing hello-world-like test applications.
EmbeddedStorageManager
now implements java.lang.AutoClosable
.
Replaced all provisional RuntimeException
s with either PersistenceException
or StorageException
, depending on the architectural level the corresponding source code it located.
The two technically different root handling concepts ("default" and "custom") have been consolidated in a way that they are the same thing on the API level and interchangeable, meaning no more confusion with those root exception messages.
All entity fields of type transient EmbeddedStorageManager
now get a reference to the used EmbeddedStorageManager
instance set upon loading/updating.
The interfaces around storage managing have been enhanced so that it is now equally valid to just write StorageManager
instead of EmbeddedStorageManager
. (An EmbeddedStorageManager
"is a" StorageManager
)
Slight technical consolidation of Lazy reference handling caused the type Lazy to be moved from the package one.microstream.persistence.lazy
to one.microstream.reference
. The reason is that the lazy handling has actually no inherent connection to persistence or storage. It's actually just a generic concept that can be used by those layers. See Migration Guide below on how to adjust existing projects.
Fixed an off-heap memory leak when restarting the storage multiple times in the same process.
Fixed a bug where changing the fields of an entity type caused an exception. This was a regression bug from fixing a similar problem for another case in version 2.1. Now, both cases work correctly.
All occurrences in user code of one.microstream.persistence.lazy.Lazy
have to be refactored to one.microstream.reference.Lazy
. Modern IDEs provide a functionality to "auto-import" missing types or automatically "organize imports", so this should be resolved with a proverbial push of a button.
Android support MicroStream is now Java-wise fully compatible with Android.
Replaced all usages of java.util.File
with java.nio.file.Path
to allow using custom file implementations.
Improved skipping functionality of Storers (see EmbeddedStorageManager#createStorer
and Storer#skip
).
The class Lazy is now an interface to allow custom implementations. See Migration guide below.
Fixed a few minor bugs in the skipping functionality of Storers.
Fixed a bug where files remained locked after the storage was shut down.
Fixed a bug where files remained locked after an exception in storage initialization.
Enums defining an abstract method are now handled correctly.
By default, all threads created by MicroStream now start with the prefix "MicroStream-". This can be customized by the new interface StorageThreadNameProvider
.
Fixed a NullPointerException in import.
Fixed a bug that caused enums with a certain field layout to be loaded inconsistently.
java.util.Locale
is now persisted and created using Locale's #toLanguageTag
and #forLanguageTag
.
In the directory of an existing storage, in the TypeDictionary file (default name "PersistenceTypeDictionary.ptd"), all occurances of "one.microstream.persistence.lazy.Lazy" must be replaced with "one.microstream.persistence.lazy.Lazy$Default".
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
You can find the MicroStream libraries in our Maven repository:
These are the different modules that make up MicroStream.
ArtifactId
Description
base
Collection of common utilities. Math, IO, Exceptions, Threads, String operations, and so on.
communication
Top-level framework for sending and receiving object graphs between Java applications. Only data is transferred, no program code ("bytecode"). The other application may be programmed in any language as long as it adheres to the transmitted communication protocol. Usable directly in the program code of a Java application to communicate with other applications or processes. The concrete form of persistence is left open and delivered via a specific implementation as a plugin. Examples of specific persistent forms are binary data, CSV, XML, Json.
communication.binary
Plugin framework for the top-level framework communication to convert the transferred object graphs to and from binary data.
persistence
Base framework to convert a graph of java objects into a persistent form and back. Usable as a common, abstract base for all technologies implementing a specific persistent representation like binary data, CSV, XML or Json.
From a technical point of view, storage as well as serialization is a process that puts a graph of Java instances into a persistent form. The only difference is that network communication serialization discards this persistent form while a database solution preserves it.
persistence.binary
Extension of the persistence base framework with a concrete implementation of the persistent form as binary data. This persistent form is superior to all text-based formats in storage and performance needs, making it the preferred method for storage and network serialization.
persistence.binary.jdk8
Specialized type handlers for JDK 8 collection types.
storage
Basic framework to manage a graph of Java data persisted as binary data as a database. Can be used both to implement an embedded database solution (in the same process as the Java application) and a standalone or server-mode database solution (in a separate process). Other forms of persistence than binary data are deliberately not supported because they would not bring any noteworthy advantages but many disadvantages for the task.
storage.embedded
Top-level framework for use in a Java application that adds an embedded database solution to its object graphs. Can be used directly in the program code of a Java application to comfortably and efficiently persist its data.
storage.embedded.configuration
Layer with support for external configuration files (XML, INI) and convenience functionality to create foundations for the embedded storage.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream Data-Store is a native Java object graph storage engine. From a technical point of view it serves one purpose only: To fully or partially persist and restore a Java object graph in the simplest way possible for the user.
MicroStream Data-Store is a storage engine, but no database management system (DBMS). Many features that typical DBMS provide have been left out on purpose. The reason is that those features exist to make a DBMS something of a server application platform of an "old kind" on top of its data store functionality: A standalone process with user management, connection management, session handling, often even with a programming language of its own, a querying interface (SQL), etc. Today, all of those server application features are already and much better handled by dedicated server applications (the "new kind"), implemented in a modern language like Java. They have their built-in user, connection and session management, the querying interface to the outside world are typically web services instead of SQL, etc. But those modern server applications still lack one important thing: an easy to use and technically efficient way to store and restore their application's data. So a "new kind" server often uses an "old kind" server just to do the data storing. This comes at the price of catching all the overhead and problems of redundant user, connection and session management AND the outdated concepts and limitations of the old querying interface (SQL). Isn't that very weird and frustratingly complicated? Why not simply include a modern data storing library in the modern server and be done with it? A storing library that perfectly fits the modern technology and brings in no redundant overhead or complication of a secondary outdated wannabe server process. This is exactly what MicroStream Data-Store is and the reason why it is intentionally not a DBMS but "only" a storage engine.
One might think the easiest way to store and load data in Java would be Java's built-in serialization. However, it turned out long ago to be very limited, making it hard, if not impossible, to be used as a replacement for a DBMS:
Only complete object graphs can be stored and restored, which is unacceptable for all but very small databases.
It is very inefficient in terms of storage size and performance.
It does not handle changing class structures very well, basically forbidding classes of persisted entities to ever change or introducing massive manual effort to compensate.
It cannot handle third-party classes that do not implement Serializable but cannot be changed.
In short: The Java Serialization is not an acceptable data store solution and hence no valid replacement for those outdated DBMS.
MicroStream Data-store is such a solution:
It can persist, load or update object graphs partially and on-demand.
It is very efficient both size- and performance-wise.
It handles changing class structures by mapping data in the old structure to the current structure during loading; implicitly via internal heuristics or explicitly via a user-defined mapping strategy.
It can automatically handle any Java constructs, only excluding those that are technically or reasonably not persistable (e.g. lambdas, proxies or instances with ties to JVM-internals like threads, etc.).
MicroStream is what the Java Serialization should have been and it is the first and only really fitting data storing solution for modern applications, completely removing the need to attach a wannabe secondary server DBMS just to store data.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
This simplest example will create a new storage if no existing storage is found, if a existing storage is found it will be loaded (this is all done at line 2 in the example above).
In line 6 the current storage's content is printed.
Line 7 assigns some data to the storage, replacing existing data if there is some.
In line 8 everything gets stored.
When using MicroStream, your entire database is accessed starting at a root instance. This instance is the root object of an object graph that gets persisted by the MicroStream storage logic. While the root instance can be of any type (for example just a collection or an array), it is a good idea to define an explicit root type specific for the application. In this simple example, it is a class called DataRoot
, which wraps a single String
.
More about root instances:
Root InstancesThe following code is all that is required to setup a an application backed by a MicroStream database. The application's convenience root instance is defined and an EmbeddedStorageManager
instance, linked to the root, is created (and its database managing threads are started). This is a fully operational Java database application.
This call is all that is necessary to store data in the simplest case.
Best practice is to safely shutdown the storage manager by simply calling:
storageManager.storeRoot()
is a special case method that always stores the root object. If you want to store any other object than the root itself, just call storageManager.store(modifiedObject)
The full code for the Hello World example is on GitHub.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Object instances can be stored as simple records. One value after another as a trivial byte stream. References between objects are mapped with unique numbers, called ObjectId, or short OID. With both combined, byte streams and OIDs, an object graph can be stored in a simple and quick way, as well as loaded, as a whole or partially.
But there is a small catch. Where does it start? What is the first instance or reference at startup? Strictly speaking "nothing". That's why at least one instance or a reference to an instance must be registered in a special way, so that the application has a starting point from where the object graph can be loaded. This is a "Root Instance".
Same difference, another problem are instances which are references by constant fields in Java classes. These aren't created when the records are loaded from the database, but by the JVM while loading the classes. Without special treatment, this would be a problem:
The application, meaning the JVM or the JVM process, starts, the constant instances are created by the JVM, one or more of them are stored, then the application shuts down.
The stored data of the constants are now stored with a certain OID in the database.
The application starts again.
The Constant instances are created again by the JVM. The data records are read by MicroStream.
The problem is: How should the application know what values, which are stored with a certain OID, belong to which constant? The JVM created everything from scratch at startup and doesn't know anything about OIDs. To resolve this, the constant instances must be registered, just like the entity graph's root instance. Then MicroStream can associate the constant instances with the stored data via the OIDs. Constant instances can be thought of as JVM-created implicit root instances for the object graph.
In both cases, root and constant instances, it is about registering special starting points for the object graph in order to load it correctly. For MicroStream, from a plain technical view, both cases don't make a difference.
In the most common cases, nothing at all. The default behavior is enough to get things going.
By default, a single instance can be registered as the entity graph's root, accessible via EmbeddedStorage.root()
.
Therefore, this is already a fully fledged (although tiny) database application:
The simple default approach has its limits when the application defines an explicit root instance that must be updated/filled from the database directly during database initialization.
Something like this:
To solve this, a custom root instance can be directly registered at the database setup. In the simplest case, is just has to be passed to .start();
:
Internally, the two concepts (default root and custom root) and handled by different mechanisms. This can be seen from the two different methods
The simplified method storageManager.root();
automatically chooses the variant that is used. Since neither of those three methods can know the concrete type of the root instance (and adding a type parameter just for that would have been a complication overkill), they all can only be typed to return Object. So, to avoid annoying and dangerous casts, it is best to keep a direct reference to a custom root instance as shown in the code snippet above.
Likewise, storageManager.storeRoot();
works for both variants, so there is no need to worry about how to store which one.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The EmbeddedStorageManager
is mostly created with factory methods of EmbeddedStorage
, where the most common settings, like database directory or the root instance, can be configured.
To achieve a more detailed customization, you can utilize the EmbeddedStorageFoundation
factory type. It holds and creates on demand all the parts that form an EmbeddedStorageManager
.
The artifact storage.embedded.configuration
provides a convenience layer for configuration purposes, as well as facilities to read external configuration.
The Configuration
type consolidates the most widely used parameters from the storage foundations in one place. It's output is an EmbeddedStorageFoundation
from which a EmbeddedStorageManager
can be created.
To read an external configuration use ConfigurationLoader
and ConfigurationParser
or the Load*()
methods of Configuration
. Currently XML and INI files are supported.
The full example can be found on GitHub.
If you use a different format, e.g. Json, just implement the ConfigurationParser
in the likes of XmlConfigurationParser
or IniConfigurationParser
.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
These are the available properties of the Configuration
type. The names are used accordingly in the external configuration files. They can be found as constants in ConfigurationPropertyNames
.
Property
Short Description
baseDirectory
The base directory of the storage in the file system. Default is "storage"
in the working directory.
deletionDirectory
If configured, the storage will not delete files. Instead of deleting a file it will be moved to this directory.
truncationDirectory
If configured, files that will get truncated are copied into this directory.
backupDirectory
The backup directory.
The number of threads and number of directories used by the storage engine. Every thread has exclusive access to its directory. Default is 1
.
channelDirectoryPrefix
Name prefix of the subdirectories used by the channel threads. Default is "channel_"
.
dataFilePrefix
Name prefix of the storage files. Default is "channel_"
.
dataFileSuffix
Name suffix of the storage files. Default is ".dat"
.
transactionFilePrefix
Name prefix of the storage transaction file. Default is "transactions_"
.
transactionFileSuffix
Name suffix of the storage transaction file. Default is ".sft"
.
typeDictionaryFilename
The name of the dictionary file. Default is "PersistenceTypeDictionary.ptd"
.
Interval in milliseconds for the housekeeping. This is work like garbage collection or cache checking. In combination with houseKeepingNanoTimeBudget the maximum processor time for housekeeping work can be set. Default is 1000
(every second).
Number of nanoseconds used for each housekeeping cycle. Default is 10000000
(10 million nanoseconds = 10 milliseconds = 0.01 seconds).
entityCacheThreshold
Abstract threshold value for the lifetime of entities in the cache. Default is 1000000000
.
entityCacheTimeout
Timeout in milliseconds for the entity cache evaluator. If an entity wasn't accessed in this timespan it will be removed from the cache. Default is 86400000
(1 day).
dataFileMinSize
Minimum file size for a data file to avoid cleaning it up. Default is 1024^2 = 1 MiB.
dataFileMaxSize
Maximum file size for a data file to avoid cleaning it up. Default is 1024^2*8 = 8 MiB.
The degree of the data payload of a storage file to avoid cleaning it up. Default is 0.75
(75%).
Number of threads used by the storage engine. It depicts the numbers of subdirectories as well. Each thread manages one directory in which it writes to and reads from exclusively. The unity of thread, directory and the cached data therefor is called a "Channel".
Number of milliseconds for the house keeping interval. House keeping tasks are, among others:
Garbage Collection
Cache Check
File Cleanup Check
In combination with houseKeepingNanoTimeBudget, it can be specified how many CPU time should be used for house keeping. E.g. interval=1000ms and budget=10000000ns means every second there's 0.01 seconds time for house keeping, so max 1% CPU time used for house keeping. This CPU time window is only used if house keeping work is pending. If nothing has to be done, no time is wasted.
Number of nanoseconds used for each housekeeping cycle. However, no matter how low the number is, one item of work will always be completed. But if there is nothing to clean up, no processor time will be wasted. Default is 10000000
(10 million nanoseconds = 10 milliseconds = 0.01 seconds).
However, no matter how small the time is, one item is done at least. This is to avoid no-ops, if a too small time window is configured.
This time budget is a "best effort" threshold, meaning when at 1ns left, a huge file has to be cleaned or the references of a huge collection have to be marked for GC, then this budget can be exceeded considerably.
The degree of the data payload of a storage file to avoid cleaning it up. There are logical "gaps" in database files, byte ranges of records which were replaced with newer versions of them in subsequent database files. These gaps can be removed to keep the used disk space at a minimum. A value of 1 means: Even at the smallest gap clean up the file. A value of 0 means: Gaps don't matter, never clean up anything. All live data in a file is copied to a current head file. Then the source file only consists of gaps and can be deleted without loss of data.
Minimum file size in bytes of a storage file to avoid merging with other files during housekeeping. Must be greater than 1, maximum is 2GB.
Maximum file size in bytes of a storage file to avoid splitting in more files during housekeeping. Must be greater than 1, maximum is 2GB.
Due to internal implementation details files larger than 2GB are not supported!
This list shows which property configures which type, used by the foundation types, to create the storage manager.
Property
Used by
baseDirectory
StorageFileProvider
deletionDirectory
StorageFileProvider
truncationDirectory
StorageFileProvider
backupDirectory
StorageBackupSetup
channelCount
StorageChannelCountProvider
channelDirectoryPrefix
StorageFileProvider
dataFilePrefix
StorageFileProvider
dataFileSuffix
StorageFileProvider
transactionFilePrefix
StorageFileProvider
transactionFileSuffix
StorageFileProvider
typeDictionaryFilename
StorageFileProvider
houseKeepingInterval
StorageHousekeepingController
houseKeepingNanoTimeBudget
StorageHousekeepingController
entityCacheThreshold
StorageEntityCacheEvaluator
entityCacheTimeout
StorageEntityCacheEvaluator
dataFileMinSize
StorageDataFileEvaluator
dataFileMaxSize
StorageDataFileEvaluator
dataFileDissolveRatio
StorageDataFileEvaluator
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Using a Storage File Provider (one.microstream.storage.types.StorageFileProvider
) allows to specify the location and naming rules for all storage related files.
available properties are:
BaseDirectory The Microstream storages location base directory. Contains channel directories and type dictionary file.
DeletionDirectory If configured, the storage will not delete files. Instead of deleting a file it will be moved to this directory.
TruncationDirectory If configured, files that will get truncated are copied into this directory.
ChannelDirectoryPrefix Channel directory prefix string
StorageFilePrefix Storage file prefix string
StorageFileSuffix storage file extension
TransactionsFilePrefix transactions file prefix
TransactionsFileSuffix transaction file extension
TypeDictionaryFileName filename of the type dictionary
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Channels are the IO threads used by the MicroStream storage engine. A single channel represents the unity of a thread, a storage directory and cached data. Increasing the number of channels means to run more IO threads.
The channel count is an important configuration value that impacts to IO performance.
For the channel configuration the following configuration properties are available:
Channel count
The number of channels that MicroStream will use. Must be
Channel directory prefix
The channel directory will be prefix+channelNumber e.g. "ch_0" if prefix is "ch_"
Data file prefix default is "channel_"
Data file suffix deflaut id ".dat"
Channel file size configuration is done by the the Storage Data File Evaluator.
They can be set By storage.embedded.configuration API:
See also: Configuration
Or by setting a StorageFileProvider
using theEmbeddedStorageFoundation
factory
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Housekeeping interval and time budget is configured by setting up a StorageHousekeepingController
.
Available properties are:
House keeping interval interval the housekeeping is triggered in milliseconds, default once per every second
House keeping budget time budget for housekeeping in nanoseconds, default is 0.01 seconds
The desired file min and max sizes and payload ratio is configured by the StorageDataFileEvaluator
:
available properties are:
Data file minimum size Files smaller then minimum file size will be merged with other files if possible, default is 1 MB.
Data file maximum size Files larger then maximum file size will be split in smaller ones, default is 8 MB.
Data file dissolve ratio Data file payload min ratio to trigger file refactoring, default is 0.75 (75%).
The lifetime of objects in the internal entity cache can be configured by the StorageEntityCacheEvaluator
:
Available properties are:
Entity cache threshold
Abstract threshold value, roughly comparable to size in bytes with a time component, at which a cache must be cleared of some entities. Default is 1000000000
.
Entity cache timeout
Time in milliseconds after that an entity is considered to be old if not read meanwhile. must be greater zero, default is 86400000ms
(1 day).
For external configuration see: Properties
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
By default, the continuous backup is disabled. If enabled the MicroStream instance will clone all changes to another directory. The backup is identical to the primary MicroStream storage.
To enable the continuous backup just set the backup directory:
With storage.embedded.configuration
API:
With MicroStream foundation classes:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
By default, MicroStream uses the operation system's standard file locking mechanism to prevent simultaneous access to the storage files. In the rare case that this is not sufficient to control the file access MicroStream provides a proprietary file lock implementation to ensure exclusive access to the storage files from different applications using MicroStream.
Using this file lock may only be necessary if, while a MicroStream application is running, a second MicroStream application may try to access the same storage and the default file locks are not reliable.
You don't need to activate this feature if:
Only one MicroStream application will access the storage,
MicroStream applications that may access the same storage run on the same system,
other applications that may access the storage files don't use MicroStream to access them.
To activate the internal file lock you need to setup StorageLockFileSetup
:
The default interval the locks are updated is 10 seconds, you can set a custom value in milliseconds with:
Storage.LockFileSetupProvider( final long updateInterval )
To specify the charset used by the lock files use:
Storage.LockFileSetupProvider( final Charset charset )
or, to customize both:
LockFileSetupProvider( final Charset charset , final long updateInterval )
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream is designed to work with object graphs. Thus, storing data means to store an object graph. This includes the object's value fields and references to other objects. Storing an object will also store all instances referenced by this objects that have not been stored before. While storing your data most of the work MicroStream performs for you. You only need to call the store method on the correct object. The rule is: "The Object that has been modified has to be stored!".
Storing objects that are not part of an object graph is most likely pointless.
See Getting Started how to create a database with a root instance.
To store the registered root instance just call the storeRoot()
method of a EmbeddedStorageManager
instance.
To store a newly created object, store the "owner" of the object. In the example below a new object is created and added to the myObjects
list of the root object. Then the modified list gets stored. This will also store the new object.
Before storing a modified object keep in your mind that the modified object needs to be stored.
In case of a value types, like int
, it is the object that has the int field as a member:
Don't forget immutable objects
Immutable objects like String
cannot be modified.
Assigning a new value to a String does not modify the String object. Instead a new String instance is created and the reference is changed!
The full code for the example is on GitHub.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Beside long store(Object instance)
MicroStream provides some convenience methods to store several objects at once:
void storeAll(Iterable<?> instances)
Stores the passed instance in any case and all referenced instances of persistable references recursively, but stores referenced instances only if they are newly encountered (e.g. don't have an id associated with them in the object registry, yet and are therefore required to be handled). This is useful for the common case of just storing an updated instance and potentially newly created instances along with it while skipping all existing (and normally unchanged) referenced instances.
long[] storeAll(Object... instances)
Convenience method to store multiple instances. The passed array (maybe implicitly created by the compiler) itself is NOT stored.
MicroStream does not provide explicit transactions, every call to a store method is automatically a transaction. A store operation is an atomic all or nothing operation. if the store call is successful all data is written to the storage. Otherwise no data is persisted. Partially persisted data will be reverted.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The MicroStream engine supports two general storing strategies: Lazy and eager storing. By default, MicroStream uses the lazy storing strategy.
These storing strategies differ in the way how objects, referenced by the object to be stored are handled if those referenced objects had already been stored.
Lazy storing is the default storing mode of the MicroStream engine.
Referenced instances are stored only if they have not been stored yet. If a referenced instance has been stored previously it is not stored again even if it has been modified.
That's why modified objects must be stored explicitly
In eager storing mode referenced instances are stored even if they had been stored before. Contrary to Lazy storing this will also store modified child objects at the cost of performance.
To use lazy or eager storing explicitly, get an instance of the required Storer
and use it's store methods:
Available Storer
s are:
storage.createLazyStorer()
storage.createEagerStorer()
Standard storing:
storage.createStorer()
will provide corresponding Store
r instances.
Beside the 'global' lazy or eager storing strategies MicroStream allows to implement an individual handling for the storing behavior. See PersistenceEagerStoringFieldEvaluator for details.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The default MicroStream implementation fully supports the Java transient field modifier. Class members marked transient will not be persisted.
It is possible to override the default behavior by implementing a custom PersistenceFieldEvaluator
.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
In some cases, it can be necessary to store modified encapsulated objects that cannot be a accessed from your code.
In the upper code snippet the "hidden" object cannot be accessed by store(myForeignObject.hidden)
if no getter is available. To allow such hidden objects to be stored after they have been modified you have to options:
Set the global storing strategy of the MicroStream instance to eager storing or
Implement and set a custom PersistenceEagerStoringFieldEvaluator
for this field.
To increase performance use immutable sub-graphs as often as possible. Storing those with the provided convenience storing methods or using a thread local storer to insert those sub-graphs concurrently can give a great performance boost.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Loading data can be done in two ways, eager and Lazy. The basic, default way of loading is eager loading. This means that all objects of a stored object graph are loaded immediately. This is done during startup of the MicroStream database instance automatically if an already existing database is found.
Contrary to lazy loading, eager loading has no requirements to your entity model.
To load your data you just need to create an EmbeddedStorageManager
instance:
After that just get the root instance of your object graph from the StorageManager
by calling EmbeddedStorageManager.root()
and check for null
as this indicates a non-existing database
The full code for the eager loading example is on GitHub.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
In this chapter it is explained how Lazy Loading is done with MicroStream.
Of course, it's not really about the technical implementation of procrastination, but about efficiency: why bloat the limited RAM with stuff before you even need it?
Classic example: The application has self-contained data areas that contain a large amount of data. The data for an area is not loaded if the area is not worked at all. Instead, you only load a tiny amount of "head data" for each area (name or other for displaying purposes) and the actual data only when the application really needs it. E.g. fiscal years with hundreds of thousands or millions of sales. One million revenue records for 2010, one million for 2011, for 2012, etc. In 2019, most of the time only 2019 and 2018 will be needed. The previous few, and the year 2000 sales are not of great interest anymore. Therefore: load data only when needed. Super efficient.
For example let's say the app "MyBusinessApp" has a root instance class, looking like this:
The business year hold the turnovers:
This approach would be a problem: During initialization, the root instance would be loaded, from there its HashMap
with all BusinessYear
instances, each with its ArrayList
and thus all conversions. For all years. 20 years of approximately 1 million sales makes 20 million entities, which are pumped directly into the RAM at the start. It does not matter if someone needs it or not. We don't want it that way.
It would be nice if you could simply add a "lazy" to the turnover list. And that's exactly how it works:
And bingo, the turnovers are now loaded lazily.
Of course, this is no longer an ArrayList<Turnover>
field, which is now magically loaded lazy, but this is now a Lazy
field and the instances of this type are typed generically to ArrayList<Turnover>
. Lazy
is just a simple class whose instances internally hold an ID and a reference to the actual thing (here the ArrayList
instance). If the internal reference is zero, the reserved ID is used to reload it. If it is not null
, it is simply returned. So just a reference intermediate instance. Similar to the JDK's WeakReference
, just not JVM-weak, but storage-lazy.
What do you have to do now to get the actual ArrayList<Turnover>
instance?
Just as with WeakReference
, or simply as one would expect from a reference intermediate type in general: a simple get
method.
The .get()
call reloads the data as needed. But you do not want to mess around with that yourself. No "SELECT bla FROM turnovers WHERE ID =" + this.turnovers.getId()
. Since you want to program your application you don' t have to mess around with low-level database ID-loading stuff. That's what the MicroStream Code does internally. You do not even need to access the ID, you just have to say "get!".
That's it.
There are different strategies, what you write here. Analogous to the code example before it would be simply:
So always a new ArrayList
instance, wrapped in a Lazy
instance. If the actual ArrayList
reference should be null
at first, it works the same way:
The this.turnovers.get()
also just always returns null
. Completely transparent.
But you could also do this:
If there is no list, then you do not make any intermediate reference instance for any list. A separate instance for null
is indeed a bit ... meh.
But that has a nasty problem elsewhere: this.turnovers.get()
does not work then. Because NullPointerException
.
Anytime you need to write this here, the readability of code is not exactly conducive:
But there is a simple solution: Just move this check into a static util method. Just like that:
This is the same .get()
, just with a static null-check around it. This always puts you on the safe side.
For Lazy Loading, simply wrap Lazy<>
around the actual field and then call .get()
or maybe better Lazy.get(...)
.
It's as simple as that.
The full example can be found on GitHub.
Why do you have to replace your actual instance with a lazy loading intermediate instance and fiddle around with generics? Why is not something like this:
Put simply:
If it were just that it would be bare Java bytecode for accessing an ArrayList
. There would be no way for a middleware library to get access and look it up and perhaps reload it. What's written there is an ArrayList
reference. There is no lazy anymore. Either, the instance is null
, or it is not null
. If you wanted to reach in there, you would have to start with bytecode manipulation. Technically possible, but something you really don't want in your application.
So there must always be some form of intermediary.
Hibernate solves this through its own collection implementations that do lazy loading internally. Although the lazy loading is nicely hidden in some way (or not, if you need an annotation for that), it also comes with all sorts of limitations. You can only use interfaces instead of concrete classes for collections. At first, the instance is not the one you dictate, the code becomes intransparent and difficult to debug, you have to use a collection, even if it's just a single instance, and so on. You want to be able to write anything you want and you want full insight and control (debugability, etc.) over the code.
All this can be done with the tiny Lazy Interim Reference class. No restrictions, no incomprehensible "magic" under the hood (proxy instances and stuff) and also usable for individual instances.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
For convenience MicroStream provides Null-safe static access methods for lazy references.
These are
Lazy.get(Lazy)
Gets the lazy referenced object, loads it if required.
return value:
null
if the lazy reference itself is null
otherwise the referenced object
Lazy.peek(Lazy)
Get the lazy referenced object if it is loaded, no lazy loading is done. If the object has been unloaded before peek will return null
.
return value:
null
if the lazy reference itself is null
otherwise the current reference without on-demand loading
Lazy.clear(Lazy)
Clears the lazy reference if it is not null
.
All lazy references track the time of their last access (creation or querying) as a timestamp in milliseconds. If an instance is deemed timed out by a LazyReferenzManager its subject gets cleared.
The timestamp is currently not public accessible.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The Lazy
class has a .clear()
method. When called, the reference held in the Lazy Reference is removed and only the ID is kept so that the instance can be reloaded when needed.
Important background knowledge:
However, such a clear does not mean that the referenced instance immediately disappears from memory. That's the job of the garbage collector of the JVM. The reference is even registered in another place, namely in a global directory (Swizzle Registry), in which each known instance is registered with its ObjectId in a bijective manner. This means: if you clear such a reference, but shortly thereafter the Lazy Reference is queried again, probably nothing has to be loaded from the database, but simply the reference from the Swizzle Registry is restored. Nevertheless, the Swizzle Registry is not a memory leak, because it references the instances only via WeakReference
. In short, if an instance is only referenced as "weak," the JVM GC will still clean it up.
So that the Lazy References do not have to be managed manually, but the whole goes automatically, there is the following mechanism: Each Lazy
instance has a lastTouched
timestamp. Each .get()
call sets it to the current time. This will tell you how long a Lazy Reference has not been used, i.e. if it is needed at all.
The LazyReferenceManager
audits this. By default, it is not enabled because it needs its own thread and it's problematic when frameworks start threads unsolicited. But it is activated quickly:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Deleting data does not require performing explicit deleting actions like DELETE FROM table WHERE...
. Instead you just need to clear any references to the object in your object-graph and store those changes. If a stored object is not reachable anymore its data will be deleted from the storage later. This behavior is comparable to Java's garbage collector.
Deleted data is not erased immediately from the storage files
The erasing from the storage files is done by the housekeeping process.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The MicroStream engine takes care of persisting your object graph. When you do queries, they are not run on the data stored by MicroStream, queries run on your data in the local system memory. There is no need to use special query languages like SQL. All operations can be done with plain Java. MicroStream does not restrict you in the way you query your object graph. You are totally free to choose the best method fitting to your application.
One possibility may be Streams if you use the standard Java collections.
Of course you must care about lazy loading if you use that feature.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Actually, a database is a passive collection of persisted data that can never be live on its own. But the managing thread accessing it can.
When an EmbeddedStorageManager is "started" it is actually just setup with all kinds of default and user-defined settings and definitions. What is actually "started" are the database managing threads that process storing and loading requests.
Of course, for every start() method, there needs to be something like a shutdown() method. So there is in MicroStream:
But is it really necessary to call shutdown? Should it be? What if there's an error and the process stops without calling shutdown()? Will that cause the database to become inconsistent, corrupted, maybe even destroyed?
The answer is: It wouldn't be much of a database solution if a crash could cause any problem in the persisted data. MicroStream data-store is carefully designed in such a fashion that the process it runs in can simply vanish at any point in time and the persisted data will never be corrupted.
This is surprisingly simple and reliable to solve:
Whenever a .store()
call returns, it is guaranteed that the data stored by it has been physically written to the underlying storage layer, usually a file system. Before that, there is no guarantee regarding written data at all. In fact, should the process die before the last byte has been written and secured, the next StorageManager initialization will recognize that and truncate the last partially written store. Either way, all the data that was guaranteed to be written will be consistently available after the next .start()
.
As a consequence, this safety mechanism makes an explicit .shutdown()
call pretty much unnecessary. It doesn't hurt, but it is effectively more-less the same as just calling System.exit(0);
.
The only time when an explicit shutdown is really needed is, if the database managing threads shall be stopped but the application itself keeps running. For example, it is perfectly valid to start the StorageManager, work with the database, then stop it, maybe change some configuration or copy files or something like that and then start it up again to continue working.
In any other case, the shutdown method can be ignored and the live database can happily just be "killed" while running. It is specifically designed to withstand such a treatment.
Any live MicroStream database basically consists of three major parts:
A place where the persisted data is located. Usually a file system directory.
The managing threads accessing (read and write) the persisted data.
The EmbeddedStorageManager instance to use and control the database in the application.
Apart from a lot of internal components (configuration, processing logic, housekeeping state, etc.), that's all there is. There is nothing special or "magic" about it, no static state, no global registration in the JVM process or something like that.
The consequence of this is: If two EmbeddedStorageManager instances are started, each one with a different location for its persistend data, then the application has two live databases! If three or ten or 100 are started, then that's the number of live databases the application has. There is no limit and no conflict between different databases inside the same application process. The only important thing is that no two running StorageManagers can access the same data location.
Refactoring V2
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
If one or more fields in a class have changed, the data structure of this class doesn't match anymore with the records in the database. This renders the application and the database incompatible.
It's like in an IDE. You change the structure of a class and the tooling takes care of the rest. Problem is, in a database, the "rest" can be, in some circumstances, several gigabytes or even more, that have to be refactored and written again. It's one way to do it, but there are better alternatives.
At best, the data is transformed when it's accessed only. The old (legacy) type data is being mapped to the new type when it's being loaded, hence: Legacy Type Mapping.
Nothing needs to be rewritten. All records are, as they were saved, compatible with all other versions of their type. Simply by mapping while loading.
What has to be done to achieve this? In the most common cases, nothing!
The heuristic attempts to automatically detect which fields are new, have been removed, reordered or altered.
The fields in the Contact
entity class have been renamed, reordered, one was removed, one is new.
What the heuristic is doing now is something like this:
String firstname
is equal in both classes, so it has to be the same, pretty much as int age
.
name
and lastname
is pretty similar, type is the same too. If there is nothing better for the two, they probably belong together. Same with the other two fields.
In the end, the ominous link
and postalAddress
remain.
The heuristic can not make sense of that, so it assumes that one thing falls away and the other one is added. In this particular example, that worked perfectly. Well done, heuristic.
But:
Just as people can make mistakes in estimating similarities ("I would have thought ..."), even programs can make mistakes as soon as they logically go on thin ice. There is nothing more with absolute correctness that you actually know from (bug-free) software. Such a similarity matching will be correct in the most cases, but sometimes it will also fall by the wayside.
Example: perhaps only PostalAddress
instances were referenced in the application under link
and the two fields would actually be the same, only now properly typed and named. How should heuristics know that? Nobody could know that either, if he is not privy to the details of the concrete application.
That's why Legacy Type Mapping has two mechanisms that prevent things from going wrong:
A callback interface is used to create the desired mapping result: PersistenceLegacyTypeMappingResultor
Optionally, an explicit mapping can be specified, which is then preferred to the heuristic approach.
If you do not want that, you can simply set another resultor. Like in this example each suspected mapping is submitted once to the user for control in the console. This is done with the InquiringLegacyTypeMappingResultor
.
Maybe even one, where the user can "rewire" the mapping itself, write out the mapping, and then return an appropriate result.
All you need is two columns of strings: from old to new. By default MicroStream uses a CSV file, but you can also write something else. In the end, a lot of string pairs for "old -> new" mappings have to come into the program somewhere.
The concept is simple:
If there are two strings, this is interpreted as a mapping from an old thing to a new thing.
If the second value is missing, it is interpreted as an old thing to be deleted.
Missing the first value, then it's as a new thing.
Why call it "thing"? Because this applies to several structural elements:
Constant identifier
Class names
Field names
Example:
count; articleCount
means: the field named earlier count
is called articleCount
in the current version of the class. count;
means: the early field count
should be ignored during the mapping. More specifically, the values of this field per record. ;articleCount
means, this is a newly added field, DO NOT try to match it with anything else heuristically.
You can also mix explicit mapping and heuristics. Only explicitly specify so many changes until the analysis gets the rest right by itself. That means you never have to specify the annoying trivial cases explicitly. Only the tricky ones. Usually, nothing should be necessary at all, or maybe a view indications at most to avoid mishaps.
However, those who strictly prefer to make any change explicitly, instead of trusting a "guessing" software, can also do that. No problem.
For class names, the three variants map, add and remove are somewhat tricky in meaning: Map is just old -> new, same as with fields. To make an entry for a new class doesn't make sense. It's covered by the new class itself. You can do it, but it has no effect. Marking a removed class as deleted makes no sense either, except one special case.
It is not required to specify the fields mapping of mapped classes if the mapping heuristic can do a correct field mapping. Especially if classes have been renamed only.
Classes are simply referred to by their full qualified class name:
com.my.app.entities.Order
In some cases you need to specify the exact Version of the class, then the TypeId has to be prepended:
1012345:com.my.app.entities.Order
Mapping from old to new:
com.my.app.entities.Order;com.my.app.entities.OrderImplementation
For fields it's a bit more complex.
To unambiguously refer a field, the full qualified name of its defining class has to be used.
com.my.app.entities.Order#count;com.my.app.entities.Order#articleCount
The #
is based on official Java syntax, like e.g. in JavaDoc.
If inheritance is involved, which must be uniquely resolved (each class in the hierarchy can have a field named "count"), you must also specify the declaring class. Like this:
com.my.app.entities.Order#com.my.app.entities.ArticleHolder#count; ⤦ com.my.App.entities.Order#com.my.app.entities.ArticleHolder#articleCount
A simple example:
So far so good, all classes and fields are getting mapped, automatically or manually. But what about the data? How are the values getting transformed from old to new? Technically speaking it's done fully automatic. But there are some interesting questions:
Let's say int
to float
. Just to copy the four bytes would yield wrong results. It has to be converted, like float floatValue = (float)intValue;
Can it be done? Yes, fully automatic.
The class BinaryValueTranslators
does the job for you, it has a converter function from each primitive to another.
Currently MicroStream supports conversion between primitives and their wrapper types, and vice versa.
When converting a wrapper to a primitive, null
is converted to 0
.
If you need special conversions between object types, you can add custom BinaryValueSetter
for that, see customizing.
How fast is that?
The type analysis happens only once during initialization. If no exception occurs, the Legacy Type Mapping is ready-configured for each necessary type and will then only be called if required. For normal entity classes that are parsed by reflection, legacy type mapping loading is just as fast as a normal load. An array of such value translator functions is put together once and they are run through each time they are loaded. With legacy mapping, only the order and the target offsets are different, but the principle is the same as with normal loading.
For custom handlers an intermediate step is necessary: First put all the old values together in an order that the custom handler expects and then read the binary data normally, as if loading a record in the current format. That's necessary because MicroStream can't know what such a custom handler does internally. If someone ever uses such a custom handler, the small detour is not likely to be noticeable in terms of performance. And if it should be the case and it has a negative effect on the productive operation: No problem, because: Of course you can also write a custom legacy type handler. It would run at full speed even with tricky special cases.
Of course there is the possibility, as always, of intervening in the machinery massively with customizing.
If you need the highest possible performance for some cases, or for logging / debugging, or anyway: Register any value translator implementations. In the simplest case this is 1 line of code, so do not worry. Being able to specify refactoring mapping in a different way than a CSV file is another example. You can even customize (extend or replace) the strategy that is looked up in refactoring mapping.
Furthermore, you can also replace the heuristic logic with your own. This is easier than it sounds. This is just a primitive little interface (PersistenceMemberSimilator
) and the default implementation thereof calls e.g. just a Levenshtein algorithm for names. You can certainly do that 10 times more clever. Or "more appropriate" for a particular application or programming style. E.g. utilize annotations.
The basic statement is: If there is a problem somewhere, whether with the heuristic or a special case request or performance problem loading a gazillion entities all at once, or if there is a need for debugging in depth or something like that: do not panic. Most likely, this is easily possible with a few lines of code.
Customizing examples:
User InteractionMore information about customizing in general:
CustomizingYou can not just mark classes as deleted. As long as there are records of a certain type in the database, the corresponding class must also exist so that the instances from the database can be loaded into the application. If there are no more records, then that means that there are only a few bytes of orphaned description in the type dictionary, but nobody cares. Is it possible to delete it by hand (or rather not, there are good reasons against it) or you can just ignore it and leave it there forever. In both cases, you must not mark a class as deleted.
Now the special case:
In the entity graph (root instances and all recursively reachable instances from there) all references to instances of a certain type are filled in. It's done by the application logic or possibly by a specially written script. That is, all instances of this type are unreachable. No instance is available, no instance can ever be reloaded. This means that the type is "deleted" from the database at the logical level. One does not have to register anywhere, that is implicitly just like that. You can actually delete the corresponding Java class from the application project because it will never be needed again during the loading process at runtime.
So far so good.
There is only one problem: even if the instances are never logically accessible again: the data records are still around in the database files. The initialization scans over all database files, registers all entities, collects all occurring TypeIds and ensures for every TypeId that there is a TypeHandler
for it. If necessary, a LegacyTypeHandler
with mapping, but still: there must be a TypeHandler
for each TypeId. And a TypeHandler
needs a runtime type. That is, ass-backwards, over records that are logically already deleted, but only physically still lying around, now it is again enforced that the erasable entity class must be present. Bummer. One can prevent this: there is a "cleanup" function in the database management logic, which cleans up all logical gaps in the database files (actually copies all non-gaps into a new file and thus deletes the old file altogether). You would have to call it, then all physical occurrences of the unreachable records disappear and you could easily delete the associated class. But that is annoying.
That is why it makes sense for these cases - and only for them - to do the following:
If you as a developer are absolutely sure that no single instance of a given class is ever reachable again, i.e. must be loaded, then you can mark a type as "deleted" (rather "unreachable") in the refactoring mapping. Then the Type Handling will create a dummy TypeHandler
that does not need a runtime class. See PersistenceUnreachableTypeHandler
. But be careful: if you are mistaken and an instance of such a type is still referenced somewhere and eventually loaded later at runtime, then the Unreachable handler will throw an exception. At some point during the runtime of the application, not even during initialization. The cleanup provides real security: remove all logical gaps and if then with a deleted class no more error in the initialization occurs, it is really superfluous.
Any ideas, such as simply returning null
in the dummy type handler instead of an instance, are a fire hazard: it may dissolve some annoying situations pleasantly, but it would also mean that existing datasets, potentially entire subgraphs, become hidden from the application. Nevertheless, the database would continue to drag them along, perhaps becoming inexplicably large, and any search for the reason would yield nothing, because the dummy type handler keeps the data records secret. Shortsighted great, but catastrophic in the long run. That's not good. The only clean solution is: you have to know what to do with your data model. As long as there are still available instances, they must also be loadable. The annoying special case above can be defused without side effects. But it can not be more than that, otherwise it will get rid of the chaos, problems and lost confidence in the correctness of the database solution.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Here is an overview of how to enable and configure different levels of user interaction for the Legacy Type Mapping.
Somewhere you have a foundation
instance, a foundation in where everything is configured, from which the StorageManager
is created.
It itself contains a foundation for connections. To access the inner thing needs a little detour.
Incidentally, that's not a JDBC connection, but this is just one thing that creates helper instances like Storer
and Loader
. Because Legacy Type Mapping affects loading, it has to go in there.
Either you access it directly, like this:
Or like this, that's better for method chaining.
If you have that, the configuration for the Legacy Type Mapping callback logic is just a one liner:
That's just the necessary logic, without anything further. If you do not change anything, this is done by default.
That wraps a printer around the necessary logic. All these storage and persistence classes are nothing sacred or super duper intertwined or anything. These are just interfaces and if you plug in another implementation then it will be used.
Resultor which asks the user to apply. More customization is possible, see below.
With the implementation of just one single interface method, you can build anything else you can imagine. For example, logging to a file instead of the console. Or in the personally preferred logging framework. Or write confirmed mappings into the refactorings file. Everything is possible.
For the inquiring implementation (InquiringLegacyTypeMappingResultor
) there are a few settings: When should he ask? Always or only if something is unclear. Never does not make any sense of course, then you shouldn't use it, or alternatively the printing resultor.
When is a mapping unclear? If at least one field mapping is not completely clear. A field mapping is clear if:
If two fields are exactly the same (similarity 1.0 or 100%)
Or if two fields are specified by the explicit mapping.
So if all fields are clear according to the above rule, then there is no need to ask.
And there is another special case: If a field is discarded that is not explicitly marked as discardable, then as a precaution an inquiry is always done. Although no data is lost, but the data would not be available in the application, so better ask.
There are options to control this a bit finer. You can optionally specify a double as a threshold (from 0.0 to 1.0, otherwise Exception): The value determines how similar two matching fields automatically have to be so that they are not inquired. Example: The value is 0.9 (90%), but a match found is only 0.8 (80%) similar. This is according to the specification too little, there must be an inquiry as a precaution. If you specify 1.0, that means: always ask, everything is really perfectly clear. If you enter 0.0, this means: never ask or only for implicitly dropping fields.
Looks like this:
Here a small example with a Person
class.
It should be changed to:
Without explicitly predefined mappings, the inquiry would look like this:
customerid
and pin
are too different to be automatically assigned to each other. Therefore, it is wrongly assumed that customerid
is omitted and pin
is new. comment
and commerceId
are surprisingly similar (75%) and are therefore assigned.
But that's not what we want.
Incidentally, it would not matter here what is defined as a threshold: customerid
would be eliminated by the implicit decision. This is too delicate not to inquire, so it is always necessary to ask.
To get the mapping right, you have to specify two entries:
customerid
is now called pin
and comment
should be omitted
Then the inquiry looks like this:
Due to the explicit mapping from customerid
to pin
, the similarity does not matter, it is the mapping that matters. To indicate this, it says "[mapped]" instead of the similarity. The rest is as usual. Only comment is now "[discarded]", according to the mapping. The difference to the above is namely: This is an explicitly predetermined absence. That does not force inquiry.
This clears the way for the threshold:
If you enter 0.7 or more then you will be asked. As far as everything would be clear, but the mapping of surname
to lastName
is below the required "minimum similarity", so rather ask.
If you enter 0.6 or less, you will no longer be asked. Because all assignments are either explicitly specified or they are according to "minimum similarity" similar enough to rely on it.
A recommendation for a good value for the "minimum similarity" is difficult. As soon as one softens rules, there is always the danger of a mistake. See comment
example above: is 75% similar to commerceId
. Still wrong. Then prefer 80%? Or 90%? Of course it is better, but the danger is still there.
If you want to be sure, just make 1.0 or omit the parameter, then by default 1.0 is taken.
The most important is the explicit mapping anyway : if "enough" is given by the user, there is no need to ask.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Continuous backup mode allows the MicroStream engine to clone all storage actions to a second storage. This storage should be placed on another file system.
to configure and enable the continuous backup see Backup.
MicroStream provides the possibility to import and export data, see Import / Export.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream provides an API to import and export persisted data of the storage. It is pretty much the same as writing and reading a backup.
The records in the storage are distributed in lots of files and folders, depending on channel count and other settings. To get order in the chaos the export produces one file per type. This files are used again by the import to read the data into the storage.
The created binary type data files contain only records of the according type, nevertheless they have the same format as the channel storage files.
It is also possible to convert the exported binary files to a human readable format, namely CSV.
Why CSV?
Contrary to XML or JSON, CSV is perfectly suited to represent records with the least possible overhead. There are a lot of tools, like spreadsheet editors, which can read and modify CSV files. The file's size is at the possible minimum and the performance of the converter is significantly better than with the other formats.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Housekeeping is an internal background logic to optimize the database's usage of memory and persistent storage space (typically disc space). It is comprised of mechanisms for cleaning up storage files, clearing unneeded cached data and recognizing deleted entities via garbage collection. Housekeeping is performed with a configurable time budget in configurable intervals to make sure it never interferes with the application's work load too much (see Housekeeping Configuration).
If new versions of an entity are stored or if entities become no longer reachable (meaning the become effectively deleted or "garbage" data), their older data is no longer needed. However, the byte sequences representing that old data still exist in the storage files. But since they will never be needed again, they become logical "gaps" in the storage files. Space that is occupied, but will never be read again. It might as well be all zeroes or not exist at all. Sadly, unwanted areas cannot simple by "cut" from files. Above all because that would ruin all file offsets coming after them. So with every newly stored version of an entity and every entity that is recognized as unreachable "garbage", a storage file consists more and more of useless "gaps" and less and less of actually used data. This makes the storage space less and less efficient. To prevent eventually ending up with a drive that is filled with useless bytes despite an actually not that big database, the files need to be "cleaned up" from time to time. To do this, the Housekeeping occasionaly scans the storage files. If their "payload" ratio goes below the configured limit, the affected files will be retired: all data that belongs to still live entities is copied to a new file. Then the old file consists of 100% unneeded gap data and can safely be deleted.
Which ratio value to set in the configuration is a matter of taste or, more precisely, depends on each indivudual application's demands. A value of 1.0 (100%) means: only files with 100% payload, so no gaps at all, are acceptable. This means that for every store that contains at least one new version of an already existing entity, the corresponding storage file will contain the slightest gap, thus dropping below the demanded ratio of 100% and as a consequence, will be retired on the next occasion. This very aggressive cleanup strategy will keep the disc space usage at a perfect minimum, but at the cost of enormous amounts of copied data, since virtually every store will cause one or more storage files to be retired and their content be shifted into a new file. Respectively, a value of 0.0 (0%) means something like: "Never care about gaps, just fill up the disc until it bursts." This keeps the disc write loads for the file cleanup at 0, but at the cost of rapidly eating up disc space.
The best strategy most probably lies somewhere in between. Somewhere betwen 0.1 and 0.9 (10% and 90%). The default vaue is 0.75 (75%). So a storage file containing up to 25% of unused gap data is okay. Containing more gaps that 25% will cause a storage file to be retired.
In addition to the payload ratio check, the file cleanup also retired files tha are too small or too big. For example: The application logic might commit a single store that is 100 MB in size. But the storage files are configured to be no larger than 10 MB (for example to keep a single file cleanup nice and fast). A single store is always written as a whole in the currently last storage file. The reason for this is to process the store as fast as possible and quickly return control to the application logic. When the housekeeping file cleanup scan encounters such an oversized file, it will retire it immediately by spliting it up accross 10 smaller files and then deleting the oversized file. A similar logic applies to files that are too small. Upper and lower size bounds can be freely configured to arbitrary values. The defaults are 1 MB and 8 MB.
To avoid repeated reads to storage files (which are incredibly expensive compared to just reading memory), data of once loaded entities is cached in memory. If an entity's cached data is not requested again for a certain amount of time in relation to how much data is already cached, it is cleared from the cache to avoid unnecessarily consuming memory. The mechanism to constantly evaluate and clear cached data where applicable, is part of the housekeeping. The aggressiveness of this mechanism can be configured via the Housekeeping Configuration.
In a reference-based (or graph-like) data paradigm, instances never have to be deleted explicitely. For example, there is no "delete" in the java language. There are only references. If those references are utilized correctly, deleting can be done fully automatically without any need for the developer to care about it. This is called "garbage collection". The concept is basically very simple: when the last reference to an instance is cut, that instance can never be accessed again. It becomes "garbage" that occupies memory with it data that is not needed any longer. To identify those garbage instances, all an algorithm (the "garbage collector") has to do is to follow every reference, starting at some defined root instance (or several) of a graph and mark every instance it encounters as "reachable". When it has no more unvisited instances in its queue, the marking is completed. Every instance that is not marked as reachable by then must be unreachable garbage and will be deleted from memory.
Similar to the JVM's garbage collection to optimize its memory consumption, MicroStream has a garbage collection of its own, but for the level of persistent storage space instead of memory space.
However, MicroStreams multi-threaded garbage collector is currently still in development and not activated, yet.
Housekeeping can also be triggered manually from the StorageConnection
. Related methods are:
issueCacheCheck(nanoTimeBudgetBound)
issueCacheCheck(nanoTimeBudgetBound, entityEvaluator)
issueFileCheck(nanoTimeBudgetBound)
issueFileCheck(nanoTimeBudgetBound, fileDissolvingEvaluator)
issueFullCacheCheck()
issueFullCacheCheck(entityEvaluator)
issueFullFileCheck()
issueFullFileCheck(fileDissolvingEvaluator)
issueFullGarbageCollection()
issueGarbageCollection(nanoTimeBudget)
All Housekeeping methods can be given a defined time budget or can be run until full completion.
Frequently Asked Questions - Answered
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
No. MicroStream allows you to store any Java object. Instances of any and all types can be handled, there are no special restrictions like having to implement an interface, using annotations or having a default constructor (see POJO). Only types bound to JVM-internals like Thread, IO-streams and the like are deliberately excluded from being persistable since they could not be properly recreated upon loading, but such instances should not be part of entity data models, anyway.
During initialization, MicroStream automatically checks if your runtime entity classes are still matching the persistent data. Mismatches are automatically mapped when loading data based on predefined rules that you can extend and overwrite on a per-case basis if needed.
Legacy Type MappingThis is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream connects your application's entity graph residing in memory to a physical form of data (i.e. persistent data) to/from which entity data is stored/loaded as required.
MicroStream uses the common concept of "Lazy Loading", allowing you to define which parts of your data (entity sub-graphs) are loaded only when required instead of eagerly at startup. A few well-placed lazy references in your entity model make your application load only a tiny bit of "head" entities at startup time and load everything else later on demand. This allows the handling of arbitrarily big databases with relatively small memory requirements.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream stores persistent data in a physical form, typically in native file-system files.
Yes, as many as you like. Each MicroStream instance represents one coherent entity graph of persistent data.
Yes. This is already done automatically. The minimum and maximum size of every partial file can be configured, although this is a very technical detail that should not be relevant in most cases.
At any given time, only one JVM process may directly access the files representing a unique set of data. Such a restriction is crucial for the correct execution of any application: changes to an application's persistent data have to be guarded by the rules of the application's business logic, i.e. the process that currently runs the application. Allowing another process to bypass these rules would eventually result in catastrophic consistency errors. The requirement to distribute an application over multiple processes must be solved by a clustering approach (e.g. by distributing logic AND persistent data over multiple processes or by having one process to serve as the data master for multiple worker processes).
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Yes, since version 2.2 we provide an all-in-one multi-release jar which is compatible with Jigsaw projects. This single jar solution was necessary because different MicroStream modules use the same packages, but the module system doesn't allow that multiple modules declare the same exports.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Yes. In fact, every storing of data is executed as a transaction, an atomic all-or-nothing action. When one or more entities are stored, their data is collected into a continuous block of bytes and that block is written to the physical form (the "files") in one fell swoop. Any problem during the IO-operation causes the whole block to be deleted (rolled back).
Yes. The storing and loading process can be parallelized by using multiple threads and thus be strongly accelerated. There is no limitation on how many threads can be used, apart from the mathematical constraint that the thread count must be a power of 2 (1, 2, 4, 8, 16, etc.).
Yes. There are currently two options available to create backups: An on-the-fly backup that copies and validates stored entity data after it has been written and the possibility to export all database files to a target location (which is in fact just a low-level file copying, but executed in a concurrency-safe way).
Yes. MicroStream provides a per-type export of binary data and a utility to convert its binary data into the CSV format. The other way around (convert CSV to binary an import binary files) is also possible.
No, because it doesn't need to. Such concerns are long covered by the application itself, with the DBMS usually being degraded to only being the application's exclusive database. Thus, all that is needed for a modern business application is just an application-exclusive data storage solution, which is exactely what MicroStream is.
Yes, if the data is structured in a format conforming to the entity classes and with references being represented in globally unique and bijective numbers. How hard that is for a given database depends on its specifics, but it can be as easy as executing one generically created SELECT per table.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Custom Type Handler allow taking control over the storing and loading procedure of specific java types. This is useful to optimize the performance for storing complex objects or in the rare case that it is not possible to store a type with the default type handlers.
Suitable base class to start the implementation of a custom type handler for the Microstream standard binary storage implementation are:
one.microstream.persistence.binary.internal.AbstractBinaryHandlerCustomValue
for simpler custom type handling in case only value have to be stored
or
one.microstream.persistence.binary.internal.AbstractBinaryHandlerCustom
if the object own references that have to be persisted too.
This example implements a custom type handler for the java.awt.image.BufferedImage class. Instead of storing the rather complex object structure of that class the image is serialized as png image format using javax.imageio.ImageIO into an byte array. This byte array is stored by microstream.
The custom type handler must be registered in the CustomTypeHandlerRegistry to enable it:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
In addition to the methods for legacy type mapping described in chapter Legacy Type Mapping there is also the possibility to implement custom legacy type handlers. Those handlers are the most flexible way to do the mapping from old to new types.
The basic interface that has to be implemented is one.microstream.persistence.types.PersistenceLegacyTypeHandler.
Fortunately the standard persistence implementation provides the abstract class one.microstream.persistence.binary.types.BinaryLegacyTypeHandler.AbstractCustom
that should be sufficient to start with a custom implementation in most cases.
See the example customLegacyTypeHandler on GitHub
Please note the this example requires manual code modifications as described in it's main class.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Implementing the PersistenceEagerStoringFieldEvaluator
interface allows you to handle the eager/lazy storing behavior of any known member. The default implementation of the MicroStream engine threads all fields as lazy storing. See Lazy and Eager Storing for details on lazy and eager storing.
The PersistenceEagerStoringFieldEvaluator
has only one method to be implemented: public boolean isEagerStoring(Class<?> t, Field u)
return true if the field has to be eager, otherwise return false.
To register the customized PersistenceEagerStoringFieldEvaluator
add it using the one.microstream.persistence.types.PersistenceFoundation
.setReferenceFieldEagerEvaluator(PersistenceEagerStoringFieldEvaluator)
method during the storage initialization.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Concept to separate the basic aspects of what defines an entity into separate instances of different layers:
Identity, a never to be replaced instance representing an entity in terms of references to it
Logic, nestable in an arbitrary number of dynamically created logic layers, e.g. logging, locking, versioning, etc.
Data, always immutable
Entity graphs are constructed by stricly only referencing identity instances (the "outer shell" of an entity), while every inner layer instance is unshared. This also allows the actual data instance to be immutable, while at the same time leaving referencial integrity of an entity graph intact.
MicroStream provides ready-to-use logic layers for:
Logging
Versioning
While the layers admittedly introduce considerable technical complexity and runtime overhead, this concept is a production ready solution for nearly all requirements regarding cross cutting concerns and aspects.
To use this concept in your code, there need to be at least implementations for the entity's identity and data.
Let's say the entity looks like this:
There needs to be a identity class:
And a data class:
A lot of code to write to get an entity with two properties!
But don't worry, there is a code generator for that. An annotation processor to be precise. The only code you have to provide are the entity interfaces, all the other stuff will be generated.
Just add the annotation processor typeone.microstream.entity.codegen.EntityProcessor
to your compiler configuration. That's it.
The generator also builds a creator:
An Updater:
An optional equalator, with equals
and hashCode
methods:
And an optional Appendable:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The layered entities code generator is an annotation processor, provided by the base
module.
The maven configuration looks like this:
If you don't want the HashEqualator
to be generated, just set the microstream.entity.hashequalator
argument to false
. You can leave it out otherwise, the default value is true
.
The same applies to the Appendable
.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The entity types are just simple interfaces with value methods, which have following requirements:
A return type, no void
No parameters
No type parameters
No declared checked exceptions
You are not limited otherwise. Use any types you want. Inheritance and generics are supported as well.
There is one base type (Beeing
), one feature interface (Named
) and three entities (Animal
, Pet
, Human
).
The code generator takes care of the three entities, and its output looks like this:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Given is the following entity:
So how is it done? Since the code generator provides a creator, we can use it to create a new Person
.
Let's see what the debugger displays if we run this code:
There's always an entity chain, with
The identity (PersonEntity
) as outer layer
Then the logic layers, none here in our example
And the inner most layer is always the data (PersonData
), which holds the properties.
The properties can be accessed like defined in the entity's interface:
The creator can also be used to create copies. Just hand over the existing one as template:
This will create a "Mike Doe".
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The data layer is always immutable. In order to update the values we have to replace the data layer completely. This is done with the updater. The property setter methods can be chained, so it is easy to update multiple properties, for example:
If only one property needs to be updated, the updater class offers static convenience methods for that:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
An arbitrary amount of logic layers can be added to entities. Let's use the predefined versioning layer. It will keep track of all changes. Technically every new data layer which is added by the updater, will create a new version entry.
Let's have a look at the debugger:
Now the versioning layer is chained between the identity layer and the data layer.
If we update the entity a few times, we will see how the versioning layer works. In this case we use an auto-incrementing Long as key.
If you want to access older versions use the context:
To limit the amount of preserved versions, a cleaner can be utilized:
This will keep only the last ten versions of the person.
Additionally to number keys, timestamps can be used as well.
They can be preserved for a specific time range:
The version context can be used as a shared state object. So you can control versioning for multiple entities at once, or even for the hole entity graph.
The auto-incrementing contexts take care of the key creation. If you need to control it by yourself, use the mutable context. But be aware that you have to set the version before updating any data, otherwise the current one will be overwritten.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Another predefined logic layer is for logging purposes. Since there is a myriad of loggers out there, MicroStream doesn't provide any special adapter, but a generic type which can be used to adapt to the logging framework of your choice.
Just create a class and implement EntityLogger
, and you are good to go.
Additional to afterUpdate
there are further hooks:
entityCreated
afterRead
beforeUpdate
Now just add the logger when creating entities:
When you call
the logger's output is
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
Entities can be created with an arbitrary amount of layers, so feel free to combine them as you like:
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream uses a strictly interface-based architecture. All types in the public API are, whenever possible, interfaces. This offers the best possibilities to extend or exchange parts of the engine. A good ways to enrich a type with features, is the wrapper (decorator) pattern.
For example, let's say we want to add logging to the PersistenceStoring
's store(object)
method.
Conventionally it would be done that way: A new type, implementing the original interface, would be handed over the wrapped instance, all interface methods have to be implemented and delegated. And in the single method, we wanted to add functionality; the actual implementation of the logging is done.
This produces a lot of overhead. In this case, three methods are just boilerplate code to delegate the calls to the wrapped instance. A common solution for that is to create an abstract base wrapper type for the designated interface, and to reuse it whenever needed.
And then, based on that, the implementation of the logger type would look like this:
That's better. No more boilerplate code. Just overwrite the methods you want to extend.
The only work left is, to generate the base wrapper types. One way is to let your IDE generate the wrapper or delegation code. Disadvantage of that is, it has to be redone every time your interfaces change. A code generator, which does it automatically would be nice. And that's what the base module brings along. Like the layered entity code generator, it is an annotation processor.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
The wrapper code generator is an annotation processor, provided by the base
module.
The maven configuration looks like this:
There are following ways to get the base wrapper types generated. If you want it for your own types, the best way is to use the GenerateWrapper
annotation.
Or, if you want it for interfaces in libraries, like PersistenceStoring
, you cannot add an annotation. That's what the microstream.wrapper.types
parameter is for. This is just a comma separated list of types. Alternatively you can use the GenerateWrapperFor
annotation:
It accepts a list of type names. Plain strings have to be used instead of class literals, because it is read inside the compilation cycle which prohibits access to class elements.
This is the manual for older MicroStream versions (Version < 5.0).
The new documentation (Version >= 5.0) is located at:
MicroStream's wrapper code generator generates following wrapper type for PersistenceStoring
:
It is not an abstract class, but an interface, which extends the Wrapper
interface of the base module, and the wrapped type itself. This offers you the most flexible way to use it in your application.
The Wrapper
type is just a typed interface and an abstract implementation of itself.
You can either implement the Wrapper
interface and provide the wrapped instance via the wrapped()
method, or you can extend the abstract class and hand over the wrapped instance to the super constructor.
Version with the abstract type:
Or only the interface, then you have to provide the wrapped instance via wrapped()
: