



































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An in-depth tutorial on working with objects in hibernate, focusing on the persistence lifecycle and the concept of object states. It covers transient objects, persistent instances, and detached objects, explaining their differences and the implications for java object identity. The document also introduces the concept of conversations and the strategies for implementing them with detached objects or an extended persistence context.
Typology: Study notes
1 / 43
This page cannot be seen from the preview
Don't miss anything!




































what-when-how
In Depth Tutorials and Information
Because Hibernate is a transparent persistence mechanism—classes are unaware of their own persistence capability—it’s possible to write application logic that is unaware whether the objects it operates on represent persistent state or temporary state that exists only in memory. The application shouldn’t necessarily need to care that an object is persistent when invoking its methods. You can, for example, invoke the calculateTotalPrice() business method on an instance of the Item class without having to consider persistence at all; e.g., in a unit test.
Any application with persistent state must interact with the persistence service whenever it needs to propagate state held in memory to the database (or vice versa). In other words, you have to call Hibernate (or the Java Persistence) interfaces to store and load objects.
Different ORM solutions use different terminology and define different states and state transitions for the persistence lifecycle. Moreover, the object states used internally may be different from those exposed to the client application. Hibernate defines only four states, hiding the complexity of its internal implementation from the client code.
The object states defined by Hibernate and their transitions in a state chart are shown in figure 9.1. You can also see the method calls to the persistence manager API that trigger transitions. This API in Hibernate is the Session. We discuss this chart in this topic; refer to it whenever you need an overview.
We’ve also included the states ofJava Persistence entity instances in figure 9.1. As you can see, they’re almost equivalent to Hibernate’s, and most methods of the Session have a counterpart on the EntityManager API (shown in italics). We say that Hibernate is a superset of the functionality provided by the subset standardized in Java Persistence.
Some methods are available on both APIs; for example, the Session has a per-sist() operation with the same semantics as the EntityManager’s counterpart. Others, like load() and getReference(), also share semantics, with a different method name.
During its life, an object can transition from a transient object to a persistent object to a detached object. Let’s explore the states and transitions in more detail.
Figure 9.1 Object states and their transitions as triggered by persistence manager
operations
Transient objects
Objects instantiated using the new operator aren’t immediately persistent. Their state is transient, which means they aren’t associated with any database table row and so their state is lost as soon as they’re no longer referenced by any other object. These objects have a lifespan that effectively ends at that time, and they become inaccessible and available for garbage collection. Java Persistence doesn’t include a term for this state; entity objects you just instantiated are new. We’ll continue to refer to them as transient to emphasize the potential for these instances to become managed by a persistence service.
Hibernate and Java Persistence consider all transient instances to be nontransactional; any modification of a transient instance isn’t known to a persistence context. This means that Hibernate doesn’t provide any roll-back functionality for transient objects.
Objects that are referenced only by other transient instances are, by default, also transient. For an instance to transition from transient to persistent state, to become managed, requires either a call to the persistence manager or the creation of a reference from an already persistent instance.
Persistent objects
A persistent instance is an entity instance with a database identity, as defined. “Mapping entities with identity.” That means a persistent and managed instance has a primary key value set as its database identifier. (There are some variations to when this identifier is assigned to a persistent instance.)
Persistent instances may be objects instantiated by the application and then made persistent by calling one of the methods on the persistence manager. They may even be objects that became
You should now have a basic understanding of object states and how transitions occur. Our next
topic is the persistence context and the management of objects it provides.
You may consider the persistence context to be a cache of managed entity instances. The persistence context isn’t something you see in your application; it isn’t an API you can call. In a Hibernate application, we say that one Session has one internal persistence context. In a Java Persistence application, an EntityManager has a persistence context. All entities in persistent state and managed in a unit of work are cached in this context. We walk through the Session and EntityManager APIs later in this topic. Now you need to know what this (internal) persistence context is buying you.
■ Hibernate can do automatic dirty checking and transactional write-behind.
■ Hibernate can use the persistence context as a first-level cache.
■ Hibernate can guarantee a scope of Java object identity.
■ Hibernate can extend the persistence context to span a whole conversation.
All these points are also valid for Java Persistence providers. Let’s look at each feature.
Automatic dirty checking
Persistent instances are managed in a persistence context —their state is synchronized with
the database at the end of the unit of work. When a unit of work completes, state held in memory is propagated to the database by the execution of SQL INSERT, UPDATE, and DELETE statements (DML). This procedure may also occur at other times. For example, Hibernate may synchronize with the database before execution of a query. This ensures that queries are aware of changes made earlier during the unit of work.
Hibernate doesn’t update the database row of every single persistent object in memory at the end of the unit of work. ORM software must have a strategy for detecting which persistent objects have been modified by the application. We call this automatic dirty checking. An object with modifications that have not yet been propagated to the database is considered dirty. Again, this state isn’t visible to the application. With transparent transaction-level write-behind, Hibernate propagates state changes to the database as late as possible but hides this detail from the application. By executing DML as late as possible (toward the end of the database transaction), Hibernate tries to keep lock-times in the database as short as possible. (DML usually creates locks in the database that are held until the transaction completes.)
Hibernate is able to detect exactly which properties have been modified so that it’s possible to include only the columns that need updating in the SQL UPDATE statement. This may bring some performance gains. However, it’s usually not a significant difference and, in theory, could harm performance in some environments. By default, Hibernate includes all columns of a mapped table in the SQL UPDATE statement (hence, Hibernate can generate this basic SQL at startup, not at runtime). If you want to update only modified columns, you can enable dynamic SQL generation by setting dynamic-update=”true” in a class mapping. The same mechanism is implemented for insertion of new records, and you can enable runtime generation of INSERT statements with dynamic-insert=”true”. We recommend you consider this setting when you have an extraordinarily large number of columns in a table (say, more than 50); at some point, the overhead network traffic for unchanged fields will be noticeable.
In rare cases, you may also want to supply your own dirty checking algorithm to Hibernate. By
default, Hibernate compares an old snapshot of an object with the snapshot at synchronization time, and it detects any modifications that require an update of the database state. You can implement your own routine by supplying a custom findDirty() method with an org.hibernate.Interceptor for a Session. We’ll show you an implementation of an interceptor later in the topic.
We’ll also get back to the synchronization process (known as flushing) and when it occurs later in this topic.
The persistence context cache
A persistence context is a cache of persistent entity instances. This means it remembers all persistent entity instances you’ve handled in a particular unit of work. Automatic dirty checking is one of the benefits of this caching. Another benefit is repeatable read for entities and the performance advantage of a unit of work-scoped cache.
For example, if Hibernate is told to load an object by primary key (a lookup by identifier), it can first check the persistence context for the current unit of work. If the entity is found there, no database hit occurs—this is a repeatable read for an application. The same is true if a query is executed through one of the Hibernate (or Java Persistence) interfaces. Hibernate reads the result set of the query and marshals entity objects that are then returned to the application. During this process, Hibernate interacts with the current persistence context. It tries to resolve every entity instance in this cache (by identifier); only if the instance can’t be found in the current persistence context does Hibernate read the rest of the data from the result set.
The persistence context cache offers significant performance benefits and improves the isolation guarantees in a unit of work (you get repeatable read of entity instances for free). Because this cache only has the scope of a unit of work, it has no real disadvantages, such as lock management for concurrent access—a unit of work is processed in a single thread at a time.
■ The persistence layer isn’t vulnerable to stack overflows in the case of circular references in
a graph of objects.
■ There can never be conflicting representations of the same database row at the end of a unit of work. In the persistence context, at most a single object represents any database row. All
changes made to that object may be safely written to the database.
■ Likewise, changes made in a particular persistence context are always immediately visible to all other code executed inside that persistence context and its unit of work (the repeatable read
for entities guarantee).
You don’t have to do anything special to enable the persistence context cache. It’s always on and, for the reasons shown, can’t be turned off.
Later in this topic, we’ll show you how objects are added to this cache (basically, whenever
they become persistent) and how you can manage this cache (by detaching objects manually from the persistence context, or by clearing the persistence context).
The last two items on our list of benefits of a persistence context, the guaranteed scope of
identity and the possibility to extend the persistence context to span a conversation, are closely
A persistence context only spans the processing of a particular request, and the application
manually reattaches and merges (and sometimes detaches) entity instances during the conversation.
The alternative approach doesn’t require manual reattachment or merging: With the session-
per-conversation pattern, you extend a persistence context to span the whole unit of work (see figure 9.3).
First we have a closer look at detached objects and the problem of identity you’ll face when you implement a conversation with this strategy.
Figure 9.3 Conversation implementation with an extended persistence context
As application developers, we identify an object using Java object identity (a==b). If an object changes state, is the Java identity guaranteed to be the same in the new state? In a layered application, that may not be the case.
In order to explore this, it’s extremely important to understand the relationship between Java identity, a==b, and database identity, x.getId().equals( y.getId() ). Sometimes they’re equivalent; sometimes they aren’t. We refer to the conditions under which Java identity is equivalent to database identity as the scope of object identity.
■ A primitive persistence layer with no identity scope makes no guarantees that if a row is accessed twice the same Java object instance will be returned to the application. This becomes problematic if the application modifies two different instances that both represent the same row in a single unit of work. (How should we decide which state should be propagated to the database?)
■ A persistence layer using persistence context-scoped identity guarantees that, in the scope of a single persistence context, only one object instance represents a particular database row. This avoids the previous problem and also allows for some caching at the context level.
■ Process-scoped identity goes one step further and guarantees that only one object instance represents the row in the whole process (JVM).
For a typical web or enterprise application, persistence context-scoped identity is preferred. Process-scoped identity does offer some potential advantages in terms of cache utilization and the programming model for reuse of instances across multiple units of work. However, in a
pervasively multithreaded application, the cost of always synchronizing shared access to
persistent objects in the global identity map is too high a price to pay. It’s simpler, and more scalable, to have each thread work with a distinct set of persistent instances in each persistence context.
We would say that Hibernate implements persistence context-scoped identity. So, by nature, Hibernate is best suited for highly concurrent data access in multiuser applications. However, we already mentioned some issues you’ll face when objects aren’t associated with a persistence context. Let’s discuss this with an example.
The Hibernate identity scope is the scope of a persistence context. Let’s see how this works in code with Hibernate APIs—the Java Persistence code is the equivalent with EntityManager instead of Session. Even though we haven’t shown you much about these interfaces, the following examples are simple, and you should have no problems understanding the methods we call on the Session.
If you request two objects using the same database identifier value in the same Session, the result is two references to the same in-memory instance. Listing 9.1 demonstrates this with several get () operations in two Sessions.
Listing 9.1 The guaranteed scope of object identity in Hibernate
Object references a and b have not only the same database identity, but also the same Java
identity, because they’re obtained in the same Session. They reference the same persistent instance known to the persistence context for that unit of work. Once you’re outside this boundary, however, Hibernate doesn’t guarantee Java identity, so a and c aren’t identical. Of course, a test for database identity, a.getId().equals( c.getId() ), will still return true.
override the equals() and hashCode() methods before using Hibernate (or Java Persistence).
Traditionally, Java developers seem to be unaware of the intricate details of such an implementation. The longest discussion threads on the public Hibernate forum are about this equality problem, and the “blame” is often put on Hibernate. You should be aware of the fundamental issue: Every object-oriented programming language with hash-based collections requires a custom equality routine if the default contract doesn’t offer the desired semantics. The detached object state in a Hibernate application exposes you to this problem, maybe for the first time.
On the other hand, you may not have to override equals() and hashCode(). The identity scope guarantee provided by Hibernate is sufficient if you never compare detached instances—that is, if you never put detached instances into the same Set. You may decide to design an application that doesn’t use detached objects. You can apply an extended persistence context strategy for your conversation implementation and eliminate the detached state from your application completely. This strategy also extends the scope of guaranteed object identity to span the whole conversation. (Note that you still need the discipline to not compare detached instances obtained in two conversations!)
Let’s assume that you want to use detached objects and that you have to test them for equality with your own routine. You can implement equals() and hash-Code() several ways. Keep in mind that when you override equals(), you always need to also override hashCode() so the two methods are consistent. If two objects are equal, they must have the same hashcode.
A clever approach is to implement equals() to compare just the database identifier property (often a surrogate primary key) value:
Notice how this equals() method falls back to Java identity for transient instances (if id==null)
that don’t have a database identifier value assigned yet. This is reasonable, because they can’t possibly be equal to a detached instance, which has an identifier value.
Unfortunately, this solution has one huge problem: Identifier values aren’t assigned by
Hibernate until an object becomes persistent. If a transient object is added to a Set before being saved, its hash value may change while it’s contained by the Set, contrary to the contract of java.util.Set. In particular, this problem makes cascade save (discussed later in the topic) useless for sets. We strongly discourage this solution (database identifier equality).
A better way is to include all persistent properties of the persistent class, apart from any database identifier property, in the equals() comparison. This is how most people perceive the meaning of equals(); we call it by value equality.
When we say all properties, we don’t mean to include collections. Collection state is associated with a different table, so it seems wrong to include it. More important, you don’t want to force the entire object graph to be retrieved just to perform equals(). In the case of User, this means you shouldn’t include the boughtItems collection in the comparison. This is the implementation you can write:
place. This effort is required anyway; it’s important to identify any unique keys if your database
must ensure data integrity via constraint checking.
For the User class, username is a great candidate business key. It’s never null, it’s unique with a database constraint, and it changes rarely, if ever:
■ Consider what attributes users of your application will refer to when they have to identify an object (in the real world). How do users tell the difference between one object and another if they’re displayed on the screen? This is probably the business key you’re looking for.
■ Every attribute that is immutable is probably a good candidate for the business key. Mutable attributes may be good candidates, if they’re updated rarely or if you can control the situation when they’re updated.
■ Every attribute that has a UNIQUE database constraint is a good candidate for the business key. Remember that the precision of the business key has to be good enough to avoid overlaps.
■ Any date or time-based attribute, such as the creation time of the record, is usually a good component of a business key. However, the accuracy of Sys-tem.currentTimeMillis() depends on the virtual machine and operating system. Our recommended safety buffer is 50 milliseconds, which may not be accurate enough if the time-based property is the single attribute of a business key.
■ You can use database identifiers as part of the business key. This seems to contradict our previous statements, but we aren’t talking about the database identifier of the given class. You may be able to use the database identifier of an associated object. For example, a candidate business key for the Bid class is the identifier of the Item it was made for together with the bid
amount. You may even have a unique constraint that represents this composite business key in
the database schema. You can use the identifier value of the associated Item because it never changes during the lifecycle of a Bid—setting an already persistent Item is required by the Bid constructor.
If you follow our advice, you shouldn’t have much difficulty finding a good business key for all your business classes. If you have a difficult case, try to solve it without considering Hibernate— after all, it’s purely an object-oriented problem. Notice that it’s almost never correct to override equals() on a subclass and include another property in the comparison. It’s a little tricky to satisfy the requirements that equality be both symmetric and transitive in this case; and, more important, the business key may not correspond to any well-defined candidate natural key in the database (subclass properties may be mapped to a different table).
You may have also noticed that the equals() and hashCode() methods always access the properties of the “other” object via the getter methods. This is extremely important, because the object instance passed as other may be a proxy object, not the actual instance that holds the persistent state. To initialize this proxy to get the property value, you need to access it with a getter method. This is one point where Hibernate isn’t completely transparent. However, it’s a good practice to use getter methods instead of direct instance variable access anyway.
Let’s switch perspective now and consider an implementation strategy for conversations that doesn’t require detached objects and doesn’t expose you to any of the problems of detached object equality. If the identity scope issues you’ll possibly be exposed to when you work with detached objects seem too much of a burden, the second conversation-implementation strategy may be what you’re looking for. Hibernate and Java Persistence support the implementation of conversations with an extended persistence context: the session-per-conversation strategy.
Extending a persistence context
A particular conversation reuses the same persistence context for all interactions. All request processing during a conversation is managed by the same persistence context. The persistence context isn’t closed after a request from the user has been processed. It’s disconnected from the database and held in this state during user think-time. When the user continues in the conversation, the persistence context is reconnected to the database, and the next request can be processed. At the end of the conversation, the persistence context is synchronized with the database and closed. The next conversation starts with a fresh persistence context and doesn’t reuse any entity instances from the previous conversation; the pattern is repeated.
Note that this eliminates the detached object state! All instances are either transient (not known to a persistence context) or persistent (attached to a particular persistence context). This also eliminates the need for manual reattachment or merging of object state between contexts, which is one of the advantages of this strategy. (You still may have detached objects between conversations, but we consider this a special case that you should try to avoid.)
In Hibernate terms, this strategy uses a single Session for the duration of the conversation. Java Persistence has built-in support for extended persistence contexts and can even automatically store the disconnected context for you (in a stateful EJB session bean) between requests.
We’ll get back to conversations later in the topic and show you all the details about the two implementation strategies. You don’t have to choose one right now, but you should be aware of the consequences these strategies have on object state and object identity, and you should understand the necessary transitions in each case.
You should never create a new SessionFactory just to service a particular request. Creation of
a SessionFactory is extremely expensive. On the other hand, Session creation is extremely inexpensive. The Session doesn’t even obtain a JDBC Connection until a connection is required.
The second line in the previous code begins a Transaction on another Hibernate interface. All
operations you execute inside a unit of work occur inside a transaction, no matter if you read or write data. However, the Hibernate API is optional, and you may begin a transaction in any way you like—we’ll explore these options in the next topic. If you use the Hibernate Transaction API, your code works in all environments, so you’ll do this for all examples in the following sections.
After opening a new Session and persistence context, you use it to load and save objects.
Making an object persistent
The first thing you want to do with a Session is make a new transient object persistent with the save() method (listing 9.2).
Listing 9.2 Making a transient instance persistent
A new transient object item is instantiated as usual B. Of course, you may also instantiate it after opening a Session; they aren’t related yet. A new Session is opened using the SessionFactory ©. You start a new transaction.
A call to save() © makes the transient instance of Item persistent. It’s now associated with the current Session and its persistence context.
The changes made to persistent objects have to be synchronized with the database at some point. This happens when you commit() the Hibernate Transaction ©. We say a flush occurs (you can also call flush() manually; more about this later). To synchronize the persistence context, Hibernate obtains a JDBC connection and issues a single SQL INSERT statement. Note that this isn’t always true for insertion: Hibernate guarantees that the item object has an assigned database identifier after it has been saved, so an earlier INSERT may be necessary, depending on the identifier generator you have enabled in your mapping. The save() operation also returns the database identifier of the persistent instance.
The Session can finally be closed ©, and the persistence context ends. The reference item is now a reference to an object in detached state.
You can see the same unit of work and how the object changes state in figure 9.4.
It’s better (but not required) to fully initialize the Item instance before managing it with a Session. The SQL INSERT statement contains the values that were held by the object at the point
when save() was called. You can modify the object after calling save(), and your changes will be
propagated to the database as an (additional) SQL UPDATE.
Everything between session.beginTransaction() and tx.commit() occurs in one transaction.
For now, keep in mind that all database operations in transaction scope either completely succeed or completely fail. If one of the UPDATE or INSERT statements made during flushing on tx.commit() fails, all changes made to persistent objects in this transaction are rolled back at the database level. However, Hibernate doesn’t roll back in-memory changes to persistent objects. This is reasonable because a failure of a transaction is normally nonrecoverable, and you have to discard the failed Session immediately. We’ll discuss exception handling later in the next topic.
Retrieving a persistent object
The Session is also used to query the database and retrieve existing persistent objects. Hibernate is especially powerful in this area, as you’ll see later in the topic. Two special methods are provided for the simplest kind of query: retrieval by identifier. The get() and load() methods are demonstrated in listing 9.3.
Listing 9.3 Retrieval of a Item by identifier
You can see the same unit of work in figure 9.5.
The retrieved object item is in persistent state and as soon as the persistence context is closed, in
detached state.
Figure 9.
Modifying a persistent instance
First, you retrieve the object from the database with the given identifier. You modify the object, and these modifications are propagated to the database during flush when tx.commit() is called. This mechanism is called automatic dirty checking —that means Hibernate tracks and saves the changes you make to an object in persistent state. As soon as you close the Session, the instance is considered detached.
Making a persistent object transient
You can easily make a persistent object transient, removing its persistent state from the database,
with the delete() method (see listing 9.5).
Listing 9.5 Making a persistent object transient using delete()
Look at figure 9.7.
The item object is in removed state after you call delete(); you shouldn’t continue working with it, and, in most cases, you should make sure any reference to it in your application is removed. The SQL DELETE is executed only when the Session’s persistence context is synchronized with the database at the end of the unit of work. After the Session is closed, the item object is considered an ordinary transient instance. The transient instance is destroyed by the garbage collector if it’s no longer referenced by any other object. Both the in-memory object instance and the persistent database row will have been removed.
Figure 9.7 Making a persistent object transient
FAQ Do I have to load an object to delete it? Yes, an object has to be loaded into the persistence context; an instance has to be in persistent state to be removed (note that a proxy is good enough). The reason is simple: You may have Hibernate interceptors enabled, and the object must be passed through these interceptors to complete its lifecycle. If you delete rows in the database directly, the interceptor won’t run. Having said that, Hibernate (and Java Persistence) offer bulk operations that translate into direct SQL DELETE statements.
Hibernate can also roll back the identifier of any entity that has been deleted, if you enable the hibernate.use_identifier_rollback configuration option. In the previous example, Hibernate sets the database identifier property of the deleted item to null after deletion and flushing, if the option is enabled. It’s then a clean transient instance that you can reuse in a future unit of work.
Replicating objects
The operations on the Session we have shown you so far are all common; you need them in every Hibernate application. But Hibernate can help you with some special use cases—for example, when you need to retrieve objects from one database and store them in another. This is called replication of objects.
Replication takes detached objects loaded in one Session and makes them persistent in another Session. These Sessions are usually opened from two different SessionFactorys that have been configured with a mapping for the same persistent class. Here is an example: