Java Architect

May 24, 2006

Cache vs Pool

Filed under: Caching,J2EE,java,Pooling — nikhilb020875 @ 7:14 am

Both caches and pools are used in J2EE environment.

A pool is a collection of stateless objects. Eample – database connection pools, thread pools, and Servlet pools.

A cache is a collection of stateful objects. Example – Entity Bean caches and Stateful Session Bean caches. Aside from Entity Beans and Stateful Session Beans, caches are useful to hold any data that you want to look up once and reference it multiple times, things like JNDI entries, RMI services, configuration file contents, etc. Caches save time!

The main distinction between a cache and a pool is what is contained in them. In other words: when you retrieve an object from a cache or a pool, do you need a specific object, or will any object do? If you need a specific object, then each object maintains state; hence, you need to use a cache. If, on the other hand, you can use any object, then the objects do not maintain state and you can use a pool.

Let’s compare their performance considerations:

  • A request for a pooled object can be serviced by any object in the pool.
  • A request for a cached object can only be serviced by a specific object in the cache.
  • If all objects in a pool are in use when a request is made, then the request must wait for any object to be returned to the pool before the request can be satisfied.
  • If the requested object in a cache is in use when it is requested, then the request must wait. It doesn’t matter if the rest of objects in the cache are available, as a specific one is needed.
  • The size of a pool can be fixed or can grow. If a new object is requested from an empty pool, a new object can be created and added to the pool.
  • The size of a cache is usually fixed (because it holds specific objects and creating a new one is not always an option). However, if the cache is full and a new object needs to be loaded into the cache, an existing object has to be removed from the cache (activation and passivation).

Full Article is at: http://www.informit.com/guides/content.asp?g=java&seqNum=104&rl=1

May 19, 2006

Why ORM tools are not recommended

Filed under: hibernate,java,ORM — nikhilb020875 @ 12:46 pm

I am posting this at the risk of sounding anti-ORM. But this is the result of my study of different persistence strategies, and I’d be happy to be proved wrong.

  1. Distribution Readiness

Due to the stateful nature of the ORM Persistence Manager Object, the code is not cluster ready.

Lets take the example of “session” object (Hibernate Session Manager) in hibernate.

We have two business classes MyClass and SomeOtherClass:

public class MyClass {

public static void myMethod(String[] args) throws java.text.ParseException {

Session s1 = HibernateUtil.currentSession();

Transaction t1 = s1.beginTransaction();

UserMaster user1 = (UserMaster) s1.load(UserMaster.class, “UserName22458”);

user1.setFname(“FName s1”);

SomeOtherClass someOtherClass = new SomeOtherClass();

someOtherClass.dosomething(user1);

t1.commit();

}

}

public class SomeOtherClass {

public void dosomething(UserMaster user){

Session s = HibernateUtil.currentSession();

UserMaster user2 = (UserMaster) s.load(UserMaster.class, “UserName22458”);

System.out.println(“compare same row in copied sessions. user==user2 -> ” + (user==user2));

}

}

This code would give different results if MyClass and SomeOtherClass are put on the different nodes while clustering.

What I mean to say is that the code is not cluster ready as session is stateful.

  1. One of the main disadvantages that I see with ORM tools is that the usage of the Cache is forced on us:

ORM tools heavily depend on object caches. In an ORM tool object caches are used to:

a. Ensure that objects are unique within memory and

b. Improve application performance

The primary reason caches are universal in ORM tools is to ensure that objects are unique within memory and that has nothing to do with the primary reason why the caches should be used in the first place. A data cache is useful only when we have data that is accessed frequently but is rarely changed. That’s the only case when an application can benefit by the usage of a cache and this situation is not true for all applications.

The problem becomes more apparent when we try to use ORM tools in clustered environments or if there is another application accessing the database which does not goes through the same cache.

In clustered environments we’d have to live with the disadvantages associated with data cache synchronization. It does not make sense if I am not getting any benefit out of cache.

Am I right in saying that ORM tools should only be used for:

a. Applications that will not use clustering for scaling.

b. Applications that have data that is accessed frequently but is rarely changed so that the overheads of cache synchronization are justified.

Other disadvantages:

  1. Less Control on SQL Queries
  2. Data and behavior are not separated
  3. Façade needs to be built if the data model is to be made available in a distributed architecture
  4. Each ORM technology/product has a different set of APIs and porting code between them is not easy

Points 2, 3 and 4 can be overcome by using DAO layer to hide the ORM tool. But that would also mean that the DAO layer does not exposes the same set of persistent objects that the ORM layer works on. This would have two disadvantages:

a. Persistent objects have to be translated back and forth to the objects exposed by the DAO layer.

b. The fact that ORM layer does not works on the domain objects directly, seems to take a certain edge out of ORM solution. After all, one of the benefits of an ORM solution is that a complicated domain model with all its relationships can be made persistent.

Nikhil Bajpai (http://www.geocities.com/nikhilb020875/)

May 16, 2006

Connections and Threads

Filed under: connection,database,java,session,threads — nikhilb020875 @ 11:05 am

Question: which is the best way to manage JDBC Connections?

Issues:

  1. Passing connection around as a parameter is a nuisance. (A Bigger point of contention is 'global variables' vs. 'passing parameters' style of coding. To some, the 'global variables' approach seamless object orientated)
  2. I want to ensure that within the session I use the same connection object across requests.

Solution:

Point number one is best solved by using ThreadLocal.

Point number two can only be solved by saving the connection in HttpSession. That’s again is a source of contention. Reason being:

1.      You end up needing a database connection (and an active transaction) per concurrent *user* of your application (i.e. everyone who is logged on and has an active session) instead of a database connection per current *request*.

2.      The database connection would remain unavailable to other users

3.      Other users have to wait to access rows are in the middle of updating, and the transaction has locked

4.      What if there are multiple requests for the same session?  This can happen more often than you might think. In any such scenario, it is not safe to assume that a Connection retrieved from a session is only being used by a single thread.  You'll need to ensure that you don't try to retrieve it from more than one thread simultaneously. A couple of common causes:

a.       Framed presentation, because the browser will often trigger multiple simultaneous requests, and they are all participating in the same session.

b.      User submits a request that starts a long SQL query, presses STOP, and then submits a second request while the first one is still running.

Nevertheless there are situations where the transaction spans across request.

Let’s discuss the merits of using ThreadLocal:

This is what I got from one of the mailing lists: (http://marc2.theaimsgroup.com/?l=tomcat-user&m=103609047904032&w=2)
The notion of binding a Connection to a particular Thread using a ThreadLocal is a pretty good idea. The big advantage to my mind is that you can use the Connection from any level of the call tree without passing it around as a parameter. What you need to keep in mind is that you must reset the ThreadLocal before the Thread leaves your control. In other words:

  • At the beginning of your processing, check out a Connection from a connection pool and bind it to the ThreadLocal.
  • Throughout processing, the ThreadLocal can be used to obtain access to the Connection.
  • At the end of your processing, close the connection (which returns it to the connection pool) and reset the ThreadLocal.

If you need to maintain database state across multiple HTTP requests, a slight modification is needed: before you check a Connection out from the connection pool, look in the HttpSession to see whether there is a Connection there already. If there is, bind that Connection to the ThreadLocal. Otherwise, obtain a new Connection and put it both in the HttpSession and in the ThreadLocal.

In the context of servlets, one good way to structure this would be to create a ServletWithConnection class (extending HttpServlet), and have its service() method look something like this:

protected static ThreadLocal tl = new ThreadLocal();

protected void service (……) {

Connection conn = connPool.getConnection();

tl.set(conn);

super.service();

tl.set(null);

conn.close();

}

Then you can extend this servlet whenever you need a database connection, override doGet, doPost, whatever, as normal, and access the Connection via tl.get() whenever you need it.

Incidentally, Martin Fowler talks about this in his new book on enterprise architecture patterns — look at the "registry" pattern.

The following are the points against threadlocal approach:

  1. If you're going to keep a Connection in the session (across requests) at all, you're wasting your time bothering with a connection pool — you’ve already said you're willing to leave an expensive resource (an open connection to the database) allocated to a particular user in between requests, even if the user went to lunch. You might as well just open a connection when the session is created, and close it when the session completes.
  2. The pattern mentioned above (extending HttpServlet) does not violates these principles (because it gives the connection back before the request is completed), but it does do unnecessary work if you're running on a servlet container that provides data sources via JNDI resources — and that means every J2EE appserver in the world, plus Tomcat 4.x and 5.x, and I'm sure a bunch of others.
  3. It’s overkill if many calls don't need to use the database for anything. I guess you could be lazy and only get the connection from the pool and bind it to the threadlocal just before you use it for the first time.
  4. You'd still need to go straight to the connection pool, if you needed to use a second connection in the same thread (e.g. for logging. you still want to log even if there is a rollback).

May 9, 2006

How DriverManager finds the correct driver

Filed under: database,java — nikhilb020875 @ 6:09 am

In JDBC, the way to get connection to a database is:

Class.forName("<name of driver>");

DriverManager.getConnection("url") ;

Lets analyse these two lines.

The first line class.forName(), simply finds that class from the class path and loads it.
I mean, nothing stops you from loading the driver class by including it in the import statements. The code will work fine. But we dont put it in the import statements as we want to keep the flexibility of changing the driver withing recompiling our code.
There is something more. The driver classes have a static block that run as soon as the class is loaded.
The static block registers that driver with the driverManager against a perticular subprotocol. For example a driver called com.test.XYZDriver might register itself against the subprotocol "XYZ"

Now about the second statement: Drivemanager.getConnection("url");

A database URL (or JDBC URL) is a platform independent way of adressing a database. A database/JDBC URL is of the form jdbc:[subprotocol]:[node]/[databaseName]

When the static method "getConnection(url)" is executed, DriverManager extracts the subprotocol from the url and finds the driver registered against that subprotocal. For a url "jdbc:XYZ:192.168.23.34/myDB", the driverManager will try to find the driver registered against the subprotocol "XYZ". The driver is expected to know how to make sence of the rest of the protocol and get the connection. The driverManager returns that connection.

J2EE patterns: Business Delegate

Filed under: J2EE patterns — nikhilb020875 @ 5:23 am

here is what i found from one of the mailing lists:

Business Delegate is used to reduce the coupling between clients (usually presentation layer components) and the business services (especially distributed business components). Most importantly you will need business delegate when you want to hide the complexity of accessing the business services (lookup..etc) and you may eventually incorporate client-side caching, and you can convert the low-level exceptions (RemoteException, Look up exceptions) into application level exceptions which are more meaningful to the client. In a way, you can assume Business Delegate is a client side Session Facade.

Think of Business Delegate when :

  • presentation tier components interact directly with business services (usually distributed)
  • It is desirable to minimize coupling between presentation-tier clients and the business service, thus hiding the underlying implementation details of the service, such as lookup and access.
  • When you want to implement caching mechanisms for business service information
  • When it is desirable to reduce the traffic between client and business services.
  • When you want to shield the clients from possible volatility in the implementation of business service API
  • When you want to implement re-try mechanisms (some times service may not be accessible in one go, in this case , before letting client know about this, you can keep the logic of re-trying here..)
  • When you need to hide remoteness (Location transparency from client perspecive)

SCEA

Filed under: Uncategorized — nikhilb020875 @ 5:23 am

I created this blog to write about by progress in preperation for SCEA examination.

Apart from that, I’d be puting anything that is of interest to me as an architect.

April 19, 2006

Hello world!

Filed under: Uncategorized — nikhilb020875 @ 9:25 am

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.