One piece of big news in the SAP® Web Application Server 6.40 is that J2EE version 1.3 is fully supported. This means that developers will have the choice to use all of the fancy options available in Enterprise Java Beans. EJBs are now about five years old. While they are hardly brand new, from an SAP perspective now that SAP Web AS has a full ABAP stack alongside the full J2EE engine, EJBs offer some new possibilities and developers will have to choose from the available EJB options when developing new applications. This article is about Entity Beans. Let me put this very clearly: I don't like them, and I will tell you why.
Sinking the Ship Before Launch?
It may seem odd that with SAP making a huge push into J2EE that I write an article complaining about one aspect of the technology. But, from my perspective, this article shows what the SAP Developer Network is all about; and that is looking at what developers need to know about the tools at their disposal, and then telling it like it is.
One big piece of news in the SAP® Web Application Server 6.40 is that J2EE version 1.3 is fully supported. Developers will be able to use all of the fancy options available in Enterprise Java Beans. EJBs are now about five years old. From an SAP perspective, now that SAP Web AS has a full ABAP stack along with the full J2EE engine, EJBs offer some new possibilities. Developers will have to choose from the available EJB options when developing new applications. This article is about Entity Beans. Let me put this very clearly: I don't like them, and I will tell you why.
Especially for programmers who have worked with ABAP before, Java and J2EE may seem to be a step down from the old R/3. The reason for this is that ABAP was built as a 4th Generation Language (4GL), while Java clearly is a 3rd Generation Language (3GL). Remember, this means that ABAP is written as a special purpose language and therefore everything needed for business applications comes built-in in a very handy way.
Historically 3GLs usually come with better language definitions that make it easier to implement the language itself -- a fact that only seems to bother the little group of people who built compilers and interpreters, but has large consequences in the long run. Nevertheless, a 3GL has to deliver all the functionality that may be built-in to a 4GL as additional libraries or as generators in their development environment.
ABAP programmers moving to Java still miss a lot of things that came free with their former environment.
The Java community has not always been delighted with the details of Enterprise Java Beans, especially when they were not attracted to the advantages that EJB delivers. Of course these advantages have to be paid for with some overhead, and small applications done with virtually no tools do not benefit from these advantages.
Howeverk, those brave programmers who used CORBA and such were delighted and understood immediately what this new technology could do for them. If you ever touched this area, or even worse used DCE, you know what I'm talking about. The rest of you may imagine the dark ages when a book had to be handwritten to copy it. No reuse of anything, nothing done by machinery, and you always start from scratch!
EJB introduced the forget-about-stubs-ages and invented reconfiguration without recompiling (yourself). This was a step forward!
Issues with EJB
As soon as you invent something really great, there are people who start arguing about the invention. In fact, it's almost the single sign that you have something great once people start arguing about it and come up with this-will-never-work arguments. It happened to trains, cars, and planes, to steel ships, and the discovery of America.
Well, some of these arguments usually do have content and are worth watching. Imagine if the American continent was not on the westbound way to Asia from Europe. This would have meant that Columbus and his men would have starved to death, because the distance would have been too large to get there with his ships.
Anyway, EJBs were said to be too complex to use when they were first invented, and at that time they were just half as complex as they are today. Entity Beans, especially, have been considered useless, as their advantages sometimes get lost due to the special ways they are used. Let us go into this in detail.
EJBs are often reported to be too slow for the job. Well, this is nothing new to those of us who have seen software evolve during the last 20 years. If we're honest, we have to admit that new software seldom was faster (if ever) in a new generation than the one before. Nevertheless, it has been quite some time since I saw the last PC running under DOS. I don't even know if it would run on the latest generation of PCs.
New generation software usually means inserting new levels of software that run on top of everything that was there before. Only a miracle could make the performance better than it had been before, right? This means we are always rooting for faster hardware.
Additionally, we have been adding more and more code generators to development. In general, this means the code produced is not very smart or optimized. It removes a load of work from highly paid workforces, who still sometimes tragically insist on using the copy-and-paste function of their preferred editor over and over again to repeat code inefficiently, the way the code generators sometimes do. There are ways this code could be improved, but the cost in time compared to generating is immense, of course.
We are not even calculating in the fact that code generated once, bulletproof, also saves time for debugging the mistakes that always enter into your code against your will.
The n+1 Problem
When building up a standard like EJB, you have to standardize the processes. This sometimes has side effects, which will not serve you well in every possible situation.
If you want to use a bean-managed Entity Bean you have to use a finder method by definition. Before you can access your data you have to get all primary keys of the data you are looking for. Once you have those you can go through your data and get the records one by one. This actually means you have to go n+1 times to your database to get the data for n records.
Compare this to a direct access through SQL, where you would drop a select statement and then fetch all records one by one. In this case you don't do a real select statement, the database keeps the result and just delivers records to the query. The process of searching a single record is done n times additionally for a BMP bean because the bean cannot keep the result from the first query, when primary keys are returned.
Basically this makes beans unable to handle lists. This is well known; in the Java community it is regularly recommended to not use beans for this purpose. In fact, it is a general problem of OOP that handling lists is not done very well by that concept. Surprisingly, since programmers are often not familiar with database concepts, it is an easy trap to step into.
Compared to just writing a simple program, EJBs are a little bit complex, without a doubt. They not only carry an implementation class, but additionally up to four interfaces, at least one XML descriptor file, and maybe a primary key class. There are five different types of such beans (to further confuse things Sun invented another type, the JavaBean, that has nothing to do with EJBs) and an application needs three different kinds of nested archive files that carry the contents of such an application.
Even if you are one of the lucky people who already understood this pile of technology, and there is no doubt that it is quite complex - nevertheless it is a useful and meaningful system that clearly makes sense.
Some of these parts only make sense if you can remember how remote computing was done in the early years. For CORBA or DCE you had to write so-called IDL files that described the methods or functions (remember back then OOP was mostly theory) that you wanted to call remotely, and send them through a generator. From this you got the so-called "stubs," which enabled the program to call a remote function locally, hand over variables, get them "marshaled" (meaning stuffed up), send them over the line, "un-marshal" (unpack) them, hand them over to the remote function, process them, and then go through the whole process backwards to deliver the result to the calling entity.
In case you are an ABAP programmer and wonder why you never had to do any of these steps to call a program remotely, let me assure you that exactly this happens in your program too, except that ABAP takes all those duties out of your hands.
And so it is today with J2EE. If you look at a remote interface - one of the mandatory descriptions an EJB needs - this is nothing more than a description of a class that delivers exactly the same information as the IDL description. But you never have to look at the resulting, generated classes (hopefully).
Nevertheless, compared to 4GL, this is still somewhat complex. Additionally, complexity does not end here. EJB descriptor files contain lots of information that make them so valuable. One of the requirements is the EJB reference that has to be defined for every bean that wants to use another one. Not a problem for today's projects, where numbers are low. But recalling some mathematics that everyone who ever had contact with computer science should know, is alarming. The formula for possible references in such a descriptor goes against n2, or to be precise: it equals n2-n.
Projects as large as those going on at SAP could suffer from this and it's a good idea to give this issue some thought. But this is not the place where we want to discuss solutions.
Advantages that Don't Bite
As you can read in every entry level advisory paper, Entity Beans should always be called via a facade session bean. Consequently, never attempt to call Entity Beans remotely, superseding the facade. This essentially means that the whole remote computing paradigm does not apply to Entity Beans.
Another advantage of EJB is that you can write programs without worrying about security to the slightest extent. Data access rights can be defined via descriptors again for access to specific methods or on tables and their columns. In case you want to decide about access rights on specific row information, you have to do this in code.
This is exactly the standard procedure needed in many situations according to SAP development experience. In large applications a user might not have powerful rights, while his actions result in changes that need such rights. In such cases the application needs to decide not on user rights, but actually on process consequence. For example, if an action in Resource Planning software results in depleting a shelf, this should result in replenishing it, resulting in order actions that might exhaust the starting user's rights (I'm not much of an expert for Resource Planning software, so please forgive me in case this example is too naive).
This way it looks like Entity Beans have lost advantages that we need, of course, to be convinced that it makes sense to use them. We still have the other advantages like component data access and object relational mapping options. Many programmers are still not comfortable with OOP anyway, so they are not very interested in using them.
We are left solely with session beans and have to access our data directly from them. This is not that big of a problem, as we still have the advantages of session beans, where remote computing does matter. Data access from session beans JDBC is available anyhow, and also SQLJ for SAP Web Application Server, a standard most commonly described as Java embedded technology.
Using it means just writing your SQL statements with a leading pound sign inside the code. This code is precompiled and will do both syntax checking of your SQL query and meta-data checking of used tables and column names with a data dictionary during the compilation. Afterwards, you get your code back with all necessary JDBC parts generated into it.
Developers, haunted by long deployment cycles that only result in stupid errors because of writing mistakes, would surely appreciate SQLJ.
But does this mean we have to continue to do database access in the same manner as we did for the last 15 years? Will we have to maintain transactions on our own forever?
Java Data Objects
One of the promising technologies in the Java world is the Java Data Objects Standard, which is one of the standards for Java but not part of J2EE.
JDO was written to make up a technology for persisting Java objects. This seems to be very much what we need. In fact, JDO and Entity Beans are able to work together.
EJB Future Releases?
Will future releases correct the behavior of EJBs? I really doubt this. The people who did the specification are experienced programmers and technicians who certainly did not miss the facts that sometimes turn EJBs into performance eaters.
Session Bean Superiority
It seems that we have to do it all again from scratch. Of course, the complete development of database manipulation can be done in session beans too. This means that we are far beyond the large step we had hoped to take with such new technologies. This means we are at the same point with database programming as we were at the beginning of the Nineties, but not a single step beyond. Let's just look at the disadvantages.
What's Left from Entity Beans?
Let's not forget about the one big benefit we get from container-managed Entity Beans. Once we do have a case iwhere we want to go to the trouble of using them, we are compensated because we do not have to take care of any database-related actions anymore. Therefore, people using them do not need to be aware of all the background that direct database access needs. Anybody who thinks this is "impossible" should tell me how his computer works - deep down. For those who really think they can do this, please explain how your car's motor, transmission, and clutch work.
Ultimate progress can only be taken by abstracting complete functionality into shells that hide their complexity. The question here is only, how can we stop such programmers from using container-managed EJBs in the wrong way? But that is another story.
So what can we do to reduce complexity that is still inherent in EJB projects? To fight complexity there is usually the old rule of "divide and conquer." It usually means cutting complex problems into pieces until problem parts are no longer complex. In our case, the nice part is that once we divide the complexity about using EJBs we find a lot of things that can actually be done by a machine called a computer.
It's possible to use tools for many of the duties that must be done when writing EJBs. Tools today can handle, for example, all the interfaces that initially had to be written by hand, because they are related to the implementation class in a very logical manner.
It's the same thing for persistent objects: the more intelligent persistent levels get, the more performance we gain. But to write intelligent persistent levels requires some real life experience - something that helped SAP to provide such nice tools for ABAP earlier.
What Else Comes Up?
For large environments it is clear that we have to graphically manage the beans' references. To some extent this is possible today. There are three ideas about what kinds of tools can be written:
While number 1 is already on the way, number 2 would be nice for business programmers who do not like to build their environment in the EJB world all over again. Number 3 is a fascinating idea to me, because everybody tells me it's impossible and will never work. 4GL is dead? I think this was what I heard about "interpreted languages" in the early Nineties, too.