Java Profiles : Allen Holub

New to Java? We'll help you get started with our revised beginner's tutorial, or our free online textbook.

Get the latest Java books

h t t p : / /w w w . j a v a c o f f e e b r e a k . c o m /

Java Coffee Break

Learning Java

Articles
Author Profiles
Lessons
FAQ's
Books
Newsletter
Tutorials
Talk Java!

Using Java

Applets
JavaBeans
Servlets
Resources
Discuss Java

Looking for Java resources? Check out the Java Coffee Break directory!

Java Profiles

`Allen Holub`

Allen Holub is the author of "Taming Java Threads", a new book published recently by APress. In this exclusive interview, he talks to us about the tricky problems of multi-threading in Java, and the programming language in general.

Read a review of his book

Q: What do you see has been the biggest change affecting the Java community in the last year?

A: I don't see that there has been a *big* change in the last year.

Certainly, the introduction and continued improvement of the HotSpot VM is an enormous plus. The VM in 1.1.3 really screams, and that sort of performance is essential on the server side. Similarly the official incorporation of RMI over IIOP into the JDK is essential for RMI to be truly useful---the older HTTP-based transport protocol was just too slow, though it's nice that it's there when you need to tunnel through a firewall.

The 1.3 JDK also adds some other truly useful, but less important, stuff: The java.awt.Robot finally makes it possible to test GUIs in an automated fashion, and the Runtime.addShutdownHook() method is essential for implementing the Singleton design pattern.

None of this is exactly revolutionary, which I see as a good thing. Java finally seems to be settling down enough that I don't have to spend my life continually chasing after the new feature of the week. Frankly, I think that the language has enough features (and this has been the case for some time). Microsoft adds features as a strategy to prevent others from cloning the operating system, but Java doesn't need to protect itself from clones in this way. My heartfelt wish is that Sun proclaim a new-feature moratorium and concentrate of fixing the myriad bugs that permeate all the packages (and the compiler and VM, for that matter).

Q: What do you see as the main advantages of Java, compared to other languages like C++?

A: Ease of programming. I programmed in C++ for eight years, I never looked back once I started programming in Java. C++ is hideously complicated. I once witnessed a roundtable discussion where most of the big C++ gurus were asked (I'm paraphrasing):

If 100% is knowing everything there is to know about C++, where would you place yourself?

If my memory serves, the highest number came from Stroustrup, who put himself at 70%. I think most C++ programmers are closer to 7%. I'm convinced that one of the reasons that C++ programs are so buggy is that the language is so complex that it's just not possible to write a correct program in it.

It is possible, on the other hand, for a single person to understand all of Java, including the packages. Java does everything C++ does---including multiple inheritance---but is a lot easier to program. (Parameterized types are missing, but that's coming.) The enormous wealth of libraries that accompany the language---both in the java.* packages and provided by third parties---is also an enormous advantage. You can just get more done, faster.

This is not to say that I think that Java is perfect---it's just better than the realistic alternatives. (I say "realistic" because I like Eiffel a lot, but I don't expect to be programming in it any time soon.) Java has enormous problems with its threading model, as is discussed in "Taming Java Threads." A lot of the libraries are surprisingly bad (Swing's text controls are an abomination, for example, and EJB is way to complicated for what it does.) Moreover, the language is still hideously buggy in some places---it's inexcusable that printing is still screwed up, and that you can't interrupt() out of a blocking I/O operation.

On the plus side, the "community process" process does leave the way open for the language to evolve. It's been a persistent joke that a programming language becomes obsolete as soon as the official specification is released. The reason for this phenomenon, I think, is that a programming language can't be static---it has to evolve to meet the needs of the programmers and the "business" requirements of the users. The fact that Java is not fixed, and that there's even an official evolutionary mechanism, is likely to make Java much more long lived than other languages.

Q: What type of applications use multi-threading? What type of programmer needs to understand multi-threaded programming?

A: Virtually all applications use multithreading (or should). On the client side, Swing/AWT uses a single thread both to handle all OS-level events and to dispatch notifications to the listeners. This architecture means that the user interface is locked---completely unresponsive---while your program is in the process of servicing a UI event like a button press or a menu-item selection. If you want a cancel button to work, for example, you *must* use a thread to implement the operation that you intend to cancel. Most books on Swing conveniently ignore this problem.

On the sever side, there is typically one or more thread per client connection, with many other threads on the scene doing things like talking to databases. A Servlet, in fact, is something of a worst-case scenario with respect to threading since there is only one instance of the Servlet, but each client connection is handled on its own threads and all client-connection threads simultaneously talk to the same Servlet object---something of a worst-case synchronization scenario.

To make matters worse, the behavior of threaded systems---particularly in a mutiprocessor environment---is counterintuitive. I'm writing an article on this subject for JavaWorld right now, but the problem has to do with the way that the hardware works. In effect, virtually none of the clever tricks that people try to use to get around the overhead of synchronization actually work. (The "double-checked locking" idiom for singleton access is a case in point--it just doesn't work. There's a good article that explains why at http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
).

The upshot of all this is that it's simply not possible to write ANY production-quality Java without understanding muti-threaded programming. All Java programmers need to know this stuff.

Q: What are the differences between multi-process programming, which our Unix readers who remember the days of fork() will appreciate, and multi-threading?

A: I'm hoping that you're joking about Unix readers using multi-process programming. All Unix systems support threading at this juncture---in fact the Solaris and POSIX threading models are (in my completely unbiased opinion :-) vastly better than the Microsoft models, for example. Any Unix program that's still implementing concurrency using a fork() is hopelessly obsolete.

A process is effectively an address space---and all the overhead and data structures, such as virtual-memory tables, that are required to implement the address space. In Java you can think of the VM and a process as rough equivalents. (I'm simplifying: Since the VM is implemented as a DLL or shared library, it's possible for multiple instance of the VM to be running in a single process, but that's rare.)

A thread is a thread of execution---a sequence of memory locations that contain executable instructions, visited by the CPU in some order defined by the instructions themselves. The thread data structure keeps track only of execution-related things, like the register set and runtime stack.

Swapping one process with another is a big deal, since you have to mess with large data structures---such as virtual-memory tables---and may even have to swap memory to disk. Swapping a thread is extremely efficient: push a few registers onto on thread's runtime stack, then pop new values from another thread's runtime stack. In C, you can implement a simple cooperative threading model using setjump() and longjump() calls.

Because of the overhead issues, threads are better than multiple processes in all but one situation: Multiple processes give you multiple address spaces, so memory-intensive operations might have to occur in multiple processes.

Q: From a performance perspective, what's the practical limit on the number of threads? How does one determine whether you're using too many, or too few?

A: The practical limit is set by the OS---one of many reasons why Java's threading model can't be architecture neutral. NT (the OS itself) tends to get unstable if you have more than a couple hundred threads running. (I'm not sure about Win2K, I haven't tried to crash it yet.) Solaris is happy with many thousands of threads running.

That being said, it never makes sense for more threads to be *running* at a given moment than you have processors. That is, an operation performed by two threads sharing a single CPU will run more slowly than the same operation rewritten to be single threaded because of the time wasted doing thread-context swaps.

Threads don't make sense, from a performance point of view, unless they can run on their own CPUs. This is not to say that you don't want more than 16 threads on a 16-CPU box, but it does mean that you want only 16 of those threads to be running at a given moment. The others should be blocked (suspended) waiting for something to do. For example, they could be waiting for client to connect to a socket, or waiting for a DMA-based disk-I/O operation to complete.

On the other hand, it's sometimes worthwhile to use multiple threads as an organizational tool, provided that you have sufficient mastery of multithreaded-programming techniques that you don't cause problems simply by introducing threading into the mix. It's not possible to program naively in a multithreaded environment.

Q: While your code examples are excellent, one of the restrictions on their use is that their source isn't redistributed, which has caused some confusion amongst readers. Can your code be used in open-source projects?

A: One concern that I do have is that I want the source code distributed from my web site rather than somewhere else---that way I can keep it up to date, fix bugs, and generally monitor the state of the code. There's no problem using the code in open-source software, but I'd prefer for my portion of the source to be downloaded from my web site rather then being bundled into a CD-ROM or .zip distribution or equivalent (and I'd prefer changes to the code to be run through me so that I can keep the implementation coherent). Now that web distribution is so commonplace, this desire on my part doesn't seem so awful to me.

Q: Looking to the future, where do you see Java heading? Is there a particularly dominant technology (e.g. J2EE, CORBA, Jini) that you feel will change the way we look at Java?

A: That's quite a question. I'm hoping that Java will head more in the direction of OO systems than away. A lot of Java is very procedural in structure. The threading model is a case in point; it's not in the least bit OO. (I talk about this issue quite a bit in "Taming Java Threads.") EJB is also pretty miserable as it stands now. The separation of the Session and the entity bean is very procedural, not to mention the fact that people don't leave the entity beans on the server as was, I believe, the intent of the designers, but ship the things around
using RMI calls. A more interesting architecture is a Java stored procedure running in Oracle 8i, using RMI to talk to the outside world so that real objects are shipped through the system, and encapsulating all the SQL in the database where it belongs. In other words, I prefer for the technology to be hidden.

I suppose what I just did in the last paragraph was to talk myself into believing that one of the technologies that you mentioned is indeed more important than the others---Jini. I agree with Don Norman (who wrote "The Design of Everyday Things" and "The Invisible Computer," both good books), who believes that computers as we know them will gradually disappear in favor of smart appliances, and Jini is the enabling technology for a smart appliance.

In the original telephone systems, the onus of getting connected to the person you wanted to talk to was entirely on the end user of the system. All you had were party lines. Eventually, someone came up with the idea of a central switchboard, which is basically where we are now with server-based architectures. One of the objections to widespread use of the telephone was that it was impossible because everybody would have to be an operator. That's exactly what happened though, when we dial a phone number, we are acting as an operator. The reason I bring this up is that there will come a time, I think, where everybody will be a programmer. Not in the sense of writing programs in a language like Java, but in the sense of being able to communicate to the machine what you want it to do. This communication, however, can be done effectively only in the context of a specialized UI in the sense of a piece of gear specialized for performing a single task. These pieces of gear will, of course, need to talk to each other.

I'll give you an example:

Probably the most usable complex computer system that I use daily is my car. In fact, the UI is so good that I'm not even aware that I'm using a computer. But I am. With the exception of the steering and the emergency backup system on the brakes, there's not a single mechanical control in the car that is physically connected to the thing that it controls. The controls literally comprise a mechanical user interface to a computer, which is actually running the automobile. In fact, the car isn't even a single computer. It's literally a network of computers distributed throughout the chassis, actively communicating with
each other to make the car move. Ideally, if something goes wrong with the car, it should diagnose the problem and then call a service center, which would dispatch a truck with the mechanic, tools, and parts required to fix the problem. It should be able to do this anywhere in the world.

Now consider the notion of word processing. I imagine that eventually I'll do my writing on a piece of intelligent "paper" that I can fold up and put in my pocket. This "paper" will let me correct what I write on it, though (perhaps using pen-style idioms, perhaps with some other metaphor), and will internally store many virtual pages. For the paper to be useful, though, it needs to be able to plug into a network of similarly specialized devices and talk to them. I might want to send a piece of virtual paper with a shopping list on it to a supermarket,
for example, or print onto physical paper. My point is that I won't be using a generic computer for this purpose, but rather a specialized device with a UI optimized for human use. I must train (i.e. "program") the device to recognize my personal handwriting style or shorthand, ideally simply by using it.

Jini, of course, takes care of only the inter-device communication part of this puzzle, but that's a pretty big part. Eventually, I imagine, similar enabling technologies will emerge to take care of the other parts of the puzzle. Java, because if it's platform independence and vendor neutrality, seems like a good vehicle for this technology.

Q: Well Allen, we appreciate that look into the world of Java and multi-threading. For readers that want to know more, Allen's book is called "Taming Java Threads", and is published by Apress.

Back to main

Copyright 1998, 1999, 2000 David Reilly	Privacy \| Legal \| Linking \| Advertise!
Last updated: Monday, June 05, 2006