Friday, November 11, 2011

Best Practices to improve Performance in JSP

This topic illustrates the performance improvement best practices in JSP with the following sections:

Overview of JSP

When the user requests a JSP page for the first time, A JSP converts into servlet java source file and compiles into servlet class file that is called as translation phase, then onwards it works like pure servlet for all requests this is called execution/request process phase. But the method signatures are different for both Servlet and JSP. Servlet has init(), service() and destroy() methods where as JSP has jspInit(), _jspService() and jspDestroy() methods. JSP has some advantages over servlet. JSP gives good separation between presentation (html) and business logic. See Overview of Servlets for more details. Here I use JSP's Servlet instead of Servlet to differentiate between both.

Note: This Section assumes that reader has some basic knowledge of JSP.

Use jspInit() method as cache

The default mechanism of a JSP Engine is to load a JSP's servlet in multithreaded environment, that is the default value of page directive in JSP's
<%@ page isTheadSafe="true" %>
In this environment, a JSP's jspInit() method is called only once in its life time.
Here is a trick that you can use to improve performance using jspInit() method.
You can use this method to cache static data.
Generally a JSP generates not only dynamic data but also static data.
Programmers often make a mistake by creating both dynamic and static data from JSP page. Obviously there is a reason to create dynamic data because of its nature but there is no need to create static data every time for every request in JSP page.
For example, normally you would write a JSP page like this
//creating static data and pass it to client
out.print("<html>");
out.print("<head><title>Hello world</title></head>");
out.print("<body>");
// create the dynamic data and pass it to client here
//creating static data again and passing it to client
out.print("</body>");
out.print("</html>");

Here you are generating both static data and dynamic data from _jspService() method. Instead what you can do is
<%!      char[] header;
char[] navbar;
char[] footer;
char[] otherStaticData;
public void jspInit(){
//create all the static data here
StringBuffer sb = new StringBuffer(); // better to initialize the StringBuffer with some size to improve performance
sb.append("<html>");
sb.append("<head><title>Hello world</title></head>");
sb.append("<body>");
header = sb.toString().toCharArray();
// do same for navbar if its data is static
// do same for footer if its data is static
} // end jspInit() method
%>
out.print(header);
out.print(navbar);                   
// write dynamic data here
out.print(footer);
}

Here the static data is created in jspInit() method which means that it is created only once in the life time of JSP and it is used in _jspService() method to pass the data to the client. When you send a large amount of static data, then you can use this technique to see a considerable increase in performance.

Optimization techniques in _jspService() method

When you use implicit out object to pass the data to the client from JSP, the JSP Engine/container creates a JSPWriter object and put it in the _jspService() method. You don't need to bother about writing _jspService() method in your JSP, JSP Engine does that work for you. You can improve performance by using the following techniques.

  1. Use StringBuffer rather than using + operator when you concatenate multiple strings
  2. Use print() method instead of println() method of out (implicit) object
  3. Use ServletOutputStream instead of JSPWriter
  4. Initialize out with proper size in the page directive
  5. Flush the data partly
  6. Minimize the amount of code in the synchronized block
  7. Set the content length
  1. Use StringBuffer for concatenation rather than using + operator. See Concatenating Strings for detailed information.
  2. println() method internally calls print() method and there is no need for a new line separation when generating html pages. So a small overhead of calling one more method is reduced if you use print() method directly.
  3. There is a small overhead involved in JSPWriter because it is meant for character output stream and it encodes data to bytes, rather you can directly use ServletOutputStream whenever you want to send binary data.
  4. Initialize the out object with proper size in the page directive. It is discussed in detail in later part of this section.
  5. If you want to pass huge data to the client from your servlet, user may need to wait till the ServletOutputStream or JSPWriter flushes the data. This happens generally whenever you have a number of gifs per page and you want to pass it to the client. The better approach is to flush the data partly using flush() method rather than flushing whole data at a time. You can initially flush header, then navigation bar, then body content and finally footer so that the user need not wait for whole data and he sees the header data immediately and so on with navigation bar, body content and footer.
out.write(header);
    out.flush(); // flush the header
    out.write(navbar);                                   
    out.flush(); // flush the navigation bar
    // write dynamic data here
    out.flush(); // flush the dynamic data
    out.write(footer);
    out.flush(); // finally flush the footer

Optimization techiques in jspDestroy() method

The jspDestroy() method is called only once in JSP's servlet life time, when the JSP Engine removes the JSP's servlet from memory. It is always better to remove instance variable resources such as JDBC connections, sockets, other physical resources in this method to avoid memory leaks.

Optimization techniques in page directive

Page directive defines attributes that apply to an entire JSP page. Here is an example of page directive.
<%@ page session="true|false" buffer="none|8kb|size in kb" %>
true and 8kb are default values. Here I have shown only a few attributes, these attributes have an impact on the performance so we will discuss about them here. By default JSP Engine creates session object. If you don't want to use built in HttpSession for a JSP, then make session attribute value as false. It avoids unnecessary creation of session (implicit) object, reduces overhead on memory and garbage collector and increases performance. By default the size of out (implicit object of JSPWriter) object is 8kb. You can increase the size if you are sending a large amount of data. so set
<%@ page session="false" buffer="12kb" %>  
Here you need to set the size as per page response data if it crosses 8kb.

Choosing right include mechanism

There are two include mechanisms available to insert a file in a JSP page. They are
  1. include directive <%@ include file="child.jsp" %>
  2. include action <jsp:include page="child.jsp" flush="true" />
The include directive includes the content of the file during the translation phase where as include action includes the content of the file during execution/request processing phase. For include directive, JSP Engine adds the content of the inserted page at translation phase, so it does not have an impact on performance. For include action, JSP Engine adds the content of the inserted page at run time which imposes extra overhead.

Choosing right session scope in useBean action

When you want to create a bean using useBean action tag you can set scope for that bean
<jsp:useBean id="objectName" scope="page|request|session|application" />  
default value is 'page' for any bean if you don't specify the scope explicitly. By defining scope attribute, you are defining the life time of that object, when it has to be created and when its life time ends. To be precise, you are defining the availability of that object to a page, request, session (that is across multiple requests to a user) or application (across multiple users ). Here the scope effects the performance if you don't specify exact scope as per your requirement. What will happen if you set a session scope for an object which is needed only a request? The object will unnecessary reside in the memory even after your work is done. When using the session or application scope object you have to explicitly remove it after you are done. Otherwise the session object will be there in the memory till you explicitly remove the object or your server removes it after a configured time limit ( typically it is 30 minutes). It reduces the performance by imposing overhead on memory and garbage collector. The same is the problem with the application scope objects. So set exact scope for an object and also remove those scope objects immediately whenever you are done with them.

Choosing the custom tags versus non custom tags

Custom tags in JSP gives you reusability and simplicity. Simplicity means that you need not write java code in JSP rather you write custom tags for that. Reusability means that once you write a piece of code as custom tag handler, you can use this tag handler in any JSP. But what will happen if you write a tag handler that is not reused often and is not simple? In such cases it is better not to use custom tags since you need to use classes, interfaces of javax.servlet.jsp.tagext, deployment descriptor file and also you need to override methods of those classes and interfaces in order to write a tag handler. JSP Engine has to look at descriptor file to figure out tag handler class and execute that handler. All these operations do not come for free. It reduces performance and it is proportional to the number of tag handlers you use in JSP. So don't use custom tags unless you are sure of its reusability.

Cache the static and dynamic data

The use of caching in different areas of your application gives very good performance. Generally every application's database schema will have at least some read only tables. There is no need of accessing these tables every time. You can cache that data in memory and reuse it instead of accessing database every time. It reduces network traffic, consumes less CPU cycles and gives good performance. Caching can be done in three flavors namely static data caching, semi dynamic data caching and dynamic caching. Static data means that it doesn't change the content in its life time, it is always constant. Semi dynamic data means that data changes but not very often. For example the data that changes after every one hour can be called as semi dynamic data but it does not change the data for every request. Dynamic data means that it changes often. Often people use the word dynamic data for semi dynamic data as well so even I followed the same terminology. In this section, dynamic data synonymous with semi dynamic data. It is best to cache static data and dynamic data in order to improve performance.We will discuss here about few caching techniques to improve JSP performance. They are
  1. Caching static and dynamic data
  2. Utilizing application server Caching facilities
  3. Utilizing JSP built in facility, session and application (implicit) objects
  4. Utilizing third party Caching algorithms
As we saw above, Caching at jspInit() method is useful for caching static data and it reduces the creation time of static data. By writing your own algorithms for caching dynamic data, you can maintain dynamic caching for your application. Your application server may support caching facility for dynamic data caching. For example, weblogic server is giving some custom tags for dynamic caching facility. you can use that facility. Look at your server documentation for more information. You can use JSP's built in facility, session and application objects for caching. session object is available for a user session across multiple requests and application object is available for all users using the application. You can cache data into these objects and get this cached data whenever you require. The methods that support caching are.
session.setAttribute(String name, Object cacheableObject);

session.getAttribute(String name);

application.setAttribute(String name, Object cacheableObject);

application.getAttribute(String name);
You can even use third party vendors or open source caching algorithms to achieve caching. One of the good open source is http://www.opensymphony.com. They are offering custom caching tags for free, they are
<cache></cache>

<usecached></usecached>

<flush/>
These ready made tags are used by session and application scope objects internally. You can set cacheable object by key and get those objects using those keys, scope ( either session or application), time for refreshing cacheable objects, and flushing. See this link hhttp://www.opensymphony.com/oscache for detailed information about these tags. Any of these caching techniques gives good performance with some limited scope and you need to utilize depending on your application's requirement.

Choosing the right session mechanism

We use session mechanism to maintain client state across multiple pages. The session starts when the client, such as browser requests for a URL to the web server and it ends when the web server ends the session or web server times out the session or user logs out or user closes the browser. There are few approaches available to maintain session, those are using
  1. session (implicit) object available for any JSP ( this is HttpSession provided by servlet API)
  2. Hidden fields
  3. Cookies
  4. URL rewriting
  5. Persistent mechanism
Obviously it is difficult to select one mechanism out of above mentioned approaches to maintain session data. Each one has an impact on performance depending on amount of the data to be stored as session data and number of concurrent users. The following table gives you an idea of performance about each approach.
Session mechanism Performance Description
session good There is no limit on size of keeping session data
Hidden fields moderate There is no limit on size of passing session data
Cookies moderate There is a limit for cookie size
URL rewriting moderate There is a limit for URL rewriting
Persistent mechanism moderate to poor There is no limit of keeping session data
Here the Persistent mechanism means that you store the session data in the database, file storage or any other persistent storage. There are a few approaches for this mechanism, they are
  1. Using your application server's persistent mechanism for session data storage
  2. Using your own persistent mechanism by maintaining your own database schema
If you use the first approach, generally application server converts the session objects into BLOB data type and stores it in the database. If you use second approach, you need to design the schema as per your session fields and need to store the session data by writing JDBC code for that, this gives better performance than the first approach. Either of persistent mechanisms give moderate to poor performance than other approaches because of overhead involved in database calls through JDBC and it makes calls to database on every request in order to store that session data and finally it needs to retrieve the whole session data from database but it scales well upon increasing session data and concurrent users. URL rewriting gives moderate performance because the data has to pass between the client and server for every request but there is a limitation on amount of data that can pass through URL rewriting. It gives moderate performance because of overhead involved on the network for passing data on every request. Cookies also give moderate performance because they need to pass the session data between client and server. It also has the size limit of 4k for each cookie. Like URL rewriting and Cookies, Hidden fields need to pass the data between client and server and give moderate performance. All these three session mechanisms give moderate performance and is inversely proportional to the amount of session data. Unlike the above mentioned mechanisms, session (implicit) object mechanism gives better performance because it stores the session data in memory and reduces overhead on network. Only session id will be passed between client and server. But it does not scale well up on increasing session data and concurrent users because of increase in memory overhead and also increase in overhead on garbage collection. Remember that choosing the session mechanism out of one of the above approaches not only depends on performance but also scalability and security. The best approach to maintain a balance between performance, scalability and security. Mixture of session mechanism and Hidden fields gives both performance and scalability. By putting secure data in session and non secure data in hidden fields you can achieve better security.

Control session

If you decide to use session (implicit object that represents HttpSession object) for your session tracking, then you need to know how your application server/servlet engine implements session mechanism. You need to take care of the following points
  1. remove session explicitly
  2. session time out value
  3. application server/servelt engine implementation
Generally, your application server/servlet engine will have default session time out value as 30 minutes which means that if you don't remove session or manipulate that session for 30 minutes then your servlet engine removes that session from memory. If you set long session time out value such as 1 hour, then it keeps all the session objects till 1 hour. This approach effects the scalability and performance because of overhead on memory and garbage collection. In order to reduce memory overhead and to improve performance, it is better to remove/invalidate session explicitly using session.invalidate() method. And also try to adjust the session time out value as per your application's requirement. Third important point is that your application server may serialize session objects into persistent mechanism after crossing certain memory limit. It is expensive and reduces the performance because it not only serializes the single session object but also serializes the total object hierarchy. Use 'transient' for variables to avoid unnecessary serialization. See Serialization for detailed information. So know about your application server/servlet engine session implementation mechanism and act accordingly.

Disable JSP auto reloading

Most of the application servers/JSP engines have the capability of loading JSP's servlets dynamically, that means you need not restart your server whenever you change the JSP content. Application server/JSP engine loads the JSP's servlet every time when you configure that JSP's servlet. For example, if you configure auto reload time as 1 second, then JSP engine loads that JSP's servlet after every 1 second. This feature is good at development time because it reduces the development time by avoiding restart of the server after every change in JSP. But it gives poor performance in the production due unnecessary loading and burden on class loader. So turn off your auto reloading feature in the configuration file to improve performance.

Control Thread pool

JSP engine creates a separate thread for every request and assigns that thread to _jspService() method in its multithreaded JSP's servlet and finally it removes that thread after completion of _jspService() method execution. It happens for every request. Your JSP engine may create a new thread for every request by default. This default behavior reduces performance because creating and removing threads is expensive. This can be avoided by using the thread pool. JSP engine creates pool of threads at start up and assigns a thread from pool to every request instead of creating a fresh thread every time and it returns that thread to the pool after completion. JSP engine creates the thread pool with some default size depending upon configuration parameters of the configuration file for that pool. The pool will have minimum and maximum number of threads and you can configure these numbers in the configuration file of your JSP engine. The number of maximum and minimum threads in pool depend on concurrent users for your application. You have to estimate number of concurrent users for your application and give the thread pool size based on that. Obviously there is a limit on thread pool which depends upon your hard ware resources. By setting thread pool size correctly, The performance of JSP increases significantly. Your application server/ JSP engine may not give the facility to configure thread pool. Tomcat's JSP Engine gives the facility to configure thread pool. Look at your application server / JSP engine documentation for the information about thread pool.

Key Points

  1. Use jspInit() method to cache static data
  2. Use StringBuffer rather than using + operator when you concatenate multiple strings
  3. Use print() method rather than println() method
  4. Use ServletOutputStream instead of JSPWriter to send binary data
  5. Initialize the 'out' object (implicit object) with proper size in the page directive.
  6. Flush the data partly
  7. Minimize code in the synchronized block
  8. Set the content length
  9. Release resources in jspDestroy() method.
  10. Give 'false' value to the session in the page directive to avoid session object creation.
  11. Use include directive instead of include action when you want to include the child page content in the translation phase.
  12. Avoid giving unnecessary scope in the 'useBean' action.
  13. Do not use custom tags if you do not have reusability.
  14. Use application server caching facility
  15. Use Mixed session mechanisms such as 'session' with hidden fields
  16. Use 'session' and 'application' as cache.
  17. Use caching tags provided by different organizations like openSymphony.com
  18. Remove 'session' objects explicitly in your program whenever you finish the task
  19. Reduce session time out value as much as possible
  20. Use 'transient' variables to reduce serialization overhead if your session tracking mechanism uses serialization process.
  21. Disable JSP auto reloading feature.
  22. Use thread pool for your JSP engine and define the size of thread pool as per application requirement.

No comments:

Post a Comment