From the Inventor of TestMaker Open Source SOA Test Automation Tool

Frank Cohen

Subscribe to Frank Cohen: eMailAlertsEmail Alerts
Get Frank Cohen: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Apache Web Server Journal

Apache Web Server: Article

Making the Right Choices for SOAP Scalability

From Chapter 14

Software developers live in a time that offers the greatest choice of software development tools, application servers, and connectivity ever. Each choice you make affects the scalability and reliability of your finished application, especially if you're building Web services. For example, as you will learn in this article, my study of SOAP encoding styles found a 30-fold performance improvement by choosing one SOAP encoding style over the others. By understanding the performance impact of SOAP encoding styles, Web service development tools, application servers, and platforms, our choices greatly improve system performance.

This article presents results of an investigation that shows how each choice immediately impacts scalability and reliability. It discusses the impact of letting development tools make choices for us on scalability and performance. Then it presents directions, tools, and test agents to stage tests in your own environment.

Why Is SOAP So Popular?
In my experience, when independent technology innovations intersect, the world enjoys life-changing products, services, and techniques. For example, the light bulb required both electricity generation and filament fiber technology. In the case of Web services, enterprise information technology managers had just come off a multi-year binge where they bought huge numbers of computers, servers, routers, and other Web infrastructure. Left with a recession, terrible stock market, uncertainty about the world economy, and having to contend with terrorism and SARS, these information managers returned to their existing Web infrastructures to increase the productivity of their teams through new software projects.

At the same time, most software developers realized they really liked XML. For example, XML was much better than using the Microsoft Windows Registry or text-based property files to store and describe application data. Software developers saw a good thing in XML and wanted to find more ways to use XML in their applications.

As a software developer I began noticing application programming interfaces (APIs) that expected to receive a value that contained XML encoded data. For example, when building a portal system for Sun Microsystems I found that the servlet to create a new user account received the fields that made up the user contact information (email, address, and telephone number) in an XML document. Rather than pass one value in at a time to a method, instead the method took one XML value that contained several values. Using XML to implement an application's interfaces is a clear win to developers. Plus, these XML-described interfaces could work across platforms and programming languages. With XML everything looks like an interface.

These intersecting technologies power the widespread enthusiasm for Web services. At the same time, software developers were again experimenting with software architectures, especially with the location of application business logic and presentation code. Presentation code handles windows, mice, keyboard, and other user interactions. Business logic is the instructions that define the behavior and operation of an application.

The first-generation software architecture built the presentation and business logic on a single system. In the second generation, client/server architecture brought back the large, centrally controlled datacenter so familiar in the 1960s, when mainframes ruled the information world. In client/server architecture the desktop system is a "dumb" terminal that only needs to display the data provided by the server. The early Internet was modeled after client/server architecture, where the browser made a simple request to a server. As browsers improved in functionality - applets, JavaScript, ActiveX, DHMTL were introduced - some systems included business logic on the desktop side. However, the majority of function remained on the server.

The age of "Grid Computing" is upon us, where an application hosts business logic modules on the desktop or server. The modules discover each other using UDDI and P2P technologies. Also, multiple copies of the business-logic modules may run in a grid of data centers to allow failover, dynamic routing, and functional specialization. All of these architectures run in a Web environment and can host Web services.

So even if SOAP-based Web services are replaced with some other type of Web service technology, remote procedure calls using XML-encoded data will be around for a very long time.

SOAP Encoding Styles
SOAP uses XML to marshal data that is transported to a software application. Most of the time, SOAP moves data between software objects, but the SOAP specification was intended to be useful for old legacy systems as well as modern object-oriented systems. Consequently, SOAP defines more than one data-encoding method to convert data from a software program into XML format and back again. The SOAP-encoded data is packaged into the body of a message and sent to a host. The host then decodes the XML-formatted data back into a software object.

Since SOAP's introduction, three SOAP encoding styles have become popular and are reliably implemented across software vendors and technology providers:

  • SOAP Remote Procedure Call (RPC) encoding, also known as Section 5 encoding, as defined by the SOAP 1.1 specification and later defined in SOAP 1.2 as RPC encodings and conventions
  • SOAP Remote Procedure Call Literal encoding (SOAP RPC-literal,) uses RPC methods to make the call but uses an XML do-it-yourself method for mar-shaling the data.
  • SOAP document-style encoding, also known as message-style or document-literal encoding.

There are other encoding styles, but software developers have not widely adopted them, mostly because their promoters disagree on a standard. For example, early on in the invention of Web services Microsoft promoted Direct Internet Message Exchange (DIME) to encode binary file data, while the rest of the computer industry adopted SOAP with Attachments. SOAP RPC encoding, RPC-literal and document-style SOAP encoding have emerged as the encoding styles that a software developer can count on.

Some developers do not realize that such encoding styles exist because the tools they use to develop Web services are doing the work of implementing the encoding styles for the developer. For example, BEA WebLogic Workshop provides a fast and efficient implementation of the Java Web Service (JWS) interface. JWS implements a set of application programming interfaces (API) and a standard description of the files the JWS engine needs to automatically deploy the Web service on the server. JWS builds the Web service deployment descriptors for you automatically. So you need to only define the public Java methods and JWS publishes the SOAP proxy to access the methods. This makes development appear very easy, but there is a lot of work going on under the covers, so to speak. We'll see this in more depth later in this chapter.

Before I discuss SOAP encoding style's impact on performance, you should understand the differences between these styles of SOAP encoding. Figure 1 shows the entire stack for a SOAP RPC encoded call.

 

SOAP RPC is the encoding style that offers the most simplicity for developers. The developer makes a call to a remote object, passing along any necessary parameters. The SOAP stack serializes the parameters into XML, moves the data to the destination using transports such as HTTP and SMTP, receives the response, deserializes the response back into objects, and returns the results to the calling method. Whew! SOAP RPC handles all the encoding and decoding, even for very complex data types, and binds to the remote object automatically.

Now, imagine you are a developer with some data already in XML format. SOAP RPC also allows literal encoding of the XML data as a single field that is serialized and sent to the Web service host. Since there is only a single parameter -- the XML tree - the SOAP stack only needs to serialize one value. The SOAP stack still deals with the transport issues to get the request to the remote object. The stack binds the request to the remote object and handles the response. Lastly, in a SOAP document-style call, the SOAP stack sends an entire XML document to a server without even requiring a return value. The message can contain any sort of XML data that is appropriate to the remote service. In SOAP document-style encoding, the developer handles everything, including determining the transport (e.g., HTTP, MQ, SMTP), marshaling and unmarshaling the body of the SOAP envelope, and parsing the XML in the request and response to find the needed data.

The three encoding systems are compared in Figure 2.

 

SOAP RPC encoding is easiest for the software developer; however, all that ease comes with a scalability and performance penalty. SOAP RPC-literal encoding is more involved for the software developer to handle XML parsing, but requires fewer overheads from the SOAP stack. SOAP document-literal encoding is most difficult for the software developer, but consequently requires little SOAP overhead.

Why is SOAP RPC easier for the developer? With this encoding style, you only need to define the public object method in your code once; the SOAP stack unmarshals the request parameters into objects and passes them directly into the method call of your object. Otherwise, you are stuck with the task of parsing through the XML tree to find the data elements you need, and then you get to make the call to the public method.

There is an argument for parsing the XML data yourself: since you know the data in the XML tree best, your code will parse that data more efficiently than generalized SOAP stack code. As we will see when we measure scalability and performance in SOAP encoding styles, we will find this to be the case.

But before I go further into that, we look at how enterprise information systems managers are coming to grips with SOAP encoding styles and scalability.

Simple Object Access Needs Simple Testing
Elsevier (www.elsevier.com) is the leading research content publisher for the science, technology, and medical industries. Elsevier now uses a content-publishing platform that uses SOAP to build application programming interfaces. Elsevier's information managers need to know if their choices of SOAP encoding style will scale and perform to handle millions of transactions every day. Their decisions affect how Elsevier will invest capital in new infrastructure. Over time, they need to know how new releases of their own software, new releases of application server software, and platform changes will affect scalability and performance.

Elsevier learned about TestMaker through the open source community and contacted PushToTest (www.pushtotest.com) to see if TestMaker was appropriate for their testing needs. Elsevier asked PushToTest to conduct an independent audit of SOAP stacks and encoding styles to answer their questions about system performance and scalability. PushToTest delivered a Test Web Service (TWS) that handles RPC, RPC-literal, and document-style SOAP messages and runs on a variety of application servers. The environment is completed with a set of intelligent test agents to check TWS for scalability and performance.

TestMaker checks Web services for scalability, performance, and reliability. Software developers, QA analysts, and IT managers use TestMaker to build intelligent test agents that implement archetypal user behavior. The agents drive a Web service using native protocols (HTTP, HTTPS, SOAP, XML-RPC, SMTP, POP3, IMAP) just as a real user would. Running multiple intelligent test agents concurrently creates near-production-level loads to check the system for scalability and performance.

In addition to checking SOAP encoding scalability, the Elsevier test environment provides a benchmark specific to Elsevier's systems to show a performance comparison for a variety of application servers and platforms. For example, TWS is currently implemented to run on IBM WebSphere, BEA WebLogic Workshop, and the SunONE Application Server. I am confident that ports to MindElectric's Glue, Apache Axis, Systinet WASP, and other application servers is straightforward.

I built the Elsevier test environment by customizing TestMaker to support SOAP RPC, SOAP RPC-literal, and SOAP document-style requests, and by implementing TWS to respond to requests in these encoding styles. The request to TWS contains two parameters: the first defines the size of the response and the second defines a delay value before responding. TWS responds by creating a response document containing random gibberish words that appeared in five response elements; each element has one child element. A TestMaker test agent uses the Apache SOAP library to make requests to TWS. The test agent varied the number of concurrent requests to TWS and the payload size of the response. The test agent logged the results to a delimited log file, which was subsequently summarized by a tally script. The tally script determined the number of transactions per second (TPS) performed by the test by counting the duration of successful transactions. We defined success as the absence of transport or SOAP faults.

With Sun Microsystems support, I ran the tests on Sun Solaris E4500 servers with 6 CPUs and 4 GB of RAM. The Test Web Service (TWS) used the SOAP stack provided by the underlying application server. For example, WebSphere provides Apache SOAP, BEA WebLogic provides their own implementation that uses the JAX-RPC APIs, and the SunONE Application Server uses the Java 1.4 JAX-RPC library. On the client side, Test-Maker uses the Apache SOAP library.

In the Elsevier project, I found that a developer's choice of encoding style determines to a large extent the scalability and performance of a Web service. The SOAP implementations universally showed scalability problems when using SOAP RPC encoding, especially as payload sizes increased, as illustrated in Figure 3.

 

The test agent recorded 294 transactions per second when making requests where the response SOAP envelope measured 600 bytes of SOAP RPC-encoded data. As the test agent increased the response size, the transactions per second plummeted. When making requests of 96,000 bytes of SOAP RPC-encoded data, the agent measured only 9.5 transactions per second.

When the test environment used SOAP document-style encoding the performance fared much better, With 600 bytes of document-encoded data, the test agent measured 469 TPS. Recall that the SOAP RPC-encoded requests gave us 294 TPS for requests of the same size. Additionally, when the test agent increased the response size, the TPS values did not degrade significantly when we used document-style encoded responses.

When the test environment uses SOAP RPC-literal encoding I found an efficient middle ground. RPC-literal provides the performance benefits of SOAP document-style encoding with a little more work required to parse through the XML data.

In my experience every production environment is unique. So, rather than try to be your answer guy for every application server and encoding style, I would like to give you a performance kit that you can download and use in your own production environment. I have made a generalized version of the Elsevier test environment available for free download for your immediate use at: www.pushtotest.com/ptt/kits/encodingkit.html.

Summary
In this article, we found that software developers have many choices for building Web service systems: SOAP-encoding styles, Web service development tools, and application servers. This chapter presented the results of an investigation that shows how each choice immediately impacts scalability and reliability.

Elsevier adopted SOAP as their standard way to build their next generation content aggregation system. We saw how Elsevier developed a new methodology and test environment to check various SOAP implementations, including application servers, SOAP stacks, and utilities for scalability and performance.

Book info:
Java Testing, Design and Automation
Prentice Hall Publishing
ISBN 0131421891

Price:
$49.95 U.S.D
http://thebook.pushtotest.com

Publication date:
February 2004

More Stories By Frank Cohen

Frank Cohen is the CEO and Founder at Votsh Inc. and the CTO at Appvance (formerly PushToTest). He is one of the world's foremost experts in software test tools, process, and methodology. He founded Regent Software, joined Peter Norton Computing, managed the successful merger with Symantec, joined Stac Electronics launched SoftWindows at Insignia, and led Apple Computer‘s middleware, networking and connectivity product lines as senior manager. He was also on the founding team of TuneUp.com, which was acquired by Symantec and CoFounder of Inclusion Technologies – interactive personalized communication and workflow function technology for Web sites.

Cohen authored four books including FastSOA (Morgan Kaufmann), Java Testing and Design: From Unit Testing To Automated Web Tests (Prentice Hall,) Java Web Services Unleashed (SAMS), and Java P2P Unleashed (SAMS) - See more at: http://appvance.com/about-us/executive-team/#sthash.dY4dqUi2.dpuf is the leading authority for testing and optimizing software developed with service-oriented architecture (SOA) and Web service designs. He is CEO and Founder of PushToTest and inventor of TestMaker, the open source SOA test automation tool, that helps software developers, QA technicians, and IT managers understand and optimize the scalability, performance, and reliability of their systems.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.