Best Practices: Structuring Your Tests
At the lowest level, a test step verifies a single action. Grouping test steps into a test case allows verifying a delimited piece of functionality - something your application needs. Several test cases make up a test suite that verifies the complete functionality of one of the deliverables – something your customer wants. Grouping the test suites into a test project, allows for verification of functionality for a complete product.
The terms project and suite are sometimes used interchangeably, but the idea is the same: multiple levels of tests with different scope.
Small Building Blocks
The benefit of starting with individual API call test steps and combining them into large use cases helps you quickly pinpoint exactly where a defect occurs. It also allows you to gradually let you get familiar with all the nuances of the AUT – Application Under Test.
APIs are often poorly documented. You want to approach structuring your tests like building blocks, starting from very small and using the knowledge you gained to build up bigger pieces.
When you run your tests and find a defect, developers want to be able to quickly identify the the smallest piece that is broken. If you only have a large user story a single unit, it will take a long time to find exactly what failed where within that story, no matter how automated it is.
Also, properly structuring your tests into manageable logical blocks will help your tests be more flexible and maintainable.
You want to be able to select specific pieces of functionality to test. If you have a suite of thousands of tests, but only one (critical) defect in one area of the application is fixed, you want to be able to quickly select only the tests surrounding that area of functionality. Almost all modern frameworks have a method of grouping tests or even suites into some named groups, and subsequently a method of selecting just one (or few) of the groups to run.
How small the individual pieces should be depends on your product. If you are a house builder, the smallest interesting blocks will be things like a window, a door, a cabinet, a floor mat, a wall section, et c. The next bigger unit could be a room: bedroom, kitchen, hallway, et c. A project could be a bungalow, a two-storey house, et c.
If you are a supplier for the house builder, the smallest block might be a door hinge, a handle, a bathroom sink, a kitchen sink, a faucet, et c. The bigger blocks might be an outside door, inside door, kitchen cabinet, et c. A project could be a complete kitchen ensemble, or a bathroom ensemble.
Smoke tests should always be your starting point when looking at a new application. A good smoke test suite probably has the highest ROI – Return On Investment – compared to all the other categories, and will save you a lot of trouble in the long run.
The term “smoke test” probably comes from the plumbing industry via the electronics industry. It is a basic check of all the installed pieces, to determine if there is any point in testing further. In the plumbing industry, one would blow actual smoke through pipes, to see if there are any smoke leaks or cracks. In electronics, power would be applied to a circuit, to see if any of the devices start to smoke due to incorrect wiring. In either case, if you see smoke, no further testing is needed.
In software, you want to be able to answer the question: has everything been deployed and started correctly? A smoke test should determine if any part is not working, or is outright missing. In the case of APIs, it should be sufficient to call every method once and get any response, just to make sure it is there. Methods that fetch data from the database are especially important, as those will tell you whether all the pieces of the entire stack in the n-tier design are deployed and communicating properly.
Non-intrusive smoke tests work well as a production monitoring tool.
Smoke tests should be run at the end of every single build. A good rule of thumb is that the entire smoke test suite should take no more than 30 minutes to run, since nobody wants to wait a long time for a new build. Another good rule is that all the tests in the smoke test suite should not make any internal calls, only communicate with external APIs, such as JDBC calls. If you use non-intrusive smoke tests (that do not modify any customer data), then they work well as a production monitoring tool. (See the World of API Testing section article DevOps trends.)
Sanity tests add higher level of testing to the smoke tests. Sanity tests verify that the smoke tests are getting back something reasonable. For example: Most weather reports do not have decimal places for temperature, but the database might. A weather report for anywhere in the world should not return a temperature (in degrees Celsius) with more than two digits. At this stage the exact value or whether it is positive or negative isn't important. You are checking that the call is correctly interpreting the inputs and correctly showing you the returned data.
What an API does is to expose various (object oriented programming) objects that are stored somewhere, most often in a database. Every object can be created and read back, and often objects can be updated and deleted.
CRUD tests are tests that verify that object are being written correctly in the database: Create – Read – Update – Delete. It is a good idea to verify the data directly in the database after each operation (perhaps using a JDBC – Java DataBase Connectivity – call). This will help you get familiar with the structure of the database, and give you confidence that subsequent calls are returning what they are suppose to.
One way to use CRUD tests is to reveal problems in any caching mechanism between the API server and the database server. If your smoke tests reveal that separately create and read calls works, but create and read for the same object instance back-to-back does not, that is probably a caching problem.
Another problem area that CRUD tests can reveal are concurrency issues. If you make 5 parallel withdrawal calls for all of your money, only one of the calls should succeed. If more than one succeeds, this is often a problem with incorrect locking mechanisms.
The purpose of any test is to try to break the application. A more specific way of testing is to use negative tests to bring out error messages. Error message should tell you exactly what went wrong: “a user must be logged in first”, or “missing amount for a deposit”. Obviously the errors should match the actual conditions.
Error messages are especially important for public APIs; they help developers interface with your API. For internal APIs, it is quite common that the error messages will only return a code, and the list of codes is shared between different teams (server developers, UI developers, and testers). (See the article Negative Testing.)
Boundary tests are a special case of negative tests. If you have a field that is suppose to accept only a certain range of values, then it is a good idea to test what happens exactly at the boundary. For example, if a field accepts any integer value between 1 and 10, inclusive, you might want to try all the following values.
Also, while not boundary test, it is a good idea to try extremely large values. In the above example, you would want to try the maximum value of an integer. A boundary test will often be set up as a data-driven test executed in a loop, so it is very easy to add another value. Adding large values checks for buffer overflow problems, which is a security concern.
Security tests are another special case of negative tests. These involve sending specially crafted inputs to the AUT in an attempt to try to bypass normal access restrictions, and trick the application into revealing information which should not be revealed. Security testingis a very specialized area which requires much more research and education than others. Some testing tools offer canned security tests as part of their feature set, allowing you to easily expose your AUT for some of the most common security attacks out there. However, there are security vulnerabilities, such as zero-day exploits, that no tool can reveal and which require a specialist in the area.