Thursday, March 21, 2013

Performance tuning on homemade views

I normally get assigned the performance related problems in any organization I have ever worked in, and frequently I need to create new indexes, change programming and similar things.
The other day, however, I ran into an issue whereby a simple notification in SAP altering the attached repair order was extremely slow.
I ran a ST05 trace and found the following:

SELECT STATEMENT ( Estimated Costs = 5.717 , Estimated #Rows = 1 )

       14 COUNT STOPKEY
          Filter Predicates

           13 HASH JOIN
              ( Estim. Costs = 5.717 , Estim. #Rows = 1 )
              Estim. CPU-Costs = 671.874.273 Estim. IO-Costs = 5.672
              Access Predicates

               10 HASH JOIN
                  ( Estim. Costs = 4.260 , Estim. #Rows = 21 )
                  Estim. CPU-Costs = 653.871.901 Estim. IO-Costs = 4.217
                  Access Predicates

                   5 NESTED LOOPS
                     ( Estim. Costs = 2 , Estim. #Rows = 21 )
                     Estim. CPU-Costs = 20.973 Estim. IO-Costs = 2

                       2 TABLE ACCESS BY INDEX ROWID QMEL
                         ( Estim. Costs = 1 , Estim. #Rows = 1 )
                         Estim. CPU-Costs = 4.877 Estim. IO-Costs = 1

                           1
INDEX SKIP SCAN QMEL~0
                             Search Columns: 2
                             Estim. CPU-Costs = 3.059 Estim. IO-Costs = 0
                             Access Predicates

                       4 TABLE ACCESS BY INDEX ROWID COBRB
                         ( Estim. Costs = 2 , Estim. #Rows = 21 )
                         Estim. CPU-Costs = 16.096 Estim. IO-Costs = 2

                           3 INDEX RANGE SCAN COBRB~1
                             Search Columns: 2
                             Estim. CPU-Costs = 3.899 Estim. IO-Costs = 0
                             Access Predicates

                   9 VIEW index$_join$_002
                     ( Estim. Costs = 4.255 , Estim. #Rows = 250.747 )
                     Estim. CPU-Costs = 621.161.707 Estim. IO-Costs = 4.214

                       8 HASH JOIN
                         Access Predicates

                           6 INDEX FAST FULL SCAN AUFK~ZWR
                             ( Estim. Costs = 1.764 , Estim. #Rows = 250.747 )
                             Estim. CPU-Costs = 82.670.583 Estim. IO-Costs = 1.759
                             Filter Predicates
                           7 INDEX FAST FULL SCAN AUFK~HOB
                             ( Estim. Costs = 1.941 , Estim. #Rows = 250.747 )
                             Estim. CPU-Costs = 76.894.097 Estim. IO-Costs = 1.936

               12 TABLE ACCESS BY INDEX ROWID AFKO
                  ( Estim. Costs = 1.456 , Estim. #Rows = 88 )
                  Estim. CPU-Costs = 10.379.050 Estim. IO-Costs = 1.456

                   11 INDEX SKIP SCAN AFKO~P
                      ( Estim. Costs = 1.448 , Estim. #Rows = 88 )
                      Search Columns: 1
                      Estim. CPU-Costs = 10.309.628 Estim. IO-Costs = 1.447
                      Access Predicates Filter Predicates


I immediately looked at the code snippet and found a single select statement which looked quite innocent. It just selected an order number from its notification number over a view. However, the view was custom. So I decided to have a look. It joined some of the heaviest tables in a manufacturing system, like ours. QMEL and AUFK, and doing so via COBRB and AFKO.
this in itself isn't much of a problem. But I was wondering why there were a lot of joins and skip scans rather than nested loops and unique scans.Also I really wanted to avoid the fast index scan.
It turned out that the view didn't have the client specified. So the primary key was not unique, making the statement look for a lot of entries in a lot of clients most of which didn't have the data i needed.
so I simply added the client (MANDT) to the list of fields. And hey presto, my index skip scans were gone...




However, I still had a lot of index join and hash joins I didn't like the looks (or costs) of. So I reasoned that join conditions in the view were also equally unable to decide on what data to use, so I decided to extend the client number throughout the join condition, thusly:


I had high hopes, and they were fullfilled.  My execution plan now looked a lot better: the hash joins were gone and replaced with normal nested loops. The index join was gone entirely. I was in fact rather happy.


SELECT STATEMENT ( Estimated Costs = 6 , Estimated #Rows = 1 )

       13 COUNT STOPKEY
          Filter Predicates

           12 NESTED LOOPS

               10 NESTED LOOPS
                  ( Estim. Costs = 5 , Estim. #Rows = 1 )
                  Estim. CPU-Costs = 45.471 Estim. IO-Costs = 5

                   8 NESTED LOOPS
                     ( Estim. Costs = 4 , Estim. #Rows = 3 )
                     Estim. CPU-Costs = 35.473 Estim. IO-Costs = 4

                       5 NESTED LOOPS
                         ( Estim. Costs = 2 , Estim. #Rows = 3 )
                         Estim. CPU-Costs = 21.185 Estim. IO-Costs = 2

                           2 TABLE ACCESS BY INDEX ROWID QMEL
                             ( Estim. Costs = 1 , Estim. #Rows = 1 )
                             Estim. CPU-Costs = 4.877 Estim. IO-Costs = 1

                               1 INDEX UNIQUE SCAN QMEL~0
                                 Search Columns: 2
                                 Estim. CPU-Costs = 3.059 Estim. IO-Costs = 0
                                 Access Predicates

                           4 TABLE ACCESS BY INDEX ROWID COBRB
                             ( Estim. Costs = 2 , Estim. #Rows = 3 )
                             Estim. CPU-Costs = 16.308 Estim. IO-Costs = 2
                             Filter Predicates

                               3 INDEX RANGE SCAN COBRB~1
                                 Search Columns: 2
                                 Estim. CPU-Costs = 3.899 Estim. IO-Costs = 0
                                 Access Predicates

                       7 TABLE ACCESS BY INDEX ROWID AUFK
                         ( Estim. Costs = 1 , Estim. #Rows = 1 )
                         Estim. CPU-Costs = 4.763 Estim. IO-Costs = 1
                         Filter Predicates

                           6 INDEX RANGE SCAN AUFK~HOB
                             Search Columns: 2
                             Estim. CPU-Costs = 3.059 Estim. IO-Costs = 0
                             Access Predicates


In fact, the estimated costs were now at roughly 1/1000th of the original costs. Translated into user time, this meant a decrease from 90 seconds wait time to less than 1 second for this one select statement..

The morale of the story ? Remember that you're working in a client, and that the client number is a part of the key. Regardless if you have only 1 production client, 500 QA clients,  or anywhere in between. there are still a few clients reserved for SAP use (000, 001 and 066), so a unique key means you HAVE to include your client in your views, when the client is also a part of the key of the tables you are creating a view over.

Tuesday, March 5, 2013

What is "netweaver"

I recently started a new job. As part of my first assignments I was asked to participate in the "installation of netweaver". and I was puzzled. Very puzzled in fact, because in my mind the first netweavers were released in 2004. And I was certain the company wasn't THAT far behind in their release strategy.

So I googled "netweaver", and I looked at my result list. Apparently, the "what is netweaver" is very high on the list of results. So I summized that many people who do not work with SAP daily, might not know what netweaver is. So to avoid further confusion, I created a powerpoint presentation explaining what netweaver is.

From SAPs perspective it's presented as such:
In this picture, the Application Platform is the ABAP and JAVA stack systems that are usually a part of the business suite systems, the Process Integration is the PI system (more or less), Information Integration is a slew of systems and technologies to present stored data differently, like BI, TREX etc. And the People Integration is all of the "user generated data" technology, like mobility, portal etc.

The original idea from SAP was to create a technology in which you could seamlessly move data between systems, SAP or not. Not unlike what was later released as the XI (now PI) system. But in reality, the entire business suite is today a part of the netweaver concept. Anything newer than 4.7 will be "netweaver enabled".
If you think of the SAP business suite as an office pack, where ERP, SRM etc. are the products (like word excel etc) then netweaver will be the API/ language you use to communicate between the products. External tools can also do this, most frequently you will find people describing .NET and websphere as such tools, but in reality many tools exists that supports the netweaver integration.

It ended up with one of my colleagues handing me the installation guide for the software they were talking about. It was from "Netweaver Application Server Java 7.3".... So the confusion was apparently caused by SAP using one of their buzzwords just a slight bit too much.... Again :D