Wednesday, February 27, 2013

MSSQL error 14262 on a SAP CCMS job


I got some weird calls on this error in my system log on several SAP systems:


BY  2 Database error 14262 at EXE
BY  0 > [Microsoft][SQL Native Client][SQL Server]The specified
BY  0 > @job_name ('SAP CCMS Check Database XXX
BY  0 > [20120102153140-6-060000]') does not exist.

Everything worked, so noone really bothered to fix the problem, untill it started to wake up too many people at really odd hours (one of those people being me who likes to sleep at night), so I decided to investigate....
After searching a lot, and talking to a lot of people I found out that apparently this job is run as a part of the DBchecks that DBAcockpit needs to do, one of the DBCC jobs. The downside seems to be that a new job is scheduled every once in a while (at least once a year until you implement a fix for the ccms_check_db_hist_YYYY.txt file issue), and the old jobs doesn't seem to "go away" from MSSQL even though the job has been deleted from within SAP.

so I investigated a little bit, and SAP has a solution in note 1413688. But starting to create and run huge scripts on our databases seems like a very poorly thought out solution to what clearly must be a simple problem (namely that the job isn't deleted correctly).
So I went in search of the cause of the error. In MSSQL server management, I did find a lot of interesting jobs:



Lo and behold. There was a job that matched my error, as I expected. Now, the job was disabled (by our very competent sysadmin who also created a nice, shiny new job that did all the work it was supposed to do, which is why everyone ignored the error), so I was wondering why the old job showed up in my system log, but more interesting was, why didn't the other disabled jobs do the same ?
After a cursory examination of the jobs on a few systems with the same problem, I found a commonality: Apparently the job owner(s) was set as a user that didn't have ora_dba privileges (!)
So I changed the owner in the job properties, and hey presto, problem gone....







Now I completely understand that ideally, the jobs shouldn't even be checked from within SAP, and of course they should be deleted from the DB when they're deleted within SAP, but this way fixed my problem in a hurry, and without invoking any huge alterations in my system, so I was happy. Today I'll enjoy some nice weather, and I'll spend another day wondering who changes the ownership of a running database, because I know that question will make me quite unhappy...

Monday, February 18, 2013

automating SAP/Oracle backups on windows systems

The other day I was looking at a windows/Oracle flavor that had to get a new backup system up and running.
While I normally wouldn't want to run Oracle on windows, I nevertheless had no choice in the matter here.

So my first step would be to ensure that I dont start a backup if one is already running. On a unix system I'd have to either create a lockfile that tells me that the backup is running, or I'd have to do some serious grep'ing in the processlist, because one thing unix is good at is running a lot of databases on one huge server. Unfortunately, windows isn't very good at that. So I decided to use that in my favor.
I noticed that each system only had one database instance. Which meant that I could simply look for the presence of my executable in the tasklist, and skip all of my steps if I "got lucky".... (ok, so I really should return an error message somewhere, but I'll fix that later). With brbackup it got as simple as this:

REM first determine if a backup is running - I shouldn't run a backup if it is already started.....
tasklist /FI "IMAGENAME eq brbackup.exe">NUL | find /I /N "brbackup.exe">NUL
if %ERRORLEVEL%==0 goto :errormsg
REM if we reach this point program brbackup.exe is not running and we can continue

And now I'm off to figure out how to implement the "/ as sysdba" privilege when my script is being called by a service. Because services are definitely one thing windows excels at being bad at ;)

Tuesday, February 12, 2013

Sometimes windows is a funny thing

Once in a while I run across a few oddities, I think most of us do, in my daily tasks. One of them was this one....
A filesystem with the SAP workfiles was full, and I had to do some cleanup to get things going. Just deleting the recyclebin didn't really cut it, so I had to deelte some files.
Going to the workdir and selecting the oldest files, will usually net me a couple of outdated files I can delete, so I tried to delete these and got this funny error from windows:


  

Apparently I need to delete some files in order to be able to delete files ;)
Fortunately the fix is easy, the error actually means that I cannot send the files selected to the recyclebin. So simply clicking shift+delete will work.
Meanwhile I'm still working on some automated cleanup tasks.... It would seem there's still systems around here that need it. And now I have even more reason to laugh at Microsofts error messages....

Monday, February 4, 2013

finding contents of hidden unix files

I've never really gotten the hang of which environment file contains what when fixing environment variables on a unix system. Well, I do know, I just spend way too much time doing this stuff, because there's always a lot of places I miss.

So while I normally can simply grep for the contents of a file, grepping for the contents of a hidden file, however, is slightly more tricky. Selecting .* doen't really seem to cut it with the find or grep commands. Luckily I found a way to cheat it.....

grep 'string' ./.*

This cheats grep into searching the local directory . for files of type .*
So now I can just grep for my variablename and find all occurances in the profile files without actually having to think about it.....