legsontable

“A good DBA may relax and put his legs on the table”. These words a senior (in age as well as in experience) DBA used to say while he was trying to teach me the basics of administrating databases. He was trying to say that organizing your work and preparing yourself for future catastrophes is the most important part of the job. Only then you are confident to face the surprises Oracle software and the organisation where you work for comes up with.

But as time passes by I disagree with this attitude. In my opinion a good DBA has always work to do. But it’s not always easy to convince your manager you are busy as hell. What ARE you doing all day long, or moreover, what meaningful pro-active contributions you possibly can do for your company to keep the business online? And how can you make it visible to your manager. I´m convinced that in most organisations the attitude towards DBA´s is quite respectfull, but that was and is not always the case and this writing is for those who continuously struggle with the aspects a DBA is doing or is supposed to do.

In this post I’ll try to summarize the deliverables for a DBA as a kind of checklist. Thought about it what base to choose for this kind of list.  Had the choice for methods as ITIL, ASL (Application Services Library) ISM (Integrated Service Management), COBIT ( Control Objectives for Information and related Technology) and more. Decided to keep it simple, using a lot of ITIL (v2, cause I’m lost with v3):

Service Support

  • Incident Management
  • Configuration Management
  • Problem Management
  • Change Management
  • Release Management

Service Delivery

  • Continuity Management
  • Availability Management
  • Service Level Management
  • Capacity Management
  • Financial Management

 

In drawing, from service-desk perspective:
 Service-desk-itil
Let’s take a glance at Service Support:
Incidentmangement:

Goal: “Restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained”

What kind of stuff a DBA can and must do at first sight? Most of it is quite reactive, the goal is obvious: fix-it. Due to things like Database down, database slow, job failures, authorisation failures, storage  and so on. And this kind of stuff isn’t just production. However this production could be 24/7 and an incident has always priority 1 and escalating. The other environments has other need for attention:  all kind of developers and testers are most of the time quite creative in messing things up.

In the meantime, while the DBA is desperate trying to get the database (and the application software) up and running, a manager calls every 10 minutes when it’s fixed. After that the manager above him.
And hopefully the organisation is quite prepared so the DBA is able to login remote through a reasonable VPN-connection and laptop.
And the ‘A’ of Adminstrator always kicks in: searching for a solution if the problem isn’t that obvious: Google, My Oracle Support. Updating the call in the not so intuitive ticketing system, logging your actions, write a report for the managers how to avoid such a thing in the future.
And then, after saving the day for the company at night, appearing happy at work for the daily routine.

Configuration Management

Goal: “Provide accurate information on configurations and their documentation to support all the other Service Management processes

The first goal of a DBA is to make an inventory of which databases belongs to his/her responsibility- the scope. Call it asset-managment. What versions do they have, how does the test-database differ from production-database. Who has access to those databases. What patches have been applied on these databases / servers. A lot of these questions can be easily managed by Enterprise Manager Grid / Cloud Control, but when it does not, the ‘A’ of DBA is very important.
And what about password management, access to My Oracle Support, documentation, DBA-scripts, version documentation of the Oracle software running, and your own scripts.
In my opinion it’s very important to know the stakeholders of the database and communicate clearly what’s the configuration status of the database and application. In that way the possibility of surprises (changes, lack of documentation etc.) is decreased.
License documentation also belongs to this process. When does the support contract expire, can we do it cheaper, when does the support contract expire, should we buy more or less in the future, contact with LMS (License Management Service) of Oracle.

Problem management:

Goal: “Minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT infrastructure, and to prevent recurrence of Incidents related to these errors

Investigating the root cause of incidents is part of the job, but also be alerted by errors and warnings in the log files or in Enterprise Manager. You can be very busy with investigating something that has never been the cause of an outage… yet.
Speaking of an outage: writing a post-mortem, with your analyses and advise how to prevent this in the future is expected.
When you’re lucky, you may have close contact along the way with some engineers of My Oracle Support :-), for trying to solve the problem together. That may take a while, and it’s an art to gently force them to think in the right direction.
While monitoring, you may see trends in the behaviour of the database / application server, and report them – accompanied by your advise – to the stakeholders so they can take action.

Change mgmt/ Project mgmt:

Goal: “Ensure that standardized methods and procedures are used for efficient and prompt handling of all Changes, in order to minimize the impact of Change related incidents upon service quality, and consequently to improve the day-to-day operations of the organization

Most important part of change management is to get involved in major or minor changed BEFORE they will be authorized by a Change Advisory Board with hardly any knowledge of a database , datamodelling or let’s say weblogic.
This involvement can be of any form and can even be outsourced to a service level manager. The point is to know what’s going on and when the sh** may hit you.
Your assignement is to give risk-analyses of coming up changes, even give some Return of Investment opinions. Interfere with test plans, communication, backup before the change, rollback when the change fails, the execution, at what moment is the change considered to be succesful, documentation, scripting…

Release Management:

Goal: “Design and implement efficient procedures for the distribution and installation of Changes to I.T. Systems

As stated at change management you should interfere with the changes as much as possible, which implies that you may be very busy with it. So give the causers of the change the right standards and procedures in advance, it will help the understanding between you.
Another task: QA on the documentation of the new releases, and judge them according to your standards. You have to prepare the upgrade / release, test it in advance, and perform the execution of the release/upgrade at an insane time.
What about Service Delivery: 

Continuity Management:

Goal Support the overall Business Continuity Management process by ensuring that the required I.T. technical and service facilities (including computer systems, networks, applications, technical support and Service Desk) can be recovered within required, and agreed, business timescales

The DBA is the centerguard of continuity. He/she can design and describe disaster recovery scenarios such as in Fire, Earthquake, Flood (e.g. leaking roofs – it happens!), Power failure, results of actions of a DBA with a hangover and so on.
Installing the configurations for testing the disaster recovery scenarios. The configurations likely differ per application, and so do the scenarios. Some organisations are testing the scenarios once a year in production.

Availability Management:

Goal Understand the Availability requirements of the business and plan, measure, monitor and continuously improve the Availability of the I.T. Infrastructure, services and supporting organization to ensure that these requirements are met consistently

This process seems a bit like continuity, but availability management is pointing to three metrics:
- MTTR (Mean Time To Repair),
- MTBF (Mean Time Between Failures)
- MTBSI (Mean Time Between System Incidents)
You should know about the possibilties and configurations of products like Data Guard, RAC, Enterprise Manager, but also knows about concepts of SAN, NAS, Active-passive, security issues in relation to the SLA of the applications.
A big part of the action is implementing backups, and more important: to be sure you can recover from this backup in all kind of scenarios.
Claim a substantial amount of time and resources to test your various backups!

Service Level Management:

Goal: “Maintain and improve I.T. Service quality, through a constant cycle of agreeing, monitoring and reporting upon I.T. Service achievements and instigation of actions to eradicate poor service – in line with business or cost justification

Be annoying, interfere with the composing of Service Level Agreements and Operation Level Agreements. From the customer’s perspective: be curious of the results of the satisfaction surveys (if any taken).
Put together a yearly schedule of actions to be taken every month / quarter.  Report of the outcome to the managers so they know that you are doing something for your money.
Report of short or long-term vision regarding database, platform or development issues, so you got control on forehand.

Capacity Management:

Goal:  “Ensure that cost-justifiable I.T. capacity always exists and that it is matched to the current and future needs of the business”

Get really familiar with tools like Enterprise Manager and use the reports of the capacity used (CPU, Disks, I/O etc) and the predictions. Where it’s possible, tune the databases / application servers. Could take a substantial amount of time.
Know the scope of your area and the hardware where your stuff is running, in case a manager walks in with a management solution (more hardware, Exadata, Exalogic). No problem, but you will have to know the alternatives.

By the way: a substantial amount of capacity is used by backups and the way they are configured. Every day a full backup, compressed, uncompressed ? Think about it in relation to your capacity management.

The same is for logging- and auditing- files. Cleaning scripts, is there a reguirement or legal issue to save them for years?

Financial Management.

Goal  Provide cost-effective stewardship of the I.T. assets and resources used in providing I.T. Services

In this era the software-costs, especially Oracle licenses, undoubtably beats the hardware-costs. So a database responsible should be aware of alternatives, in relation to the guidelines, the principles and the standards of the company.
Should the following be considered in your organisation:
  • Server consoliditation
  • Use of Oracle Standard Edition (One), or even XE. E.g. in the development environment.
  • Colocation of your servers
  • Database- Software- Infrastructure ‘As A Service’ in the cloud
  • Use of OracleVM instead of VMware
  • (Oracle)Linux instead of Windows
  • Open Source software
One requirement is that the company knows about the costs and licensing at this point per customer. And this is not always the case. Terms like TCO, ROI, Accounting, charging, does not always mean a thing to a Database person, but there are specialists in your company who are familiar with it and explain it to you in a management-summary (so everybody understands it…).
Seperately mentioned : the customer and supplier needs attention!

Customer Relationship mgmt:

Know who your customers are. Be visible to them if possible through visiting them, or contact them regurarly through mail or reports. Perhaps organising a yearly DBA-day with a look into the future, or publish DBA-newsflashes. It’s the first win of gain understanding of customers when something is going wrong in the future!

Supplier Mgmt:

Publish / organise a so called ‘engagement model’ with Oracle of who is talking to who on operational, strategic and management level.  Let them visit you once in a while to discuss incidents, problems and what the role of the supplier can be in this kind of business.

Visit workshops of the suppliers, for technical reasons, but also for your network.

Get the most out of your support contract, Oracle (support) can be of more use than you think at no costs at all!

So in the end, what deliverables may be expected:
Deliverables
  • A stable, secure and resilient infrastructure
  • A log or database of all operational events, alerts and alarms
  • A set of operational scripts
  • A resilience and fail-over testing schedule
  • A set of operational work schedules
  • A set of operational management tools
  • Management reports and information
  • Exception reviews and reports
  • Review and audit reports
  • A secure Operational Document Library

But that’s a lot of work for a guy/girl!
And there is where the manager, priorities and time management may kick in.

Time management

This may help to priorities the things to be done:

The Matrix:

time_management_matrix

 

Quadrant I – Activities that are Important and Urgent e.g. Incident Management
Quadrant II – Activities that are Important but not Urgent e.g. Configuration Management
Quadrant III – Activities that are not Important but Urgent
Quadrant IV – Activities that are not Important and not urgent

The task of the manager

1.Rate each focus area
2.Rate the quality of each deliverable
3.Decide what level you want to reach
4.Determine how much work is involved
5.Determine how many DBAs you need

And if you managed to read as this far, you are truly interested or truly desperate. Hope it’s usefull for someone..

zv7qrnb