The Windows Operating System (OS) is something a DBA should know and be familiar with since SQL Server has to run on top of it. I would say that on average most DBAs interact with the OS for troubleshooting purposes. In this post I just want to point out a few snippets of how PowerShell can help you do this type of work.

Services Console Manager

In the SQL Server 2000 days DBAs became very familiar with typing in “services.msc” in the run prompt. Scrolling through the list of services to find out what state it is, or what the login is configured for with a particular service. Now, if you are performing administrative tasks against SQL Server services it is always advised that you use SQL Server Configuration Manager. However, if you are looking to check the status of the service or performing a restart of just the service, PowerShell can help out.

Get-Service

This cmdlet has a few discrepancies that it can help to understand upfront when you start using PowerShell instead of the Services Console. In the Services Console you find the service by the “Name”, this is the “DisplayName in the Get-Service cmdlet. The “Name” in Get-Service is actually the “Service Name” in the Service Console, do you follow? OK. So with SQL Server the DisplayName for a default instance would be “SQL Server (MSSQLSERVER)”, and the “Name” would be “mssqlserver”. This cmdlet allows you to filter by either field so the below two commands will return the same thing:

Get-Service 'SQL Server (MSSQLSERVER)'

Get-Service mssqlserver

You can obviously see which one is easier to type right off. So with SQL Server you will likely know that a default instance’s name would be queried using “mssqlserver”, and a named instance would be “mssql$myinstance”. So if you wanted to find all of the instances running on a server you could use this one-liner:

Get-Service mssql*

Restart-Service

This does exactly what you think it will, so you have to be careful. You can call this cmdlet by itself and restart a service by referencing the “name” just as you did with Get-Service. I want to show you how the pipeline can work for you in this situation. You will find some cmdlets in PowerShell that have a few “special” features. The service cmdlets are included in this category, they allow an array as an input object to the cmdlet for the property or via the pipeline.

So, let’s use the example that I have a server with multiple instances of SQL Server, and all the additional components like SSRS and SSIS. I only want to work with the named instance “SQL12″. I can get the status of all component services with this command:

The added “-WhatIf” will not actually perform the operation but tell you what it would end up doing. Once I remove that the restart would actually occur. All of this would look something like this in the console:

Win32_Service

Some of you may recognize this one as a WMI class, and it is. Using WMI offers you a bit more information than the Get-Service cmdlet. You can see that by just running this code:

One specific thing I did in this function is declaring the type of parameter you pass into this function. When you use “[string[]]”, it means the parameter accepts an array or multiple objects. You can set your variable to do this, but you also have to ensure the function is written in a manner that can process the array. I did this simply by wrapping the commands into a “foreach” loop. So an example use of this against a single server would be:
If you wanted to run this against multiple servers it would go something like this:

Every DBA should be very familiar with this management console and can probably get to it blind folded. You might use this or “My Computer” when you need to see how much free space there is on a drive. If you happen to be working in an environment that only has Window Server 2012 and Windows 8 or higher, wish I was there with you. PowerShell 4.0 and higher offers storage cmdlets that let you get information about your disk and volume much easier, and cleaner. They actually use CIM (Common Information Model), which is what WMI is built upon. I read somewhere that basically “WMI is just Microsoft’s way of implementing CIM”. They are obviously going back to the standard, as they have done with other areas. It is worth learning more about, and it actually allows you to connect to a PowerShell 2.0 machine to get the same amount of information.

Anyway back to the task at hand. If you are working on PowerShell 3.0 or lower you can use Get-WmiObject and win32_Volume to get similar information that the storage cmdlet Get-Volume returns in 4.0:

Almost everyone is familiar with and knows their way around the Windows Event Viewer. I actually left this last for a reason. I want to walk you through an example that I think will help “put it all together” on what PowerShell can do for you. Our scenario is dealing with a server that had an unexpected restart, at least for me. There are times that I will get paged by our Avail Monitoring product for a customer’s site, and I need to find out who or why the server restarted. The most common place you are going to go for this will be the Event Log.

Show-EventLog

If you just want to go through Event Viewer and manually find events, and it is a remote server, I find this to be the quickest method:

Show-EventLog -ComputerName Server1

This command will open Event Viewer and go through the process of connecting you to “Server1″. No more right-clicking and selecting “connect to another computer”!

Get-EventLog

I prefer to just dig into searching for events, this is where Get-EventLog comes in handy. You can call this cmdlet and provide:

Specific Log to look in (system, application, or security most commonly)

Specify a time range

Look just for specific entry type (error, information, warning, etc.)

In Windows Server 2003 Microsoft added a group policy “Shutdown Event Tracker” that if enabled writes particular events to the System Log when a server restarts, either planned or unplanned. In an unplanned event the first user that logs into the server will get a prompt about the unexpected shutdown. When you are dealing with planned, they are prompted for a similar prompt for restart and it has to be filled in before the restart will occur. What you can do with this cmdlet is search for those messages in the System Log.

In this instance I find that SSIS and SSRS did not start back up and failed to start. I found this because I checked the status of the services for SQL Server using my custom function, Get-SQLServiceStatus:

To search for events after the shutdown I need to find the first event that is written to the Event Log when a server starts up, the EventLog source. I can then use that time stamp as a starting point to search for messages on the SQL Server services that did not start up correctly. I just need the time stamp of the event and pass that into the Get-EventLog cmdlet to pull up error events. I am going to do that with this bit of code:

I hope you found this post useful and it gets you excited about digging deeper into PowerShell. In the next post I am going to close up the series digging into SQL Server and a few areas where PowerShell can help.

As with any database administration strategy, management of Redshift requires setting and revoking permissions. When first setting up permissions, it is ideal to establish groups as the basic unit of user access. This keeps us from having to manage hundreds of permissions as users enter and leave organizations. If you haven’t done this early on and are managing permissions on the user level, you can leverage some of the queries below to make the migration to group based access easy.

Another advantage we see in managing by groups is for some data warehouses we want to exclude users from running reports during ETL runs in order to prevent contention or reporting on incomplete data. All we have to do is run this query at the start of the jobs:

revoke usage on schema myreportschema from group report_group;

When the jobs are finished, we then grant usage again:

grant usage on schema myreportschema to group report_group;

It is easy to see users and group assignments via (note that a user can belong to more than one group):

select usesysid, usename, usesuper, nvl(groname,'default')
from pg_user u
left join pg_group g on ','||array_to_string(grolist,',')||',' like '%,'||cast(usesysid as varchar(10))||',%'
order by 3,2;

Grants in Redshift are ultimately at the object level. That is to say while we can grant access to all existing objects within a schemas, those grants are stored at an object level. That is why issuing this command works for all existing tables, but tables added after this command that have been run do not automatically have the select privilege granted:

grant select on all tables in schema myreportschema to group report_group;

While this is good for security granularity, it can be administratively painful. Let us examine some strategies and tools for addressing this issue.

Redshift has the useful view, information_schema.table_privileges, that lists tables and their permissions for both users and groups. Note that this also includes views despite the name. AWS also provides some useful views in their Redshift utilities package in Github, most notably v_get_obj_priv_by_user which essentially flattens out information_schema.table_privileges and makes it much easier to read:

One of the biggest challenges is to fill in missing grants. We can do this by modifying the above query. Here’s an example where we create the grant statements for all missing select grants for the report_group:

If you are an Oracle DBA, then it is quite likely that Oracle ASM is used as the storage management for at least some of the databases you manage.

Eventually you will want to see ASM metrics that can be used to track the performance of the storage over time. There are built-in data dictionary views that allow monitoring ASM IO performance at the database, instance, diskgroup and disk level. These are current time metric only however, so they must be collected and saved to be of much use.

Point your browser at your favorite search engine and search for the terms Oracle, ASM and metrics. Quite likely near the top of that list relevant hits will be bdt’s oracle blog. Bertrand Drouvot has created asm_metrics.pl, an excellent utility for monitoring and reporting on ASM IO metrics. I have used this utility several times now with good results.

As good as asm_metrics.pl is, however, I have on several occasions wanted something slightly different. The asm_metrics.pl script output report format, while my preference is to save data in a CSV file. By saving all available metrics in this manner it is not necessary to decide ahead of time how the data is to be used and then chose the appropriate command line options.

When all of the metrics are preserved as data there is then no limitation on the types of post-data-collection analysis that can be performed. If for instance, you would like to break down all of the data by disk for a particular disk group, the data will be at your disposal to do so.

And so, the following collection of scripts was created. First, I will provide a brief description of each script, followed by detailed usage.

asm-metrics-collector.pl

This is the main Perl script used for collecting ASM metrics data.

asm-metrics-aggregator.pl

This Perl script can be used as a filter to aggregate previously collected metrics.

asm-metrics-chart.pl

Output of either of the previous scripts can be used as input to this script. asm-metrics-chart.pl will create a XLSX format Excel file with line charts for the columns you select.

These scripts try not to make too many decisions for you. Their job is simply to get the data, perform some aggregations as needed, and chart the data.

While there are many options that could be added to the scripts, I have attempted to keep from making them unnecessarily complicated. Standard Linux command line utilities can be used for many operations that would otherwise require a fair bit of code complexity.

Don’t have Linux? In this day of free hypervisors, anyone can install and use Linux.

Using the ASM Scripts

The following are more detailed descriptions of each script and their usage.

For each script the –help option will provide some explanation of the options, while the –man option may be used to show extended help.

The Perl Environment

All recent versions of the Oracle RDBMS software ship with Perl installed. The Oracle based Perl installations already include the DBI and DBD::Oracle modules which are necessary for connecting to the database.

If you don’t already have a version of Perl with the DBI and DBD::Oracle modules installed, simply use the one installed with Oracle:

You may have noticed there is no password on the command line and indeed, there is no provision for a password on the command line. In the cause of security, the password must either be entered from the keyboard or sent to the script via STDIN. Following are some examples of connecting with a user that requires a password.

This first example will require you to type in the password as the script appears to hang:

By default the output of asm-metrics-collector.pl will include a row for each disk in each diskgroup. Should you wish to observe and chart the read and write times at diskgroup level, the presence of one row per disk causes that operation to be somewhat difficult. The following brief shell script will read output created by asm-metrics-collector.pl, aggregate the chosen columns at the requested level and write it to STDOUT in CSV format.

The aggregation level is determined by the –grouping-cols option, while the columns to aggregate are specified with –agg-cols option.

You may be wondering about the purpose of the double dashes “–” that appear in the command line.

This is how the Perl option processor Getopt::Long knows that there are no more options available on the command line. As the –display-cols option can take several arguments, some method must be used indicating the end of the arguments where there is following text that is is not to be processed as part of the option. The “–” is the option list terminator, and will be well known to long time Unix and Linux users.

This script also does one set of calculations behind the scenes; the average read and write times are calculated at the current aggregation level.

What if you don’t know which columns are available in the input file? Simply use the –list-available-cols option with an input file and the script will output the available columns and exit.

Now it all gets more interesting as we can visualize the data collected. There are many methods available to accomplish this. I chose Excel as it is ubiquitous and easy to work with. The previously discussed Perl module Excel::Writer::XLSX makes it relatively easy to create Excel files complete with charts, directly from CSV data.

The following command line will create the default Excel file asm-metrics.xlsx with line chart for reads and writes. The –worksheet-col options specifies that each diskgroup will be shown on a separate worksheet.

Suppose you have two disk groups, DATA and FRA. The DATA disk group has five disks and FRA has two disks. In addition there are two RAC databases with two instances each. For each iteration the data collected by asm-metrics-collector.pl will have 5 rows per DATA, 2 rows per FRA, multiplied by the number of instances, so 40 rows per iteration.

Should you wish to see only the DATA rows for a single instance, and then aggregate these, it can be done via a combination of command line tools and these scripts.

First let’s get the list of columns and number them so we know how to find the instance:

That should get you started on modifying the data via standard command line utilities.

To Do

These scripts do not know about any of the ASM enhancements found in 12c, so there is some room for improvement there. So far they have fit my needs, but you may have some ideas to make these scripts better. Or (heaven forbid) you found a bug. Either way, please try them out and let me know.

Last week I published a blog post titled “Are You Ready For the Leap Second?“, and by looking at the blog statistics I could tell that many of you read it, and that’s good, because you became aware of the risks that the leap second on June 30th, 2015 introduces. On the other hand, I must admit I didn’t provide clear instructions that you could use to avoid all possible scenarios. I’ve been looking into this for a good while and I think the official RedHat announcements and My Oracle Support notes are confusing. This blog post is my attempt to explain how to avoid the possible issues.

Update (June 9th, 2015): Made it clear in the text below that ntp’s slewing mode (ntp -x) is mandatory from Oracle Grid Infrastructure and therefore for RAC too.

The complexity of solving these problems comes from the fact that there are multiple contributing factors. The behavior of the system will depend on a combination of these factors.
In the coming sections I’ll try to explain what exactly you should pay attention to and what you should do to avoid problems. The content of this post is fully theoretical and based on the documentation I’ve read. I have NOT tested it, so it may behave differently. Please, if you notice any nonsense in what I’m writing, let me know by leaving a comment!

1. Collect the data

The following information will be required for you to understand what you’re dealing with:

Here’s a number of bugs that are related to leap second handling on Linux:

System hangs on printing the leap second insertion message – This bug will hang your server at the time when the NTP notifies kernel about the leap second, and that can happen anytime on the day before the leap second (in our case anytime on June 30th, 2015). It’s fixed in kernel-2.6.9-89.EL (RHEL4) and kernel-2.6.18-164.el5 (RHEL5).

Systems hang due to leap-second livelock – Because of this bug systems repeatedly crash due to NMI Watchdog detecting a hang. This becomes effective when the leap second is added. The note doesn’t exactly specify which versions fix the bug.

MOS Note: “How Leap Second Affects the OS Clock on Linux and Oracle VM (Doc ID 1453523.1)” mentions that kernels 2.4 to 2.6.39 are affected, but I’d like to know the exact versions. I’ve searched a lot, but I haven t found much, so here are the ones that I did find:

I’m quite sure by reading this you’re thinking: “What a mess!”. And that’s true. I believe, the safest approach is to be on kernel 2.6.39-200.29.3 or higher.

3. NTP is used

You’re using NTP if the ntpd process is running. In the outputs displayed above it’s running and has the following arguments: ntpd -u ntp:ntp -p /var/run/ntpd.pid -g. The behavior of the system during the leap second depends on which version of NTP you use and what’s the environment.

ntp-4.2.2p1-9 or higher (but not ntp-4.2.6p5-19.el7, ntp-4.2.6p5-1.el6 and ntp-4.2.6p5-2.el6_6) configured in slew mode (with option “-x”) – The leap second is not added by kernel, but the extra time is added by increasing the length of each second over ~2000 second period based on the differences of the server’s time and the time from NTP after the leap second. The clock is never turned backward. This is the configuration you want because:

Time never goes back, so there will be no impact to the application logic.

Strange time values like 23:59:60 are not used, so you won’t hit any DATE and TIMESTAMP datatype limitation issues.

As the leap second is not actually added, It should be possible to avoid all 3 kernel bugs that I mentioned by using this configuration. In many cases updating NTP is much simpler than a kernel upgrade, so if you’re still on an affected kernel use this option to bypass the bugs.

The drawbacks of this configuration are related to the fact that the leap second is smeared out over a longer period of time:

This probably is not usable for applications requiring very accurate time.

This may not be usable for some clusters where all nodes must have exactly the same clocktime, because NTP updates are usually received every 1 to 18 minutes, plus giving the ~2000 seconds of time adjustment in slew mode the clocks could be off for as long as ~50 minutes. Please note, the slewing mode is (ntp -x) is mandatory for Oracle Grid Infrastructure as documented in Oracle® Grid Infrastructure Installation Guides 11g Release 2 and 12c Release 1.

ntp-4.2.2p1-9 or higher configured without slew mode (no “-x” option) – The NTP will notify the kernel about the upcoming leap second some time during June 30th, and the leap second will be added as an extra “23:59:59″ second (time goes backward by one second). You will want to be on kernel with all fixes present.

below ntp-4.2.2p1-9 – The NTP will notify the kernel about the upcoming leap second some time during June 30th, and depending on the environment, the leap second will be added as an extra “23:59:59″ second (time goes backward by one second), or the time will freeze for one second at midnight.

If NTP is not used the time is managed locally by the server. The time is most likely off already, so I really do recommend enabling NTP in slew mode as described above, this is the right moment to do so.

If you have tzdata-2015a or higher installed, the information about the leap second on June 30th, 2015 is also available locally on the server, but it doesn’t mean yet it’s going to be added. Also if NTP is not used and the leap second is added locally, it will appear as “23:59:60″, which is an unsupported value for DATE and TIMESTAMP columns, so this is the configuration you don’t want to use. Here are the different conditions:

You’re below tzdata-2015a – the leap second will not be added.

You’re on tzdata-2015a or higher and “file /etc/localtime” includes message “X leap seconds”, where X is a number – the leap second will be added as “23:59:60″ and will cause problems for your DATE/TIMESTAMP datatypes. You don’t want this configuration. Disable leap second by copying the appropriate timezone file from /usr/share/zoneinfo over /etc/localtime. It’s a dynamic change, no reboots needed. (Timezone files including the leap seconds are located in /usr/share/zoneinfo<strong>/right</strong>)

“file /etc/localtime” includes message “no leap seconds” – the leap second will not be added.

The recommendations

Again I must say this is a theoretical summary on how to avoid leap second issues on Linux, based on what’s written above. Make sure you think about it before implementing as you’re the one who knows your own systems:

Single node servers, or clusters where time between nodes can differ – Upgrade to ntp-4.2.2p1-9 or higher and configure it in slew mode (option “-x”). This should avoid the kernel bugs too, but due to lack of accurate documentation it’s still safer to be on kernel 2.6.39-200.29.3 or higher.

Clusters or applications with very accurate time requirements – NTP with slew mode is not suitable as it’s unpredictable when it will start adjusting the time on each server. You want to be on kernel 2.6.39-200.29.3 or higher. NTP should be enabled. Leap second will be added as an extra “23:59:59″ second (the time will go backward by one second). Oracle Database/Clusterware should detect time drifting and should deal with it. Check MOS for any bugs related to time drifting for the versions you’re running.

I don’t care about the time accuracy, I can’t update any packages, but need my systems up at any cost – The simplest solution to this is stopping the NTP on June 29th and starting it up on July 1st, so that the server was left unaware of the leap second. Also, you need to make sure the /etc/localtime does not contain the leap second for June 30th, 2015 as explained above.

Very accurate time requirements + time reduction is not allowed – I don’t know. I can’t see how this can be implemented. Does anyone have any ideas?

Post Scriptum

Initially I couldn’t understand why this extra second caused so much trouble. Don’t we change the time by a round hour twice a year without any issues? I found the answers during the research, and it’s obvious. Servers work in UTC time, which does not have daylight saving time changes. The timezone information is added just for representation purposes later on. UTC Time is continuous and predictable, but the leap second is something which breaks this normal continuity and that’s why it is so difficult to handle it. It’s also a known fact that Oracle Databases rely heavily on gettimeofday() system calls and these work in UTC too.

This is a short post about two things that should be on the to-do list for all Cassandra Administrators. The leap second issue and the new JMX default.

The Leap Second

Before we move on you should learn more about how the leap second affects Cassandra in more detail.

In short, you must update your JVM to version 7u60 or above. If you are on Cassandra 2.0.14+ or 2.1.x then all JVM on version 8 are safe and tested. One issue that the the JVM doesn’t solve is that time-series data might become interleaved. If is this is critical for your deployment (not for most cases) be aware of this.

JMX Security

Since Cassandra 2.0.14 and 2.1.4 the cassandra-env.sh file sets the JMX to only listen to the localhost. So unless you are fine with this you should enable remote access, while making sure you activate security!

Short version:

Edit $CASSANDRA_CONF/cassandra-env.sh update and set LOCAL_JMX=no

Create /etc/cassandra/jmxremote.password and add the username and password:

monitorRole QED
controlRole R&amp;D
USERNAME PASSWORD

Change ownership to the user you run Cassandra with and permission to read only:

Performance Schema (PS) has been the subject of many, many recent discussions, presentations, and articles. After its release in MySQL 5.7, PS has become the main actor for people who want to take the further steps in MySQL monitoring. At the same time, it has become clear that Oracle intends to make PS powerful with so many features and new instrumentation that old-style monitoring will begin to look like obsolete tools from the Stone Age.

This article will explain PS and provide guidance on what needs to be done in order to use it effectively.

What I am not going to do is to dig into specific performance issues or address polemics about what PS is and what, in a Utopian vision, it should be. I have seen too many presentations, articles and comments like this and they are not productive, nor are they in line with my target which is: keep people informed on how to do things EASILY.

For the scope of this article I will base my code mainly on version MySQL 5.7, with some digression to MySQL 5.6, if and when it makes sense.

Basic Concepts

Before starting the real how-to, it is my opinion that we must cover a few basic concepts and principles about PS. The primary goal of the Performance Schema is to measure (instrument) the execution of the server. A good measure should not cause any change in behavior. To achieve this, the overall design of the Performance Schema complies with the following, very severe design constraints:

The parser is unchanged. Also, there are no new keywords or statements. This guarantees that existing applications will run the same way with or without the Performance Schema.

All the instrumentation points return “void”, there are no error codes. Even if the performance schema fails internally, execution of the server code will proceed.

None of the instrumentation points allocate memory. All the memory used by the Performance Schema is pre-allocated at startup, and is considered “static” during the server life time.

None of the instrumentation points use any pthread_mutex, pthread_rwlock, or pthread_cond (or platform equivalents). Executing the instrumentation point should not cause thread scheduling to change in the server.

In other words, the implementation of the instrumentation points, including all the code called by the instrumentation points is:

Malloc free

Mutex free

Rwlock free

Currently, there is still an issue with the usage of the LF_HASH, which introduces memory allocation, though a plan exists to be replace it with lock-free/malloc-free hash code table.

The observer should not influence the one observe. As such, the PS must be as fast as possible, while being less invasive. In cases when there are choices between:

Processing when recording the performance data in the instrumentation.

OR

Processing when retrieving the performance data.

Priority is given in the design to make the instrumentation faster, pushing some complexity to data retrieval.

Performance schema was designed while keeping an eye on future developments and how to facilitate the PS usage in new code. As such, to make it more successful, the barrier of entry for a developer should be low, so it is easy to instrument code. This is particularly true for the instrumentation interface. The interface is available for C and C++ code, so it does not require parameters that the calling code cannot easily provide, supports partial instrumentation (for example, instrumenting mutexes does not require that every mutex is instrumented). The Performance Schema instrument interface is designed in such a way that any improvement/additions in the future will not require modifications, as well as old instrumentation remaining unaffected by the changes.

The final scope for PS is to have it implemented in any plugin included in MySQL, although pretending to have them always using the latest version will be unrealistic in most cases. Given that the Performance Schema implementation must provide up to date support, within the same deployment, multiple versions of the instrumentation interface must ensure binary compatibility with each version.

The importance of flexibility means we may have conditions like:

Server supporting the Performance Schema + a storage engine that is instrumented.

Server supporting the Performance Schema + a storage engine that is not

Server not supporting the Performance Schema + a storage engine that is instrumented.

Finally, we need to take in to account that the Performance Schema can be included or excluded from the server binary, using build time configuration options, with exposure in the compiling interface.

Performance Schema Interfaces

As mentioned above, PS can be excluded from code at the moment of the code compilation, thanks to the PS compile interface. This interface is one of seven that are present in PS. The full list is:

Instrument interface

Compiling interface

Server bootstrap interface

Server startup interface

Runtime configuration interface

Internal audit interface

Query interface

Instrument Interface:

This is the one that allows plugin implementers to add their instruments to PS. In general the interface is available for:

C implementations

C++ implementations

The core SQL layer (/sql)

The mysys library (/mysys)

MySQL plugins, including storage engines,

Third party plugins, including third party storage engines.

Compiling Interface:

As mentioned earlier, this is used during the build and will include or exclude PS code from the binaries.

Server Bootstrap Interface:

This is an internal private interface, which has the scope to provide access to the instructions demanded and create the tables for the PS itself.

Server Startup Interface:

This interface will expose options used with the mysqld command line or in the my.cnf, required to:

Enable or disable the performance schema.

Specify some sizing parameters.

Runtime Configuration Interface

This is one of the two most important interfaces for DBAs and SAs. It will allow the configuration of the PS at runtime. Using the methods expose by this interface, we will be able to configure what instruments, consumers, users and more we want to have active. This interface uses standard SQL and is very easy to access and use. Also, it is the preferred method to activate or deactivate instruments. Thus, when we start the server we should always enable the PS with all the instruments and consumers deactivated, and use this interface to choose only the ones we are interested in.

Internal Audit Interface:

The internal audit interface is provided to the DBA to inspect if the Performance Schema code itself is functioning properly. This interface is necessary because a failure caused while instrumenting code in the server should not cause failures in the MySQL server itself, and in turn the performance schema implementation never raises errors during runtime execution. To access the information a DBA just needs to issue the SHOW ENGINE PERFORMANCE SCHEMA STATUS; command.

Query Interface:

Lastly, this interface is the one that allows us to access the collected data, and to perform data filtering, grouping, join, etc. It will also allow access to a special table like the summary tables and digest, which will be discussed later on.

Consumers and Instruments

Another important concept in PS to understand is the difference between Instruments and Consumers.

Instruments:

Instruments are the ones collecting raw data where the calls are embedded in the code, such as:

MYSQL_TABLE_IO_WAIT(PSI_TABLE_FETCH_ROW, active_index, result,

{ result= index_prev(buf); })

In this case the code refers to the MYSQL_TABLE_IO_WAIT function declared in the handler.cc class (<mysql_root_code>/sql/handler.cc). If enabled in the compilation phase the above function will provide PS the information related to specific table io_wait.

The instruments demanded to manage that data collection is: wait/io/table/sql/handler.

The naming convention for the instruments is quite easy. The first part wait is the name of the Top-level Instrument component (list later), the second io is the observed condition, and table is the object. The remaining suffix is referring to more specific plugin implementations and includes innodb, myisam, sql or names like IO_CACHE::append_buffer_lock. In the above example it refers to the Handler class in SQL tree.

Instruments are organized by top level components like:

Idle: An instrumented idle event. This instrument has no further components.

Memory: An instrumented memory event.

Stage: An instrumented stage event.

Statement: An instrumented statement event.

Transaction: An instrumented transaction event. This instrument has no further components.

We can and should keep in consideration that, it is best practice to enable only the instruments we may require for the time we need them. This can be achieved using the re-using the runtime interface (I will explain how exactly later on).

The Consumers are the destination of the data collected from the instruments. Consumers have different scope and timelines. Also, consumer like event statements has many different tables like:

Current

History

History long

Summaries (by different aggregation)

Summary Digest (like what we can find by processing the slow query log)

Once more it is important to define what we are looking for and enable only what we need. For instance, if we need to review/identify the SQL with the most impacting, we should enable only the events_statements_current, events_statements_history and events_statements_summary_by_digest. All the other consumers can stay off. It is also important to keep in mind that each event may have a relation with another one. In this case, we will be able to navigate the tree relating the events using the fields EVENT_ID and NESTING_EVENT_ID where the last one is the EVENT_ID of the parent.

Pre-Filtering vs. Post-filtering

We are almost there, stay tight! Another important concept to understand is the difference between post and pre-filtering. As I mentioned, we can easily query the Consumer tables with SQL, we can create complex SQL to join tables and generate complex reports. But this can be quite heavy and resource consuming, especially if we want to dig on specific sections of our MySQL server.

In this case we can use the pre-filtering approach. The pre-filtering is basically a way to tell to PS to collect information ONLY from a specific source like user/IP (actors) or Object(s) like Tables, Triggers, Events, and Functions. The last one can be set at a general level or down to a specific object name.

The pre-filtering with the activation of the right instruments and consumer is a powerful way to collect the information without overloading the server with useless data. It is also very easy to implement given we just need to set the objects and/or actors in the setup tables as we like.

Rolling the Ball, Setup the PS for Observation as Start

Now that we have covered the basic concepts we can start to work on the real implementation.

Compile the Source Code:

As mentioned earlier, we can use the compile interface to include or exclude features from the code compilation. The available options are:

This level of detail is so granular that we can only include the things we are planning to use.

The positive aspect of doing so at the compilation level is that we will be sure no one will mess-up adding undesired instruments. The drawback is that if we change our mind and we decide we may need the ones we had excluded, we will have to compile the whole server again.

As a result, I would say that using this approach is not for someone that is just starting to use PS. Given you are still discovering what is there, it make sense to compile with all the features (default).

Configure PS in my.cnf:

To set the PS correctly in the my.cnf is quite important, so I strongly suggest disabling any instrument and consumer at the start-up. They can be enabled by the script later, and that would be much safer for a production database.

I normally recommend a section like the following:

performance_schema=1

performance_schema_events_waits_history_size=50

performance_schema_events_waits_history_long_size=15000

performance_schema_instrument=’%=OFF’

performance_schema_consumer_events_stages_current=0

performance_schema_consumer_events_stages_history=0

performance_schema_consumer_events_stages_history_long=0

performance_schema_consumer_events_statements_current=0

performance_schema_consumer_events_statements_history=0

performance_schema_consumer_events_statements_history_long=0

performance_schema_consumer_events_transactions_current=0

performance_schema_consumer_events_transactions_history=0

performance_schema_consumer_events_transactions_history_long=0

performance_schema_consumer_events_waits_current=0

performance_schema_consumer_events_waits_history=0

performance_schema_consumer_events_waits_history_long=0

performance_schema_consumer_global_instrumentation=0

performance_schema_consumer_thread_instrumentation=0

performance_schema_consumer_statements_digest=0

The settings above will start the server with PS as “enabled”, but all the instruments and consumer will be OFF. Well, this is not entirely true, as for the moment of the writing (MySQL 5.7.7) once the PS is enabled the instruments related to memory/performance_schema are enabled regardless, which make sense given they are dedicated to monitor the memory utilization of PS.

A final note about the configuration is that we can decide to use the counting option of the instruments instead, capturing the latency time. To do so, we just have to declare it as: performance_schema_instrument=’statement/sql/%=COUNTED’

In this case I had set that ALL the SQL statements should be counted.

Start Server and Set Only the Users We Need:

Once we have started our MySQL server, we are almost ready to go.

This is it, given we start it with NO instruments, we have to decide where to begin, and given we all know the most impacting factor in a database server is how we query it, we will start from there. In turn, analyzing what is going from the SQL point of view. Although, I want to catch the work coming from my application user, not from everywhere. Given this we can set the user in the actor table. This is very simple given we will use the Runtime configuration interface which uses SQL syntax.

So, let say I want to trace only my application user named stress running from machines in the 10.0.0.0/24 range. I will need to:

Great, from now on PS will only focus on my user stress, so now let us decide what to enable for instruments and consumers.

Once more using SQL command we will enable all the instruments related to SQL statements, but wait a minute, if you check the instrument table, you will see we have several variations of the statements instrument:

SQL

SP

Scheduler

Com

Abstract

Also, this is not included but relevant is the TRANSACTION. For now, we will only enable the SQL, ABSTRACT, Scheduler and Transaction.

SQL will be:

update setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘statement/abstract/%';

update setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘statement/sql/%';

update setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘transaction';

(root@localhost) [performance_schema]>select count(*) from setup_instruments where ENABLED = ‘YES’ and name not like ‘memory%';

+———-+

| count(*) |

+———-+

| 143 |

+———-+

1 row in set (0.01 sec)

We have 143 instruments active. Now we must setup the consumers and choose the destination that will receive the data.

It is easy to understand that ANY object existing in the default Schema will be ignored. In our case, for now, we will keep it as it is, but this will be our next filtering step after we have analyzed some data. This will happen in the PART 2, stay tuned.

Conclusions

For now, you should understand what a Performance Schema is, its basic concept, as well as what interfaces are available and for what. You should also be able to compile the source code with and without PS, or part of it. You should be able to configure the MySQL configuration file correctly, and perform the initial configuration at runtime. Finally, you should know how to query the PS and how to dig in the information, which will also be discussed in the Part 2.

Responsive UI is a big deal nowadays, when enterprise applications should run on different platforms and devices.

SQL Server:

Three SQL Server MVPs (Jonathan Kehayias, Ted Krueger and Gail Shaw) provide fascinating insight into the most common SQL Server problems, why they occur, and how they can be diagnosed using tools such as Performance Monitor, Dynamic Management Views and server-side tracing.

Are your databases more costly than they need to be? If you’re not sure, this webinar will show you how to find out, quickly and easily, with KPIs that indicate overprovisioning or opportunity for driving more efficiency from your databases.

MySQL Incremental Backup – Point In Time Backup and Recovery of InnoDB and MyIsam Databases

If you’re not aware of what the leap second is look into it. The fact is, this year the last minute of June 30th will be one second longer and “June 30, 2015 23:59:60″ will be a valid and correct time. There are a few issues that could be caused by the leap second, so I’ve reviewed a number of MOS notes and this blog post is the summary of the findings.

Update (June 4th, 2015): I’ve put together another blog post about handling the leap second on Linux here.

Any Linux distributions using kernel versions from 2.4 though and including 2.6.39 may be affected (including both UEK and RedHat compatible kernels). This range is very wide and includes any RHEL and OEL releases except version 7 unless the kernel versions are kept up to date on lower versions.

Problems may be observed even a day before the leap second happens, so this year it could cause the symptoms any time on June 30. This is because the NTP server lets the host know about the upcoming leap second up to a day ahead of time, and the update from the NTP triggers the issues.

There are 2 possible symptoms:

Servers will become unresponsive and the following can be seen in system logs, console, netconsole or vmcore dump analysis outputs:

I think, as the problem is triggered by the update coming from NTP on June 30, it should also be possible to stop the NTPD service on June 29th, and re-enable it on July 1st instead. This would allow it to bypass the problem conditions.
Just because any Java application can be effected we need to think about where Java is used. And for Oracle DBAs the typical ones to worry about would be all enterprise manager agents as well as any fusion Middleware products. So if you’re using Grid control or Cloud control to monitor your Oracle infrastructure it’s very likely most of your servers are potentially under risk if the kernels are not up to date.

2. Inserts to DATE and TIMESTAMP columns fail with “ORA-01852: seconds must be between 0 and 59″

Any OS could be affected. Based on MOS note “Insert leap seconds into a timestamp column fails with ORA-01852 (Doc ID 1553906.1)”, any inserts of time values having “60” seconds into DATE or TIMESTAMP columns will result in ORA-01852.
This can’t be reliably mitigated by stopping the NTPD as the up to date TZ information on the server may already contain the information about the extra second. The note also provides a “very efficient workaround”: *the leap second record can be stored in a varchar2 datatype instead.*. You might be thinking, “What? Are you really suggesting me that?” According to MOS note 1453523.1 it appears that the time representation during the leap second is something that could differ depending on the OS/kernel/ntpd versions. For example, it could show “23:59:60″ or it could should show “23:59:59″ for 2 consecutive seconds, which would allow avoiding the ORA-01852. Be sure to check it with your OS admins and make sure that the clock never shows “23:59:60″ to avoid this issue completely.

Consider your infrastructure

By no means are the issues described above an exhaustive list. There’s too much information to cover everything, but based on what I’m reading the issues caused by leap second can be quite severe. Please consider your infrastructure and look for information about issues and fixes to address the upcoming leap second. Search MOS for the products you use and add the “leap second” keyword too, If you’re using software or OS from another vendor, check their support notes regarding leap seconds. Here are additional MOS notes for reading if you’re on some of Oracle’s engineered systems, but again, you’ll find more information if you search:

Leap Second Time Adjustment (e.g. on June 30, 2015 at 23:59:59 UTC) and Its Impact on Exadata Database Machine (Doc ID 1986986.1)

Oracle 12c introduces new catalog features including CDB_ dictionary views (which include a CON_ID column) superseding the DBA_ views that most DBA sqlplus scripts are based upon.

However, existing DBA sqlplus scripts can easily be modified using just a few simple sqlplus techniques to be compatible with 11g, as well as all types of 12c databases including legacy and container databases.

The following simple SQL and sqlplus techniques can be used to make a “universal script” that is compatible with all versions.

Illustrating the Issue

Let’s say for sake of example that we have a simple 10g/11g monitoring script that’s checking the amount of freespace in each tablespace by querying the DBA_TABLESPACE_USAGE_METRICS view.

On our 10g or 11g database the following query gives the necessary information:

It executes successfully on the 12c database but there’s a problem: the query is only returning the data from the root container (or more accurately, from the container in which the statement was executed). The PDB data is missing, I have both open and closed PDBs in this database:

Note that the comma is in the variable and not in the column list in the SQL SELECT or ORDER BY clauses.

The script is now dynamically determining whether to use the CDB_ or DBA_ view and similarly dynamically adding the CON_ID column to the SELECT and ORDER BY clauses. (And of course should be executed from the root container.)

And the exact same script still works on the 11g database using the 11g version of sqlplus!

Similarly the optional column (including the comma) defined in the sqlplus variable could be used in an aggregation GROUP BY clause. However, if the query has no other aggregation columns then we might need to add a constant to the GROUP BY clause (and ORDER BY), otherwise the GROUP BY would have no columns listed and the universal sqlplus script is executed against an 11g database.

Finally, once we’re done testing and debugging, we can get rid of the ugly “old” and “new” statements using:

SET VERIFY OFF

Implementing these techniques will allow modifications of most existing DBA sqlplus scripts to create universal versions, of which will be compatible with 11g (and likely earlier) databases as well as 12c legacy and container databases.

Deeper Dive

What if our monitoring query is based on an underlying catalog table and not a dictionary view?

For example, let’s say that our objective is to report on users and the last time the database password was changed. The password change date isn’t presented in the DBA_USERS or CDB_USERS view, but it is in the underlying SYS.USER$ table. Hence the monitoring query might be something like:

If we look at the view definition of any of the CDB_ views it is apparent that the view traverses the open PDBs by using the new 12c “CONTAINERS” function which accepts a table name as the only argument.

When run from the root container the CONTAINERS() function will traverse all open PDBs (assuming the common user used has local PDB permission to access the referenced table).

A final question might be: why isn’t the PDB$SEED database shown in the results?

The answer is that a new 12c initialization parameter EXCLUDE_SEED_CDB_VIEW controls whether the seed database is displayed in CDB_ view (or CONTAINERS() function calls). EXCLUDE_SEED_CDB_VIEW is dynamic and session modifiable:

A final question is whether this technique will still work if the SQL script is run through other tools? The answer is: “it depends“.

It depends on whether the other tools support the “define” command and the use of script variables. Specifically, Oracle SQL Developer and the newer sqlcl tool does. The above examples work fine in SQL Developer and sqlcl using the standard sqlcl “default” sqlformat. Other sqlformat options in sqlcl show some issues (testing with sqlcl version 4.2.0.15.121.1046).

Loading, Updating and Deleting From HBase Tables using HiveQL and Python.

In keeping with the ODA quarterly patching strategy, Appliance Manager 12.1.2.3 is now available.

From time to time someone publishes a query on the OTN database forum and asks how to make it go faster, and you look at it and think, “it’s a nice example to explain a couple of principles because it’s short, easy to understand, obvious what sort of things might be wrong, and easy to fix.”

I was quite sure that the password that I had was correct, so I tried with the SYSMAN user and had the same error. I still wanted to verify that I had the correct password , so I tried with the SYSMAN user to log in to the repository database, and was successful, so I know something was wrong there.

SQL&gt; connect sysman/
Enter password:
Connected.

So I went to the<gc_inst>/em/EMGC_OMS1/sysman/log/emoms.log and saw the following error

2015-05-18 21:22:06,103 [[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'] ERROR audit.AuditManager auditLog.368 - Could not Log audit data, Error:java.sql.SQLException: ORA-14400: inserted partition key does not map to any partition
ORA-06512: at &quot;SYSMAN.MGMT_AUDIT&quot;, line 492
ORA-06512: at &quot;SYSMAN.MGMT_AUDIT&quot;, line 406
ORA-06512: at line 1

Which led me to believe that the JOB_QUEUE_PROCESSES was set to 0, but it wasn’t the case, since it was set to 50. Though, this is actually an incorrect limit, so I bumped it up to 1000 and tried to rerun the EM12c repository DBMS Scheduler jobs as per the documentation in 1498456.1:

Even though the JOB_QUEUE_PROCESSES were not set to 0, it was the cause that it was failing, as it was a low value for this parameter. Thus, be careful when setting up this parameter, be sure to follow the latest installation guidelines.

This Log Buffer edition covers Oracle, SQL Server and MySQL blog posts from all over the blogosphere!

Oracle:

Hey DBAs: You know you can install and run Oracle Database 12c on different platforms, but if you install it on an Oracle Solaris 11 zone, you can take additional advantages.

Here is a video with Oracle VP of Global Hardware Systems Harish Venkat talking with Aaron De Los Reyes, Deputy Director at Cognizant about his company’s explosive growth & how they managed business functions, applications, and supporting infrastructure for success.

Oracle Unified Directory is an all-in-one directory solution with storage, proxy, synchronization and virtualization capabilities. While unifying the approach, it provides all the services required for high-performance enterprise and carrier-grade environments. Oracle Unified Directory ensures scalability to billions of entries. It is designed for ease of installation, elastic deployments, enterprise manageability, and effective monitoring.

It is no good doing some or most of the aspects of SQL Server security right. You have to get them all right, because any effective penetration of your security is likely to spell disaster. If you fail in any of the ways that Robert Sheldon lists and describes, then you can’t assume that your data is secure, and things are likely to go horribly wrong.

How does a column store index compare to a (traditional )row store index with regards to performance?

Learn how to use the TOP clause in conjunction with the UPDATE, INSERT and DELETE statements.

Did you know that scalar-valued, user-defined functions can be used in DEFAULT/CHECK CONSTRAINTs and computed columns?

Tim Smith blogs as how to measure a behavioral streak with SQL Server, an important skill for determining ROI and extrapolating trends.

Encryption is important component of secure environments. While being intangible, property security doesn’t get enough attention when it comes to describing various systems. “Encryption support” is often the most of details what you can get asking how secure the system is. Other important details are often omitted, but the devil in details as we know. In this post I will describe how we secure backup copies in TwinDB.

The fsfreeze command, is used to suspend and resume access to a file system. This allows consistent snapshots to be taken of the filesystem. fsfreeze supports Ext3/4, ReiserFS, JFS and XFS.

This blog describes the best-practice approach in regards to the data ingestion from SQL Server into Hadoop. The case scenario is described as under:

Single table ingestion (no joins)

No partitioning

Complete data ingestion (trash old and replace new)

Data stored in Parquet format

Pre-requisites

This example has been tested using the following versions:

Hadoop 2.5.0-cdh5.3.0

Hive 0.13.1-cdh5.3.0

Sqoop 1.4.5-cdh5.3.0

Oozie client build version: 4.0.0-cdh5.3.0

Process Flow Diagram
Configuration

Create the following directory/file structure (one per data ingestion process). For a new ingestion program please adjust the directory/file names as per requirements. Make sure to replace the
tag with your table name

The ingestion process is invoked using an oozie workflow. The workflow invokes all steps necessary for data ingestion including pre-processing, ingestion using sqoop and post-processing.

oozie-<table_name>-ingest
This directory stores all files that are required by the oozie workflow engine. These files should be stored in HDFS for proper functioning of oozie
oozie-properties
This directory stores the <table_name>.properties. This file stores the oozie variables such as database users, name node details etc. used by the oozie process at runtime.
hive-<table_name>
This directory stores a file called create-schema.hql which contains the schema definition of the HIVE tables. This file is required to be run in HIVE only once.

Now we are ready to select data in HIVE. Go to URL http://<hive_server>:8888/beeswax/#query.

a. Choose existing database on left or create new.
b. Paste contents of create-schema.hql in Query window and click Execute.
c. You should now have an external table in HIVE pointing to data in hdfs://<namenode>/data/sqoop/<table_name>

At this point an oozie job should be created. To validate the oozie job creation open URL http://<hue_server>:8888/oozie/list_oozie_coordinators. Expected output as under. In case of error please review the logs for recent runs.

To validate the oozie job is running open URL http://<hue_server>:8888/oozie/list_oozie_workflows/ . Expected output as under. In case of error please review the logs for recent runs.

To validate data in HDFS execute the following command. You should see a file with *.metadata extension and a number of files with *.parquet extension.

$ hadoop fs -ls /data/sqoop/<table_name>/

Now we are ready to select data in HIVE or Impala.
For HIVE go to URL http://<hue_server>:8888/beeswax/#query
For Impala go to URL http://<hue_server>:8888/impala
Choose the newly created database on left and execute the following SQL – select * from <hive_table_name> limit 10
You should see the the data being outputted from the newly ingested data.

Alan Cooper helped to debug the most widely-used PC language of the late seventies and early eighties, BASIC-E, and, with Keith Parsons, developed C-BASIC. He then went on to create Tripod, which morphed eventually into Visual Basic in 1991.

MySQL replication is among the top features of MySQL. In replication data is replicated from one MySQL Server (also knows as Master) to another MySQL Server (also known as Slave). MySQL Binlog Events is a set of libraries which work on top of replication and open directions for myriad of use cases like extracting data from binary log files, building applications to support heterogeneous replication, filtering events from binary log files and much more.

Data comes in different shapes. One of the these shapes is called a time series. Time series is basically a sequence of data points recorded over time. If, for example, you measure the height of the tide every hour for 24 hours, then you will end up with a time series of 24 data points. Each data point will consist of tide height in meters and the hour it was recorded at.

Time series are very powerful data abstractions. There are a lot of processes around us that can be described by a simple measurement and a point in time this measurement was taken at. You can discover patterns in your website users behavior by measuring the number of unique visitors every couple of minutes. This time series will help you discover trends that depend on the time of day, day of the week, seasonal trends, etc. Monitoring a server’s health by recording metrics like CPU utilization, memory usage and active transactions in a database at a frequent interval is an approach that all DBAs and sysadmins are very familiar with. The real power of time series is in providing a simple mechanism for different types of aggregations and analytics. It is easy to find, for example, minimum and maximum values over a given period of time, or calculate average, sums and other statistics.

Building a scalable and reliable database for time series data has been a goal of companies and engineers out there for quite some time. With ever increasing volumes of both human and machine generated data the need for such systems is becoming more and more apparent.

OpenTSDB and HBase

There are different database systems that support time series data. Some of them (like Oracle) provide functionality to work with time series that is built on top of their existing relational storage. There are also some specialized solutions like InfluxDB.

OpenTSDB is somewhere in between these two approaches: it relies on HBase to provide scalable and reliable storage, but implements it’s own logic layer for storing and retrieving data on top of it.

OpenTSDB consists of a tsd process that handles all read/write requests to HBase and several protocols to interact with tsd. OpenTSDB can accept requests over Telnet or HTTP APIs, or you can use existing tools like tcollector to publish metrics to OpenTSDB.

OpenTSDB relies on scalability and performance properties of HBase to be able to handle high volumes of incoming metrics. Some of the largest OpenTSDB/HBase installations span over dozens of servers and process ~280k writes per second (numbers from http://www.slideshare.net/HBaseCon/ecosystem-session-6)

There exist a lot of different tools that complete OpenTSDB ecosystem from various metrics collectors to GUIs. This makes OpenTSDB one of the most popular ways to handle large volumes of time series information and one of the major HBase use cases as well. The main challenge with this configuration is that you will need to host your own (potentially very large) HBase cluster and deal with all related issues from hardware procurement to resource management, dealing with Java garbage collection, etc.

OpenTSDB and Google Cloud Bigtable

If you trace HBase ancestry you will soon find out that it all started when Google published a paper on a scalable data storage called Bigtable. Google has been using Bigtable internally for more than a decade as a back end for web index, Google Earth and other projects. The publication of the paper initiated creation of Apache HBase and Apache Cassandra, both very successful open source projects.

Latest release of Bigtable as a publicly available Google Cloud service gives you instant access to all the engineering effort that was put into Bigtable at Google over the years. Essentially, you are getting a flexible, robust HBase-like database that lacks some of the inherited HBase issues, like Java GC stalls. And it’s completely managed, meaning you don’t have to worry about provisioning hardware, handling failures, software installs, etc.

What does it mean for OpenTSDB and time series databases in general? Well, since HBase is built on Bigtable foundation it is actually API compatible with Google Cloud Bigtable. This means that your applications that work with HBase could be switched to work with Bigtable with minimal effort. Be aware of some of the existing limitations though. Pythian engineers are working on integrating OpenTSDB to work with Google Cloud Bigtable instead of HBase and we hope to be able to share results with the community shortly. Having Bigtable as a back end for OpenTSDB opens a lot of opportunities. It will provide you with a managed cloud-based time-series database, which can be scaled on demand and doesn’t require much maintenance effort.

There are some challenges that we have to deal with, especially around a client that OpenTSDB uses to connect to HBase. OpenTSDB uses it’s own implementation of HBase client called AsyncHBase. It is compatible on a wire protocol level with HBase 0.98, but uses a custom async Java library to allow for asynchronous interaction with HBase. This custom implementation allows OpenTSDB to perform HBase operations much faster than using standard HBase client.

While HBase API 1.0.0 introduced some asynchronous behavior using BufferedMutator it is not a trivial task to replace AsyncHBase with a standard HBase client, because it is tightly coupled with the rest of OpenTSDB code. Pythian engineers are working on trying out several ideas on how to make the transition to standard client look seamless from an OpenTSDB perspective. Once we have a standard HBase client working, connecting OpenTSDB to Bigtable should be simple.

“What’s for lunch today?”, said the newly born ready to run Red Hat 6.4 server.

“Well, I have an outstanding 3-course meal of OEM12c installation.
For the appetizer, a light and crispy ASM 12c,
DB 12c with patching for the main and desert, and to cover everything up, OEM 12c setup and configuration”, replied the DBA who was really happy to prepare such a great meal for his new friend.

“Ok, let’s start cooking, it won’t take long”, said the DBA and took all his cookware (software), prepared ingredients (disk devices) and got the grid infrastructure cooked:

As always, this fresh Log Buffer Edition shares some of the unusual yet innovative and information-rich blog posts from across the realms of Oracle, SQL Server and MySQL.

Oracle:

A developer reported problems when running a CREATE OR REPLACE TYPE statement in a development database. It was failing with an ORA-00604 followed by an ORA-00001. These messages could be seen again and again in the alert log.

The difference between the CONCAT function and the STUFF function lies in the fact that CONCAT allows you to append a string value at the end of another string value, whereas STUFF allows you insert or replace a string value into or in between another string value.

After examining the SQLServerCentral servers using the sp_Blitz™ script, Steve Jones now looks at how we will use the script moving forward.

Big data applications are not usually considered mission-critical: while they support sales and marketing decisions, they do not significantly affect core operations such as customer accounts, orders, inventory, and shipping. Why, then, are major IT organizations moving quickly to incorporating big data in their disaster recovery plans?

There are no more excuses for not having baseline data. This article introduces a comprehensive Free Baseline Collector Solution.

There are tools like pt-kill from the percona tool kit that may print/kill the long running transactions at MariaDB, MySQL or at Percona data instances, but a lot of backup scripts are just some simple bash lines.