Thursday, February 27, 2014

Yesterday I responded to an emergency callout to a customer with 800 users running a single Exchange 2010 SP3 UR3 multi role server running on Windows Server 2008 R2 SP1. The server server would not boot and was simply blue screening. We were not able to access Windows in anyway even by booting into safe mode and had no indication as to why the failure occurred as we could not access the server event logs. The Exchange 2010 SP3 server was running on top of a VMware vSphere 5.1 clustered environment hosted on shared storage.

This server was one I setup a couple of years back and as a result it followed my standard multi-role Exchange server build which consists of two or more NTFS volumes.

Due to the changes in Disk I/O it is no longer a requirement to separate transaction logs from the Exchange database as I/O is no longer an issue. Remember Exchange 2010 has a 90% disk I/O reduction over Exchange 2003.Note: This server only had two volumes however additional volumes can exist in the event additional databases are required.

As the system volume was corrupt and no longer booting, this needed to either rebuilt or restored from backup. The database/log volume was assumed to be fine and the plan was to simply re-attach the database/log files after the system volume containing Windows and Exchange server was restored. We did have a full backup of the Exchange 2010 server through Backup Exec 2010 R3 SP3 which was taken on the weekend of both the system volume and system state. As a result we had two methods for recovering this Exchange 2010 SP3 server bringing it back online:

Recover the servers system volume and system state using the last backup taken with Backup Exec and relink it to the database/log volume.

Recover the Server by performing an Exchange Recover Server installation to reconnect a newly installed Exchange server to the existing configuration stored in the Active Directory configuration partition using the Setup /m:RecoverServer switch.

It was deemed at the time, the best course of action would be to recover the server through Backup Exec as this would ensure things such as the digital certificate and Exchange Web service URL addresses would all be restored back to their original state.

Backup Exec 2010 R3 SP3 Issues

I have been working with Backup Exec for over 8 years now back when it was owned by a company named Veritas. Every time I have had to perform a full system restore of a failed server, it has always been a cumbersome process aligned with multiple challenges. After the so many years of having this product on the market, you think the functionality of the "Backup" and "Restore" processes would be completely ironed out and bullet proof, after the primary purpose of this product is to backup and restore data. However, this my friends is what makes Backup Exec "special", the ability to cause companies pain by not performing these tasks.

Despite having used Backup Exec to recover servers in the past, from the history experienced with the product, to give myself the best chance for a successful restore I put what I knew aside and followed the Symantec online documentation exactly. To restore a server running Windows Server 2008 or Windows Server 2008 R2, Symantec has published a knowledge base article on their support page to restore a remote Windows 2008 computer which has completely failed. This is the most important support page article they could publish online as it documents the steps to restore a failed Windows Server, again the whole point of a backup and restore program. As a result you think the instructions documented on this particular article would be 100% accurate and reviewed carefully by the Symantec support team. Unfortunately, this is not the case and the below article does have technical mistakes to performing a successful restore of a Windows 2008 server.

In the instructions taken from the above article entitled "Disaster Recovery restore steps for a remote Windows 2008 computer" it states to:

Build a new Windows Server 2008 computer by performing a fresh install

Provide it the same computer name as before

Ensure it has the same disk configuration as the previous system

Do not join the new computer to an Active Directory domain, instead leave it as Workgroup.

Following these instructions exactly, I was unable to proceed with the restore documented further down in step 10 in the documentation. At this point we raised a support case with Symantec Gold Support to assist us with restoring the server. The Symantec support engineer after taking the case advised us that the instructions documented online were incorrect and the server needed to be joined to the domain.

After joining the computer to the domain I was able to proceed with performing the restore task. The task proceeded successfully until it got to 91% where the following error was generated.

V-79-57344-782The job failed with the following error: Unable to swap out active registry hive with new data

After rebooting the server being restored, Windows Boot Manager was unable to boot the server as it was left in a corrupt state.

At this point the Symantec engineer asked us to retry the process. It failed again, exact same experience. At this point the company had been without email for over 12 hours due to the lengthy timeframe Backup Exec takes to restore data.

Symantec then ran a tool on the Backup Exec media server to collect a bunch of logs for analysis and advised us that they require 48 hours to review why the server restore is failing. When we asked them the question "so you expect us to be without email for 48 hours", their response was yes there is nothing we can do.

Exchange 2010 Recover Server Installation

At this point I advised my customer that we need to forget about Backup Exec and proceed doing a native Exchange 2010 Recover Server installation using the Setup /m:RecoverServer, something I had faith in working. We rebuilt the Windows 2008 R2 server, provided it the same host name, re-joined it to the domain, installed all windows updates along with Exchange 2010 pre-requisites.

One thing I noticed is the Exchange 2010 SP3 media does not allow you to run Setup /m:RecoverServer, it errors out. You must run the Recover Server installation from Exchange 2010 SP2 media and then after the install proceed to installing Exchange 2010 SP3 followed by the latest update rollup.

We had the server up within 2 hours of starting this procedure.

SummaryThere is no denying that the restore procedure documented under http://www.symantec.com/business/support/index?page=content&id=TECH86323 should have worked for servers backed up using a Backup Exec agent.This being said, it is important to note that Backup Exec has better methods for backing up and recovering servers which this customer is not currently following. In virtual environments running VMware ESX it is recommended that companies present the VMware datastores running Virtual Machine File System (VMFS) to the Backup Exec media server. Whilst Windows cannot read VMFS, Backup Exec can allowing it to directly backup the Virtual Machine hard disks (VMDKs) and configuration files. This backup method provides faster backup speeds and reliable restores as Backup Exec no longer needs to rely on Backup Exec agents, instead it can communicate directly with vCentre to snapshot virtual machines for backup purposes and restore entire virtual machines back to their original state.

In addition to virtual machine level backups, Symantec Backup Exec 2012 also offers an alternative way for recovering servers backed up using a Backup Exec agent. Backup Exec 2012 provides a recovery media for companies to boot off providing the ability to directly restore a server taken from an agent based backup. This means companies no longer need to:

Install Windows Server from Microsoft Installation Media

Provide the server the same host name and drive letters

Join the server to the domain

Deploy the Backup Exec Agent

Start the Recovery Process

The Backup Exec recovery media significantly reduces the time required to perform a bare metal restore of a failed Windows Server.

For companies which have already made significant investment in Backup Exec, before throwing away the investment with Symantec, it is advised that customers review their backup strategy to ensure it suits the infrastructure within their company.

Tuesday, February 25, 2014

Stellar Phoenix EDB to PST Converter is an application which allows administrators to export mailboxes from an online or offline Exchange EDB database to PST format. It is a simple and user friendly application to use and navigate which is handy for even the most inexperienced administrators.

When launching the application, Administrators are presented with two options:

Convert Online EDB to PST

Convert Offline EDB to PST

Convert Online EDB to PST OptionThe convert online EDB to PST option allows administrators to connect to a live Exchange environment and export mailboxes to PST while users are working. This option requires you to specify the connection details of your Exchange server along with credentials for a user account which has administrative access to export mailboxes. To perform the export the credentials specified must also have access to all user mailboxes across the Exchange organisation.

The Online EDB to PST conversion option is a nice to have option due to the user friendly nature of the application however it is not something I would recommend to companies as a reason to purchase the Stellar Phoenix EDB to PST Converter application. Microsoft has already shipped tools with all versions of Exchange to mass export user mailboxes to PST from online mailboxes. For Example, Exchange 5.5, 2000 and 2003 has a tool named ExMerge which has the ability to export all mailboxes to separate PST files for all users in an Exchange organisation. Exchange 2007, 2010 and 2013 has the ability to export all mailboxes to PST files using PowerShell. Exchange Management Shell has a Export-Mailbox or New-MailboxExportRequest depending on what version of Exchange your currently running.

This being said, these tools provided by Microsoft only allow you to export entire mailboxes to PST. Stellar Phoenix EDB to PST Converter allows you to select individual folders or messages from an Exchange Mailbox and export only a subset of data which is relevant to PST. This granular selection provided by the application provides administrators with more flexibility when managing their online exports.Convert Offline EDB to PST OptionConvert Offline EDB to PST option allows administrators to extract PST files from an EDB file without the presence of an Exchange server. This is very handy for companies who want to restore select emails from an offline EDB backup file without restoring the entire EDB file back into an Exchange environment such as a Recovery Storage Group (RSG).

When you choose to select Offline EDB to PST conversion, the tool will ask you to specify a path to the EDB file along with specifying what version of Exchange the EDB file was created in.

After selecting the EDB file the application will break down all mailboxes available for export. As an administrator you can select which mailboxes you wish to export to PST including granular selection of which folders/items you wish to export from each mailbox.

After the selection process is completed, click the save button which is represented by an old fashion floppy disk. Specify a folder which you would like to export all PST files to.

The tool will then go through and begin the process of exporting all mailboxes selected to individual PST files. The PST files will be created in the format of "mailboxname.pst"

Exchange 2013 Support

At the time of writing this review, Stellar Phoenix EDB to PST Converter does not currently support Microsoft Exchange Server 2013 or 2013 SP1 mailbox database files. Stellar Phoenix said this is currently being worked on and support for this will be added to the product in the next couple of months.

The Administrator License allows for the product to be installed on unlimited workstations within a single site (or physical office).

The Technician License allows for the product to be installed on unlimited workstations across all physical offices throughout an organisation.

It is important to note that in both licenses, only a single conversion can be performed simultaneously.

The Administrator License is priced at $399 USD while the Technician License is priced at $499 USD. These prices are correct as of this writing but are subject to change. For the latest pricing on Stellar Phoenix EDB to PST Converter, please refer to the following web page:

A customer of mine had a newly setup Windows Server 2012 R2 remote access server configured to use a static address pool. When opening up Remote Access Management Console they received the following error:

Settings for server cannot be retrieved. VPN is configured to allocate IP addresses using a static address pool, but no IP address ranges are configured.

My customer did not want to use a static address pool but instead use the existing DHCP server to provide IP addresses to remote access clients. To configure the remote access server to use the existing DHCP server, use the following command.

Set-VpnIPAddressAssignment -IPAssignmentMethod 'DHCP'

After using this PowerShell command, simply reopen the Remote Access Management console and the issue will no longer occur.

Exchange 2013 after a long wait is finally ready for production. Exchange 2013 was released by Microsoft back October 2012 with a number of limitations and problems. The removal of EMC and major changes to the product architecture had the Exchange community stunned.

The original release of Exchange 2013 outraged some members of the IT community as the product was not fit for production with some saying Microsoft should have held off on the product release until the product was in a more developed state. The original release was full of bugs and lacked fundamental features such as the ability to co-exist in an Active Directory environment running Exchange 2007 or Exchange 2010 making migrating to Exchange 2013 impossible meaning it could only be deployed in greenfield deployments.

Up until the 24th of February 2014 another major product limitation existed such as the ability to run Exchange 2013 on the Windows Server 2012 R2 operating system. Customers were waiting for over half a year for support to be added to the product as Windows Server 2012 R2 has significant enhancements operational interface making the original Windows Server 2012 feel like a prototype.

Finally as of the 25th of February 2014, Microsoft has finally released the first official service pack to Exchange 2013 which has resolved many of the original bugs which came with the RTM build as well as adding significant enhancements. New enhancements to the product include but are not limited to:

Windows Server 2012 R2 Support

The ability to log Cmdlet commands in the Exchange Admin Centre (EAC)

Edge Transport Role

New Communication Method between Exchange and Outlook called MAPI over HTTP, a replacement for RPC over HTTP. This needs to be enabled and is disabled by default. MAPI over HTTP provides significant enhancements.

Almost a year and a half later after the products initial release, Exchange 2013 is ready for production. It is now in a stable state and packed full of rich features. For more information on changes made in Exchange Server 2013 Service Pack 1, please refer to the official blog release post which can be found here:

Microsoft has released a small but powerful tool for converting VMware Virtual Machines to HyperV virtual machines called Microsoft Virtual Machine Converter (MVMC). This tool is able to convert the entire virtual machine including virtual disk and configuration files to HyperV format ready to use on a Windows Server 2012 hypervisor.

Microsoft Virtual Machine Converter comes part of the Microsoft Virtual Machine Converter Solution Accelerator package which can be downloaded from the following URL:

Thursday, February 20, 2014

This post is interesting as it is one of the weirdest Exchange 2010/2013 performance problems I have seen in my career for a while. In writing this blog post, hopefully other organisations with the same bizarre issue can find this post hopefully resolve their Exchange performance problem.

This post will cover the following topics:

Symptoms of the problem

Troubleshooting Steps

Issue Resolution

My customer was running a single Exchange 2010 SP3 CU3 multi-role server however it needs to be noted this issue can also occur for Exchange 2013 based on what I have read.

Symptoms

RPC Response Times

The Outlook RPC Response times we were seeing from clients connecting on the same subnet as the Exchange server were abnormally high. We were seeing response times in the 10,000 - 20,000 facinity consistently across multiple clients. Response times should generally be below 50 and sometimes higher for remote clients in branch offices or clients connecting using RPC over HTTP as shown in the following screenshot.

Autodiscover Requests Timing Out

Autodiscover requests were randomly timing out as the Exchange server was taking to long to respond. The following error was being generated in the Test E-mail AutoConfiguration tool:

Users were complaining Mail Tips was failing across multiple outlook clients. When composing new emails instead of receiving a mail tip, in the same area where the mail tips are displayed users would receive an error indicating there was a problem receiving mail tips from the server. This is due to the server not responding to the mail tips request in a timely fashion.

Outlook Web App Performance Problems

Loading the Outlook Web App page and logging in was extremely slow taking approximately 45 seconds for the to completely load. Navigating the mailbox, composing new emails and accessing features such as Exchange Control Panel were also extremely slow and barely usable. Sometimes the server did not respond fast enough and the web page would simply timeout.

Microsoft Outlook Load Times

Launching the Microsoft Outlook client was extremely slow and took approximately 21 seconds across all clients instead of the snappy 2-3 second load time.

Troubleshooting Performed

I was contracted to come in and resolve the performance problems for the customer. During my initial 4 hours troubleshooting the performance issues I went through and looked at a large number of items which generally contribute to Exchange performance issues.

General Server Performance

The general server performance was looked at including items such as physical disk, memory, cpu and network utilisation. Items examined included:

Number of hard page faults

Disk queue length

Disk Time

CPU Utilisation

Memory utilisation

Network number of packets sent/received

Network bandwidth utilisation

A few other related counters in performance monitor were also examined, all within normal readings.

Network Stack Tweaking

For testing the following changes were made to the Windows network stack:

The Exchange User Monitor was ran on the server to verify that no significant load was placed on the server from one user session from virus or bad Outlook profile. Readings from ExMon were normal.

Anti-Virus or Third Party Applications

AntiVirus or Third Party Applications were investigated. No real time file scanning AV is installed on the Exchange server (as recommended by Microsoft).

The only third party application installed is Exclaimer Outlook Signature which is found not to be causing an issue.

Network Connectivity Test

A network connectivity test was performed between the Exchange server and a client on the same subnet suffering performance issues using a bandwidth measuring tool known as iPerf. The network connectivity test showed the client was almost able to achieve gigabit speeds to the Exchange 2010 server over the local network segment.

IIS Application Pools

IIS Application pools were flushed with an IISReset to rule out App Pool Recycling and memory usage/limit/leaking related issues which can significantly impact performance of web based applications on an IIS server. In the event it was related to an IISApp pool issue, an IISReset would generally restore performance for a short period of time which did not happen.

Certificate CRL Checking

When using public certificates, the service in which the certificate is associated with must be able to contact the public certificate revocation list. In the event it cannot, this can cause web based applications to be sluggish as we are waiting for CRL timeouts to occur on the backend. I tested this from the Exchange server and was able to contact the CRL lists meaning there was no firewall blocking this communication.

Exchange Troubleshooting Assistant

The Microsoft Exchange Troubleshooting Assistant also known as ExTRA can be used to identify performance related issues on an Exchange 2010 server. This was run and the report flagged nothing of interest.

Process Monitor

Microsoft system internals process monitor (procmon.exe) was run on the Exchange 2010 server to identify application loops or unwanted activity which may be related to slow websites loading. Output from procmon was normal.

I increased the event log level for OWA for testing purposes. No problems could be identified relating to performance problems from OWA Event Logs.

IPv6 Disabled

IPv6 can be known to cause performance issues with Microsoft Exchange. As a result for testing I advised the customer to turn of IPv6 by disabling it on the network interface adapter on the Exchange 2010 server and by putting in place the DisabledComponents registry DWORD value as required by Microsoft KB929852. This had no effect and the issue continued to occur.

Second Exchange 2010 Server

After all troubleshooting performed above I advised my customer to build a new Exchange 2010 server as it was they had a simple environment with only a single Exchange 2010 server. We performed the following steps:

Performed a fresh install of Windows 2008 R2 with all latest updates

Installed Exchange 2010 with SP1 multi role deployment

Installed Exchange 2010 SP3 update

Installed Exchange 2010 SP3 CU3

Performed minor core configuration tasks such as configuring public folder replication, OAB Distribution, Enabling Outlook Anywhere, Configuring the Digital Certificate and a few other things.

Moved a few test users to the new Exchange 2010 server for testing purposes.

The test users experienced the exact same performance issues on the new server as those on the old server despite it being a fresh installation of Windows Server and Microsoft Exchange.

Resolution

My customer stumbled across the following webpage where another company was having a similar issue with their Exchange server performance:

This other company with the similar issue resolved it by Disabling IPv6. Instead of manually disabling IPv6 like I did in my step above instead used the Microsoft FitIt tool to disable IPv6 which is available on the same knowledge base article, KB929852.

After running the Microsoft FixIt tool, the customer expressed to me that the issue is resolved and Exchange is no longer under performing.

What puzzles me however is the Microsoft FixIt tool as documented simply creates the DisabledComponents registry key, something which I created manually during troubleshooting. This is documented under "For more information" in the manual section of KB929852, the same KB I used to create my manual registry key. Perhaps it makes another change which I am unaware of and is not documented on the knowledge base article? It might be worth while testing this theory by running an application such as RegSnap before and after the tool is run to identify exactly what changes it has made.

The other thing which is important to note, Exchange generally should be deployed with IPv6. I have many customers running Exchange 2010/2013 servers with IPv6 enabled and no performance issues. This particular customer did express to me that around the time the issue started occurring, they also had an upgrade on their core switch. This network infrastructure change should not be ruled out as possibly effecting the performance of clients connecting to Exchange.

This was definitely one of the most puzzling problems I have dealt with in a while. If you do have the same problem as the symptoms expressed in this post, please do try running the Microsoft FixIt tool as documented above. If this also resolves your problem, please do comment as I am looking forward to hearing your feedback.

Sunday, February 2, 2014

In large complex forests with multiple domains, the DNS Suffix search list can be extensive and can result in DNS name resolution failing. Companies with a large number of domains often used WINS to provide single name resolution across all domain infrastructure along side DNS. This is because it was not possible to provide a single name resolution technology which you could span across your entire Active Directory infrastructure using DNS.

Starting from Windows Server 2008, Microsoft introduced a new DNS concept known as GlobalNames which was a move to permanently decommission the legacy WINS technology to enable single name resolution.

When deploying GlobalNames you first enable it on all DNS servers in your domain using dnscmd.exe:

dnscmd ServerName /config /enableglobalnamessupport 1

You then create the GlobalNames zone file and ensure it replicates to all DNS servers in your Forest (best practice) or alternatively create a separate application partition to hold the GlobalNames zone which gives you the flexibility of choosing which DNS servers in your forest hold this zone. The whole point of GlobalNames is generally to allow DNS servers in other Active Directory domains (within the same forest) to all share the single name resolution hence removing the need for complex DNS Suffix Search Lists.

But what if you want DNS Servers in other forests to use the GlobalNames zone?

This is a little more complicated as we cant simply replicate the GlobalNames DNS zone to DNS servers in other domains due to the fact it is another Active Directory forest, it runs under a complete different set of security identifies and does not adhere to Ticket Granting Tickets (TGTs) handed out by Kerberos KDC's from another forest.

What Microsoft has done to allow other forests to share a GlobalNames DNS zone is to provide companies the ability to create a service location (SRV) resource records in the remote forests DNS zone. This is done as follows:

_globalnames._msdcs.remotedomain.local which must point to an FQDN of a DNS server that hosts the GlobalNames zone in the main forests.

Saturday, February 1, 2014

The KRBTGT account is a user account which resides inside the domain users container in every Active Directory domain. This service account is critical and is required by the Kerberos Key Distribution Centre (KDC).

This account must never be deleted.This account must never be renamed.

Before reading this post, I hope you have an understanding of how the Kerberos authentication protocol works. If not I encourage you to watch the following video entitled "How Kerberos Works" by Don Jones, a CBT Nuggets instructor. This can be found on the following link:

From watching this video you will understand that all users and devices on a network must be assigned a Ticket Granting Ticket (TGT) from the KDC which then enables them the ability to access resources on the network.

The KRBTGT account is critical to this process as the KDC uses the password derived from the KRBTGT account to encrypt each TGT assigned to users and devices on the network. All KDC servers on the network have the KRBTGT account password because all domain controllers have this account. The KRBTGT account is created upon the promotion of the first domain controller in a new Active Directory domain.

There are a few things to be aware of with the KRBTGT account when dealing with Read Only Domain Controllers. RODC's also act as a KDC for branch offices and as a result require a KRBTGT account. However, RODC's do not contain the passwords of all accounts in an Active Directory domain, only passwords specified by an administrator defined in the Password Replication Policy (PRP) - for more information see http://technet.microsoft.com/en-us/library/cc730883.aspx

Now because the password associated with the KRBTGT account is so sensitive, we do not want this residing at branch sites as the whole point of an RODC is to implement it when physical security to the server is low. To ensure that the KRBTGT of a compromised RODC can't be leveraged to request tickets to other domain controllers, each RODC has a special local KRBTGT account. This account has the format KRBTGTXXX, where "XXX" is a string of random numbers. This random string uniquely identifies the RODC and is generated when an RODC is installed.

The accounts generated for each RODC appear in the users container on all Active Directory domain controllers in the domain. As a result all writeable domain controllers also keep a copy of the KRBTGT password hash.

Lastly, if an RODC receives a session ticket request based on a TGT that isn't valid, a Kerberos error will occur asking the client computer to request a new TGT. If the RODC does not have a copy of the users password hash, the RODC will forward the TGT request to a writable domain controller.