… getting the wrong answer fast is not an improvement over the right answer slow

Category Archives: OS

Last week whilst trying to get to grips with SQL Server AlwaysOn Failover Clusters, I set up a simple iSCSI target using the “iscsitarget” package as per the Debian docs. However when trying to validate the cluster in WSFC (Windows Server Failover Clustering) the disk checks failed with:

“does not have the inquiry data (SCSI page 83h VPD descriptor) that is required by failover clustering”

This has something to do with the scsiId, which is required by the cluster manager to control volume ownership, being supplied by iscsitarget in a format unsupported by WSFC.

I failed to find a workaround for this and instead switched to using “tgt” to serve the iSCSI targets. I was pushed for time, and couldn’t find a straightforward guide so I’m documenting my steps here.

Today I’m trying to configure a couple of servers each with 2 LACP trunks going to separate switches on our network. I was hoping that if I made a single 802.3ad bond with all the interfaces it’d automatically work in active-backup mode with the 2 trunks and give me switch redundancy.

It would appear that the Linux bonding driver does do this, so I set up my bond as follows:

If I bring down both interfaces on aggregator 1 then the above switches to aggregator ID 3 and all seems good.

# ifconfig eth0 down; ifconfig eth1 down

But it all goes bad once I bring those interfaces back up; the machine disappears off the network.

# ifconfig eth0 up; ifconfig eth1 up

The issue appears to be that the link status on both trunks is up, and since the MAC address used is the same for each trunk (that of the first adapter) once traffic has passed through both switches they both have the MAC present in their switching tables.

I couldn’t find any proper workaround for this, and eventually found a stack-exchange post outlining the same issue. Aparrently if the switches can be linked with VPC (Virtual Port Channel) or MLAG (Multi Chassis Link Aggregation) then it can work, but otherwise not.

What I’ve done in the end is a poor-mans workaround that simply involves checking the status of the bond, and switching the interfaces when the aggregator becomes inactive. It looks like this (on debian):

This is not an issue if you want to connect to a utf8 database, but the issue I had this morning was connecting to a latin1 database with psql from a Windows client (something I do rarely). If I set the codepage to utf8 to match client encoding, I got a “Not enough memory.” error:

I could set the codepage to 1252, but that would mean my setting for client_encoding would be a lie, and if I were to then revert to set client_encoding=’WIN1252′ I’d have come full circle and be back at the “FATAL: conversion between WIN1252 and LATIN1 is not supported” error message.

A quick google revealed these bugreports with no solutions. Another dig at the docs revealed the following passage:

pager

Controls use of a pager program for query and psql help output. If the environment variable PAGER is set, the output is piped to the specified program. Otherwise a platform-dependent default (such as more) is used.