Wednesday, September 2, 2020

Technology: Combining AlwaysOn Clusters with Merge/Transactional Replication outside of Active Directory (Part 3)

Ok, SQL configuration is relatively straightforward if you've already done the network/adapter/cluster configuration correctly. If you haven't, this involves a lot of debugging at different layers of the stack.


First off, it's REALLY important that you have identical SQL versions in use by both machines in the failover cluster ('select @@version' helps here). Make sure that you've installed identical versions of SQL Server. Also, make sure that the Windows account used to run the SQL Server service has identical credentials on both servers and is part of the Computer's administrator group (Services > 'SQL Server (MSSQLSERVER)' > 'Log on' > double check the account credentials are the same for both).

Once the SQL accounts/versions align, you'll need to configure the SQL Server instance to allow use within a failover cluster (on both servers). This is in the SQL Configuration Manager Tool > SQL Server Services > right-click the SQL Server (MSSQLSERVER) choice > go to "AlwaysOn High Availability" and check "Enable AlwaysOn Availability Groups", then "OK".





Do this on both servers, then restart the SQL Services on both. This will allow SQL Server to use the AlwaysOn components.

One step that you may or may not need (depending on the code you plan to run) is the "Distributed Transaction Coordinator" configuration. We make use of this for a script that creates indexed views in replication (I'll write this up separately, it's a custom script very specific to what we do at OAF), so for our configuration I'll need to enable this windows feature using the steps here.

Next, open up SSMS and log in. It makes the process easier if there's a shared Admin SQL account that exists on all 4 servers (MyPrimary01, MyPrimary02, MyDistributor and the Transactional Subscriber). Restore a backup of your target (or example) database - this will probably be replaced later, so for now I'd recommend using a minimized/seed backup. Once you have a restored backup, use the "New Availability Group Wizard" to setup the Availability group. Some notes:

  • Note that many of these SQL objects/failover cluster objects end up with virtual IPs or virtual URLs, which can cause conflicts. To prevent any possible naming conflicts (especially in cases where the failover cluster is using Virtual IPs/Virtual machine names) it's safest to make sure that none of these names are the same:
    • Availability Group Name
    • Server Names (for all servers in the cluster)
    • Failover Cluster Name
    • Listener Name
  • Full Recovery Mode is Required for whichever DB you are using - see notes in part 5 on customizations, this is necessary but can cause the transaction log to grow unsustainably until you set up log shipping
  • Under "Specify Replicas", with a standard SQL license you can have only two machines listed. Typically to get full use of the AlwaysOn feature you'll want Synchronous Commits with Automatic Failovers (there are some performance tuning/advanced features possible here). Skip the listener for this first setup



  • Ignore any errors about Windows Domain Logins - as long as the Windows account used to run the SQL service matches on both servers this should be fine.
  • Use Automatic Seeding (as long as the disk partition labels/sizes are the same on both machines this should be fine)
  • The next steps of the wizard are fairly self-explanatory - if everything works, you should see the cluster created.




So the availability group now exists, but can't really do much that's useful yet. A few different pieces are needed first - starting with a listener.

Listeners took me some time to figure out, a lot of the AlwaysOn documentation treats them like a load balancer (other server that points to the Primary node of the cluster), but they act more like a Status that is "owned" by the Primary node. This is where subnet masks come into play - you'll need to use a config script of them form:

USE [master] GO ALTER AVAILABILITY GROUP [MyAvailGroup] ADD LISTENER N'MyListener' ( WITH IP ((N'172.16.28.29', N'255.255.255.252'), (N'172.16.28.17', N'255.255.255.252') ) , PORT=6543); GO


where those IP addresses are within the available VPN subnets but not already claimed by a server. I'd like to pretend that I found a clean formula for this, instead it requires a bit of trial-and-error to see if there's a free IP address within the subnet you want. If nothing works, you may have to expand the subnet using the windows network configuration. The important part is that every failover cluster Node should have an IP in this listener script (so if the VPN IP of MyPRIMARY02 is 172.16.28.28, it needs to have a reference within the same subnet like 172.16.28.29 while the VPN IP for MyPRIMARY01 of 172.16.28.16 needs a reference of 172.16.28.17).


This listener is basically a set of virtual IPs that can be used to reference the Primary node, regardless of which server is actually the primary node. In Active Directory (or some internal Windows DNS server tools), these virtual IPs would automatically be treated like real servers; without these tools configured, we'll need to use some custom OpenVPN code (covered in part 5).

Before we get to the custom networking, we need to do the standard networking for SQL server replication - described next in part 4.


No comments:

Post a Comment

Cowboys vs. Soldiers

I classify developers into two main camps: Cowboys (aka Hackers) and Soldiers (aka Tweakers) - cowboys are great for startup environments wh...