Our SQL 2005 cluster experienced a power blip due to computer room issue. Quorum and SQL3 instance were on physical SVR1 and SQL4 instance was on physical SVR2. The outage took out Active Directory and most of the network in the computer room. The cluster has shared disks on a Netapp connected via iSCSI. All equipment went down and came back up on their own.
When power came back up, SVR1 was offline in the cluster and all services were online on SVR2.
I rebooted SVR1 and it rejoined the cluster and was able to move the Quorum and SQL3 back to SVR1.
This is when we noticed a issue with the SQL4 instance. Applications on a .NET webserver that use SQL4 databases, could not connect to their databases. The developers (non-admin users) could not connect via SSMS 2008 to their databases on SQL4. I (admin on SQL servers) can connect to SQL4 with either SSMS 2005 and SSMS 2008.
To get the applications working, I failed over SQL4 to SVR1 and now everyone and all apps can connect to SQL4.
Any ideas what is wrong? I suspect a problem with SVR2, but not sure what since it appears to be functioning within the cluster ok and cluster groups have no problem failing over and starting up on that physical node.
UPDATE: Apps show this error when trying to connected to SQL4 on SVR2:
Error:
(A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified))
UPDATE 2: SQL Browser service was not starting on SVR2, duh. Fixed by removing SQL from node, evicting from cluster, then re-adding server to cluster, and adding node to SQL 2005 instances using SQL 2005 Setup.
Monday, September 12, 2011
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment