Degraded Service Investigation

Resolvido

All redundancies restored with all hosts active.

Monitorando

We are monitoring services as we finish up work on the host.

Problema Identificado

Incident Report: Service Disruption - Host Configuration

Date/Time of Incident:

  • 1/16/2025, 10:00 PM MST: Web and Gateway traffic stopped processing.
    • Primary node in our Windows Cluster experienced a stop error (Blue Screen of Death). This event was caused by previously removed network adapters driver.
    • This event caused data processing to fail over to other nodes in our cluster. The failover event caused a serice disruption as virtual machines restarted and Microsoft SQL Server failing over to other Windows Clustered Microsoft SQL Server nodes.
  • 1/16/2025, 10:10 PM MST: Website services were restored.
  • 1/16/2025, 10:22 PM MST: All services fully restored.

Actions Taken:

  • VM’s have been migrated off of the affected host.

Planned Actions from Incident:

  • 1/16/2025: Microsoft and Dell suggested reinstalling the host OS (Windows Server) as the best method to remove the unwanted drivers and left over configurations that are incompatible with the newly installed network adapters.
Investigando

We were able to isolate the issue are all services are running as we continue our investigation to root cause.

Investigando

We are investigating the cause of degraded service

2 Serviços afetados: