Download cluster networks microsoft slides ppt and more Assignments Computer science in PDF only on Docsity!
Click to edit Master subtitle style Microsoft Virtual Academy
03 | Cluster Networking
Elden Christensen | Principal Program Manager Lead
| Microsoft
Symon Perriman | Vice President | 5nine Software
- (^) Cluster Network Infrastructure fundamentals
- (^) Cluster Network Design Planning
- (^) Cluster Network Configuration Options Module Overview
Health Monitoring of Nodes in the Cluster
- (^) Failover Clustering conducts health monitoring between nodes to detect when servers are no longer available
- (^) When servers are unresponsive clustering takes recovery action
- (^) Unicast in nature and uses a Request- Reply type process for reliability and security - (^) Not just a basic ping You there? Yes You there? Yes
Network Health Detection
- (^) Clustering does full mesh monitoring between all network interfaces between all nodes
- (^) Health monitoring:
- (^) Nodes exchange heartbeats every 1 second
(configurable)
- (^) Nodes are considered down if they do not
respond to 5 heartbeats (configurable)
- (^) Nodes are removed from cluster membership if they exceed thresholds
- (^) Communication over port 3343
Viewing NetFT Virtual Adapter
- (^) NetFT is a virtual network adapter
- (^) Visible in Device Manager and with IPConfig /all
- (^) Completely self configuring
- (^) Media Access Control (MAC) address is self-generated based on a hash of MAC address of the first enumerated (by NDIS) physical NIC in the cluster node
- (^) NetFT self-configures an APIPA (Automatic Private Internet Protocol Addressing) address
- (^) No manual user configuration required
NetFT Architecture
- (^) NDIS 6.2 miniport virtual adapter
- (^) Supports Receive Side Scaling (RSS)
- (^) Network fault tolerance for TCP and UDP across routed network connections - Each link independently monitored
- (^) Supports IPv4 and IPv
- (^) Built in route failure detection
- (^) IP over UDP/IP Tunneling
- Virtual adapter tunnels over physical adapters
- (^) Clustering leverages port 3343
- (^) NetFT uses UDP 3343
- (^) ClusSvc uses TCP 3343
NDIS
IP
TCP UDP UDP
NetFT NIC1 NIC
ClusSvc
Cluster Network Discovery
- (^) Cluster uses exactly one IP per Subnet per NIC
- (^) Cluster ignores other IPs from the same subnet configured on the NIC
- (^) Cluster ignores other NICs & associated IPs from the same subnet
- (^) Each NIC per Node will be part of exactly one Cluster Network - (^) Cluster will use prefix matching to determine the set of Cluster Networks - (^) Cluster has built-in resilience to use IPv4 or IPv6 per NIC (prefix must match)
- (^) Cluster will use an IP from different Subnet from another NIC
- (^) Cluster will ignore other IPs configured on that NIC for subnets already discovered
Cluster Network Discovery: Example Same Subnet NIC 2 Same Subnet NIC 2
Cluster Network 1 Cluster Network 1 NIC 2 Ignored By Cluster NIC 2 Ignored By Cluster Same Subnet NIC 1 Same Subnet NIC 1
Cluster Communication
- (^) Clusters exchange three types of communication: Network Health Monitoring
- (^) Heartbeats are sent to monitor health status of network interfaces
- (^) Are sent over all cluster enabled networks Intra-cluster Communication
- (^) Database updates and state synchronization that are sent between the nodes in the cluster
- (^) Example: When creating a new resource the cluster database must be updated on all nodes
- (^) Are over a single interface CSV I/O Redirection
- (^) Metadata updates to files
- (^) All I/O in failure scenarios
- (^) Over same network as intra-cluster communication
- (^) Over a single interface
- (^) Can leverage SMB multi- channel to stream over multiple interfaces
DEMO
Microsoft Virtual Academy
Manage Cluster Networking
Network Bandwidth Planning
- (^) Lightweight (only 134 bytes)
- (^) Sensitive to latency
- (^) If cluster heartbeats become blocked by a saturated NIC, this could cause nodes to be removed from cluster membership
- (^) Bandwidth not important, but quality of service is Heartbeats
- (^) Lightweight
- (^) Traffic varies by workload, in general infrequent on running stable File / Hyper-V clusters. Heavier on SQL / Exchange clusters
- (^) Clustering is a distributed synchronous system, latency will slow down cluster state changes (such as failover)
- (^) Bandwidth not important, but quality of service is Intra-Cluster Communication
- (^) Metadata updates
- (^) Lightweight and Infrequent
- (^) Latency will slow down I/O performance
- (^) Yes, network performance will impact storage I/O performance!
- (^) Quality of service most important
- (^) Failure scenarios / asymmetric storage configurations
- (^) All I/O is forwarded via SMB over the network
- (^) Network bandwidth is most important CSV I/O Redirection Key Take-away: Primary design consideration for cluster communication is ensuring quality of service
Traditional Network Configuration Guidance
What we have recommended for the last decade…
- (^) At least 2 independent networks
- (^) Public / Client Network
- (^) IPv4 - Static or DHCP assigned address » APIPA (aka. autonet) 169.254.x.x addresses not supported
- (^) IPv6 Stateless address autoconfiguration (SLAAC) » (^) Note: DHCPv6 not supported for cluster IP Address resources
- (^) Default gateway (routable)
- (^) Intra-Node Communication
- (^) IPv6 (preferred) or IPv » (^) IPv6 Linklocal (fe80) works great
- (^) No default gateway (non-routable)
- (^) Separate physical network
This still applies, but we also support converged networking
Are Separate Networks Really Needed? Required?
- (^) No – It is not required to have 2 separate networks
- (^) Clustering does support a converged networking model
- (^) Validate will generate a Warning to alert you of a potential single point of failure
- (^) Validate is not NIC Teaming aware Recommended?
- (^) Yes – It is recommended to have redundant network communication between nodes
- (^) Sort of… let’s talk about what really matters and converged networking (next slide)
Resiliency
- (^) In a highly available system you want to avoid any single points of failure
- (^) Many ways to accomplish network redundancy - (^) Multiple independent
networks
Quality of Service
- (^) Cluster heartbeats are lightweight, but sensitive to latency - (^) If cluster heartbeats can’t get
through… this can be falsely
interpreted that nodes are
down
- (^) Many ways to accomplish network quality of service - (^) Multiple network cards Converged Network Considerations