Tags

Pacemaker

Service Management Quick Start

All of the below can be applied to individual resources, or to a full resource group.

  • pcs resource cleanup resource_name to clear any failed actions, and as a first attempt at recovering a failed service
  • pcs resource disable resource_name to disable a resource. Generally the next step if pcs resource cleanup resource_name fails to recover a failed resource
  • pcs resource enable resource_name to enable a disabled service.
  • pcs resource restart resource_name to restart a running resource.
  • pcs resource move resource_name to move a running resource to a new node.
  • find_service resource_name to find which node a resource is on. find_service is a home-built wrapper for pcs commands.
  • find_my_services to find which resources are on this node. Again, a wrapper for pcs commands.

Resource Management and resource groups

In SL6 / rgmanager, resources are organized and managed as services. In SL7, resources are managed individually and are optionally grouped into resource groups.

This allows you to restart individual resources. If I want to restart the main www apache service in SL6 / rgmanager, I "clusvcadm -R www" which restarts everything in that service (stop nfs service, stop apache, unmount filesystem, de-activate logical volume, un-bind IP address, then bind IP address, activate LV, mount FS, start apache, start nfs). In SL7, I can just "pcs resource restart www-apache" to just restart the apache service without stopping and starting everything else.

Resource groups also provide a shortcut for organizing resources and dependencies. Resources are started and stopped in the order listed, and if one resource is stopped any subsequent resources in that group will also be stopped.

See official documentation at http://www.linux-ha.org/wiki/OCF_Resource_Agents

Example commands

  • pcs status = show status of cluster
  • pcs resource move testwww = move testwww service off of its current node.
    • equivalent to pcs resource ban testwww lnx904-p14 if testwww was currently running on lnx904-p14
    • pcs constraint delete cli-ban-testwww-on-lnx904-p14 delete the constraint create by either of the above commands. May result in testwww moving back to its original node
  • pcs resource cleanup testwww = cleanup "Failed action" messages for testwww
  • pcs resource restart testwww = restart the testwww resource group
  • pcs resource restart testwww-apache = restart the testwww-apache resource (just the apache server - not the entire resource group)
  • pcs resource debug-start testwww = manually start the testwww resource group without granting control to pacemaker. Also see debug-stop and debug-monitor .
  • pcs resource show testwww-apache = show the details of the testwww-apache resource.
    • Resource: testwww-apache (class=class provider=provider type=type)
    • /usr/lib/class/resource.d/provider/type
  • pcs resource update testwww-apache configfile=/hac/services/testwww/etc/httpd/conf/httpd.conf = update the configfile attribute for the testwww-apache resource
  • pcs resource describe apache = describe the apache resource agent
  • pcs resource disable testwww = stop and disable testwww
  • pcs resource enable testwww = enable and start testwww
  • pcs resource create lnx00-ip IPaddr2 ip=128.84.44.44 --group lnx00 --disabled = create but don't start an lnx00-ip resource as the next resource in the lnx00 group (lnx00 is created if it doesn't already exist)

Modify clusters XML config directly:

  • cibadmin --query >tmp.xml
  • vi tmp.xml # for example
  • cibadmin --replace --xml-file tmp.xml

fence on fail

By default, if a resource fails to stop cleanly, that node will fence itself. This can be changed per resource by changing the on-fail attribute from "fence" to "block". For example:
  • pcs resource update lnx00-lv-test op stop on-fail=block

For more, see:

Recover from failed resource

If a resource fails with on-fail=block or on-fail=stop, it will not be restarted until you "pcs resource cleanup" the resource. For example:
  • manually fix whatever was causing the problem. For example, of the stop operation for a filesystem or lv resource fails, you may need to kill processes to manually umount the filesystem and deactivate the logical volume.
  • pcs resource cleanup lnx00 # for example

Intervals

In SL7 / pacemaker, each resource action can have a custom timeout and interval. This can be useful for actions that need to happen at a specified interval, especially when that action should occur more frequently than an hour (unlike cron, that ensures actions occur at specified times)

Constraints

In SL7 / pacemaker, location and order constraints can be created to ensure that one resource starts before another, two resources start (or don't) on the same node, etc. Location constraints also determine where a resource can run (in lieu of failover domains used in SL6 / rgmanager).

Please see:

Example Commands

  • pcs constraint = view summary of all constraints
  • pcs constraint --full = view full information on all constraints
  • pcs constraint delete cli-ban-testwww-on-lnx904-p14 = delete the constraint that bans testwww from running on lnx904

Location constraints / Stickiness

Our clusters are configured as opt-out clusters, where by default resources can run on any node.

By default, pacemaker won't hesitate to move a running resource to a node with a higher store. To tell pacemaker to not move a healthy resource, we change the resource-stickiness property from the default (0) to 500 (pcs resource defaults resource-stickiness=500). With all that, our failover domains from rgmanager can be achieved as location constraints in pacemaker as follows

For a resource that can run anywhere but has a first and second choice (and should not fallback to the preferred node)
  • Set the score of the preferred node to 200
    pcs constraint location cesronl prefers cesr101-p16=200
  • Set the score of the second choices to 100
    pcs constraint location cesronl prefers cesr102-p16=100

For a resource that can run anywhere but should fallback to the preferred node
  • Set the score of the preferred node to INFINITY
    pcs constraint location cesroff prefers cesr102-p16=INFINITY

For a resource that should not run on a specific node
  • set the score of the nodes to avoid to -INFINITY
    pcs constraint location cesroff avoids cesr101-p16=INFNINITY

For more, please see http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_prevent_resources_from_moving_after_recovery.html

Utilization and placement strategy

If desired, you can allow pacemaker to place resources based on the utilization of each node - taking into consideration (optional) utilization capacities assigned to resources. For more, please see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-utilization-haar

Resource creation and organization

  • Anything that requires a cron job (as opposed to a regularly executing command using the monitor action) or email notification (using the MailTo resource agent) should be grouped into a resource group.
  • All resources within a resource group should be named resourcegroupname-blah, where resourcegroupname is the name of the resource group and blah is a description of the resource.
  • Notifications on resource changes should be managed by making a MailTo RA the last resource agent in that group
  • All sl6 services have been recreated as resource groups in pacemaker.
  • Any static files (configs, scripts, etc.) in /hac are managed by puppet and should be edited in lnxpup:/mnt/puppet/configs/ .
  • /hac/services/${servicename} is the chroot for each resource group
  • All log files should be local in /hac/services/${servicename}/var/log/
    • some are managed by the resource agent and may be in /var/log
  • All pid files should be local in /hac/services/${servicename}/var/run/
    • some are managed by the resource agent and may be in /var/run or /var/run/cluster
  • All lock files should be local in /hac/services/${servicename}/var/lock/subsys/
    • some may be in /var/lock/subsys
  • All state information is in a file system bound to that service. For example, yp3:/mnt/yp3

Creating and testing new resource agents

  1. create new resource agent in /usr/lib/ocf/resources.d/test/new_agent
  2. test using pcs resource describe ocf:test:new_agent
  3. create new resource using this agent using something like pcs resource create new_service ocf:test:new_agent --disabled op stop timeout=60 on-fail=block
  4. test new resource using:
    • pcs resource debug-start new_service
    • pcs resource debug-monitor new_service
    • pcs resource debug-stop new_service
  5. once all of the above return successfully, manually propagate new_agent to each cluster member
  6. give the cluster control of the new service using pcs resource enable new_service
  7. when ready for production, ask cmpgrp to manage and propagate new_agent using puppet.
  8. once in production, recreate new_service using ocf:classe:new_agent instead of ocf:test:new_agent

Troubleshooting

Resource Failures

cesradmin members will receive emails when resources fail. These failures will also appear in the output of "pcs status". A general workflow for debugging and dealing with these failures is:
  1. log into the cluster, and execute "pcs status" to determine if recovery was successful or if the resource is failed.
  2. If it's running, execute "pcs resource cleanup <resource_name>" to cleanup the output of "pcs status"
  3. If it's failed, proceed with:
    1. "pcs resource disable <resource_name>"
    2. "pcs resource cleanup <resource_name>"
    3. "pcs resource enable <resource_name>"
  4. If it's still failing, proceed with:
    1. "pcs resource disable <resource_name>"
    2. "pcs resource cleanup <resource_name>"
    3. "pcs resource debug-start <resource_name>"
    4. pcs resource debug-monitor <resource_name>
    5. pcs resource debug-stop <resource_name>
    6. Once all of the above are successful,"pcs resource enable <resource_name>"
  5. If the service's logs aren't instructive enough, check teh following in order of verbosity (lowest to highest)
    1. /var/log/messages on the server that saw failures
    2. /var/log/pacemaker.log on the server that saw failures
    3. /var/log/messages on the DC (determined using "pcs cluster status" or "pcs status"
    4. /var/log/pacemaker.log on the DC

Pacemaker Log Files

All pacemaker logs will be in /var/log/pacemaker.log, with a subset of these logs will be in /var/log/messages .

For the most complete logs, see the "master" node, known in pacemaker as the DC (Designated Co-ordinator). "pcs status" or "pcs cluster status" will show you which node is the current DC.
Topic revision: r9 - 10 Aug 2021, MichaelRyan
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CLASSE Wiki? Send feedback