Node Features Plugin Programmer Guide
Overview
This document describes the node features plugin that is responsible for managing a node's active features. This is typically used for changing a node's characteristics at boot time. For example, an Intel Knights Landing (KNL) processor can be booted in various MCDRAM and NUMA modes. This document is intended as a resource to programmers wishing to write their own node features plugin.
const char plugin_name[]="launch Slurm plugin"
const char
plugin_type[]="node_features/[knl_cray]"
- knl_crayUse Cray's capmc command to manage an Intel KNL processor.
const uint32_t plugin_version=SLURM_VERSION_NUMBER
If specified, identifies the version of Slurm used to build this plugin and
any attempt to load the plugin from a different version of Slurm will result
in an error.
If not specified, then the plugin may be loadeed by Slurm commands and
daemons from any version, however this may result in difficult to diagnose
failures due to changes in the arguments to plugin functions or changes
in other Slurm functions used by the plugin.
The programmer is urged to study src/plugins/node_features/knl_cray/node_features_knl_cray.c for a sample implementation of a Slurm node features plugin.
API Functions
int init (void)
Description:
Called when the plugin is loaded, before any other functions are
called. Put global initialization here.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int fini (void)
Description:
Called when the plugin is removed. Clear any allocated storage here.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Note: These init and fini functions are not the same as those described in the dlopen (3) system library. The C run-time system co-opts those symbols for its own initialization. The system _init() is called before the Slurm init(), and the Slurm fini() is called before the system's _fini().
int node_features_p_reconfig(void)
Description:
Note that the configuration has changed, read configuration parameters again.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int node_features_p_get_node(char *node_list)
Description:
Update active and available features on specified nodes.
Executed from the slurmctld daemon only and directly updates internal
node data structures.
Arguments:
node_list: Regular expression identifying
the nodes to be updated. Update information about all nodes is value is NULL.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int node_features_p_job_valid(char *job_features)
Description:
Determine of the user's job constraint string is valid.
This may be used to limit the type of operators supported (Slurm's active
feature logic only supports the AND operator) and prevent illegal
combintations of node features (e.g. multiple NUMA modes).
Executed from the slurmctld daemon only when either the job submit or
modify operation is invoked.
Arguments:
job_features: Job constraints specified by
the user (-c/--constraint options).
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
char *node_features_p_job_xlate(char *job_features)
Description:
Translate a job's feature request to the node features needed at boot time.
Job features not required by this plugin (e.g. rack number) will not be
returned. For example, a user requested features may be "cache&quad&knl&rack1".
Since the "knl" and "rack1" represent physical characteristics of the node
and are not used by the node features plugin to boot the node, this function's
return value will be "cache,quad".
Executed from the slurmctld daemon only.
Arguments:
job_features: Job constraints specified by
the user (-c/--constraint options).
Returns:
Node features used by this plugin when configuring or booting a node.
A string with it's memory allocated by xmalloc (i.e. the return value
must be released using Slurm's xfree function).
bool node_features_p_node_power(void)
Description:
Report if the PowerSave mode is required to boot nodes.
Executed from the slurmctld daemon only.
Returns:
True if the plugin requires PowerSave mode for booting nodes.
bool node_features_p_node_reboot(void)
Description:
Report if the RebootProgram is required to boot nodes.
Executed from the slurmctld daemon only.
Returns:
True if the plugin requires RebootProgram mode for booting nodes.
void node_features_p_node_state(char **avail_modes, char **current_mode)
Description:
Get this node's available and current features (e.g. MCDRAM and NUMA
settings from BIOS for a KNL processor, for example
avail_modes="cache,flat,equal,a2a,quad,hemi,snc2,snc4" and
current_mode="cache,quad").
Executed from the slurmd daemon only.
Arguments:
avail_modes: Nodes state features which are
available. Value is allocated or appended to as appropriate with xmalloc functions.
current_modes: Nodes state features which
are currently in effect. Value is allocated or appended to as appropriate
with xmalloc functions.
char *node_features_p_node_xlate(char *new_features, char *orig_features)
Description:
Translate a node's new feature specification as needed to preserve any
original features (i.e. features outside of the domain of this plugin).
For example, a node's new features may be "cache,quad", while it's original
features may have been "flat,hemi,knl,rack1".
The available features with respect to this plugin are "flat,hemi", while
features outside of the domain of this plugin are "knl,rack1".
In this case, this function's return value will be "cache,quad,knl,rack1".
Executed from the slurmctld daemon only.
Arguments:
new_features: Node's reported features.
orig_features: Node's previous feature state.
Returns:
Node's currently features value
A string with it's memory allocated by xmalloc (i.e. the return value
must be released using Slurm's xfree function).
char *node_features_p_user_update(uid_t uid)
Description:
Determine if the specified user can modify the currently available node
features.
Arguments:
uid: User ID of user making request.
Returns:
True if user can change node active features to other available features.
Last modified 1 March 2016