You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p><strong>NOTE</strong>: If you are using RES and specify RESEnvironmentName in your configuration, these steps will automatically be done for you.</p>
188
189
<p>Before you can use the cluster you must configure the Linux users and groups for the head and compute nodes.
189
190
One way to do that would be to join the cluster to your domain.
190
191
But joining each compute node to a domain effectively creates a distributed denial of service (DDOS) attack on the demain controller
<td>Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.</td>
208
+
<td>Command01_MountHeadNodeNfs</td>
209
+
<td>Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.</td>
209
210
</tr>
210
211
<tr>
211
-
<td>Command02CreateUsersGroupsJsonConfigure</td>
212
-
<td>Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly.</td>
212
+
<td>Command02_CreateUsersGroupsJsonConfigure</td>
213
+
<td>Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly. Update /etc/fstab with the mount in the previous step.</td>
<p>If you configured extra file systems for the cluster that contain the users' home directories, then they should be able to ssh
233
234
in with their own ssh keys.</p>
234
235
<h2id="configure-submission-hosts-to-use-the-cluster">Configure submission hosts to use the cluster</h2>
236
+
<p><strong>NOTE</strong>: If you are using RES and specify RESEnvironmentName in your configuration, these steps will automatically be done for you on all running DCV desktops.</p>
235
237
<p>ParallelCluster was built assuming that users would ssh into the head node or login nodes to execute Slurm commands.
236
238
This can be undesirable for a number of reasons.
237
239
First, users shouldn't be given ssh access to a critical infrastructure like the cluster head node.
<td>Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.</td>
257
+
<td>Command01_MountHeadNodeNfs</td>
258
+
<td>Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.</td>
257
259
</tr>
258
260
<tr>
259
-
<td>Command03SubmitterConfigure</td>
260
-
<td>Configure the submission host so it can directly access the Slurm cluster.</td>
261
+
<td>Command03_SubmitterConfigure</td>
262
+
<td>Configure the submission host so it can directly access the Slurm cluster. Update /etc/fstab with the mount in the previous step.</td>
261
263
</tr>
262
264
</tbody>
263
265
</table>
264
266
<p>The first command simply mounts the head node's NFS file system so you have access to the Slurm commands and configuration.</p>
265
267
<p>The second command runs an ansible playbook that configures the submission host so that it can run the Slurm commands for the cluster.
268
+
It will also compile the Slurm binaries for the OS distribution and CPU architecture of your host.
266
269
It also configures the modulefile that sets up the environment to use the slurm cluster.</p>
270
+
<p><strong>NOTE</strong>: When the new modulefile is created, you need to refresh your shell environment before the modulefile
271
+
can be used.
272
+
You can do this by opening a new shell or by sourcing your .profile: <code>source ~/.profile</code>.</p>
267
273
<p>The clusters have been configured so that a submission host can use more than one cluster by simply changing the modulefile that is loaded.</p>
268
274
<p>On the submission host just open a new shell and load the modulefile for your cluster and you can access Slurm.</p>
269
275
<h2id="customize-the-compute-node-ami">Customize the compute node AMI</h2>
@@ -281,8 +287,14 @@ <h2 id="customize-the-compute-node-ami">Customize the compute node AMI</h2>
281
287
<p>Then update your aws-eda-slurm-cluster stack by running the install script again.</p>
282
288
<h2id="run-your-first-job">Run Your First Job</h2>
283
289
<p>Run the following command in a shell to configure your environment to use your slurm cluster.</p>
290
+
<p><strong>NOTE</strong>: When the new modulefile is created, you need to refresh your shell environment before the modulefile
291
+
can be used.
292
+
You can do this by opening a new shell or by sourcing your profile: <code>source ~/.bash_profile</code>.</p>
284
293
<pre><code>module load {{ClusterName}}
285
294
</code></pre>
295
+
<p>If you want to get a list of all of the clusters that are available execute the following command.</p>
0 commit comments