Skip to content

Commit 8869d9e

Browse files
committed
Deployed ef67eb9 with MkDocs version: 1.5.3
1 parent 9ba350f commit 8869d9e

File tree

4 files changed

+23
-11
lines changed

4 files changed

+23
-11
lines changed

deploy-parallel-cluster/index.html

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,7 @@ <h2 id="create-the-cluster">Create the Cluster</h2>
185185
<p>If you look in CloudFormation you will see 2 new stacks when deployment is finished.
186186
The first is the configuration stack and the second is the cluster.</p>
187187
<h2 id="create-users_groupsjson">Create users_groups.json</h2>
188+
<p><strong>NOTE</strong>: If you are using RES and specify RESEnvironmentName in your configuration, these steps will automatically be done for you.</p>
188189
<p>Before you can use the cluster you must configure the Linux users and groups for the head and compute nodes.
189190
One way to do that would be to join the cluster to your domain.
190191
But joining each compute node to a domain effectively creates a distributed denial of service (DDOS) attack on the demain controller
@@ -204,12 +205,12 @@ <h2 id="create-users_groupsjson">Create users_groups.json</h2>
204205
</thead>
205206
<tbody>
206207
<tr>
207-
<td>Command01SubmitterMountHeadNode</td>
208-
<td>Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.</td>
208+
<td>Command01_MountHeadNodeNfs</td>
209+
<td>Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.</td>
209210
</tr>
210211
<tr>
211-
<td>Command02CreateUsersGroupsJsonConfigure</td>
212-
<td>Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly.</td>
212+
<td>Command02_CreateUsersGroupsJsonConfigure</td>
213+
<td>Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly. Update /etc/fstab with the mount in the previous step.</td>
213214
</tr>
214215
</tbody>
215216
</table>
@@ -223,7 +224,7 @@ <h2 id="create-users_groupsjson">Create users_groups.json</h2>
223224
</thead>
224225
<tbody>
225226
<tr>
226-
<td>command10CreateUsersGroupsJsonDeconfigure</td>
227+
<td>command10_CreateUsersGroupsJsonDeconfigure</td>
227228
<td>Removes the crontab that refreshes users_groups.json.</td>
228229
</tr>
229230
</tbody>
@@ -232,6 +233,7 @@ <h2 id="create-users_groupsjson">Create users_groups.json</h2>
232233
<p>If you configured extra file systems for the cluster that contain the users' home directories, then they should be able to ssh
233234
in with their own ssh keys.</p>
234235
<h2 id="configure-submission-hosts-to-use-the-cluster">Configure submission hosts to use the cluster</h2>
236+
<p><strong>NOTE</strong>: If you are using RES and specify RESEnvironmentName in your configuration, these steps will automatically be done for you on all running DCV desktops.</p>
235237
<p>ParallelCluster was built assuming that users would ssh into the head node or login nodes to execute Slurm commands.
236238
This can be undesirable for a number of reasons.
237239
First, users shouldn't be given ssh access to a critical infrastructure like the cluster head node.
@@ -252,18 +254,22 @@ <h2 id="configure-submission-hosts-to-use-the-cluster">Configure submission host
252254
</thead>
253255
<tbody>
254256
<tr>
255-
<td>Command01SubmitterMountHeadNode</td>
256-
<td>Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.</td>
257+
<td>Command01_MountHeadNodeNfs</td>
258+
<td>Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.</td>
257259
</tr>
258260
<tr>
259-
<td>Command03SubmitterConfigure</td>
260-
<td>Configure the submission host so it can directly access the Slurm cluster.</td>
261+
<td>Command03_SubmitterConfigure</td>
262+
<td>Configure the submission host so it can directly access the Slurm cluster. Update /etc/fstab with the mount in the previous step.</td>
261263
</tr>
262264
</tbody>
263265
</table>
264266
<p>The first command simply mounts the head node's NFS file system so you have access to the Slurm commands and configuration.</p>
265267
<p>The second command runs an ansible playbook that configures the submission host so that it can run the Slurm commands for the cluster.
268+
It will also compile the Slurm binaries for the OS distribution and CPU architecture of your host.
266269
It also configures the modulefile that sets up the environment to use the slurm cluster.</p>
270+
<p><strong>NOTE</strong>: When the new modulefile is created, you need to refresh your shell environment before the modulefile
271+
can be used.
272+
You can do this by opening a new shell or by sourcing your .profile: <code>source ~/.profile</code>.</p>
267273
<p>The clusters have been configured so that a submission host can use more than one cluster by simply changing the modulefile that is loaded.</p>
268274
<p>On the submission host just open a new shell and load the modulefile for your cluster and you can access Slurm.</p>
269275
<h2 id="customize-the-compute-node-ami">Customize the compute node AMI</h2>
@@ -281,8 +287,14 @@ <h2 id="customize-the-compute-node-ami">Customize the compute node AMI</h2>
281287
<p>Then update your aws-eda-slurm-cluster stack by running the install script again.</p>
282288
<h2 id="run-your-first-job">Run Your First Job</h2>
283289
<p>Run the following command in a shell to configure your environment to use your slurm cluster.</p>
290+
<p><strong>NOTE</strong>: When the new modulefile is created, you need to refresh your shell environment before the modulefile
291+
can be used.
292+
You can do this by opening a new shell or by sourcing your profile: <code>source ~/.bash_profile</code>.</p>
284293
<pre><code>module load {{ClusterName}}
285294
</code></pre>
295+
<p>If you want to get a list of all of the clusters that are available execute the following command.</p>
296+
<pre><code>module avail
297+
</code></pre>
286298
<p>To submit a job run the following command.</p>
287299
<pre><code>sbatch /opt/slurm/$SLURM_CLUSTER_NAME/test/job_simple_array.sh
288300
</code></pre>

index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,5 +309,5 @@ <h4 class="modal-title" id="keyboardModalLabel">Keyboard Shortcuts</h4>
309309

310310
<!--
311311
MkDocs version : 1.5.3
312-
Build Date UTC : 2024-05-13 23:49:02.420384+00:00
312+
Build Date UTC : 2024-05-15 18:42:21.987933+00:00
313313
-->

search/search_index.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

sitemap.xml.gz

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)