TensorBoard files gets deleted, Profiler returns 0 Millis for communication time!

**Environment:**
 - Python version [3.7.7]
 - Spark version [3.0.0]
 - TensorFlow version [2.3.0]
 - TensorFlowOnSpark version [2.2.2]
 - Cluster version [Standalone]

**Describe the bug:**
I have **2 issues** regarding the **TensorBoard** when executing a training process of my model on **2 worker nodes**:

**1-** The first one is that after the training process completed, the TensorBoard files **get deleted immediately**  on **worker 1** while they are kept at **worker 0** although I can use TensorBoard to check details while the training process is running.
**2-** I am trying to **profile** my model to check the details of consumed time for batches 3 to 5 while training the model in the **Profiler** page but I get `0 ms` for communication time, more specifically the `Device Collective Communication` and `Device to Device Time`. However the `Average Step Time` gives reasonable values like **19368.9 ms**!
From the `Hosts` drop-down list I can see that there is only one detected host in the cluster, not 2. Why does this happen?

![image](https://user-images.githubusercontent.com/32763039/110782669-2003cc80-8270-11eb-9c92-fd4d3bf61376.png)


**Logs:**
If applicable, add logs to help explain your problem.  Note: errors may not be fully described in the driver/console logs.  Make sure to check the executor logs for possible root causes.

**Spark Submit Command Line:**
spark-submit --master spark://master:7077 train_file.py --cluster_size 2 --epochs 1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TensorBoard files gets deleted, Profiler returns 0 Millis for communication time! #550

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TensorBoard files gets deleted, Profiler returns 0 Millis for communication time! #550

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions