top of page
Search
  • Robert Kennedy

Week 6

This is the first week I was able to get my hands on HPCC Systems 7.10. This is the version of HPCC that will be compatible with Kubernetes, first mentioned in Week 3. There are quite a few benefits of being able to use containerization, but one of them is–potentially–the ability for Thor to shut down and restart after every workunit. HThor has this functionality but Thor keeps the processes alive. Importantly, this means it keeps the Python interpreter alive after a workunit is complete. Unfortunately, TensorFlow assumes the interpreter is shutdown right after the program ends. The result is TF does not have the capacity to free VRAM memory from the GPU… If you are just using Python directly, restarting the interpreter is fairly automatic in a workflow. In Thor, it keeps it alive indefinitely and only closes it if Thor shuts down, which is only when the whole cluster shuts down from either a cluster wide issue, a restart, or a shutdown. Having the ability to shutdown the interpreter at will is very important and should be added as a feature to better accommodate the Python-HPCC user. 7.10 supposedly does this, I will have to wait for the person who knows HPCC the best to return from holiday to find out if this happens or if its even possible.

1 view0 comments

Recent Posts

See All

Week 11

This week all the code (major changes at least) is going to be unchanged. Other than cleaning it up for making a repo that others can use. No one wants to (or should have to) read poorly commented cod

Week 10

Nearing the home stretch. As part of trying to improve the efficiency of the GNN runtime across many GPUs, I started to experiment with using a different approach to using so many GPUs. A “normal” GNN

Week 9

It’s already August and there are only 3 more weeks until I run out of time with this internship! Time certainly flies. As per a recommendation from my mentor who knows way more about HPCC than I, I s

Post: Blog2_Post
bottom of page