Today we ran into a bit of a snag...the new cluster is more powerful than what we originally thought! We were running performance tests and it doesn't appear that there was enough power to run the cluster at full power. We unfortunately learned this the hard way when power was lost to parts of the data center.
We were able to:
- Verify all firmware up to date (Dell and DDN storage).
- Successfully ran stream to test memory bandwidth with no errors encountered.
- Successfully ran bibw for bi-directional bandwidth to ensure that all nodes receive expected bandwidth.
- Completed DDN network configuration
Members from the HPC team, facilities and electricians try to evaluate power restraints of the power distribution units.