类似如下:
[23.11|08:46:11] 4 45 FAILED TEMPERATURE [23.11|08:46:11] 4 46 FAILED TEMPERATURE [23.11|08:46:11] 4 47 FAILED TEMPERATURE [23.11|08:46:11] --- End summary --- [23.11|08:46:11] [23.11|08:46:11] Running more reference calculations.... [23.11|08:46:11] For job step4_attempt47_simulation, temperature > 5000.0 K starting at frame 37 [23.11|08:46:11] Running reference calculations on frames [36] from step4_attempt47_simulation\ams.rkf [23.11|08:46:11] Calculating 1 frames in total [23.11|08:46:11] Running step4_attempt47_reference_calc2 [23.11|08:48:41] Reference calculations finished! [23.11|08:48:41] WARNING: Reference job step4_attempt47_reference_calc2 failed: UNKNOWN. Attempting to use data anyway... [23.11|08:48:41] Traceback (most recent call last): File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/simple_active_learning/simple_active_learning_workflow.py", line 882, in run File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/simple_active_learning/simple_active_learning_workflow.py", line 714, in run File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/simple_active_learning/active_learning.py", line 355, in run_loop File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/simple_active_learning/active_learning.py", line 449, in update_and_store_singlepoint_reference_data File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/simple_active_learning/utils.py", line 101, in add_singlejob File "C:\Users\testadf\AppData\Local\Temp\pip-target-8dm09ykr\lib\python\scm/params/core/results_importer.py", line 345, in add_singlejob TypeError: unsupported operand type(s) for *: 'NoneType' and 'float' [23.11|08:48:41] ERROR: unsupported operand type(s) for *: 'NoneType' and 'float' Job TIH2+O2 has finished
从logfile可以看到训练到某一个分段(一般默认10个分段,在logfile中叫做Step),尝试了47次训练(在logfile中叫做attempt,因此合起来叫做step4_attempt47,即第四段,训练了47次都训练不出来),都没有训练好。
用最后一次训练的结果,即logfile最后一次出现的训练结果,例如:
Engine MLPotential Backend M3GNet MLDistanceUnit angstrom MLEnergyUnit eV Model Custom ParameterDir E:\WJC-ML\TIH2+O2\TIH2+O2.results\step4_attempt46_training\results\optimization\m3gnet\m3gnet EndEngine
作为初始,继续往下重新训练即可。
训练的结果,如上例子,则保存在E:\WJC-ML\TIH2+O2\TIH2+O2.results\step4_attempt46_training\results\optimization\m3gnet\m3gnet(用户根据自己的实际位置去设置,这里只是一个案例而已)。关于如何基于一个已有结果继续训练,请参考教程第三节:AMS软件:从AIMD从头训练或微调机器学习势M3GNet、NequIP