NEF (NMR Exchange Format1) files have headers (one per file) that define which programs wrote the file and it’s history. However, there are a few things that are not clear
Here’s the header
save_nef_nmr_meta_data
_nef_nmr_meta_data.sf_category nef_nmr_meta_data
_nef_nmr_meta_data.sf_framecode nef_nmr_meta_data
_nef_nmr_meta_data.format_name nmr_exchange_format
_nef_nmr_meta_data.format_version 1.1
_nef_nmr_meta_data.program_name NEFPipelines
_nef_nmr_meta_data.program_version 0.0.1
_nef_nmr_meta_data.creation_date 2021-06-19T17:36:39.073848
_nef_nmr_meta_data.uuid NEFPipelines-2021-06-19T17:36:39.073848-9006508160
loop_
_nef_run_history.run_number
_nef_run_history.program_name
_nef_run_history.program_version
_nef_run_history.script_name
1 NEFPipelines 0.0.1 header.py
stop_
save_
Please note firstly the entries sf_category
and sf_framecode
these are mandetory for the frame to be recognised.
The first frame that that isn’t clear in its format is _nef_nmr_meta_data.creation_date
, however this appears to be a isoformat date time, and the mostr reasonable decision is that this is a UTC 2 date time as there is no time zone information and this is unique worldwide. The simple way yom ake this in python is
from datetime import datetime
utc_date_time = datetime.now().isoformat()
The second question is what is the _nef_nmr_meta_data.uuid
tag? This is a UUID3 which uniquely identifies this version of the file apart form any other4. This has the form: NEFPipelines-2021-06-19T17:36:39.073848-9006508160
. The first part is obvious its our programmes name and the second part is the current time. However, whats the third part 9006508160
well its just a 10 digit random number to ensure that the uuid is unique (think of creating the file at the same time on multiple threads…without the random number they would all have the same Universally Unique Identifier!
from random import randint
from datetime import datetime
utc_date_time = datetime.now().isoformat()
random_value = ''.join(["{}".format(randint(0, 9)) for num in range(10)])
uuid = f'NEFPipelines-{utc_date_time}-{random_value}'
Finally there is the loop
loop_
_nef_run_history.run_number
_nef_run_history.program_name
_nef_run_history.program_version
_nef_run_history.script_name
1 NEFPipelines 0.0.1 header.p
stop_
This just lists the programs that have editied the file in order… lastest to oldest.
So a complete header would be
data_new
save_nef_nmr_meta_data
_nef_nmr_meta_data.sf_category nef_nmr_meta_data
_nef_nmr_meta_data.sf_framecode nef_nmr_meta_data
_nef_nmr_meta_data.format_name nmr_exchange_format
_nef_nmr_meta_data.format_version 1.1
_nef_nmr_meta_data.program_name NEFPipelines
_nef_nmr_meta_data.program_version 0.0.1
_nef_nmr_meta_data.creation_date 2021-06-19T17:36:39.073848
_nef_nmr_meta_data.uuid NEFPipelines-2021-06-19T17:36:39.073848-9006508160
loop_
_nef_run_history.run_number
_nef_run_history.program_name
_nef_run_history.program_version
_nef_run_history.script_name
1 NEFPipelines 0.0.1 header.p
stop_
...
Oh and one last thing, this header should be renewed each time a program reads and modifies the file, so new program names, dates, and uuid, plus an extra line in the_nef_run_history
loop.
NMR Exchange Format a unified and open standard for representation of NMR restraint data. ↩︎
Universal coodinated time the worlds primary time standard and the effective successor of Greenewich Mean Tim (GMT). ↩︎
UUIDs are Universally Unique Identifiers ↩︎
you could use a hash but then you would need to know what the hash of the file is before the file is complete, and adding the hash to the file would change the value of the files hash… ↩︎