It seems I cannot find a way to read only the value length of some long tag in pydicom. I have tried the following, and while dcmread is very fast, dataset[tag] takes half a second to load 1.5 GB of data. However, I am not interested in reading these 1.5 GB of data - the only information I am looking for is where these 1.5 GB of data are located in the file (start and end offsets). How can I get this information?
# python bug.py && tuna bug.prof
import cProfile
import pydicom
def bug():
path = "C:\\Data\\bug.ima"
tag = pydicom.tag.Tag(0x7FE1, 0x1010)
dataset = pydicom.dcmread(path, specific_tags=[tag], defer_size=0)
element = dataset[tag]
_a = element.file_tell
_b = len(element.value)
cProfile.run("bug()", filename="bug.prof")
One workaround using a private API is this:
# python bug.py && tuna bug.prof
import cProfile
import pydicom
def bug():
path = "C:\\Data\\bug.ima"
tag = pydicom.tag.Tag(0x7FE1, 0x1010)
dataset = pydicom.dcmread(path, specific_tags=[tag], defer_size=0)
element = dataset._dict[tag]
_a = element.value_tell
_b = element.length
cProfile.run("bug()", filename="bug.prof")
Another workaround using a another (at least not marked as private) API is this:
# python bug.py && tuna bug.prof
import cProfile
import pydicom
def bug():
path = "C:\\Data\\bug.ima"
tag = pydicom.tag.Tag(0x7FE1, 0x1010)
dataset = pydicom.dcmread(path, specific_tags=[tag], defer_size=0)
dataset = dataset.__array__().item()
element = dataset[tag]
_a = element.value_tell
_b = element.length
cProfile.run("bug()", filename="bug.prof")
Is that possible using just the public API?
EDIT: I see now that the help notes deferred-read elements will be converted... so unfortunately
get_itemwill not work as desired in this case. Probably using_dictis your best bet - I don't see that 'hidden' member ever changing, so it should be safe for the long term.The API method you are looking for is
Dataset.get_item. That will return theRawDataElement(not yet decodedDataElement), that you can use.value_telland.lengthon, assuming the decoding has not already been triggered by some other access.Another option which might offer a little speed improvement is to model the
dicomfilecontext manager inpydicom.util.leanread, but add passingdefer_sizethrough to thedata_element_generator, and of course filter the generated elements by tag. That module demonstrates a simpler read without handling so many special cases, and returns simple tuples for the data elements. If you try that way, however, be aware that the code has not been updated in quite some time so will not be as robust as mainline pydicom.