20 September 2014

Using the Debug Option in readDICOMfile()

Some recent activity in stackoverflow was brought to my attention.  Specifically, several questions were raised with reading files into R using readDICOMFile() in the oro.dicom package. The questions did highlight some inadequacies in the code, and I would like to thank the person who brought these issues to the surface.  Some of the errors that occurred were due to the fact that the files are not valid DICOM files, they were created in the early 1990s around the same time that the DICOM Standard was established. 

The result of these questions is twofold
  1. I have modified some of the code to overcome difficiencies that were highlighted.  These modifications will be available in the next release of oro.dicom (0.4.2). 
  2. I would like to raise the profile of a useful option in readDICOMFile(), the debug = TRUE input parameter.  
Let's take a closer look at "debugging" the header information in DICOM files.   The file of interest CR-MONO1-10-chest.dcm is available for download.  Note, you will have to uncompress this file before reading it.  It is not necessary to rename it with a ".dcm" extension, but it looks nicer.

> library("oro.dicom")
> dcm <- readDICOMFile("OT-MONO2-8-hip.dcm")
Error in readDICOMFile("OT-MONO2-8-hip.dcm") : DICM != DICM
> dcm <- readDICOMFile("OT-MONO2-8-hip.dcm", debug=TRUE)
# First 128 bytes of DICOM header =
[1] 08 00 00 00 04 00 00 00 b0 00 00 00 08 00 08 00 2e 00 00 00 4f 52 49 47 49
[26] 4e 41 4c 5c 53 45 43 4f 4e 44 41 52 59 5c 4f 54 48 45 52 5c 41 52 43 5c 44
[51] 49 43 4f 4d 5c 56 41 4c 49 44 41 54 49 4f 4e 20 08 00 16 00 1a 00 00 00 31
[76] 2e 32 2e 38 34 30 2e 31 30 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 37 00
[101] 08 00 18 00 1a 00 00 00 31 2e 33 2e 34 36 2e 36 37 30 35 38 39 2e 31 37 2e
[126] 31 2e 37
Error in readDICOMFile("OT-MONO2-8-hip.dcm", debug = TRUE) : DICM != DICM
> dcm <- readDICOMFile("OT-MONO2-8-hip.dcm", skipFirst128=FALSE, DICM=FALSE, debug=TRUE)
# 0008 0000 GroupLength UL UL 4 176
# 0008 0008 ImageType CS CS 46 ORIGINAL SECONDARY OTHER ARC DICOM VALIDATION
# 0008 0016 SOPClassUID UI UI 26 1.2.840.10008.5.1.4.1.1.7
# 0008 0018 SOPInstanceUID UI UI 26 1.3.46.670589.17.1.7.0.23
# 0008 0060 Modality CS CS 2 OT
# 0008 0064 ConversionType CS CS 4 WSD
# 0008 0070 Manufacturer LO LO 24 Philips Medical Systems
# 0010 0000 GroupLength UL UL 4 18
# 0010 0010 PatientsName PN PN 10 Anonymized
# 0020 0000 GroupLength UL UL 4 92
# 0020 000D StudyInstanceUID UI UI 28 1.3.46.670589.17.1.7.1.1.23
# 0020 000E SeriesInstanceUID UI UI 28 1.3.46.670589.17.1.7.2.1.23
# 0020 0012 AcquisitionNumber IS IS 2 1
# 0020 0013 InstanceNumber IS IS 2 1
# 0028 0000 GroupLength UL UL 4 90
# 0028 0002 SamplesperPixel US US 2 1
# 0028 0004 PhotometricInterpretation CS CS 12 MONOCHROME2
# 0028 0010 Rows US US 2 512
# 0028 0011 Columns US US 2 512
# 0028 0100 BitsAllocated US US 2 8
# 0028 0101 BitsStored US US 2 8
# 0028 0102 HighBit US US 2 7
# 0028 0103 PixelRepresentation US US 2 0
# 7FE0 0000 Unknown UN UN 4
# 7FE0 0010 PixelData OB OB 262144 PixelData
##### Reading PixelData (7FE0,0010) #####
> image(t(dcm$img), col=grey(0:64/64), axes=FALSE, xlab="", ylab="")
view raw gistfile1.r hosted with ❤ by GitHub

The first attempt at reading this file on line #2 fails.  The error message is very informative, bytes 129-132 do not contain the characters DICM which is part of the DICOM Standard.  So... it's safe to assume that this file is not a valid DICOM file.  Delving further we turn on the debugging option (line #4) and are able to see what the first 128 bytes, which are skipped by default as part of the DICOM Standard, look like.  They obviously contain information.  By setting skipFirst128=FALSE and DICM=FALSE (line #13) we can override the default settings are start reading information from the first set of bytes.  This does the trick and with the debugging option turned on every field from the header is displayed.  No errors have occurred, so we can display the image data from this file (line #40) below.