A scan is not a photograph. The scanner head is like a comb of light-emitting fingers that slowly strokes its subject in order to see it. The final image denies this temporality in its conventional single and simultaneous presentation. This probe attempts to reintroduce the temporality of the scanner.
On the commandline, a scan begins, first in black and white (bitmap) low resolution. This takes about 14 seconds. Next, a scan in color at twice the (linear) resolution (meaning 4 times as many pixels, each with 24 times as many bits). This scan takes 24 seconds, nearly twice as long.
scanimage \
--device-name=hpaio:/usb/Officejet_6200_series?serial=CN5AIEG26N0453 \
--format=pnm \
--mode Lineart \
--resolution 75 \
> scan.B75.pnm
scanimage \
--device-name=hpaio:/usb/Officejet_6200_series?serial=CN5AIEG26N0453 \
--format=pnm \
--mode Color \
--resolution 150
> scan.RGB150.pnm
Reading the resulting data from the scan in order to interpret it (as pixels) is an act of reconstruction. The act, typically called parsing the data, has a lot to do with managing the modality of the data, first separating out the data's headers from its main body, then managing the alignment of the stream of bytes to the structure determined by the format. This process, called framing, highlights the constant struggle in working with the digital, between the (inherent) arbitriness of the data stream, with the task of aligning those bits to a conventional meaning. In this case the data conforms to the the "netbmp" file format. The challenge in the code is now where to cut the data to isolate individual pixels, or the notice the (typically silent) edge from one row of pixels to the next. The typical glitches of misreading a format such as this one are the rolling shifting rows, or misregistered colors.
The horizontal motion in the above animation is due to mis-alignment.
In writing an interpreter of the scan data, differences of interpretation between different systems often arise. In this case, the interpretation of "raw data" via the ".fromstring" function of the Python Image Library reveals a difference from the "raw data" the PBM file: the result is an inversion turning the black print on white page to white pixels on a black background.
The color scan is non-inverted:
Another source, a passage of text, pp. 204-205:
Another source, a spread of the Hannah Ryggen tapestry, Horror. From the Civil War in Spain pp. 158-159, as a threshold (lineart) image:
the full scan:
In color, here the darkness at the binding (the result of the book not being pressed down onto the scan bed becomes a dramatically evident as the area is the shadow is passed through).
the full scan:
A scan of a book inside the book (pp. 22-23):
the full scans:
Frans Masereel, 25 Images of a Man's Passion (pp. 84-85):
the full scans:
from argparse import ArgumentParser
import sys, math
from PIL import Image
MODE_BITMAP = 1
MODE_GRAYMAP = 2
MODE_PIXMAP = 3
class PNMException (Exception):
pass
class PNM (object):
# http://en.wikipedia.org/wiki/Netpbm_format
def __init__ (self, strict=True):
self.in_data = False
self.curline = ''
self.binary = True
self.mode = None
self.bpp = None
self.width = None
self.height = None
self.max_pixel = None
self.strict = strict
self.header = []
self.framenum = 0
def process_header_line (self):
line = self.curline
linenum = len(self.header)+1
if linenum == 1:
if not line[0] == "P" and self.strict:
raise PNMException("Does not look like PNM, bad start char")
c = line[1]
if c == "1" or c == "4":
self.mode = MODE_BITMAP
self.bpp = 1
self.binary = c == "4"
elif c == "2" or c == "5":
self.mode = MODE_GRAYMAP
self.bpp = 8
self.binary = c == "5"
elif c == "3" or c == "6":
self.mode = MODE_PIXMAP
self.bpp = 24
self.binary = c == "6"
elif self.strict:
raise PNMException("Does not look like PNM, bad mode char {0}".format(c))
elif linenum == 2:
pass
elif linenum == 3:
# should rethrow any exception here
self.width, self.height = [int(x) for x in line.split()]
if self.mode == MODE_BITMAP:
self.end_of_header()
elif linenum == 4:
# should rethrow any exception here
self.max_pixel = int(line)
self.end_of_header()
self.header.append(self.curline)
self.curline = ''
def end_of_header(self):
print self.unparse()
self.in_data = True
def data (self, d):
""" Main parse function, pass in image data in chunks """
start_index = None
for i in range(len(d)):
if not self.in_data:
if ord(d[i]) != 10:
self.curline += d[i]
else:
self.process_header_line()
else:
if start_index == None:
start_index = i
if start_index:
self.write_frame(data[start_index:])
else:
self.write_frame(data)
def get_mode (self):
if self.mode == MODE_BITMAP:
return 'bitmap'
elif self.mode == MODE_GRAYMAP:
return 'graymap'
else:
return 'pixmap'
def get_pil_mode (self):
if self.mode == MODE_BITMAP:
return '1'
elif self.mode == MODE_GRAYMAP:
return 'L'
else:
return 'RGB'
def unparse (self):
mode_desc = self.get_mode()
if self.mode == MODE_BITMAP:
return "<PPM {0} {1}x{2} {3}bpp>".format(mode_desc, self.width, self.height, self.bpp)
elif self.mode == MODE_GRAYMAP:
return "<PGM {0}/{3} {1}x{2} {4}bpp>".format(mode_desc, self.width, self.height, self.max_pixel, self.bpp)
else:
return "<PPM {0}/{3} {1}x{2} {4}bpp>".format(mode_desc, self.width, self.height, self.max_pixel, self.bpp)
def frame_size (self):
pcount = self.width * self.height
if self.mode == MODE_BITMAP:
return pcount / 8
elif self.mode == MODE_GRAYMAP:
return pcount
else:
return pcount * 3
def write_frame_no_alignment (self, data):
rowbytes = self.width * self.bpp / 8
frame_height = int(math.ceil(len(data)/rowbytes))
pad_length = (frame_height*rowbytes) - len(data)
if pad_length:
data += chr(0) * pad_length
m = self.get_pil_mode()
im = Image.fromstring(m, (self.width, frame_height), data)
im.save("frame{0:04d}.png".format(self.framenum))
self.framenum += 1
def write_frame (self, data):
# quantize to whole rows, store leftover data
if not hasattr(self, "prevdata"):
self.prevdata = ''
data = self.prevdata + data
self.prevdata = ''
rowbytes = self.width * self.bpp / 8
whole_rows = len(data)/rowbytes
cutlen = len(data)-(whole_rows*rowbytes)
if cutlen:
self.prevdata = data[-cutlen:]
data = data[:-cutlen]
else:
self.prevdata = ''
m = self.get_pil_mode()
im = Image.fromstring(m, (self.width, whole_rows), data)
im.save("frame{0:04d}.png".format(self.framenum))
self.framenum += 1
parser = ArgumentParser(description="pbm interpreter")
parser.add_argument("--size", type=int, default=1, help="buffer / frame size in kb")
args = parser.parse_args()
pnm = PNM()
while True:
data = sys.stdin.read(1024*args.size)
if not data:
break
pnm.data(data)