Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
356 views
in Technique[技术] by (71.8m points)

Open only part of an image (JPEG/TIFF etc.) in R

I am analysing very large images in R, in the order of tens of thousands of pixels square. Unfortunately, even with 64 GB RAM, these images sometimes fail to fit into memory, and when they do I can only open one at a time, precluding parallelisation.

My current strategy is to load them using the JPEG or TIFF packages. e.g.:

image <- readJPEG('image.jpg')

However, as I am only performing simple mathematical manipulations (summing, thresholding etc.) that could be performed piece-by-piece, is it possible to only open part of an image at a time by specifying the dimensions to load? If so, I could write a loop to open 1024 x 1024 sized tiles. The JPEG and TIFF packages do not offer an option to do this.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are working with very large images, libvips is probably your best bet. You can shell out to it from R using system().

Your question is not very specific, but let's make a 10,000x10,000 pixel TIFF with ImageMagick and it is a black-white gradient:

convert -size 10000x10000 gradient: -depth 8 a.tif

Now threshold that at 50% with vips and check memory required:

vips im_thresh a.tif b.tif 128 --vips-leak
memory: high-water mark 292.21 MB

Pretty frugal, no? By comparison, the equivalent ImageMagick command requires 1.6GB of RAM:

/usr/bin/time -l convert a.tif -threshold 50% b.tif

Sample Output

...
1603895296  maximum resident set size
...

How about adding 64 to every pixel using im_gadd which does:

usage: vips im_gadd a in1 b in2 c out
where:
    a is of type "double"
    in1 is of type "image"
    b is of type "double"
    in2 is of type "image"
    c is of type "double"
    out is of type "image"
calculate a*in1 + b*in2 + c = outfile

So we use:

vips im_gadd 1 a.tif 0 b.tif 64 c.tif --vips-leak
memory: high-water mark 584.41 MB

Need to do some statistics?

vips im_stats c.tif
band    minimum     maximum         sum       sum^2        mean   deviation
all          64         319   1.915e+10 4.20922e+12       191.5     73.6206
 1           64         319   1.915e+10 4.20922e+12       191.5     73.6206

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...