Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
218 views
in Technique[技术] by (71.8m points)

How to reduce size of Lambda Layers/workaround the size limit/reduce size of R packages?

I'm running R on AWS Lambda. I've been making a layer for the packages that are used by the R script. Below are the packages:

install.packages(c("magrittr", "lubridate", "dplyr", "readr", "tidyr",
                   "aws.s3", "readxl", "here", "stringr"), 
                 lib = '/opt/R/new_library/R/library/')

However the subsequent folder produced exceeds 250 mb which is the limit for Lambda.

Therefore, I'm wondering what my options are:

  • From an R perspective is there any way of reducing the size of the installed packages? For example, the BH and stringi (which must be dependents) are over 150 mb in size
  • From an AWS perspective, is there any workaround for this lambda size limit? Can I use EFS to help here?

Many thanks

question from:https://stackoverflow.com/questions/65944160/how-to-reduce-size-of-lambda-layers-workaround-the-size-limit-reduce-size-of-r-p

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's complicated but packages are not generally programmed to an installation size constraint (but see for example what we said about the tinyverse and some thoughts about dependencies in general).

Consider data.table which has no dependencies, whereas dplyr has many -- I am not saying you must rewrite your script but it is one possible approach.

I am also the maintainer of BH and yes, it has a big footprint. Worse, we do not need at run-time (!!) so you could (very drastically) rm -rf its headers. You would not be able to compile anymore but the already build ones would run.

Lastly, and that is the main tip: Consider binaries. You have both RSPM and BSPM and I have written a few blog posts and videos on the topic -- scroll down to see a few. I am a big fan and big user of the Rutter PPAs which give you the above (maybe with the aws.s3 package) as binaries ready to install -- and BSPM makes it easy from install.packages().

Edit So for argument's sake I fire up rocker/r-bspm:20.04 and most package arrived as binaries:

root@b52f4e461ebd:/# du -csh /usr/lib/R/site-library/*
136K    /usr/lib/R/site-library/askpass
140K    /usr/lib/R/site-library/assertthat
147M    /usr/lib/R/site-library/BH
212K    /usr/lib/R/site-library/bspm
212K    /usr/lib/R/site-library/cellranger
580K    /usr/lib/R/site-library/cli
172K    /usr/lib/R/site-library/clipr
840K    /usr/lib/R/site-library/cpp11
272K    /usr/lib/R/site-library/crayon
320K    /usr/lib/R/site-library/curl
516K    /usr/lib/R/site-library/digest
2.0M    /usr/lib/R/site-library/dplyr
136K    /usr/lib/R/site-library/ellipsis
368K    /usr/lib/R/site-library/fansi
160K    /usr/lib/R/site-library/generics
312K    /usr/lib/R/site-library/glue
296K    /usr/lib/R/site-library/here
224K    /usr/lib/R/site-library/hms
824K    /usr/lib/R/site-library/httr
692K    /usr/lib/R/site-library/jsonlite
292K    /usr/lib/R/site-library/lifecycle
2.2M    /usr/lib/R/site-library/littler
1.7M    /usr/lib/R/site-library/lubridate
476K    /usr/lib/R/site-library/magrittr
140K    /usr/lib/R/site-library/mime
3.2M    /usr/lib/R/site-library/openssl
308K    /usr/lib/R/site-library/pillar
116K    /usr/lib/R/site-library/pkgconfig
120K    /usr/lib/R/site-library/prettyunits
208K    /usr/lib/R/site-library/progress
584K    /usr/lib/R/site-library/purrr
188K    /usr/lib/R/site-library/R6
10M     /usr/lib/R/site-library/Rcpp
1.5M    /usr/lib/R/site-library/readr
1.7M    /usr/lib/R/site-library/readxl
108K    /usr/lib/R/site-library/rematch
636K    /usr/lib/R/site-library/remotes
1.4M    /usr/lib/R/site-library/rlang
300K    /usr/lib/R/site-library/rprojroot
1.3M    /usr/lib/R/site-library/stringi
456K    /usr/lib/R/site-library/stringr
156K    /usr/lib/R/site-library/sys
1.7M    /usr/lib/R/site-library/tibble
1.5M    /usr/lib/R/site-library/tidyr
416K    /usr/lib/R/site-library/tidyselect
524K    /usr/lib/R/site-library/utf8
1.8M    /usr/lib/R/site-library/vctrs
472K    /usr/lib/R/site-library/xml2
188M    total
root@b52f4e461ebd:/# 

So with the suggested 'post-installation surgery' on BH (which, again, is a build and not runtime dependency by R does not differentiate) you can get that down to less than 50mb.

It is complemented by what installed from source:

root@b52f4e461ebd:/# du -csh /usr/local/lib/R/site-library/*
144K    /usr/local/lib/R/site-library/askpass
312K    /usr/local/lib/R/site-library/aws.s3
168K    /usr/local/lib/R/site-library/aws.signature
132K    /usr/local/lib/R/site-library/base64enc
2.1M    /usr/local/lib/R/site-library/curl
388K    /usr/local/lib/R/site-library/docopt
832K    /usr/local/lib/R/site-library/httr
696K    /usr/local/lib/R/site-library/jsonlite
140K    /usr/local/lib/R/site-library/mime
3.2M    /usr/local/lib/R/site-library/openssl
160K    /usr/local/lib/R/site-library/sys
472K    /usr/local/lib/R/site-library/xml2
8.6M    total
root@b52f4e461ebd:/# 

(and several of these exist as binaries so I'd have look more closely why BSPM did not pick binaries).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...