shapefile - Getting an error message when using rvest for webscraping?

Question

Welcome To Ask or Share your Answers For Others

shapefile - Getting an error message when using rvest for webscraping?

posted Jan 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

shapefile - Getting an error message when using rvest for webscraping?

I want to scrap all the shapefiles from the following website: https://www.sciencebase.gov/catalog/items?q=&filter0=browseCategory%3DData&community=California+Condor&filter1=browseType%3DMap+Service&filter2=browseType%3DOGC+WMS+Layer&filter3=browseType%3DDownloadable&filter4=facets.facetName%3DShapefile&&filter5=browseType%3DShapefile

I used the following script:

installed.packages('rvest')
library(rvest)
library(tidyverse)

## CONSTANT ----

URL <- "https://www.sciencebase.gov/catalog/item/54471eb5e4b0f888a81b82ca"
dir_out <- "~/condor"

## MAIN ----

# Get the webpage content
webpage <- read_html(URL)

# Extract the information of interest from the website
data <- html_nodes(webpage, ".sb-file-get sb-download-link")

# Grab the base URLs to download all the referenced data
url_base <- html_attr(data,"href")

# Filter the zip files
shapefile_base <- grep("*.zip",url_base, value=TRUE)

# Fix the double `//`
shapefile_fixed <- gsub("//", "/", shapefile_base)

# Add the URL prefix
shapefile_full <- paste0("https://www.sciencebase.gov/",shapefile_fixed)

# Create the output directory
dir.create(dir_out, showWarnings = FALSE)

# Create a list of filenames
filenames_full <- file.path(dir_out,basename(shapefile_full))

# Download the files
lapply(shapefile_full, FUN=function(x) download.file(x, file.path(dir_out,basename(x))))

# Unzip the files
unzip(filenames_full, overwrite = TRUE)

However for the download file section, I get the following error

> lapply(shapefile_full, FUN=function(x) download.file(x, file.path(dir_out,basename(x))))
trying URL 'https://www.sciencebase.gov/'
Content type 'text/html' length unknown
downloaded 111 bytes

[[1]]
[1] 0

> # Unzip the files
> unzip(filenames_full, overwrite = TRUE)
Warning message:
In unzip(filenames_full, overwrite = TRUE) :
  error 1 in extracting from zip file

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

shapefile - Getting an error message when using rvest for webscraping?

shapefile - Getting an error message when using rvest for webscraping?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags