Data in XML format are rarely organized in a way that would allow the xmlToDataFrame
function to work. You're better off extracting everything in lists and then binding the lists together in a data frame:
require(XML)
data <- xmlParse("http://forecast.weather.gov/MapClick.php?lat=29.803&lon=-82.411&FcstType=digitalDWML")
xml_data <- xmlToList(data)
In the case of your example data, getting location and start time is fairly straightforward:
location <- as.list(xml_data[["data"]][["location"]][["point"]])
start_time <- unlist(xml_data[["data"]][["time-layout"]][
names(xml_data[["data"]][["time-layout"]]) == "start-valid-time"])
Temperature data is a bit more complicated. First you need to get to the node that contains the temperature lists. Then you need extract both the lists, look within each one, and pick the one that has "hourly" as one of its values. Then you need to select only that list but only keep the values that have the "value" label:
temps <- xml_data[["data"]][["parameters"]]
temps <- temps[names(temps) == "temperature"]
temps <- temps[sapply(temps, function(x) any(unlist(x) == "hourly"))]
temps <- unlist(temps[[1]][sapply(temps, names) == "value"])
out <- data.frame(
as.list(location),
"start_valid_time" = start_time,
"hourly_temperature" = temps)
head(out)
latitude longitude start_valid_time hourly_temperature
1 29.81 -82.42 2013-06-19T16:00:00-04:00 91
2 29.81 -82.42 2013-06-19T17:00:00-04:00 90
3 29.81 -82.42 2013-06-19T18:00:00-04:00 89
4 29.81 -82.42 2013-06-19T19:00:00-04:00 85
5 29.81 -82.42 2013-06-19T20:00:00-04:00 83
6 29.81 -82.42 2013-06-19T21:00:00-04:00 80
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…