Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
203 views
in Technique[技术] by (71.8m points)

sql - Interactive Plots in R

Using the plotly library, I made the following plot in R:

library(dplyr)
library(ggplot2)
library(plotly)

set.seed(123)
df <- data.frame(var1 = rnorm(1000,10,10),
                   var2 = rnorm(1000,5,5))

df <- df %>% mutate(var3 = ifelse(var1 <= 5 & var2 <= 5, "a", ifelse(var1 <= 10 & var2 <= 10, "b", "c"))) 


plot = df %>%
  ggplot() + geom_point(aes(x=var1, y= var2, color= var3))


ggplotly(plot)

enter image description here

This is a simple scatter plot - two random variables are generated, and then the colors of the points are decided by some criteria (e.g. if var1 and var2 are between certain ranges).

From here, I could also summary statistics:

df$var3 = as.factor(df$var3)
summary = df %>%
    group_by(var3) %>%
    summarize(Mean_var1 = mean(var1), Mean_var2 = mean(var2), count=n())

# A tibble: 3 x 4
  var3  Mean_var1 Mean_var2 count
* <fct>     <dbl>     <dbl> <int>
1 a         -1.70     0.946   158
2 b          4.68     4.94    260
3 c         15.8      6.49    582

My question: is it possible to add some buttons to this plot which would allow the user to color the points based on custom choices? E.g. something like this :

enter image description here

Now, the user can type in any range they want - and the color of the points change, and the some summary statistics are generated.

Can someone please show me how to do this in R?

I had this idea - first I would create this massive table that would create all possible range combinations of "var1" and "var2":

vec1 <- c(-20:40,1)
vec2 <-  c(-20:40,1)


a <- expand.grid(vec1, vec2)

for (i in seq_along(vec1)) { 
    for (j in seq_along(vec2)) {

df <- df %>% mutate(var3 = ifelse(var1 <= i & var2 <= i, "a", ifelse(var1 <= j & j <= 10, "b", "c"))) 

}

}

Then, depending on which ranges the user wants - an SQL style statement isolate the rows from this massive table corresponding to those ranges :

custom_df = df[df$var1 > -20 & df$var1 <10 & df$var1 > -20 & df$var2 <10 , ]    

Then, an individual grap would be made for "custom_df" and summary statistics would also be recorded for "custom_df":

summary = custom_df %>%
    group_by(var3) %>%
    summarize(Mean_var1 = mean(var1), Mean_var2 = mean(var2), count=n())

But I am not sure how to neatly and efficiently do this in R.

enter image description here

Can someone please show me how to do this?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I have built a small shiny app to perform most of your requirements. Based on your pre-defined large dataframe df, user can define the following:

  1. Choose the minimum and maximum value for variables var1 and var2.
  2. Choose criteria to define the variable var3, which is used to display different colors of data points. This is a range now.
  3. Save plot as a HTML file.
  4. Summary stats displayed as a table.

You can define further options to provide the user the option to choose color and so on. For that perhaps you should google on how to use scale_color_manual().

Update: Added user option to choose red and green color based on var1 and var2 range values.

library(shiny)
library(plotly)
library(dplyr)
library(DT)

### define a large df
set.seed(123)
df <- data.frame(var1 = rnorm(1000,10,10),
                 var2 = rnorm(1000,15,15))

ui <- fluidPage(
  titlePanel(p("My First Test App", style = "color:red")),
  sidebarLayout(
    sidebarPanel(
      p("Choose Variable limits"),

      # Horizontal line ----
      tags$hr(),
      uiOutput("var1a"), uiOutput("var1b"),
      uiOutput("var2a"), uiOutput("var2b"),
      uiOutput("criteria")

    ),
    mainPanel(
      DTOutput("summary"), br(),
      plotlyOutput("plot"),
      br(), br(), br(),
      uiOutput("saveplotbtn")
    )
  )
)

server <- function(input, output, session){
  
  output$var1a <- renderUI({
    tagList(
      numericInput("var11", "Variable 1 min",
                  min = min(df$var1), max = max(df$var1), value = min(df$var1))
    )
  })
  output$var1b <- renderUI({
    if (is.null(input$var11)){
      low1 <- min(df$var1)
    }else low1 <- max(min(df$var1),input$var11)  ## cannot be lower than var 1 minimum
    tagList(
      numericInput("var12", "Variable 1 max", min = low1, max = max(df$var1), value = max(df$var1))
    )
  })
  
  output$var2a <- renderUI({
    tagList(
      numericInput("var21", "Variable 2 min",
                   min = min(df$var2), max = max(df$var2), value = min(df$var2))
    )
  })
  output$var2b <- renderUI({
    if (is.null(input$var21)){
      low2 <- min(df$var2)
    }else low2 <- max(min(df$var2),input$var21)  ## cannot be lower than var 2 minimum
    tagList(
      numericInput("var22", "Variable 2 max", min = low2, max = max(df$var2), value = max(df$var2))
    )
  })
  
  output$criteria <- renderUI({
    req(input$var11,input$var12,input$var21,input$var22)
        
    tagList(
      sliderInput("crit11", "Variable 1 red color range:",
                  min = -10, max = 0, value = c(-10,0)),
      sliderInput("crit12", "Variable 2 red color range:",
                  min = -25, max = 0, value = c(-25,0)),
      sliderInput("crit21", "Variable 1 green color range:",
                  min = 0.1, max = 10, value = c(0.1,10)),
      sliderInput("crit22", "Variable 2 green color range:",
                  min = 0.1, max = 20, value = c(0.1,20))
    )

  })
  
  dat <- reactive({
    req(input$crit11,input$crit12,input$crit21,input$crit22)
    
    df <- df %>% filter(between(var1, input$var11, input$var12)) %>% 
                 filter(between(var2, input$var21, input$var22))
    
    # df1 <- df %>% mutate(var3 = ifelse(var1 <= i & var2 <= i, "a", ifelse(var1 <= j & var2 <= j , "b", "c")))
    
    df1 <- df %>% mutate(var3 = ifelse(between(var1, input$crit11[1], input$crit11[2]) & between(var2, input$crit12[1], input$crit12[2]), "a",
                                       ifelse(between(var1, input$crit21[1], input$crit21[2]) & between(var2, input$crit22[1], input$crit22[2]), "b", "c")))
    
  })
  
  summari <- reactive({
    req(dat())
    df1 <- dat()
    df1$var3 = as.factor(df1$var3)
    summary = df1 %>%
      group_by(var3) %>%
      dplyr::summarize(Mean_var1 = mean(var1), Mean_var2 = mean(var2), count=n())
  })
  
  output$summary <- renderDT(summari())
  
  rv <- reactiveValues()
  
  observe({
    req(dat())
    p <- ggplot(data=dat()) + geom_point(aes(x=var1, y= var2, color= var3))
    pp <- ggplotly(p)
    rv$plot <- pp
  })
  
  output$plot <- renderPlotly({
    rv$plot
  })
  
  output$saveplotbtn <-  renderUI({
    div(style="display: block; padding: 5px 350px 5px 50px;",
        downloadBttn("saveHTML",
                     HTML("HTML"),
                     style = "fill",
                     color = "default",
                     size = "lg",
                     block = TRUE,
                     no_outline = TRUE
        ) )
  })
  
  output$saveHTML <- downloadHandler(
    filename = function() {
      paste("myplot", Sys.Date(), ".html", sep = "")
    },
    content = function(file) {
      htmlwidgets::saveWidget(as_widget(rv$plot), file, selfcontained = TRUE)  ## self-contained
    }
  )

}

shinyApp(ui, server)

output


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...