Categorize data as one of options given

R/llm-classify.R

llm_classify

Description

Use a Large Language Model (LLM) to classify the provided text as one of the options provided via the labels argument.

Usage

 
llm_classify( 
  .data, 
  col, 
  labels, 
  pred_name = ".classify", 
  additional_prompt = "" 
) 
 
llm_vec_classify(x, labels, additional_prompt = "", preview = FALSE) 

Arguments

Arguments Description
.data A data.frame or tbl object that contains the text to be analyzed
col The name of the field to analyze, supports tidy-eval
labels A character vector with at least 2 labels to classify the text as
pred_name A character vector with the name of the new column where the prediction will be placed
additional_prompt Inserts this text into the prompt sent to the LLM
x A vector that contains the text to be analyzed
preview It returns the R call that would have been used to run the prediction. It only returns the first record in x. Defaults to FALSE Applies to vector function only.

Value

llm_classify returns a data.frame or tbl object. llm_vec_classify returns a vector that is the same length as x.

Examples

 
 
library(mall) 
 
data("reviews") 
 
llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) 
 
llm_classify(reviews, review, c("appliance", "computer")) 
#> # A tibble: 3 × 2
#>   review                                        .classify
#>   <chr>                                         <chr>    
#> 1 This has been the best TV I've ever used. Gr… computer 
#> 2 I regret buying this laptop. It is too slow … computer 
#> 3 Not sure how to feel about my new washing ma… appliance
 
# Use 'pred_name' to customize the new column's name 
llm_classify( 
  reviews, 
  review, 
  c("appliance", "computer"), 
  pred_name = "prod_type" 
) 
#> # A tibble: 3 × 2
#>   review                                        prod_type
#>   <chr>                                         <chr>    
#> 1 This has been the best TV I've ever used. Gr… computer 
#> 2 I regret buying this laptop. It is too slow … computer 
#> 3 Not sure how to feel about my new washing ma… appliance
 
# Pass custom values for each classification 
llm_classify(reviews, review, c("appliance" ~ 1, "computer" ~ 2)) 
#> # A tibble: 3 × 2
#>   review                                                               .classify
#>   <chr>                                                                    <dbl>
#> 1 This has been the best TV I've ever used. Great screen, and sound.           1
#> 2 I regret buying this laptop. It is too slow and the keyboard is too…         2
#> 3 Not sure how to feel about my new washing machine. Great color, but…         1
 
# For character vectors, instead of a data frame, use this function 
llm_vec_classify( 
  c("this is important!", "just whenever"), 
  c("urgent", "not urgent") 
) 
#> [1] "urgent" "urgent"
 
# To preview the first call that will be made to the downstream R function 
llm_vec_classify( 
  c("this is important!", "just whenever"), 
  c("urgent", "not urgent"), 
  preview = TRUE 
) 
#> ollamar::chat(messages = list(list(role = "user", content = "You are a helpful classification engine. Determine if the text refers to one of the following: urgent, not urgent. No capitalization. No explanations.  The answer is based on the following text:\nthis is important!")), 
#>     output = "text", model = "llama3.2", seed = 100)