Summarize text

R/llm-summarize.R

llm_summarize

Description

Use a Large Language Model (LLM) to summarize text

Usage


llm_summarize(
  .data,
  col,
  max_words = 10,
  pred_name = ".summary",
  additional_prompt = ""
)

llm_vec_summarize(x, max_words = 10, additional_prompt = "", preview = FALSE)

Arguments

Arguments Description
.data A data.frame or tbl object that contains the text to be analyzed
col The name of the field to analyze, supports tidy-eval
max_words The maximum number of words that the LLM should use in the summary. Defaults to 10.
pred_name A character vector with the name of the new column where the prediction will be placed
additional_prompt Inserts this text into the prompt sent to the LLM
x A vector that contains the text to be analyzed
preview It returns the R call that would have been used to run the prediction. It only returns the first record in x. Defaults to FALSE Applies to vector function only.

Value

llm_summarize returns a data.frame or tbl object. llm_vec_summarize returns a vector that is the same length as x.

Examples



library(mall)

data("reviews")

llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE)

# Use max_words to set the maximum number of words to use for the summary
llm_summarize(reviews, review, max_words = 5)
#> # A tibble: 3 × 2
#>   review                                        .summary                      
#>   <chr>                                         <chr>                         
#> 1 This has been the best TV I've ever used. Gr… it's a great tv               
#> 2 I regret buying this laptop. It is too slow … laptop purchase was a mistake 
#> 3 Not sure how to feel about my new washing ma… having mixed feelings about it

# Use 'pred_name' to customize the new column's name
llm_summarize(reviews, review, 5, pred_name = "review_summary")
#> # A tibble: 3 × 2
#>   review                                        review_summary                
#>   <chr>                                         <chr>                         
#> 1 This has been the best TV I've ever used. Gr… it's a great tv               
#> 2 I regret buying this laptop. It is too slow … laptop purchase was a mistake 
#> 3 Not sure how to feel about my new washing ma… having mixed feelings about it

# For character vectors, instead of a data frame, use this function
llm_vec_summarize(
  "This has been the best TV I've ever used. Great screen, and sound.",
  max_words = 5
)
#> [1] "it's a great tv"

# To preview the first call that will be made to the downstream R function
llm_vec_summarize(
  "This has been the best TV I've ever used. Great screen, and sound.",
  max_words = 5,
  preview = TRUE
)
#> ollamar::chat(messages = list(list(role = "user", content = "You are a helpful summarization engine. Your answer will contain no no capitalization and no explanations. Return no more than 5 words.  The answer is the summary of the following text:\nThis has been the best TV I've ever used. Great screen, and sound.")), 
#>     output = "text", model = "llama3.2", seed = 100)