library(mall)
data("reviews")
llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE)
# Use 'labels' to let the function know what to extract
llm_extract(reviews, review, labels = "product")
#> # A tibble: 3 × 2
#> review .extract
#> <chr> <chr>
#> 1 This has been the best TV I've ever used. Gr… tv
#> 2 I regret buying this laptop. It is too slow … laptop
#> 3 Not sure how to feel about my new washing ma… washing machine
# Use 'pred_name' to customize the new column's name
llm_extract(reviews, review, "product", pred_name = "prod")
#> # A tibble: 3 × 2
#> review prod
#> <chr> <chr>
#> 1 This has been the best TV I've ever used. Gr… tv
#> 2 I regret buying this laptop. It is too slow … laptop
#> 3 Not sure how to feel about my new washing ma… washing machine
# Pass a vector to request multiple things, the results will be pipe delimeted
# in a single column
llm_extract(reviews, review, c("product", "feelings"))
#> # A tibble: 3 × 2
#> review .extract
#> <chr> <chr>
#> 1 This has been the best TV I've ever used. Gr… tv | great
#> 2 I regret buying this laptop. It is too slow … laptop|frustration
#> 3 Not sure how to feel about my new washing ma… washing machine | confusion
# To get multiple columns, use 'expand_cols'
llm_extract(reviews, review, c("product", "feelings"), expand_cols = TRUE)
#> # A tibble: 3 × 3
#> review product feelings
#> <chr> <chr> <chr>
#> 1 This has been the best TV I've ever used. Gr… "tv " " great"
#> 2 I regret buying this laptop. It is too slow … "laptop" "frustration"
#> 3 Not sure how to feel about my new washing ma… "washing machine " " confusion"
# Pass a named vector to set the resulting column names
llm_extract(
.data = reviews,
col = review,
labels = c(prod = "product", feels = "feelings"),
expand_cols = TRUE
) #> # A tibble: 3 × 3
#> review prod feels
#> <chr> <chr> <chr>
#> 1 This has been the best TV I've ever used. Gr… "tv " " great"
#> 2 I regret buying this laptop. It is too slow … "laptop" "frustration"
#> 3 Not sure how to feel about my new washing ma… "washing machine " " confusion"
# For character vectors, instead of a data frame, use this function
llm_vec_extract("bob smith, 123 3rd street", c("name", "address"))
#> [1] "bob smith | 123 3rd street"
# To preview the first call that will be made to the downstream R function
llm_vec_extract(
"bob smith, 123 3rd street",
c("name", "address"),
preview = TRUE
) #> ollamar::chat(messages = list(list(role = "user", content = "You are a helpful text extraction engine. Extract the name, address being referred to on the text. I expect 2 items exactly. No capitalization. No explanations. Return the response exclusively in a pipe separated list, and no headers. The answer is based on the following text:\nbob smith, 123 3rd street")),
#> output = "text", model = "llama3.2", seed = 100)
Extract entities from text
llm_extract
Description
Use a Large Language Model (LLM) to extract specific entity, or entities, from the provided text
Usage
llm_extract(
.data,
col,
labels, expand_cols = FALSE,
additional_prompt = "",
pred_name = ".extract"
)
llm_vec_extract(x, labels = c(), additional_prompt = "", preview = FALSE)
Arguments
Arguments | Description |
---|---|
.data | A data.frame or tbl object that contains the text to be analyzed |
col | The name of the field to analyze, supports tidy-eval |
labels | A vector with the entities to extract from the text |
expand_cols | If multiple labels are passed, this is a flag that tells the function to create a new column per item in labels . If labels is a named vector, this function will use those names as the new column names, if not, the function will use a sanitized version of the content as the name. |
additional_prompt | Inserts this text into the prompt sent to the LLM |
pred_name | A character vector with the name of the new column where the prediction will be placed |
x | A vector that contains the text to be analyzed |
preview | It returns the R call that would have been used to run the prediction. It only returns the first record in x . Defaults to FALSE Applies to vector function only. |
Value
llm_extract
returns a data.frame
or tbl
object. llm_vec_extract
returns a vector that is the same length as x
.