Open In App

Extract Numbers from Character String Vector in R

Last Updated : 19 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

In this article, we are going to see how to extract Numbers from Character String Vector in R Programming Language. There are different approaches to extract numbers from character string vectors using some in-built functions. It can be done in the following ways:

  • Using the gsub() function
  • Using the gregexpr() and regmatches() functions

Method 1: Using gsub() function.

gsub() function in R is used to replace patterns in a string. It can also be employed to extract numbers from a string by defining a pattern to capture the number and returning it.

Syntax:

gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,fixed = FALSE, useBytes = FALSE)

Parameters:

  • pattern: string to be matched, supports regular expression
  • replacement: string for replacement
  • x: string or string vector
  • Additional parameters: ignore.case, perl, fixed, and useBytes are used to control how the pattern matching is done.

Steps:

  1. Define the pattern to match the number. A simple pattern for capturing numbers is “ .*?([0-9]+).* “.
  2. Replace the matched number using gsub().

Example:

R
# Sample data
gfg <- c("7g8ee6ks1", "5f9o1r0", "geeks10")           
print(gfg)
# Extracting numbers using gsub()
res = as.numeric(gsub(".*?([0-9]+).*", "\\1", gfg))             
print(res)

Output:

[1] “7g8ee6ks1” “5f9o1r0” “geeks10”
[1] 7 5 10

Explanation: gsub() captures the first occurrence of digits in each string and returns the numbers as a numeric vector. In this case, it captures 7, 5, and 10 from the strings “7g8ee6ks1”, “5f9o1r0”, and “geeks10”.

Method 2: Using gregexpr() and regmatches() functions

In this method, we use gregexpr() to identify all the positions of the numbers in the strings, and regmatches() to extract those numbers. This approach is useful when you want to extract multiple numbers from a single string.

gregexpr() function: The gregexpr() function searches for patterns in a string and returns the positions of all matches.

Syntax:

gregexpr(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

Parameters:

  • text: string, the character vector
  • pattern: The regular expression to match.

regmatches() function: This function is used to extract or replace matched sub-strings from match data.

Syntax:

regmatches(x, m, invert = FALSE)

Parameters:

  • x: a character vector
  • m: an object with match data
  • invert: a logical, if TRUE, extract or replace the non-matched substrings.

Example:

R
gfg <- c("7g8ee6ks1", "5f9o1r0", "geeks10")   

# Extracting all numbers using gregexpr and regmatches
gfg_numbers <- regmatches(gfg, gregexpr("[[:digit:]]+", gfg))
# Convert extracted numbers to numeric
as.numeric(unlist(gfg_numbers))  

Output:

[1] 7 8 6 1 5 9 1 0 10



Next Article

Similar Reads