In today’s blog, we’ll delve into two fundamental R concepts: vectors and sorting. Both are crucial when dealing with data, and a good understanding of these will lay a solid foundation for more complex data operations.
R Vectors
A vector in R is a basic data structure that stores an ordered set of similar data types. They can hold numeric value, character values, or logical values (TRUE or FALSE). Let’s learn how to create, access and manipulate them.
Creating Numeric and Character Vectors
Creating vectors is straightforward. Use the c()
function, which stand for ” combine”.
Naming the Elements of a Vector
Generating Numeric Sequences
For generating a sequence of numbers, use seq()
or :
operator.
# Using seq()
sequence <- seq(from = 1, to = 10, by =2)
sequence
# Using :
sequemce <- 1:10
sequemce
Accessing Specific Elements or Parts of a Vector
To access specific elements in a vector, use brackets []
and specify the index.
Coercing Data into Different Types
Data coercion is the process of converting data from one type to another. Below are some common R functions used for coercion:
-
as.numeric()
: Converts to numeric type.char_vector <- c("1", "2", "3") num_vector <- as.numeric(char_vector) print(num_vector) # verify a data type class(num_vector)
-
as.character()
: Converts to character type.num_vector <- c(1, 2, 3) char_vector <- as.character(num_vector) print(char_vector)
-
as.factor()
: Converts to factor type. -
as.integer()
: Converts to integer type.num_vector <- c(1.2, 2.5, 3.7) int_vector <- as.integer(num_vector) print(int_vector)
-
as.logical()
: Converts to logical type.Any non-zero numeric value will be converted to TRUE and zero will be converted to FALSE. For character vectors, “TRUE” will be converted to TRUE and anything else will be converted to FALSE.
num_vector <- c(1, 0, 2) logical_vector <- as.logical(num_vector) print(logical_vector)
You can also coerce matrices and data frames to other types using these same functions. But remember that when coercing complex types like lists or data frames, every element must be convertible to the final type, or the operation will result in an error or NA values.
Question
why class(3L) is integer ?
why 3L-3 equals 0 ?
R Sorting
Sorting is an important operation when working with data in R. It helps organize data in a way that’s easier to understand and analyze.
Sorting Vectors
To sort a vector in ascending or descending order, you can use the sort()
function.
Finding Indices of Sorted Elements
If you want to get the indices of the sorted elements (rather than the sorted elements themselves), you can use the order()
function.
Finding Maxima and Minima
R provides function max()
, min()
, which.max()
, and which.min()
to find the maximum and minimum elements and their indices:
# Max and Min
max_value <- max(murders$total) # return the largest elements
max_value
min_value <- min(murders$total) # return the smallest elements
min_value
# Indices of Max and Min
max_index <- which.max(murders$total) # index with highest number of murders(第几个数字最大)
max_index # 第5个数字最大
murders$state[max_index] # state name with highest number of total murders
min_index <- which.min(murders$total)
min_index # 第46个数字最小
murders$state[min_index] # state name with lowest number of total murders
Ranking Elements
The rank function provides the ranks of the elements in a vector:
rank(murders$population)
Original | Sort | Order | Rank |
31 | 4 | 2 | 3 |
4 | 15 | 3 | 1 |
15 | 31 | 1 | 2 |
92 | 65 | 5 | 5 |
65 | 92 | 4 | 4 |
Explanation of Table 1
Sort: 按从小到大排列
Order: Sort对应数字在原来数字排列中的顺序
Rank: Original原来数字在Sort顺序中的排名
Vector Arithmetic
You can perform arithmetic operations between vectors and number, as well as between vectors themselves.
Arithmetic with a Single Number
You can perform an operation between a vector and a single number, which applies the operation to each element of the vector:
# Adding 2 to all elements of a vector
new_vec <- codes +2
new_vec
Arithmetic with Two Vectors
You can also perform arithmetic operations between two vectors of the same length:
Example for Vector Arithmetic by Murders dataset
# The name of the state with the maximum population is found by doing the following
murders$state[which.max(murders$population)]
# how to obtain the murder rate
murder_rate <- murders$total / murders$population * 100000
# ordering the states by murder rate, in decreasing order
murders$state[order(murder_rate, decreasing=TRUE)]
Conclusion
To sum up, the deeper understanding of these concepts of vectors and sorting in R will be significantly beneficial for your data analysis journey. You will encounter these operations frequently, and mastering them will make your data manipulation tasks easier and more efficient.