Sorting with order in R with the whole data frame

Sorting with order in R with the whole data frame

Problem Description:

I have a data frame that I’d like to order based on a vector of IDs and on the all the columns of another data frame.

id.namestest = data.frame(test = NA, id= c("id1", "id2", "id3","id3", "id2", "id1"))

head(admix)
#             V1        V2           V3
# [1,] 0.1019623 0.8961855 1.852222e-03
# [2,] 0.6891593 0.3107807 5.999776e-05
# [3,] 0.7274040 0.2697308 2.865165e-03
# [4,] 0.3458368 0.6514100 2.753215e-03
# [5,] 0.3946996 0.6053004 1.000000e-09
# [6,] 0.6383386 0.3585409 3.120463e-03

admix=structure(c(0.101962262250848, 0.68915927427333, 0.727404046114676, 
            0.345836796905855, 0.394699646563406, 0.638338623952938, 0.896185515801946, 
            0.310780727965854, 0.26973078933548, 0.65140998802539, 0.605300352436594, 
            0.358540912890725, 0.00185222194720621, 5.99977608165462e-05, 
            0.00286516454984352, 0.00275321506875506, 1e-09, 0.00312046315633649
), dim = c(6L, 3L), dimnames = list(NULL, c("V1", "V2", "V3")))

This below works, but I have to manually set the column order in admix:

admix.tmp = cbind(admix, id.namestest)
if (K==3) { admix.sort.tmp = admix.tmp[order(id.namestest[,2], admix[,1],admix[,2],admix[,3]),]}

I’d like to instead provide a vector of the order of columns sort.order

sort.order = c(1,2,3)

admix.sort.tmp = admix.tmp[order(id.namestest[,2], admix[,sort.order]),]

But I get this:

Error in order(id.namestest[, 2], admix[, c(1, 2, 3)]) : 
  argument lengths differ

I also tried:

admix.sort.tmp = admix.tmp[order(id.namestest[,2], asplit(admix, 2)),]

but I get the same error.

Solution – 1

As showed in the error, the id.namestest[,2] is a vector with length 5, whereas the admix[, 1, 2, 3] is a matrix and its length will the length of the number of elements in the matrix. We can create a list and then use order with do.call

admix.tmp[do.call(order, c(list(id.namestest[,2]), asplit(admix, 2))),]

-output

         V1        V2           V3 test  id
1 0.1019623 0.8961855 1.852222e-03   NA id1
6 0.6383386 0.3585409 3.120463e-03   NA id1
5 0.3946996 0.6053004 1.000000e-09   NA id2
2 0.6891593 0.3107807 5.999776e-05   NA id2
4 0.3458368 0.6514100 2.753215e-03   NA id3
3 0.7274040 0.2697308 2.865165e-03   NA id3

By creating a list of vectors or a data.frame, the types of columns are intact

admix.tmp[do.call(order, cbind(id.namestest[2], admix)),]
         V1        V2           V3 test  id
1 0.1019623 0.8961855 1.852222e-03   NA id1
6 0.6383386 0.3585409 3.120463e-03   NA id1
5 0.3946996 0.6053004 1.000000e-09   NA id2
2 0.6891593 0.3107807 5.999776e-05   NA id2
4 0.3458368 0.6514100 2.753215e-03   NA id3
3 0.7274040 0.2697308 2.865165e-03   NA id3

Or using dplyr

library(dplyr)
admix.tmp %>%
   arrange(id, across(all_of(colnames(admix[, sort.order, drop = FALSE]))))
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject