Filtering with non-equal time on POSIXct variable in R

Filtering with non-equal time on POSIXct variable in R

Problem Description:

I’m wondering if there is a simple way to filter time with a date-time POSIXct variable.

I discovered non-equal filtering with time variable (hms) is straightforward:

> apple_data
# A tibble: 10 × 6
   SYMBOL DATE       TIME        BB    BO date_time          
   <chr>  <date>     <time>   <dbl> <dbl> <dttm>             
 1 AAPL   2009-01-02 09:30:00  85.6  85.6 2009-01-02 09:30:00
 2 AAPL   2009-01-02 09:30:01  85.6  85.9 2009-01-02 09:30:01
 3 AAPL   2009-01-02 09:30:02  85.6  85.7 2009-01-02 09:30:02
 4 AAPL   2009-01-02 09:30:03  85.6  85.7 2009-01-02 09:30:03
 5 AAPL   2009-01-02 09:30:04  85.6  85.8 2009-01-02 09:30:04
 6 AAPL   2009-01-02 09:30:05  85.6  85.7 2009-01-02 09:30:05
 7 AAPL   2009-01-02 09:30:06  85.6  85.7 2009-01-02 09:30:06
 8 AAPL   2009-01-02 09:30:07  85.6  85.7 2009-01-02 09:30:07
 9 AAPL   2009-01-02 09:30:08  85.6  85.7 2009-01-02 09:30:08
10 AAPL   2009-01-02 09:30:09  85.6  85.7 2009-01-02 09:30:09

apple_data %>% filter(TIME <= as_hms("09:30:05"), TIME >= as_hms("09:30:03"))

# A tibble: 3 × 6
  SYMBOL DATE       TIME        BB    BO date_time          
  <chr>  <date>     <time>   <dbl> <dbl> <dttm>             
1 AAPL   2009-01-02 09:30:03  85.6  85.7 2009-01-02 09:30:03
2 AAPL   2009-01-02 09:30:04  85.6  85.8 2009-01-02 09:30:04
3 AAPL   2009-01-02 09:30:05  85.6  85.7 2009-01-02 09:30:05

Question 1

If I do not have DATE and TIME variables but date_time only instead, which is POSIXct, how could I perform non-equi filtering only with time?

Question 2

I tried extracting TIME from date_time using format(date_time, "%T"), and discovered time filtering can be done even though the output is a string. However, it takes too much time to convert string to hms on big data, and I need it for merging with other data.

Is there a fast way to convert string to hms, or extract hms from date_time from the beginning so that I can skip this costly type conversion? Any suggestions are greatly appreciated.

Reprex
structure(list(SYMBOL = structure(c("AAPL", "AAPL", "AAPL", "AAPL", 
"AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL"), label = "Stock Symbol"), 
    DATE = structure(c(14246, 14246, 14246, 14246, 14246, 14246, 
    14246, 14246, 14246, 14246), label = "Quote date", format.sas = "YYMMDDN8", class = "Date"), 
    TIME = structure(c(34200, 34201, 34202, 34203, 34204, 34205, 
    34206, 34207, 34208, 34209), class = c("hms", "difftime"), units = "secs"), 
    BB = structure(c(85.55, 85.6, 85.56, 85.55, 85.57, 85.56, 
    85.61, 85.61, 85.62, 85.62), label = "Best Bid"), BO = structure(c(85.6, 
    85.86, 85.66, 85.66, 85.8, 85.66, 85.66, 85.66, 85.73, 85.73
    ), label = "Best Offer"), date_time = structure(c(1230888600, 
    1230888601, 1230888602, 1230888603, 1230888604, 1230888605, 
    1230888606, 1230888607, 1230888608, 1230888609), tzone = "UTC", format.sas = "DATETIME20", class = c("POSIXct", 
    "POSIXt"))), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

Solution – 1

1) Calculate the range of seconds, rng, in the comparison times and also in the time of date_time. That avoids character processing for date_time. Note that 86400 equals 24 * 60 * 60.

library(dplyr, exclude = c("filter", "lag"))

rng <- as.difftime(c("09:30:03", "09:30:05"), unit = "secs")
apple_data %>%
  dplyr::filter(between(as.numeric(date_time) %% 86400, !!!rng))

giving:

# A tibble: 3 × 6
  SYMBOL DATE       TIME          BB    BO date_time          
  <chr>  <date>     <hms>      <dbl> <dbl> <dttm>             
1 AAPL   2009-01-02 34203 secs  85.6  85.7 2009-01-02 09:30:03
2 AAPL   2009-01-02 34204 secs  85.6  85.8 2009-01-02 09:30:04
3 AAPL   2009-01-02 34205 secs  85.6  85.7 2009-01-02 09:30:05

2) A base R version of the above is nearly the same.

Between <- function(x, ..., rng = range(c(...))) x >= rng[1] & x <= rng[2]
rng <- as.difftime(c("09:30:03", "09:30:05"), unit = "secs")
apple_data |>
  subset(Between(as.numeric(date_time) %% 86400, rng))
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject