Introduction

Dates are a separate data type in R and SQL. A date object can be used in “date aware” calculations, for example in BETWEEN clauses in SQL. Dates have a different format in R and SQL. A text column or character variable, even if they are formatted like a date, are not actual dates.

Converting Text to Date Objects

To convert text to a date object in R, the text string must be in the format YYYY-MM-DD or YYYY/MM/DD.

s2 <- "2021-02-15"
d1 <- as.Date(s2)

print(d1)
## [1] "2021-02-15"

Note that the class (type) of the above variable d1 is of type Date.

Alternatively, the date can be specified as YYYY/MM/DD as demonstrated below.

s3 <- "2019/06/30"
d2 <- as.Date(s3)

print(d2)
## [1] "2019-06-30"

Date Format Specification

To convert text to a date object in R, the text string must be in the format YYYY-MM-DD unless the format parameter is specified.

as.Date("07/04/20", format = "%m/%d/%y")
## [1] "2020-07-04"
as.Date("JUL 04 2021", format = "%b %d %Y")
## [1] "2021-07-04"
as.Date("February 11 80", format = "%B %d %y")
## [1] "1980-02-11"

Here are the various date format specifications:

Symbol Meaning Example
%d day as a number (01-31) 06
%a abbreviated weekday Mon
%A full weekday Monday
%m month (00-12) 11
%b abbreviated month Nov
%B full month March
%y two-digit year 20
%Y four-digit year 2021

Current System Time and Date

To get the current date from the system’s clock, use Sys.Date(). To get the current date and time, use date().

# get current date and time
date()
## [1] "Wed Feb 14 09:48:52 2024"
# get current date only as YYYY-MM-DD
Sys.Date()
## [1] "2024-02-14"

lubridate Package

The lubridate package has many more functions for converting from various other formats, but these are addressed in a different tutorial (See Also).

Dates in SQLite

Let’s create a simple in-memory database with one table (txns), three columns, where the first column in an integer primary key, the second is a date, and the third is text.

library(RSQLite)
db <- dbConnect(RSQLite::SQLite(), ":memory:")
create table txns (
  tid integer primary key,
  tdate date not null,
  tprod text
)

Now that the table is created, let’s insert some values to ensure that the table definition is correct and we can retrieve the data with their correct data type.

insert into txns values 
(100, '2022-03-31', 'Alvarex'),
(200, '2022-03-30', 'Prosilvas'),
(300, '2022-03-30', 'Alvarex'),
(400, '2022-02-09', 'Kitomaron'),
(500, '2022-01-14', 'Blevarex');

Let’s quickly retrieve all rows to ensure they are in the database.

select * from txns;
Table 1: 5 records
tid tdate tprod
100 2022-03-31 Alvarex
200 2022-03-30 Prosilvas
300 2022-03-30 Alvarex
400 2022-02-09 Kitomaron
500 2022-01-14 Blevarex

Indeed, they are. But note that the date column tdate is returned as text (<chr>). To determine if SQLite actually stored them as type date, we can run a SQL query that requires date column type.

select * 
  from txns
 where tdate between '2022-03-01' and '2022-03-31'
Table 2: 3 records
tid tdate tprod
100 2022-03-31 Alvarex
200 2022-03-30 Prosilvas
300 2022-03-30 Alvarex

The above query works, so we can be confident that SQLite stored the dates correctly even though R convert them to text.

And, finally, let’s retrieve some data into a data frame and list the data along with the types of the columns in the data frame. Recall that all result sets of queries are tables.

rs <- dbGetQuery(db, 'select * from txns')

print(rs)
##   tid      tdate     tprod
## 1 100 2022-03-31   Alvarex
## 2 200 2022-03-30 Prosilvas
## 3 300 2022-03-30   Alvarex
## 4 400 2022-02-09 Kitomaron
## 5 500 2022-01-14  Blevarex

Note that the date column came back from the query as text; the data frame has the data type <chr> for the tdate column. To use them as dates, you need to coerce the column type of the data frame to Date.

rs$tdate <- as.Date(rs$tdate)

print(rs)
##   tid      tdate     tprod
## 1 100 2022-03-31   Alvarex
## 2 200 2022-03-30 Prosilvas
## 3 300 2022-03-30   Alvarex
## 4 400 2022-02-09 Kitomaron
## 5 500 2022-01-14  Blevarex

Notice that now the tdate column is of the correct R Date type and we can apply R date functions to that column.

Saving Date Data to SQLite

In the above discussion, we saw that SQL dates are converted to text when retrieved into a data frame and an explicit coercion is required if we want to treat dates as actual R Date objects. A similar issue occurs when we save Date columns in data frames to R. We will demonstrate the issues by creating a data frame which we will then save into a new table in SQLite using dbWriteTable() and then we will save the same data frame to an already created table, again using dbWriteTable(). The difference is that in one case, the table does not yet exist, while in the other it does.

We will use the data frame rs that we retrieved above. To review, we had to coerce the data column explicitly to a Date object. As you can see, the data frame has three columns of types int (integer), date (Date) and chr (text characters), respectively.

print(rs)
##   tid      tdate     tprod
## 1 100 2022-03-31   Alvarex
## 2 200 2022-03-30 Prosilvas
## 3 300 2022-03-30   Alvarex
## 4 400 2022-02-09 Kitomaron
## 5 500 2022-01-14  Blevarex

Let’s save this entire data frame to a new table in the in-memory SQLite database and then inspect the data types of the columns of the table in the database.

dbWriteTable(db, 'newTxns', rs)

Now, if we retrieve we know from the experiments above that the date column will be text, but we want to know what the database stores them as. The code below uses a PRAGMA to query the database itself.

PRAGMA table_info(newTxns);
Table 3: 3 records
cid name type notnull dflt_value pk
0 tid INTEGER 0 NA 0
1 tdate REAL 0 NA 0
2 tprod TEXT 0 NA 0

Let’s compare that to the structure of a table that we create manually.

create table newManTxns (
  tid integer primary key,
  tdate date not null,
  tprod text
)
PRAGMA table_info(newManTxns);
Table 4: 3 records
cid name type notnull dflt_value pk
0 tid INTEGER 0 NA 1
1 tdate date 1 NA 0
2 tprod TEXT 0 NA 0

Notice how the table created by dbWriteTable() has REAL as the data type for the tdate column while the manually created table has the correct SQL date data type.

So, we know that saving a data frame to SQLite will change the date column data type from Date to REAL – not very friendly and useful. Let’s see what happens when we write the data frame to an existing table. As a reminder, the columns of the data frame must be in the same order as in the table definition. Because, the table already exists, we need to specify append=T; be careful not to specify overwrite=T as that will wipe out the table and re-create it.

dbWriteTable(db, 'newManTxns', rs, append=T)

And, again, let’s inspect the table.

PRAGMA table_info(newManTxns);
Table 5: 3 records
cid name type notnull dflt_value pk
0 tid INTEGER 0 NA 1
1 tdate date 1 NA 0
2 tprod TEXT 0 NA 0

Now, it has the correct data type. Of course, let’s check the data itself, too:

select * from newManTxns;
Table 6: 5 records
tid tdate tprod
100 19082 Alvarex
200 19081 Prosilvas
300 19081 Alvarex
400 19032 Kitomaron
500 19006 Blevarex

The tade column has an int data type, but let’s be sure that is correct in the database. We’ll run the same query as we did above that requires the data to be of a date type.

select * 
  from newManTxns
 where tdate between '2022-03-01' and '2022-03-31'
Table 7: 0 records
tid tdate tprod

And, as you can see, that did not work – which implies that R or SQLite modified the data and that a Date object is not correctly written to a SQLite database to a date column.

The final experiment uses an R data frame that has the date as text and writes it to a table in the database that has a date column.

We need to first remove all rows from the newManTxns table so we can work with an empty table.

delete from newManTxns;
rs2 <- dbGetQuery(db, 'select * from txns')

print(rs2)
##   tid      tdate     tprod
## 1 100 2022-03-31   Alvarex
## 2 200 2022-03-30 Prosilvas
## 3 300 2022-03-30   Alvarex
## 4 400 2022-02-09 Kitomaron
## 5 500 2022-01-14  Blevarex

So, the data frame rs2 has dates from the database as characters.

dbWriteTable(db, 'newManTxns', rs2, append=T)

Let’s look at the data in the database first:

select * from newManTxns;
Table 8: 5 records
tid tdate tprod
100 2022-03-31 Alvarex
200 2022-03-30 Prosilvas
300 2022-03-30 Alvarex
400 2022-02-09 Kitomaron
500 2022-01-14 Blevarex

And, finally, let’s run the date query again:

select * 
  from newManTxns
 where tdate between '2022-03-01' and '2022-03-31'
Table 9: 3 records
tid tdate tprod
100 2022-03-31 Alvarex
200 2022-03-30 Prosilvas
300 2022-03-30 Alvarex

Now it worked. So what did we learn? Date columns in data frames do not correctly map to date columns in the database. In order for dbWriteTable() to properly write dates to date columns in the database, the column for the dates must be characters in R; they cannot be Date objects. Counterinituitive, but that’s the way it works.

Understanding R to SQL Data Mappings

To understand how R maps objects to SQL, we can use the dbDataType() function.

dbDataType(db, rs$tdate)
## [1] "REAL"

This tells us R will convert the column to REAL when written to the database, which is what we saw above.

Conclusion

Dates must be properly encoded for them to be processed correctly. R and SQL use dates differently and converting between the two formats require care. Date columns in data frames do not get converted properly to columns of type date in SQLite.

Tutorial

The video tutorial demonstrates the constructs introduced in this lesson.


Files & Resources

All Files for Lesson 6.306

Errata

None collected yet. Let us know.

---
title: "Dates in R and SQLite"
params:
  category: 6
  number: 306
  time: 60
  level: beginner
  tags: "r,dates,sql,sqlite,lubridate"
  description: "Reading dates from raw data and using them in R and saving them
                or loading them from SQLite can be challenging. This tutorial
                explains how to work with dates in R, SQL, and SQLite."
date: "<small>`r Sys.Date()`</small>"
author: "<small>Martin Schedlbauer</small>"
email: "m.schedlbauer@neu.edu"
affilitation: "Northeastern University"
output: 
  bookdown::html_document2:
    toc: true
    toc_float: true
    collapsed: false
    number_sections: false
    code_download: true
    theme: spacelab
    highlight: tango
---

---
title: "<small>`r params$category`.`r params$number`</small><br/><span style='color: #2E4053; font-size: 0.9em'>`r rmarkdown::metadata$title`</span>"
---

```{r code=xfun::read_utf8(paste0(here::here(),'/R/_insert2DB.R')), include = FALSE}
```

## Introduction

Dates are a separate data type in R and SQL. A date object can be used in "date aware" calculations, for example in *BETWEEN* clauses in SQL. Dates have a different format in R and SQL. A text column or character variable, even if they are formatted like a date, are not actual dates.

## Converting Text to Date Objects

To convert text to a date object in R, the text string must be in the format *YYYY-MM-DD* or *YYYY/MM/DD*.

```{r}
s2 <- "2021-02-15"
d1 <- as.Date(s2)

print(d1)
```

Note that the *class* (type) of the above variable *d1* is of type `r class(d1)`.

Alternatively, the date can be specified as *YYYY/MM/DD* as demonstrated below.

```{r}
s3 <- "2019/06/30"
d2 <- as.Date(s3)

print(d2)
```

### Date Format Specification

To convert text to a date object in R, the text string must be in the format *YYYY-MM-DD* unless the *format* parameter is specified.

```{r}
as.Date("07/04/20", format = "%m/%d/%y")
as.Date("JUL 04 2021", format = "%b %d %Y")
as.Date("February 11 80", format = "%B %d %y")
```

Here are the various date *format* specifications:

| Symbol | Meaning                 | Example |
|--------|-------------------------|---------|
| %d     | day as a number (01-31) | 06      |
| %a     | abbreviated weekday     | Mon     |
| %A     | full weekday            | Monday  |
| %m     | month (00-12)           | 11      |
| %b     | abbreviated month       | Nov     |
| %B     | full month              | March   |
| %y     | two-digit year          | 20      |
| %Y     | four-digit year         | 2021    |

## Current System Time and Date

To get the current date from the system's clock, use <code>Sys.Date()</code>. To get the current date and time, use <code>date()</code>.

```{r}
# get current date and time
date()

# get current date only as YYYY-MM-DD
Sys.Date()
```

## **lubridate** Package

The **lubridate** package has many more functions for converting from various other formats, but these are addressed in a different tutorial ([See Also](#See_Also)).

## Dates in SQLite

Let's create a simple in-memory database with one table (*txns*), three columns, where the first column in an integer primary key, the second is a date, and the third is text.

```{r}
library(RSQLite)
db <- dbConnect(RSQLite::SQLite(), ":memory:")
```

```{sql connection=db}
create table txns (
  tid integer primary key,
  tdate date not null,
  tprod text
)
```

Now that the table is created, let's insert some values to ensure that the table definition is correct and we can retrieve the data with their correct data type.

```{sql connection=db}
insert into txns values 
(100, '2022-03-31', 'Alvarex'),
(200, '2022-03-30', 'Prosilvas'),
(300, '2022-03-30', 'Alvarex'),
(400, '2022-02-09', 'Kitomaron'),
(500, '2022-01-14', 'Blevarex');

```

Let's quickly retrieve all rows to ensure they are in the database.

```{sql connection=db}
select * from txns;
```

Indeed, they are. But note that the date column *tdate* is returned as text (*\<chr\>*). To determine if SQLite actually stored them as type *date*, we can run a SQL query that requires date column type.

```{sql connection=db}
select * 
  from txns
 where tdate between '2022-03-01' and '2022-03-31'
```

The above query works, so we can be confident that SQLite stored the dates correctly even though R convert them to text.

And, finally, let's retrieve some data into a data frame and list the data along with the types of the columns in the data frame. Recall that all result sets of queries are tables.

```{r}
rs <- dbGetQuery(db, 'select * from txns')

print(rs)
```

Note that the date column came back from the query as text; the data frame has the data type *\<chr\>* for the *tdate* column. To use them as dates, you need to coerce the column type of the data frame to *Date*.

```{r}
rs$tdate <- as.Date(rs$tdate)

print(rs)
```

Notice that now the *tdate* column is of the correct R *Date* type and we can apply R date functions to that column.

## Saving Date Data to SQLite

In the above discussion, we saw that SQL dates are converted to text when retrieved into a data frame and an explicit coercion is required if we want to treat dates as actual R *Date* objects. A similar issue occurs when we save *Date* columns in data frames to R. We will demonstrate the issues by creating a data frame which we will then save into a new table in SQLite using <code>dbWriteTable()</code> and then we will save the same data frame to an already created table, again using <code>dbWriteTable()</code>. The difference is that in one case, the table does not yet exist, while in the other it does.

We will use the data frame *rs* that we retrieved above. To review, we had to coerce the data column explicitly to a *Date* object. As you can see, the data frame has three columns of types *int* (*integer*), *date* (*Date*) and *chr* (text characters), respectively.

```{r}
print(rs)
```

Let's save this entire data frame to a new table in the in-memory SQLite database and then inspect the data types of the columns of the table in the database.

```{r}
dbWriteTable(db, 'newTxns', rs)
```

Now, if we retrieve we know from the experiments above that the date column will be text, but we want to know what the database stores them as. The code below uses a *PRAGMA* to query the database itself.

```{sql connection=db}
PRAGMA table_info(newTxns);
```

Let's compare that to the structure of a table that we create manually.

```{sql connection=db}
create table newManTxns (
  tid integer primary key,
  tdate date not null,
  tprod text
)
```

```{sql connection=db}
PRAGMA table_info(newManTxns);
```

Notice how the table created by <code>dbWriteTable()</code> has *REAL* as the data type for the *tdate* column while the manually created table has the correct SQL date data type.

So, we know that saving a data frame to SQLite will change the date column data type from *Date* to *REAL* -- not very friendly and useful. Let's see what happens when we write the data frame to an existing table. As a reminder, the columns of the data frame must be in the same order as in the table definition. Because, the table already exists, we need to specify *append=T*; be careful not to specify *overwrite=T* as that will wipe out the table and re-create it.

```{r}
dbWriteTable(db, 'newManTxns', rs, append=T)
```

And, again, let's inspect the table.

```{sql connection=db}
PRAGMA table_info(newManTxns);
```

Now, it has the correct data type. Of course, let's check the data itself, too:

```{sql connection=db}
select * from newManTxns;
```

The *tade* column has an *int* data type, but let's be sure that is correct in the database. We'll run the same query as we did above that requires the data to be of a date type.

```{sql connection=db}
select * 
  from newManTxns
 where tdate between '2022-03-01' and '2022-03-31'
```

And, as you can see, that did not work -- which implies that R or SQLite modified the data and that a *Date* object is not correctly written to a SQLite database to a date column.

The final experiment uses an R data frame that has the date as text and writes it to a table in the database that has a *date* column.

We need to first remove all rows from the *newManTxns* table so we can work with an empty table.

```{sql connection=db}
delete from newManTxns;
```

```{r}
rs2 <- dbGetQuery(db, 'select * from txns')

print(rs2)
```

So, the data frame *rs2* has dates from the database as characters.

```{r}
dbWriteTable(db, 'newManTxns', rs2, append=T)
```

Let's look at the data in the database first:

```{sql connection=db}
select * from newManTxns;
```

And, finally, let's run the date query again:

```{sql connection=db}
select * 
  from newManTxns
 where tdate between '2022-03-01' and '2022-03-31'
```

Now it worked. So what did we learn? Date columns in data frames do not correctly map to date columns in the database. In order for <code>dbWriteTable()</code> to properly write dates to *date* columns in the database, the column for the dates must be characters in R; they cannot be *Date* objects. Counterinituitive, but that's the way it works.

## Understanding R to SQL Data Mappings

To understand how R maps objects to SQL, we can use the <code>dbDataType()</code> function.

```{r}
dbDataType(db, rs$tdate)
```

This tells us R will convert the column to *REAL* when written to the database, which is what we saw above.

## Conclusion

Dates must be properly encoded for them to be processed correctly. R and SQL use dates differently and converting between the two formats require care. Date columns in data frames do not get converted properly to columns of type *date* in SQLite.

## Tutorial

The video tutorial demonstrates the constructs introduced in this lesson.

```{=html}
<iframe src="https://player.vimeo.com/video/704260836?h=6db53868ed&amp;badge=0&amp;autopause=0&amp;player_id=0&amp;app_id=58479" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen width="480" height="270" title="Dates in R and SQLite" data-external="1"></iframe>
```

------------------------------------------------------------------------

## Files & Resources

```{r zipFiles, echo=FALSE}
zipName = sprintf("LessonFiles-%s-%s.zip", 
                 params$category,
                 params$number)

textALink = paste0("All Files for Lesson ", 
               params$category,".",params$number)

# downloadFilesLink() is included from _insert2DB.R
knitr::raw_html(downloadFilesLink(".", zipName, textALink))
```

------------------------------------------------------------------------

## See Also

-   [6.112 Basics of Text & String Processing in R](http://artificium.us/lessons/06.r/l-6-112-text-proc/l-6-112.html)
-   [6.100 Beginning R](http://artificium.us/l-6-100.html)

## Further Study

-   [Dates and Times in R](https://www.stat.berkeley.edu/~s133/dates.html)
-   [Date Formats in R](https://www.r-bloggers.com/2013/08/date-formats-in-r/)
-   [dbDataType: Determine the SQL data type of an R object](https://rdrr.io/cran/DBI/man/dbDataType.html)

## Errata

None collected yet. Let us know.

```{=html}
<script src="https://form.jotform.com/static/feedback2.js" type="text/javascript">
  new JotformFeedback({
    formId: "212187072784157",
    buttonText: "Feedback",
    base: "https://form.jotform.com/",
    background: "#F59202",
    fontColor: "#FFFFFF",
    buttonSide: "left",
    buttonAlign: "center",
    type: false,
    width: 700,
    height: 500,
    isCardForm: false
  });
</script>
```
```{r code=xfun::read_utf8(paste0(here::here(),'/R/_deployKnit.R')), include = FALSE}
```
