Introduction
Like other programming languages, R needs to read data (and other information) from various types of files, including CSV, HTML, PDF, and XML. Files are located in a file system on a local or networked drive and are identified through a file path (hierarchical list of directories) and a file name. The path plus file name are unique and form a kind of key, which means that a file system is a type of hierarchical document database.
File are located in directories (also called folders on some systems) and have a file name and an (optional) extension. The extension is separated from the file name by a dot (.), although most file systems allow dots also to be part of file name, e.g., Report.2022.pdf is a legal file name for most systems; the .pdf is the extension. The extension is often used to identify the type of information that is in the file and the file format, e.g., .pdf is for a PDF file, while .Rmd is for a file containing an R Notebook. Extensions are a convention and can vary between systems. They are generally up three letters but do not have to be, e.g., .sqlite is often used for files that contain a SQLite database.
To navigate the file system from within R, we need to learn how to:
- access a file
- know where we are in the file system
- specify a path - list all files in a folder
- move to a particular folder in the file system
- create new folders in the file system
- list and set permissions on files and folders
- write contents to a file
Folders, Paths, and File Names
The file system is a tree with the “root” of the tree. On Windows, the root is the drive letter followed by a colon and a backward slash, e.g., C:\. The backward slash has a different meaning in programming, so we generally use either \\ or / instead, e.g., C:/. The MacOS file system is a Unix file system. On Unix, the root is identified by the symbolic name /
File names on Unix are case sensitive, which means that the files Report.2022.pdf and report.2022.pdf are not the same file and can exist in the same folder. On Windows, the case is part of the file name but Windows is not case sensitive, so Report.2022.pdf and report.2022.pdf are the same file and cannot both be in the same folder. It is best to assume case sensitivity.
All file access is relative to a “current working directory”, i.e., the location in which the running program is looking by default unless a full path starting with the root folder is specified. So, customers.csv refers to a file in the current working directory while /users/alfred/data/customers.csv refers to the file customers.csv in the folder -data_ which is a folder within the folder alfred which, in turn, in within users and is directly off the root folder. The former is called a relative path, while the latter is an absolute path. Any path that starts with / on Unix or C:/ on Windows is absolute.
Of course, a Windows file system can have additional drives and not only C:. For instance, a USB Drive is likely labeled with D:/. Also, note that we are using / for Windows as that is commonly used from within programs. At the command line you will need to continue using C:\.
There are two special folder names: . is the name for the current folder and .. is the name of the parent folder right above this folder. So, a path of “../../data/bars.txt” would refer to the file bars.txt is the folder data that is in the folder above the folder of the current folder. In other words, think about directions: it is telling R to look up, look up again, then down into data.
Navigating the File System in R
Getting Current Directory
The function getwd()
returns the current working directory for R meaning the default path R looks for files that have a relative path name, i.e. they do not start with the root folder (/ on Unix or C:/ on Windows).
cwd <- getwd()
print(cwd)
## [1] "/Users/mschedlb/Library/CloudStorage/OneDrive-Personal/Teaching/artificium/lessons/06.r/l-6-402-filesystem-from-r"
It is possible to reset the current working directory to a different folder using the function setwd()
but this is generally discouraged as it makes programs less portable since they presume a certain folder structure.
List All Files in a Folder
The code fragment below lists all files in the folder data that is a subfolder within the current working directory. It returns a vector a file names.
files <- list.files(path = "data")
print(files)
## [1] "cities.txt" "citiesDashboard.Rmd" "listCities.R" "procAll.cpp"
There are a number of useful parameters to list.files()
, including:
pattern |
a pattern describing which files to include |
pattern = “*.cpp” |
recursive |
whether to include files in subfolders |
recursive = TRUE |
include.dirs |
whether to include folders in addition to files |
include.dirs = TRUE |
full.names |
whether to list file name only or path + file name |
full-.names = TRUE |
files <- list.files(path = ".", pattern = "*.Rmd",
include.dirs = TRUE, recursive = TRUE)
print(files)
## [1] "data/citiesDashboard.Rmd" "l-6-402.Rmd" "Medix.2021.Q2.Rmd"
List All Subfolders in a Folder
To list all of the folders in a folder (subfolders) you need to use the function list.dirs()
rather than list.files()
. The same parameters as described above are available for list.dirs()
as well. The function is recursive, by default.
dirs <- list.dirs(path = "../../03.ml", recursive = T)
print(dirs)
## [1] "../../03.ml"
## [2] "../../03.ml/l-3-101-intro-ml"
## [3] "../../03.ml/l-3-101-intro-ml/_raw"
## [4] "../../03.ml/l-3-201-multicollinearity"
## [5] "../../03.ml/l-3-201-multicollinearity/_raw"
## [6] "../../03.ml/l-3-202-feature-engineering"
## [7] "../../03.ml/l-3-202-feature-engineering/_images"
## [8] "../../03.ml/l-3-202-feature-engineering/_raw"
## [9] "../../03.ml/l-3-203-outliers"
## [10] "../../03.ml/l-3-203-outliers/_raw"
## [11] "../../03.ml/l-3-204-missing-values"
## [12] "../../03.ml/l-3-204-missing-values/_images"
## [13] "../../03.ml/l-3-204-missing-values/_raw"
## [14] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj"
## [15] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media"
## [16] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740328.753531"
## [17] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740328.754153"
## [18] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740328.755121"
## [19] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740328.755121/deleted.SDoJTh"
## [20] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740328.755938"
## [21] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740749.718117"
## [22] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740749.718883"
## [23] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/media/1685740749.719459"
## [24] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings"
## [25] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.756229"
## [26] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.756566"
## [27] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.756862"
## [28] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.757097"
## [29] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.757417"
## [30] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.757730"
## [31] "../../03.ml/l-3-204-missing-values/_raw/tutorial-3-204-missing-values.cmproj/recordings/1685740328.757991"
## [32] "../../03.ml/l-3-204-missing-values/presentation"
## [33] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files"
## [34] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs"
## [35] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/clipboard"
## [36] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/quarto-html"
## [37] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs"
## [38] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/dist"
## [39] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/dist/theme"
## [40] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/dist/theme/fonts"
## [41] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/dist/theme/fonts/league-gothic"
## [42] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/dist/theme/fonts/source-sans-pro"
## [43] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin"
## [44] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/highlight"
## [45] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/markdown"
## [46] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/math"
## [47] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/notes"
## [48] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/pdf-export"
## [49] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/quarto-line-highlight"
## [50] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/quarto-support"
## [51] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/reveal-menu"
## [52] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/search"
## [53] "../../03.ml/l-3-204-missing-values/presentation/p-3-204_files/libs/revealjs/plugin/zoom"
## [54] "../../03.ml/l-3-206-feature-normalization"
## [55] "../../03.ml/l-3-206-feature-normalization/_images"
## [56] "../../03.ml/l-3-206-feature-normalization/_raw"
## [57] "../../03.ml/l-3-207-categorical-encoding"
## [58] "../../03.ml/l-3-207-categorical-encoding/l-3-201-slides-figure"
## [59] "../../03.ml/l-3-208-distance-measures"
## [60] "../../03.ml/l-3-208-distance-measures/_images"
## [61] "../../03.ml/l-3-208-distance-measures/_raw"
## [62] "../../03.ml/l-3-212-precision-recall-accuracy"
## [63] "../../03.ml/l-3-303-time-series-forecasting-r"
## [64] "../../03.ml/l-3-410-knn-classification-regression"
## [65] "../../03.ml/l-3-411-knn-implementation-r"
## [66] "../../03.ml/l-3-441-ols-regression"
## [67] "../../03.ml/l-3-503-regression-model-eval"
Interactive Folder and File Selection
Assuming you are running an R program rather than knitting an R Notebook, you can promtp the user to select a file or folder interactively using choose.dir()
(on MacOS) and choose.files()
(on Windows).
data.File <- choose.files(caption = "Select Data File", multi= FALSE,
filters = "*.csv")
Checking File and Folder Existence
We can check for the existence of a file or folder with the functions dir.exists()
and file.exists()
. The functions return TRUE if the file or folder exists and FALSE otherwise.
df.folder <- "data"
if (!dir.exists(df.folder))
{
dir.create(df.folder)
}
Create New File
Most functions that write data to a file will automatically create a new file as needed, but it is possible to create a new (empty) file directly with file.create()
; it is the equivalent of the function dir.create()
for creating new folders.
Copy File
To copy a file and its contents from one folder to another, use the function file.copy()
. In the example below, the file l-6-402.Rmd in the current working directory is copied to the folder /users/tmp. Note the trailing /. If not present, then it is presumed to be the new name of the file.
file.copy("./l-6-402.Rmd", "/users/tmp/")
Remove a File
To remove a file use either the function unlink()
or the function file.remove()
.
Other Useful Function
From the package tools, we also get the functions:
file_ext()
file.choose()
is_file()
is_dir()
Example: Decompress .gz Files
The code below uses the gunzip()
function to decompress all -.gz_ gzip compressed files in a folder. Note that the gunzip()
function is from the R.utils package.
# required for the gunzip() function
library(R.utils)
# uncompress all files in a folder
folder <- "pubmed-xml"
gzfiles <- list.files(path = folder, pattern = "*.gz", full.names = T)
for (i in 1:length(gzfiles)) {
gunzip(gzfiles[i], remove=FALSE, skip=TRUE)
}
Errata
None collected yet. Let us know.
LS0tCnRpdGxlOiAiTmF2aWdhdGluZyB0aGUgRmlsZSBTeXN0ZW0gaW4gUiIKcGFyYW1zOgogIGNhdGVnb3J5OiA2CiAgbnVtYmVyOiA0MDIKICB0aW1lOiA0NQogIGxldmVsOiBiZWdpbm5lcgogIHRhZ3M6ICJyLGZpbGVzLGZvbGRlcnMiCiAgZGVzY3JpcHRpb246ICJUaGlzIGxlc3NvbiBleHBsYWlucyBob3cgdG8gbmF2aWdhdGUgdGhlIGZpbGUgc3lzdGVtCiAgICAgICAgICAgICAgICBmcm9tIFIuIgpkYXRlOiAiPHNtYWxsPmByIFN5cy5EYXRlKClgPC9zbWFsbD4iCmF1dGhvcjogIjxzbWFsbD5NYXJ0aW4gU2NoZWRsYmF1ZXI8L3NtYWxsPiIKZW1haWw6ICJtLnNjaGVkbGJhdWVyQG5ldS5lZHUiCmFmZmlsaXRhdGlvbjogIk5vcnRoZWFzdGVybiBVbml2ZXJzaXR5IgpvdXRwdXQ6IAogIGJvb2tkb3duOjpodG1sX2RvY3VtZW50MjoKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiB0cnVlCiAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICBudW1iZXJfc2VjdGlvbnM6IGZhbHNlCiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgICB0aGVtZTogc3BhY2VsYWIKICAgIGhpZ2hsaWdodDogdGFuZ28KLS0tCgotLS0KdGl0bGU6ICI8c21hbGw+YHIgcGFyYW1zJGNhdGVnb3J5YC5gciBwYXJhbXMkbnVtYmVyYDwvc21hbGw+PGJyLz48c3BhbiBzdHlsZT0nY29sb3I6ICMyRTQwNTM7IGZvbnQtc2l6ZTogMC45ZW0nPmByIHJtYXJrZG93bjo6bWV0YWRhdGEkdGl0bGVgPC9zcGFuPiIKLS0tCgpgYGB7ciBjb2RlPXhmdW46OnJlYWRfdXRmOChwYXN0ZTAoaGVyZTo6aGVyZSgpLCcvUi9faW5zZXJ0MkRCLlInKSksIGluY2x1ZGUgPSBGQUxTRX0KYGBgCgojIyBJbnRyb2R1Y3Rpb24KCkxpa2Ugb3RoZXIgcHJvZ3JhbW1pbmcgbGFuZ3VhZ2VzLCBSIG5lZWRzIHRvIHJlYWQgZGF0YSAoYW5kIG90aGVyIGluZm9ybWF0aW9uKSBmcm9tIHZhcmlvdXMgdHlwZXMgb2YgZmlsZXMsIGluY2x1ZGluZyBDU1YsIEhUTUwsIFBERiwgYW5kIFhNTC4gRmlsZXMgYXJlIGxvY2F0ZWQgaW4gYSBmaWxlIHN5c3RlbSBvbiBhIGxvY2FsIG9yIG5ldHdvcmtlZCBkcml2ZSBhbmQgYXJlIGlkZW50aWZpZWQgdGhyb3VnaCBhIGZpbGUgcGF0aCAoaGllcmFyY2hpY2FsIGxpc3Qgb2YgZGlyZWN0b3JpZXMpIGFuZCBhIGZpbGUgbmFtZS4gVGhlIHBhdGggcGx1cyBmaWxlIG5hbWUgYXJlIHVuaXF1ZSBhbmQgZm9ybSBhIGtpbmQgb2Yga2V5LCB3aGljaCBtZWFucyB0aGF0IGEgZmlsZSBzeXN0ZW0gaXMgYSB0eXBlIG9mIGhpZXJhcmNoaWNhbCBkb2N1bWVudCBkYXRhYmFzZS4KCkZpbGUgYXJlIGxvY2F0ZWQgaW4gZGlyZWN0b3JpZXMgKGFsc28gY2FsbGVkIGZvbGRlcnMgb24gc29tZSBzeXN0ZW1zKSBhbmQgaGF2ZSBhIGZpbGUgbmFtZSBhbmQgYW4gKG9wdGlvbmFsKSBleHRlbnNpb24uIFRoZSBleHRlbnNpb24gaXMgc2VwYXJhdGVkIGZyb20gdGhlIGZpbGUgbmFtZSBieSBhIGRvdCAoLiksIGFsdGhvdWdoIG1vc3QgZmlsZSBzeXN0ZW1zIGFsbG93IGRvdHMgYWxzbyB0byBiZSBwYXJ0IG9mIGZpbGUgbmFtZSwgKmUuZy4qLCAqUmVwb3J0LjIwMjIucGRmKiBpcyBhIGxlZ2FsIGZpbGUgbmFtZSBmb3IgbW9zdCBzeXN0ZW1zOyB0aGUgKi5wZGYqIGlzIHRoZSBleHRlbnNpb24uIFRoZSBleHRlbnNpb24gaXMgb2Z0ZW4gdXNlZCB0byBpZGVudGlmeSB0aGUgdHlwZSBvZiBpbmZvcm1hdGlvbiB0aGF0IGlzIGluIHRoZSBmaWxlIGFuZCB0aGUgZmlsZSBmb3JtYXQsICplLmcuKiwgKi5wZGYqIGlzIGZvciBhIFBERiBmaWxlLCB3aGlsZSAqLlJtZCogaXMgZm9yIGEgZmlsZSBjb250YWluaW5nIGFuIFIgTm90ZWJvb2suIEV4dGVuc2lvbnMgYXJlIGEgY29udmVudGlvbiBhbmQgY2FuIHZhcnkgYmV0d2VlbiBzeXN0ZW1zLiBUaGV5IGFyZSBnZW5lcmFsbHkgdXAgdGhyZWUgbGV0dGVycyBidXQgZG8gbm90IGhhdmUgdG8gYmUsICplLmcuKiwgKi5zcWxpdGUqIGlzIG9mdGVuIHVzZWQgZm9yIGZpbGVzIHRoYXQgY29udGFpbiBhIFNRTGl0ZSBkYXRhYmFzZS4KClRvIG5hdmlnYXRlIHRoZSBmaWxlIHN5c3RlbSBmcm9tIHdpdGhpbiBSLCB3ZSBuZWVkIHRvIGxlYXJuIGhvdyB0bzoKCi0gICBhY2Nlc3MgYSBmaWxlCi0gICBrbm93IHdoZXJlIHdlIGFyZSBpbiB0aGUgZmlsZSBzeXN0ZW0KLSAgIHNwZWNpZnkgYSBwYXRoIC0gbGlzdCBhbGwgZmlsZXMgaW4gYSBmb2xkZXIKLSAgIG1vdmUgdG8gYSBwYXJ0aWN1bGFyIGZvbGRlciBpbiB0aGUgZmlsZSBzeXN0ZW0KLSAgIGNyZWF0ZSBuZXcgZm9sZGVycyBpbiB0aGUgZmlsZSBzeXN0ZW0KLSAgIGxpc3QgYW5kIHNldCBwZXJtaXNzaW9ucyBvbiBmaWxlcyBhbmQgZm9sZGVycwotICAgd3JpdGUgY29udGVudHMgdG8gYSBmaWxlCgojIyBGb2xkZXJzLCBQYXRocywgYW5kIEZpbGUgTmFtZXMKClRoZSBmaWxlIHN5c3RlbSBpcyBhIHRyZWUgd2l0aCB0aGUgInJvb3QiIG9mIHRoZSB0cmVlLiBPbiBXaW5kb3dzLCB0aGUgcm9vdCBpcyB0aGUgZHJpdmUgbGV0dGVyIGZvbGxvd2VkIGJ5IGEgY29sb24gYW5kIGEgYmFja3dhcmQgc2xhc2gsICplLmcuKiwgKkM6XFwqLiBUaGUgYmFja3dhcmQgc2xhc2ggaGFzIGEgZGlmZmVyZW50IG1lYW5pbmcgaW4gcHJvZ3JhbW1pbmcsIHNvIHdlIGdlbmVyYWxseSB1c2UgZWl0aGVyIFxcXFwgb3IgLyBpbnN0ZWFkLCAqZS5nLiosICpDOi8qLiBUaGUgTWFjT1MgZmlsZSBzeXN0ZW0gaXMgYSBVbml4IGZpbGUgc3lzdGVtLiBPbiBVbml4LCB0aGUgcm9vdCBpcyBpZGVudGlmaWVkIGJ5IHRoZSBzeW1ib2xpYyBuYW1lICovKiBcZW0gYSBmb3J3YXJkIHNsYXNoLgoKRmlsZSBuYW1lcyBvbiBVbml4IGFyZSBjYXNlIHNlbnNpdGl2ZSwgd2hpY2ggbWVhbnMgdGhhdCB0aGUgZmlsZXMgKlJlcG9ydC4yMDIyLnBkZiogYW5kICpyZXBvcnQuMjAyMi5wZGYqIGFyZSBub3QgdGhlIHNhbWUgZmlsZSBhbmQgY2FuIGV4aXN0IGluIHRoZSBzYW1lIGZvbGRlci4gT24gV2luZG93cywgdGhlIGNhc2UgaXMgcGFydCBvZiB0aGUgZmlsZSBuYW1lIGJ1dCBXaW5kb3dzIGlzIG5vdCBjYXNlIHNlbnNpdGl2ZSwgc28gKlJlcG9ydC4yMDIyLnBkZiogYW5kICpyZXBvcnQuMjAyMi5wZGYqIGFyZSB0aGUgc2FtZSBmaWxlIGFuZCBjYW5ub3QgYm90aCBiZSBpbiB0aGUgc2FtZSBmb2xkZXIuIEl0IGlzIGJlc3QgdG8gYXNzdW1lIGNhc2Ugc2Vuc2l0aXZpdHkuCgpBbGwgZmlsZSBhY2Nlc3MgaXMgcmVsYXRpdmUgdG8gYSAiY3VycmVudCB3b3JraW5nIGRpcmVjdG9yeSIsICppLmUuKiwgdGhlIGxvY2F0aW9uIGluIHdoaWNoIHRoZSBydW5uaW5nIHByb2dyYW0gaXMgbG9va2luZyBieSBkZWZhdWx0IHVubGVzcyBhIGZ1bGwgcGF0aCBzdGFydGluZyB3aXRoIHRoZSByb290IGZvbGRlciBpcyBzcGVjaWZpZWQuIFNvLCAqY3VzdG9tZXJzLmNzdiogcmVmZXJzIHRvIGEgZmlsZSBpbiB0aGUgY3VycmVudCB3b3JraW5nIGRpcmVjdG9yeSB3aGlsZSAqL3VzZXJzL2FsZnJlZC9kYXRhL2N1c3RvbWVycy5jc3YqIHJlZmVycyB0byB0aGUgZmlsZSAqY3VzdG9tZXJzLmNzdiogaW4gdGhlIGZvbGRlciAtZGF0YVxfIHdoaWNoIGlzIGEgZm9sZGVyIHdpdGhpbiB0aGUgZm9sZGVyICphbGZyZWQqIHdoaWNoLCBpbiB0dXJuLCBpbiB3aXRoaW4gKnVzZXJzKiBhbmQgaXMgZGlyZWN0bHkgb2ZmIHRoZSByb290IGZvbGRlci4gVGhlIGZvcm1lciBpcyBjYWxsZWQgYSByZWxhdGl2ZSBwYXRoLCB3aGlsZSB0aGUgbGF0dGVyIGlzIGFuIGFic29sdXRlIHBhdGguIEFueSBwYXRoIHRoYXQgc3RhcnRzIHdpdGggKi8qIG9uIFVuaXggb3IgKkM6Lyogb24gV2luZG93cyBpcyBhYnNvbHV0ZS4KCk9mIGNvdXJzZSwgYSBXaW5kb3dzIGZpbGUgc3lzdGVtIGNhbiBoYXZlIGFkZGl0aW9uYWwgZHJpdmVzIGFuZCBub3Qgb25seSAqQzoqLiBGb3IgaW5zdGFuY2UsIGEgVVNCIERyaXZlIGlzIGxpa2VseSBsYWJlbGVkIHdpdGggKkQ6LyouIEFsc28sIG5vdGUgdGhhdCB3ZSBhcmUgdXNpbmcgKi8qIGZvciBXaW5kb3dzIGFzIHRoYXQgaXMgY29tbW9ubHkgdXNlZCBmcm9tIHdpdGhpbiBwcm9ncmFtcy4gQXQgdGhlIGNvbW1hbmQgbGluZSB5b3Ugd2lsbCBuZWVkIHRvIGNvbnRpbnVlIHVzaW5nICpDOlxcKi4KClRoZXJlIGFyZSB0d28gc3BlY2lhbCBmb2xkZXIgbmFtZXM6ICouKiBpcyB0aGUgbmFtZSBmb3IgdGhlIGN1cnJlbnQgZm9sZGVyIGFuZCAqLi4qIGlzIHRoZSBuYW1lIG9mIHRoZSBwYXJlbnQgZm9sZGVyIHJpZ2h0IGFib3ZlIHRoaXMgZm9sZGVyLiBTbywgYSBwYXRoIG9mICoiLi4vLi4vZGF0YS9iYXJzLnR4dCIqIHdvdWxkIHJlZmVyIHRvIHRoZSBmaWxlICpiYXJzLnR4dCogaXMgdGhlIGZvbGRlciAqZGF0YSogdGhhdCBpcyBpbiB0aGUgZm9sZGVyIGFib3ZlIHRoZSBmb2xkZXIgb2YgdGhlIGN1cnJlbnQgZm9sZGVyLiBJbiBvdGhlciB3b3JkcywgdGhpbmsgYWJvdXQgZGlyZWN0aW9uczogaXQgaXMgdGVsbGluZyBSIHRvIGxvb2sgdXAsIGxvb2sgdXAgYWdhaW4sIHRoZW4gZG93biBpbnRvICpkYXRhKi4KCiMjIE5hdmlnYXRpbmcgdGhlIEZpbGUgU3lzdGVtIGluIFIKCiMjIyBHZXR0aW5nIEN1cnJlbnQgRGlyZWN0b3J5CgpUaGUgZnVuY3Rpb24gPGNvZGU+Z2V0d2QoKTwvY29kZT4gcmV0dXJucyB0aGUgY3VycmVudCB3b3JraW5nIGRpcmVjdG9yeSBmb3IgUiBtZWFuaW5nIHRoZSBkZWZhdWx0IHBhdGggUiBsb29rcyBmb3IgZmlsZXMgdGhhdCBoYXZlIGEgcmVsYXRpdmUgcGF0aCBuYW1lLCAqaS5lLiogdGhleSBkbyBub3Qgc3RhcnQgd2l0aCB0aGUgcm9vdCBmb2xkZXIgKC8gb24gVW5peCBvciBDOi8gb24gV2luZG93cykuCgpgYGB7cn0KY3dkIDwtIGdldHdkKCkKCnByaW50KGN3ZCkKYGBgCgpJdCBpcyBwb3NzaWJsZSB0byByZXNldCB0aGUgY3VycmVudCB3b3JraW5nIGRpcmVjdG9yeSB0byBhIGRpZmZlcmVudCBmb2xkZXIgdXNpbmcgdGhlIGZ1bmN0aW9uIDxjb2RlPnNldHdkKCk8L2NvZGU+IGJ1dCB0aGlzIGlzIGdlbmVyYWxseSBkaXNjb3VyYWdlZCBhcyBpdCBtYWtlcyBwcm9ncmFtcyBsZXNzIHBvcnRhYmxlIHNpbmNlIHRoZXkgcHJlc3VtZSBhIGNlcnRhaW4gZm9sZGVyIHN0cnVjdHVyZS4KCiMjIyBMaXN0IEFsbCBGaWxlcyBpbiBhIEZvbGRlcgoKVGhlIGNvZGUgZnJhZ21lbnQgYmVsb3cgbGlzdHMgYWxsIGZpbGVzIGluIHRoZSBmb2xkZXIgZGF0YSB0aGF0IGlzIGEgc3ViZm9sZGVyIHdpdGhpbiB0aGUgY3VycmVudCB3b3JraW5nIGRpcmVjdG9yeS4gSXQgcmV0dXJucyBhIHZlY3RvciBhIGZpbGUgbmFtZXMuCgpgYGB7cn0KZmlsZXMgPC0gbGlzdC5maWxlcyhwYXRoID0gImRhdGEiKQoKcHJpbnQoZmlsZXMpCmBgYAoKVGhlcmUgYXJlIGEgbnVtYmVyIG9mIHVzZWZ1bCBwYXJhbWV0ZXJzIHRvIDxjb2RlPmxpc3QuZmlsZXMoKTwvY29kZT4sIGluY2x1ZGluZzoKCnwgUGFyYW1ldGVyICAgIHwgTWVhbmluZyAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgfCBFeGFtcGxlICAgICAgICAgICAgIHwKfDotLS0tLS0tLS0tLS0tfDotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS18Oi0tLS0tLS0tLS0tLS0tLS0tLS0tfAp8IHBhdHRlcm4gICAgICB8IGEgcGF0dGVybiBkZXNjcmliaW5nIHdoaWNoIGZpbGVzIHRvIGluY2x1ZGUgICAgICAgIHwgcGF0dGVybiA9ICJcKi5jcHAiICB8CnwgcmVjdXJzaXZlICAgIHwgd2hldGhlciB0byBpbmNsdWRlIGZpbGVzIGluIHN1YmZvbGRlcnMgICAgICAgICAgICAgfCByZWN1cnNpdmUgPSBUUlVFICAgIHwKfCBpbmNsdWRlLmRpcnMgfCB3aGV0aGVyIHRvIGluY2x1ZGUgZm9sZGVycyBpbiBhZGRpdGlvbiB0byBmaWxlcyAgICB8IGluY2x1ZGUuZGlycyA9IFRSVUUgfAp8IGZ1bGwubmFtZXMgICB8IHdoZXRoZXIgdG8gbGlzdCBmaWxlIG5hbWUgb25seSBvciBwYXRoICsgZmlsZSBuYW1lIHwgZnVsbC0ubmFtZXMgPSBUUlVFICB8CgpgYGB7cn0KZmlsZXMgPC0gbGlzdC5maWxlcyhwYXRoID0gIi4iLCBwYXR0ZXJuID0gIiouUm1kIiwKICAgICAgICAgICAgICAgICAgICBpbmNsdWRlLmRpcnMgPSBUUlVFLCByZWN1cnNpdmUgPSBUUlVFKQoKcHJpbnQoZmlsZXMpCmBgYAoKIyMjIExpc3QgQWxsIFN1YmZvbGRlcnMgaW4gYSBGb2xkZXIKClRvIGxpc3QgYWxsIG9mIHRoZSBmb2xkZXJzIGluIGEgZm9sZGVyIChzdWJmb2xkZXJzKSB5b3UgbmVlZCB0byB1c2UgdGhlIGZ1bmN0aW9uIDxjb2RlPmxpc3QuZGlycygpPC9jb2RlPiByYXRoZXIgdGhhbiA8Y29kZT5saXN0LmZpbGVzKCk8L2NvZGU+LiBUaGUgc2FtZSBwYXJhbWV0ZXJzIGFzIGRlc2NyaWJlZCBhYm92ZSBhcmUgYXZhaWxhYmxlIGZvciA8Y29kZT5saXN0LmRpcnMoKTwvY29kZT4gYXMgd2VsbC4gVGhlIGZ1bmN0aW9uIGlzIHJlY3Vyc2l2ZSwgYnkgZGVmYXVsdC4KCmBgYHtyfQpkaXJzIDwtIGxpc3QuZGlycyhwYXRoID0gIi4uLy4uLzAzLm1sIiwgcmVjdXJzaXZlID0gVCkKCnByaW50KGRpcnMpCmBgYAoKIyMjIEludGVyYWN0aXZlIEZvbGRlciBhbmQgRmlsZSBTZWxlY3Rpb24KCkFzc3VtaW5nIHlvdSBhcmUgcnVubmluZyBhbiBSIHByb2dyYW0gcmF0aGVyIHRoYW4ga25pdHRpbmcgYW4gUiBOb3RlYm9vaywgeW91IGNhbiBwcm9tdHAgdGhlIHVzZXIgdG8gc2VsZWN0IGEgZmlsZSBvciBmb2xkZXIgaW50ZXJhY3RpdmVseSB1c2luZyA8Y29kZT5jaG9vc2UuZGlyKCk8L2NvZGU+IChvbiBNYWNPUykgYW5kIDxjb2RlPmNob29zZS5maWxlcygpPC9jb2RlPiAob24gV2luZG93cykuCgpgYGB7ciBldmFsPUZ9CmRhdGEuRmlsZSA8LSBjaG9vc2UuZmlsZXMoY2FwdGlvbiA9ICJTZWxlY3QgRGF0YSBGaWxlIiwgbXVsdGk9IEZBTFNFLAogICAgICAgICAgICAgICAgICAgICAgICAgIGZpbHRlcnMgPSAiKi5jc3YiKQpgYGAKCiMjIENoZWNraW5nIEZpbGUgYW5kIEZvbGRlciBFeGlzdGVuY2UKCldlIGNhbiBjaGVjayBmb3IgdGhlIGV4aXN0ZW5jZSBvZiBhIGZpbGUgb3IgZm9sZGVyIHdpdGggdGhlIGZ1bmN0aW9ucyA8Y29kZT5kaXIuZXhpc3RzKCk8L2NvZGU+IGFuZCA8Y29kZT5maWxlLmV4aXN0cygpPC9jb2RlPi4gVGhlIGZ1bmN0aW9ucyByZXR1cm4gKlRSVUUqIGlmIHRoZSBmaWxlIG9yIGZvbGRlciBleGlzdHMgYW5kICpGQUxTRSogb3RoZXJ3aXNlLgoKYGBge3J9CmRmLmZvbGRlciA8LSAiZGF0YSIKCmlmICghZGlyLmV4aXN0cyhkZi5mb2xkZXIpKQp7CiAgZGlyLmNyZWF0ZShkZi5mb2xkZXIpCn0KYGBgCgojIyBDcmVhdGUgTmV3IEZpbGUKCk1vc3QgZnVuY3Rpb25zIHRoYXQgd3JpdGUgZGF0YSB0byBhIGZpbGUgd2lsbCBhdXRvbWF0aWNhbGx5IGNyZWF0ZSBhIG5ldyBmaWxlIGFzIG5lZWRlZCwgYnV0IGl0IGlzIHBvc3NpYmxlIHRvIGNyZWF0ZSBhIG5ldyAoZW1wdHkpIGZpbGUgZGlyZWN0bHkgd2l0aCA8Y29kZT5maWxlLmNyZWF0ZSgpPC9jb2RlPjsgaXQgaXMgdGhlIGVxdWl2YWxlbnQgb2YgdGhlIGZ1bmN0aW9uIDxjb2RlPmRpci5jcmVhdGUoKTwvY29kZT4gZm9yIGNyZWF0aW5nIG5ldyBmb2xkZXJzLgoKIyMgQ29weSBGaWxlCgpUbyBjb3B5IGEgZmlsZSBhbmQgaXRzIGNvbnRlbnRzIGZyb20gb25lIGZvbGRlciB0byBhbm90aGVyLCB1c2UgdGhlIGZ1bmN0aW9uIDxjb2RlPmZpbGUuY29weSgpPC9jb2RlPi4gSW4gdGhlIGV4YW1wbGUgYmVsb3csIHRoZSBmaWxlICpsLTYtNDAyLlJtZCogaW4gdGhlIGN1cnJlbnQgd29ya2luZyBkaXJlY3RvcnkgaXMgY29waWVkIHRvIHRoZSBmb2xkZXIgKi91c2Vycy90bXAqLiBOb3RlIHRoZSB0cmFpbGluZyAqLyouIElmIG5vdCBwcmVzZW50LCB0aGVuIGl0IGlzIHByZXN1bWVkIHRvIGJlIHRoZSBuZXcgbmFtZSBvZiB0aGUgZmlsZS4KCmBgYHtyIGV2YWw9Rn0KZmlsZS5jb3B5KCIuL2wtNi00MDIuUm1kIiwgIi91c2Vycy90bXAvIikKYGBgCgojIyBGaWxlIEluZm9ybWF0aW9uCgpUbyBnZXQgaW5mb3JtYXRpb24gYWJvdXQgYSBmaWxlLCBzdWNoIGFzIGRhdGUgb2YgY3JlYXRpb24sIGRhdGUgb2YgbGFzdCBhY2Nlc3MsIGZpbGUgc2l6ZSwgb3duZXJzaGlwLCBldGMuIHVzZSB0aGUgZnVuY3Rpb24gPGNvZGU+ZmlsZS5pbmZvKCk8L2NvZGU+LgoKYGBge3J9CmZpbGUuaW5mbygibC02LTQwMi5SbWQiKQpgYGAKCiMjIFJlbW92ZSBhIEZpbGUKClRvIHJlbW92ZSBhIGZpbGUgdXNlIGVpdGhlciB0aGUgZnVuY3Rpb24gPGNvZGU+dW5saW5rKCk8L2NvZGU+IG9yIHRoZSBmdW5jdGlvbiA8Y29kZT5maWxlLnJlbW92ZSgpPC9jb2RlPi4KCiMjIE90aGVyIFVzZWZ1bCBGdW5jdGlvbgoKLSAgIDxjb2RlPmJhc2VuYW1lKCk8L2NvZGU+Ci0gICA8Y29kZT5kaXJuYW1lKCk8L2NvZGU+CgpGcm9tIHRoZSBwYWNrYWdlICoqdG9vbHMqKiwgd2UgYWxzbyBnZXQgdGhlIGZ1bmN0aW9uczoKCi0gICA8Y29kZT5maWxlX2V4dCgpPC9jb2RlPgotICAgPGNvZGU+ZmlsZS5jaG9vc2UoKTwvY29kZT4KLSAgIDxjb2RlPmlzX2ZpbGUoKTwvY29kZT4KLSAgIDxjb2RlPmlzX2RpcigpPC9jb2RlPgoKIyMgRXhhbXBsZTogRGVjb21wcmVzcyAqLmd6KiBGaWxlcwoKVGhlIGNvZGUgYmVsb3cgdXNlcyB0aGUgPGNvZGU+Z3VuemlwKCk8L2NvZGU+IGZ1bmN0aW9uIHRvIGRlY29tcHJlc3MgYWxsIC0uZ3pcXyBnemlwIGNvbXByZXNzZWQgZmlsZXMgaW4gYSBmb2xkZXIuIE5vdGUgdGhhdCB0aGUgPGNvZGU+Z3VuemlwKCk8L2NvZGU+IGZ1bmN0aW9uIGlzIGZyb20gdGhlICoqUi51dGlscyoqIHBhY2thZ2UuCgpgYGB7ciB1blppcFB1Yk1lZERhdGFGaWxlcywgZXZhbD1GfQoKIyByZXF1aXJlZCBmb3IgdGhlIGd1bnppcCgpIGZ1bmN0aW9uCmxpYnJhcnkoUi51dGlscykKCiMgdW5jb21wcmVzcyBhbGwgZmlsZXMgaW4gYSBmb2xkZXIKZm9sZGVyIDwtICJwdWJtZWQteG1sIiAKCmd6ZmlsZXMgPC0gbGlzdC5maWxlcyhwYXRoID0gZm9sZGVyLCBwYXR0ZXJuID0gIiouZ3oiLCBmdWxsLm5hbWVzID0gVCkKCmZvciAoaSBpbiAxOmxlbmd0aChnemZpbGVzKSkgewogIGd1bnppcChnemZpbGVzW2ldLCByZW1vdmU9RkFMU0UsIHNraXA9VFJVRSkKfQpgYGAKCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQoKIyMgRmlsZXMgJiBSZXNvdXJjZXMKCmBgYHtyIHppcEZpbGVzLCBlY2hvPUZBTFNFfQp6aXBOYW1lID0gc3ByaW50ZigiTGVzc29uRmlsZXMtJXMtJXMuemlwIiwgCiAgICAgICAgICAgICAgICAgcGFyYW1zJGNhdGVnb3J5LAogICAgICAgICAgICAgICAgIHBhcmFtcyRudW1iZXIpCgp0ZXh0QUxpbmsgPSBwYXN0ZTAoIkFsbCBGaWxlcyBmb3IgTGVzc29uICIsIAogICAgICAgICAgICAgICBwYXJhbXMkY2F0ZWdvcnksIi4iLHBhcmFtcyRudW1iZXIpCgojIGRvd25sb2FkRmlsZXNMaW5rKCkgaXMgaW5jbHVkZWQgZnJvbSBfaW5zZXJ0MkRCLlIKa25pdHI6OnJhd19odG1sKGRvd25sb2FkRmlsZXNMaW5rKCIuIiwgemlwTmFtZSwgdGV4dEFMaW5rKSkKYGBgCgotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KCiMjIFJlZmVyZW5jZXMKCltIb3cgdG8gdXNlIHBpcGVzIHRvIGNsZWFuIHVwIHlvdXIgUiBjb2RlLiBSYmxvZ2dlcnMuIE1hcmNoIDIsIDIwMjJdKGh0dHBzOi8vd3d3LnItYmxvZ2dlcnMuY29tLzIwMjIvMDMvaG93LXRvLXVzZS1waXBlcy10by1jbGVhbi11cC15b3VyLXItY29kZS8pCgojIyBFcnJhdGEKCk5vbmUgY29sbGVjdGVkIHlldC4gTGV0IHVzIGtub3cuCgpgYGB7PWh0bWx9CjxzY3JpcHQgc3JjPSJodHRwczovL2Zvcm0uam90Zm9ybS5jb20vc3RhdGljL2ZlZWRiYWNrMi5qcyIgdHlwZT0idGV4dC9qYXZhc2NyaXB0Ij4KICBuZXcgSm90Zm9ybUZlZWRiYWNrKHsKICAgIGZvcm1JZDogIjIxMjE4NzA3Mjc4NDE1NyIsCiAgICBidXR0b25UZXh0OiAiRmVlZGJhY2siLAogICAgYmFzZTogImh0dHBzOi8vZm9ybS5qb3Rmb3JtLmNvbS8iLAogICAgYmFja2dyb3VuZDogIiNGNTkyMDIiLAogICAgZm9udENvbG9yOiAiI0ZGRkZGRiIsCiAgICBidXR0b25TaWRlOiAibGVmdCIsCiAgICBidXR0b25BbGlnbjogImNlbnRlciIsCiAgICB0eXBlOiBmYWxzZSwKICAgIHdpZHRoOiA3MDAsCiAgICBoZWlnaHQ6IDUwMCwKICAgIGlzQ2FyZEZvcm06IGZhbHNlCiAgfSk7Cjwvc2NyaXB0PgpgYGAKYGBge3IgY29kZT14ZnVuOjpyZWFkX3V0ZjgocGFzdGUwKGhlcmU6OmhlcmUoKSwnL1IvX2RlcGxveUtuaXQuUicpKSwgaW5jbHVkZSA9IEZBTFNFfQpgYGAK