machine learning data mining pipeline crisp-dm[+] 3.101 ┆ Overview of the Machine Learning Pipeline
Explains the key concepts behind machine learning and the phases of the machine learning pipeline from data acqusition to model development, model evaluation, and eventual deployment.
#machine learning #data mining #pipeline #crisp-dm
Time: 10 · Level: beginner
machine learning statistics no-free lunch optimization ml[+] 3.106 ┆ No Free Lunch Theorem: No Universal Machine Learning Algorithm
Explains the _No Free Lunch Theorem_ in machine learning that asserts that no algorithm is universally better than others across all possible problems, meaning performance depends on the specific task at hand. It highlights the importance of selecting algorithms based on the problem's characteristics, as there is no one-size-fits-all solution.
#machine learning #statistics #no-free lunch #optimization #ml
Time: 20 · Level: beginner
regression statistics[+] 3.201 ┆ Effects of Multicollinearity in Regression
Explains the effects of correlation between features and unequal variance across a feature space.
#regression #statistics
Time: 10 · Level: beginner
missing data machine learning data cleaning imputation feature engineering outliers[+] 3.202 ┆ Overview of Feature Engineering
Provides an overview of the methods and purpose of feature engineering required to shape data for the training of machine learning models.
#missing data #machine learning #data cleaning #imputation #feature engineering #outliers
Time: 30 · Level: beginner
outliers z-score[+] 3.203 ┆ Detecting and Managing Outliers
Explains common strategies for detecting outliers in vectors of numeric values.
#outliers #z-score
Time: 15 · Level: beginner
missing data machine learning data cleaning imputation[+] 3.204 ┆ Managing Missing Values in Data
Explains how to deal with missing values. Demonstrates the use of imputation strategies.
#missing data #machine learning #data cleaning #imputation
Time: 45 · Level: beginner
regression statistics[+] 3.206 ┆ Normalizing Numeric Features for Machine Learning Algorithms
Explains how to normalize continuous numeric features for distance-based machine learning algorithms such as kNN, k-means, and SVM. Demonstrates the use of min-max and z-score normalization and explains mean-normalization and unit vector normalization.
#regression #statistics
Time: 45 · Level: beginner
categorical features feature engineering one-hot frequency[+] 3.207 ┆ Encoding Categorical Features
Explains the various encoding schemes for categorical variables, including one-hot, frequency, and weight of evidence.
#categorical features #feature engineering #one-hot #frequency
Time: 45 · Level: beginner
distance euclidean manhattan minkowski[+] 3.208 ┆ Distance Measures
Explains how distance is measured in an n-dimensional space.
#distance #euclidean #manhattan #minkowski
Time: 30 · Level: beginner
knn precision recall specificity F1-Score F1 classification accuracy[+] 3.212 ┆ Evaluating Classification Models
Explains common evaluation metrics for classification models, incuding accuracy, precision, recall, sensitivity, F1 Score, among others. Shows how to calculate these metrics and how to interpret the results.
#knn #precision #recall #specificity #F1-Score #F1 #classification #accuracy
Time: 40 · Level: beginner
class imbalance smote oversampling[+] 3.224 ┆ Managing Class Imbalance
Explains common methods for reducing class imbalance.
#class imbalance #smote #oversampling
Time: 60 · Level: intermediate
R statistics time series forecasting MAD MSE functions[+] 3.303 ┆ Basic Time Series Forecasting
Introduces time-series forecasting using weighted moving averages, linear regression, and exponential smoothing. Shows how to build a forecasting model, tune models, evaluate the model, construct ensembles, provide forecast intervals, account for trend and seasonality.
#R #statistics #time series #forecasting #MAD #MSE #functions
Time: 45 · Level: beginner
knn machine learning classification regression[+] 3.410 ┆ kNN for Classification and Regression
Explains the kNN (k Nearest Neighbor) machine learning algorithm for predicting a categorical target variable (classification) and a continuous numeric target variables (regression).
#knn #machine learning #classification #regression
Time: 45 · Level: beginner
knn machine learning classification[+] 3.411 ┆ Simple Implementation of kNN in R
Presents a simple implementation of kNN for classification and another implementation for regression, both in R.
#knn #machine learning #classification
Time: 30 · Level: beginner
naive bayes bayes machine learning classification spam detection[+] 3.420 ┆ The Naive Bayes Classifier Algorithm for Binary Classification
Explains the Naive Bayes Classifier supervised machine learning algorithm for predicting a binary categorical target variable (classification). Demonstrates the algorithms use through implementations from various packages including e1071 and klaR. Shows how to bin numeric features into categorical features.
#naive bayes #bayes #machine learning #classification #spam detection
Time: 90 · Level: beginner
naive bayes bayes machine learning classification spam detection[+] 3.435 ┆ Decision Trees: A Worked Example in R
Worked example for predicting credit default using the C5.0 decision tree algorithm in R.
#naive bayes #bayes #machine learning #classification #spam detection
Time: 90 · Level: beginner
regression ols machine learning[+] 3.441 ┆ Ordinary Least Squares Regression
Introduces linear regression and statistical learners. Shows how to build and evaluate a regression model.
#regression #ols #machine learning
Time: 10 · Level: beginner
regression ols machine learning[+] 3.442 ┆ Ridge and Lasso Regression
Introduces linear regression and statistical learners. Shows how to build and evaluate a regression model.
#regression #ols #machine learning
Time: 10 · Level: beginner
evaluation machine learning regression[+] 3.503 ┆ Evaluating Regression Models
Presents a simple implementation of kNN for classification and another implementation for regression, both in R.
#evaluation #machine learning #regression
Time: 10 · Level: beginner
evaluation machine learning regression[+] 3.971 ┆ Synthetic Data Engineering
Presents a worked example of engineering a synthetic data set useful for regression, data analytics, association rules, and data mining. Contains configurable parameters.
#evaluation #machine learning #regression
Time: 10 · Level: beginner
machine learning synthetic data data engineering[+] 3.981 ┆ Synthetic Engineering of a Dataset on Restaurant Visits
Presents a worked example of engineering a synthetic data set of restaurant visits, servers, consumption, and customers. Appropriate for regression, data analytics, and data mining. Contains configurable parameters.
#machine learning #synthetic data #data engineering
Time: 60 · Level: beginner
machine learning synthetic data data engineering[+] 3.982 ┆ Synthetic Engineering of a Forex Trades
Presents a worked example of engineering a synthetic data set of foreign currency exchanges in multiple portfolios.
#machine learning #synthetic data #data engineering
Time: 60 · Level: beginner
r r script r notebook markdown literate programming[+] 6.099 ┆ The 'R' Ecosystem: A Quick Primer
Explains the 'R' Universe, the language, history, and tools. Shows how to create R scripts and execute them from the command line as stand-alone programs and contrasts them with R Notebook and the knitting process to produce documents. Provides a short introduction to common IDEs for R.
#r #r script #r notebook #markdown #literate programming
Time: 45 · Level: beginner
r primer[+] 6.100 ┆ Beginning R
Introduces some basic concept of R, including statements, data frames, vectors, variables, and reading from a CSV file.
#r #primer
Time: 60 · Level: beginner
r primer[+] 6.101 ┆ First Steps in R
Introduces the key programming mechanisms of R. Shows how to work with control structures, variables, functions, and packages. Loads data from CSV files into data frames. Connects to SQL databases.
#r #primer
Time: 45 · Level: beginner
r primer vectors data frames[+] 6.103 ┆ Working with Vectors and Data Frames in R
Demonstrates how to create, access, and manipulate numeric, character (text), and logical data in vectors and data frames.
#r #primer #vectors #data frames
Time: 75 · Level: beginner
r primer loops[+] 6.104 ┆ Quick Guide to R For Programmers
A quick guide for programmers transitioning to R from C, C++, Java, JavaScript, Python, and other high-level languages. Explains key control structures and programming paradigms for R.
#r #primer #loops
Time: 60 · Level: beginner
r primer vectors data frames[+] 6.105 ┆ Factors: Categorical Variables in R
Demonstrates how to define and use factor variables which are used to implement categorical (enumerated) variables.
#r #primer #vectors #data frames
Time: 45 · Level: beginner
r csv tsv files excel data frames[+] 6.106 ┆ Import Data into R from CSV, TSV, and Excel Files
Demonstrates how to load data from CSV, TSV, Excel, and other text files into data frames for processing. Explains how to save large R objects in binary RData files.
#r #csv #tsv #files #excel #data frames
Time: 45 · Level: beginner
r dplyr filtering summarizing selecting[+] 6.107 ┆ Data Manipulation with dplyr
Demonstrates the use of the dplyr package of the Tidyverse for data manipulation, summarization, and filtering.
#r #dplyr #filtering #summarizing #selecting
Time: 45 · Level: beginner
r primer loops iteration apply[+] 6.108 ┆ Loops and Iteration in R
Shows how to perform iteration over vectors and lists using loops in R. Contrats loops with vector operations. Provides time comparisons.
#r #primer #loops #iteration #apply
Time: 45 · Level: beginner
r r script[+] 6.109 ┆ R Scripts and Programs
Shows how to create R scripts and execute them from the command line as stand-alone programs.
#r #r script
Time: 45 · Level: beginner
r random numbers[+] 6.111 ┆ Generating Random Numbers in R
Introduces random number generators as computer algorithms. Shows how to generate random numbers for simulation, statistics, and synthetic data generation in R. Demonstrates random number functions in R.
#r #random numbers
Time: 60 · Level: beginner
r r studio primer[+] 6.112 ┆ Basics of Text & String Processing in R
Explains string and text processing in R, including regular expressions.
#r #r studio #primer
Time: 45 · Level: beginner
r xml xpath primer[+] 6.114 ┆ Primer on Parsing XML with R
A primer on loading XML documents into R and processing the elements.
#r #xml #xpath #primer
Time: 45 · Level: beginner
r primer loops[+] 6.121 ┆ Writing Functions in R
Explains the concept of functions and their implementation in R. Demonstrates some of the unique mechanisms for writing and calling functions in R.
#r #primer #loops
Time: 60 · Level: beginner
r primer objects lists reference class class oop[+] 6.122 ┆ Reference Classes, Objects, and Methods in R
This lessons explores object-based programming in R and demonstrates how to define classes using the Reference Class System, one of three object-oriented class systems available in R.
#r #primer #objects #lists #reference class #class #oop
Time: 45 · Level: intermediate
r xml objects externalization[+] 6.123 ┆ Externalizing R Objects to XML
This lessons explains how objects built using the reference class system can be stored persistently in XML. Shows how objects are instantiated from XML and externalized to XML for persistence.
#r #xml #objects #externalization
Time: 45 · Level: intermediate
r objects serialization reference class[+] 6.124 ┆ Serializing R Reference Class Objects to CSV
This lessons demonstrates through many examples how to serialize reference class objects to CSV and how such objects can be reconstructed from CSV files. Explains the need for container classes and the common methods of adding, removing, finding, and counting objects in containers.
#r #objects #serialization #reference class
Time: 60 · Level: intermediate
R debugging runtime time Sys.time tictoc rbenchmark[+] 6.134 ┆ Measure Run-Time Performance of R Code
Demonstrates how to measure the execution time of R code for profiling, debugging, and performance improvemenet.
#R #debugging #runtime #time #Sys.time #tictoc #rbenchmark
Time: 45 · Level: beginner
R debugging runtime time Sys.time tictoc rbenchmark[+] 6.135 ┆ Improving Run-Time Performance of R Code
Lists several ways to improve and optimize the run-time performance of R code. Demonstrates strategies through examples.
#R #debugging #runtime #time #Sys.time #tictoc #rbenchmark
Time: 45 · Level: beginner
r xpath xml[+] 6.182 ┆ Generate Synthetic Pharma Sales Data for CSV and XML in R
Shows how to generate synthetic data for CSV and XML with an example that generates pharma sales data.
#r #xpath #xml
Time: 30 · Level: beginner
r xpath xml[+] 6.183 ┆ Extract and Transform PubMed XML Data using XSLT
Shows how to extract a subset of the data in the PubMed data set into a new XML with a different structure using XSLT.
#r #xpath #xml
Time: 30 · Level: beginner
r console message sprintf print paste cat[+] 6.190 ┆ Console Output in R
Illustrates different functions in R for performing console output, including print(), cat(), printf(), springf(), and message(). Provides use cases for each and when to best use them.
#r #console #message #sprintf #print #paste #cat
Time: 20 · Level: beginner
r debugging[+] 6.191 ┆ Debugging R Code
Debugging is an important element of software development and debugging methods and tools are integral to programming. This lesson demonstrates how to find the source of errors in R code.
#r #debugging
Time: 45 · Level: beginner
r pipes tidyverse magrittr[+] 6.192 ┆ Using Git with R Studio for Version Control
Demonstrates how to install, configure, and use Git and GitHub within R Studio for version control and team projects.
#r #pipes # tidyverse #magrittr
Time: 45 · Level: beginner
r testthat unit testing testing[+] 6.194 ┆ Automated Unit and Integrity Testing in R
Demonstrates how to build simple unit tests in R using the 'testthat' package. Explains common unit and system testing principles. Shows how to test database code.
#r #testthat #unit testing #testing
Time: 45 · Level: beginner
r debugging r script programming[+] 6.195 ┆ Using the Debugger in R Studio
This lesson provides an introduction to the debugger in R Studio and how to use it to find logic errors in R programs.
#r #debugging #r script #programming
Time: 45 · Level: beginner
r r studio primer[+] 6.202 ┆ Working with R Projects
Explains the benefits of working with R Projects. Sows how to create a new R Project in R Studio.
#r #r studio #primer
Time: 45 · Level: beginner
r r studio r markdown literate programming[+] 6.204 ┆ Literate Programming with R Notebooks
Explains the benefits of working with R Notebooks and how to leverage code chunks and follow the principles of Literate Programming.
#r #r studio #r markdown #literate programming
Time: 45 · Level: beginner
r markdown literate programming[+] 6.206 ┆ Programming Style Guide for R
Provides style recommendations for R programming.
#r #markdown #literate programming
Time: 20 · Level: beginner
r markdown literate programming[+] 6.208 ┆ Writing Documents with Markdown
Explains markdown and how to use R Studio and R Notebooks to write dynamic documents with embedded code.
#r #markdown #literate programming
Time: 30 · Level: beginner
r SQL sqldf database sqlite[+] 6.300 ┆ SQLite with R: A Primer
Demonstrates how to create a simple SQLite (or any other relational) database from R using {sql} code chunks and function calls. Most of the concepts apply to all relational databases, including MySQL
#r #SQL #sqldf #database #sqlite
Time: 20 · Level: beginner
r SQL sqldf database sqlite[+] 6.301 ┆ Working with Databases in R
Demonstrates how to connect to databases in R using SQLite as an example. Shows how to connect, create tables, execute SQL queries, modify the data, and enquire about the structure of the database. Explains R code vs {sql} chunks and the benefits of literate programming. Shows how to summarize data in data frames using SQL via sqldf.
#r #SQL #sqldf #database #sqlite
Time: 45 · Level: beginner
R SQL sqldf database sqlite dbWriteTable[+] 6.302 ┆ Bulk Load Data from CSV into Database in R
Explains how to load data from a CSV into a relational database of multiple tables and map primary key to foreign key references.
#R #SQL #sqldf #database #sqlite #dbWriteTable
Time: 45 · Level: beginner
r xpath xml[+] 6.303 ┆ Data Retrieval from XML via XPath in R
Explains how to retrieve data from an XML document or repository using XPath expressions.
#r #xpath #xml
Time: 45 · Level: beginner
r MySQL db4free AWS AWS RDS aiven freemysqlhosting[+] 6.304 ┆ Configure and Connect to Cloud MySQL from R
Demonstrates how to create a cloud MySQL instance on db4free.net, AWS RDS,aiven.io, and freemysqlhosting.net and then connect to each database from R.
#r #MySQL #db4free #AWS #AWS RDS #aiven #freemysqlhosting
Time: 25 · Level: beginner
r xpath xml[+] 6.305 ┆ Process XML DOM via XPath and Node Traversal
Explains how to retrieve data from an XML into a data frame using a combination of node traversal and XPath expressions.
#r #xpath #xml
Time: 45 · Level: beginner
r dates sql sqlite lubridate[+] 6.306 ┆ Dates in R and SQLite
Reading dates from raw data and using them in R and saving them or loading them from SQLite can be challenging. This tutorial explains how to work with dates in R, SQL, and SQLite.
#r #dates #sql #sqlite #lubridate
Time: 60 · Level: beginner
r xpath xml[+] 6.310 ┆ XML Transformation via XSLT in R
Explains how to transform structured information objects contained in an XML document to another text representation, such as XML, HTML, CSV, JSON, or SQL. Demonstrates how to use the XSLT transformer in R.
#r #xpath #xml
Time: 45 · Level: beginner
r xml xmlToDataFrame dataframe[+] 6.323 ┆ Load Simple XML into Dataframe in R using xmlToDataFrame()
Explains how to load a simple XML file into a dataframe.
#r #xml #xmlToDataFrame #dataframe
Time: 45 · Level: beginner
r xpath xml dom[+] 6.324 ┆ Traverse and Parse XML DOM in R
Explains how to traverse an XML Document Object Model (DOM) using a combination of XPath and node access.
#r #xpath #xml #dom
Time: 45 · Level: beginner
r xpath xml dom parsing sqlite sql[+] 6.328 ┆ Parsing an XML Document and Saving to SQLite Database in R
This example shows how to load and parse an XML document into an internal relational model of data frames. It traverses the DOM tree node-by-node and then save the data into data frames. The data frames are eventually written to a new database. The example uses only Base R and does not use tidyverse which has additional support for managing relational structures.
#r #xpath #xml #dom #parsing #sqlite #sql
Time: 45 · Level: beginner
r SQL sqldf database sqlite[+] 6.330 ┆ Querying Data Frames in R with sqldf
Similar to other languages, R allows data in data frames (from CSV, XML, or other sources) to be treated as if they were tables in a relational database and query the data frames using SQL. Demonstrates how to find data in data frames, combine data from multiple data frames, and aggregate data in data frames using SQL through the sqldf package.
#r #SQL #sqldf #database #sqlite
Time: 45 · Level: beginner
r pipes tidyverse magrittr[+] 6.343 ┆ Using Pipes to Manipulate Tabular Data
Pipes and the magrittr package are powerful tools for aggregrating, clearning, filtering, and processing tabular data and provide an alternative to tidyverse and sqldf.
#r #pipes # tidyverse #magrittr
Time: 45 · Level: beginner
r hashmaps env[+] 6.355 ┆ Speeding up Lookup with Hash Maps in R
Hash maps (hash tables) are an important data structure for storing key/value pairs and being able to retrieve them based on a key in constant time regardless of the number of key/value pairs. This lesson shows how to use hash maps in R.
#r #hashmaps #env
Time: 30 · Level: beginner
r files folders[+] 6.402 ┆ Navigating the File System in R
This lesson explains how to navigate the file system from R.
#r #files #folders
Time: 45 · Level: beginner
r scatter plot bubble chart plotly[+] 6.724 ┆ Basic Exploratory Data Visualization in R
Shows how to create scatter plots, bar charts, histograms and line graphs in R to visualize relationships. Presents methods for exploratory data visualization
#r #scatter plot #bubble chart #plotly
Time: 45 · Level: beginner