2 minutes
R Programming 102- Importing Databases and URLs
Databases
Importing Library
library(DBI)
Creating Connection
con <-dbConnect(RMySQL::MySQL(),
dbname = "db_name",
host = "localhost",
port = port_num,
user = "username",
password = "my_password")
This connection object is specific to MySQL database.
Importing data
-
Read Whole Tables
dbListTables(con) dbReadTable(con, "table_name")
-
Read selected data from tables
dbGetQuery(con, "SELECT * FROM TABLE_NAME WHERE date > '2020-02-01'") res <- dbSendQuery(con, "query string") dbFetch(res) dbClearResult(res)The dbGetQuery first sends the query and fetches it. We can do it using dbsendQuery and then dbFetch(obj) to get the result we want. The fetch function can be useful when we want to import records by records rather than having everything at once.
URLs and Other data formats
Reading data from URLs
library(readr)
read_csv("http://xyz.com/filename.csv")
read_tsv("http://xyz.com/filename.tsv")
read_csv("https://xyz.com/filename.csv")
read.csv("https://xyz.com/filename.csv")
Both read.csv method from utils package and read_csv from readr package work fine with urls. It works for fine for all kinds of files if they can be parsed into tabular formats. However read_excel function from readxl package doesn’t support urls.
file_path <- file.path("~", "Downloads", "filename.xlsx")
file_url <- "https://xyz.com/filename.xlsx"
download.file(file_url, file_path)
load("file_name.RData")
In loading RData files, you can’t pass URLs to it. Instead you need to save the file first and then load it using load function.
HTTP Queries
library(httr)
response <- GET(url)
raw_content <- content(response, as="raw") # parsing
raw_content <- content(response, as="text") # parsing
raw_content <- content(response) # parsing json works by default
JSON Files
ID: 2ccc63df-4382-43a4-a611-0a929a24d8e5
library(jsonlite)
fromJSON(url)
fromJSON(json_object)
toJSON(df)
minify(json_obj)
prettify(json_obj)
toJSON(df, pretty=TRUE) # else minified is used by default
Haven Package
library(haven)
read_sas("ontime.sas7bdat")
read_stata()
read_dta()
read_spss()
read_por()
read_sav()
Foreign Package
library(foreign)
read.dta("filename.dta", convert.factors = FALSE, convert.types= TRUE, missing.type = TRUE, convert.underscore = FALSE)
read.spss(file, use.value.labels = FALSE, to.data.frame = FALSE)
Since here the arguments to different functions have different names, it is less consistent. Easier to go through the documentation than explaining it all.
296 Words
2020-10-03 00:00 +0545