Datacamp Beginner Course Notes

Symbol

  • Assign is used as <- rather than =.

Arithmetic Operations

  • Same as other languages.
  • Modulo is represented by %%.
  • Logical variables True or False represented by TRUE and FALSE.

Matrix and Vectors

  • Index starts at 1
  • Methods:
    • rowSum : Find sum across a row.
    • colSum
    • colnames: Specify column names to a matrix.
    • rowNames

Factors

  • Basically these are used for categorical variables.
  • If we have a vector with categorical values, we can use factor() to get it’s categorical values.

Types

  1. Nominal : Categorical Variable without implied order.
  2. Ordinal: Categorical Variable with order.
    • e.g. [0, 1, 2]
    • Defined by passing parameter order = TRUE.
    • Also need to pass levels as levels = vector.
    • By specifying levels for the input vector, we can specify levels and then later use summary(factors_vector) to get the summary on it’s levels.

DataFrame

  • Creating:
    • data.frame(array1, array2… )
  • Indexing:
    • df$column to get only one column of data.
    • Similar indexing as numpy matrices
    • Also can use “column names” to get elements by index.
  • Filtering :
    • subset(df, condition)
  • Sorting:
    • order()
    • Using order to sort whole dataframe, we need to get index using order() and later use it to sort by indexing into the dataframe as df[sorted_indexes, ]

NOTE: We have a built in dataframe known as mtcars.

Lists

  • One dimensional arrays which we call lists in python are called vectors here. Lists basically mean collection of different elements: can be of different types as well.
  • Also can give enumerations for list items using names() which is later used to index list similar to dataframes. e.g. list$name