R programming - Packages

R programming - Packages

A program for statistical analysis - Part 9

This blog will look at various packages and how they are used in R programming to get the expected results.

What is a Package...?

A package is an appropriate way to organize the work and share it with others. Typically, a package will include code, documentation for the package and the functions inside, some tests to check everything works as it should, and data sets.

R packages are a collection of R functions, complied code and sample data. They are stored under a directory called "library" in the R environment. By default, R installs a set of packages during installation. More packages are added later when they are needed for some specific purpose. When we start the R console, only the default packages are available by default. Other packages which are already installed have to be loaded explicitly to be used by the R program that is going to use them.

All the packages available in the R language are listed at R Packages.

What are Repositaries...?

A repository is a place where packages are located and stored so you can install packages from it. Organizations and Developers have a local repository, typically they are online and accessible to everyone.

Installation of R Packages...?

There are two ways to add new R packages. One is installing directly from the CRAN directory and another is downloading the package to your local system and installing it manually.

Install Packages directly from CRAN

The commands used to install the packages directly from CRAN are given below

Syntax:

install.packages("package name")

Example

install.packages("XML")

Install Packages manually

Search for particular R packages on the web and download the required packages as .zip files in the required location in the local system.

install.packages(file_name_with_path, repos = NULL, type = "source")


install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")

How to Load Packages in R Programming Language

Before a package can be used in the code, it must be loaded to the current R environment. You also need to load a package that is already installed previously but not available in the current environment.

A package is loaded using the following command −

library("package Name", lib.loc = "path to library")


install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")

Difference Between a Package and a Library

library():

It is the command used to load a package, and it refers to the place where the package is contained, usually a folder on our computer.

Package:

It is a collection of functions bundled conveniently. The package is an appropriate way to organize our own work and share it with others.

How to Load More Than One Package at a Time

We can just input a vector of names to the install.packages() function to install a package, in the case of the library() function, this is not possible. We can load a set of packages one at a time, or if you prefer, use one of the many workarounds developed by R users.

To unload a given package, use the detach() function. The use will be:

detach("package:babynames", unload = TRUE)

A Short note on how to select the right packages

The most basic way to determine which package we need is by learning the R programming language. The first alternative can be to browse categories of CRAN packages. CRAN is the official repository, and also gives us the option to browse through packages.

Another alternative to finding packages can be R Documentation, a help documentation aggregator for R packages from CRAN, BioConductor, and GitHub, which offers you a search box ready for your requests directly on the main page.

Data Handling and Data Interfaces

In R, we can read data from files stored outside the R environment. We can also write data into files which will be stored and accessed by the operating system. R can read and write into various file formats like csv, excel, xml etc. R allows its users to work smoothly with the systems directories with the help of some pre-defined functions that take the path of the directory as the argument or return the path of the current directory that the user is working on. Below are some directory functions in R:

  1. getwd(): This function is used to get the current working directory used by R.

  2. setwd(): This function in R is used to change the path of the current working directory and the path of the directory is passed as an argument in the function.

  3. list.files(): This function lists all files and folders present in the current working directory.

In the next blog, we will know about various data interfaces in R. Follow Quasar Community blogs for more interesting technical blogs.