SupR: Multithreaded and Distributed R for Big Data Analysis

How to Create a Virtual Cluster
  • Install Virtualbox and Ubuntu 20.04
  • Example: Ubuntu 20.04 LTS with Base Memory: 2048 MB, Storage: 40 GB

  • Install Ubuntu 20.04 packages that are needed to install R
    
    ## Run the system commands (in a command line terminal):
    
    sudo apt update
    sudo apt install build-essential -y
    sudo apt install gfortran -y
    sudo apt install libreadline-dev -y
    sudo apt install libx11-dev -y
    sudo apt install libxt-dev -y
    sudo apt install zlib1g-dev -y
    sudo apt install libbz2-dev -y
    sudo apt install liblzma-dev -y
    sudo apt install libpcre3-dev -y
    sudo apt install libcurl4 libcurl4-openssl-dev -y
    sudo apt install texlive-full -y
    sudo apt install libssl-dev -y
    sudo apt install libxml2-dev -y
    sudo apt install openjdk-11-jre-headless -y
    sudo apt install openjdk-11-jdk-headless -y
    # sudo apt install pcre2-utils -y
    sudo apt install libpcre2-dev -y
    sudo apt install libopenblas-dev -y
    sudo apt install liblapack-dev -y
    
    # sudo apt install vim -y
    		 
  • Install openssh (Ref: https://www.simplified.guide/ubuntu/install-ssh-server)
    
    
    ## Run the system commands:
    
    sudo apt update
    sudo apt install openssh-server -y
    sudo systemctl status ssh
    
    ## SSH login without password (http://www.linuxproblem.org/art_9.html)
    ssh-keygen -t rsa
    cat .ssh/id_rsa.pub >> .ssh/authorized_keys
    		 
  • Install Python-3.6
    
    ## Run the system commands:
    
    sudo add-apt-repository ppa:deadsnakes/ppa 
    sudo apt install python3.6 -y
    sudo apt install python3.6-dev -y
    pkg-config --list-all |grep python
    		 
  • Install R-3.6.1
    
    ## Run the system commands:
    
    wget https://cran.r-project.org/src/base/R-3/R-3.6.1.tar.gz
    tar -zxvf R-3.6.1.tar.gz
    cd R-3.6.1
    
    # Using the default location, /usr/local
    ./configure --with-x=yes --enable-R-shlib CFLAGS=' -g3 -pthread -Wall -Wnested-externs -pedantic' CXXFLAGS=' -pthread -lpthread'
    
    # make clean
    make
    sudo make install
    
    # Checking:
    pkg-config --cflags libR
    pkg-config --libs libR
    
    ## Add the following line to the bash run-commands file ~/.bashrc
    ## to set the environment variable R_HOME
    export R_HOME=$(pkg-config --variable=rhome libR)
    # or
    export R_HOME="/usr/local/lib/R"
    
    
    ##
    ## Start a new command terminal application
    which R
    R CMD config --all |grep local
    		 
  • Install the devtools R package (optional)
    
    # Start R
    R
    #In R do
    install.packages("devtools")
    # in case of errors ...
    install.packages("usethis")
    install.packages("devtools")
    		 
  • Install the supr3 package (Not in RStudio)

  • Network configuation

    Follow the follwing steps to set network settings for your Ubuntu machines

    1. Find the main menu of the VirtualBox App (Oracle VM Vitualbox Manager),
      Click File → Host Network Manager...
      Create vboxnet0 (auto-created name), and enable DHCP server
    2. For each guest machine, Choose the guest machine (in VirtualBox App),
      Click Settings → Network → Adapter 2 and
      Enable Network Adapter → Select Attached to "Host-only Adapter"; Name: "vboxnet0"
    3. Start Ubuntu guest machines and check network settings, for example, with the Ubuntu command
      ip -c a
    Reference: https://www.virtualbox.org/manual/ch06.html

  • Install Guest Addition

    To install Guest Addition in any VirtualBox OS follow these steps:

    1. Click on Devices menu → Choose Insert Guest Additions CD image…
    2. Follow the instructions

  • Shared Folders

    1. Create a folder on the host machine to be shared. For example on Mac, Settings → Share → Enable File Sharing
    2. Create a shared folder on the guest machine:
      • Select Devices → Shared Folders...
      • Choose the Add button
      • Select a folder
      • Tick the Auto-mount checkbox. To have access to these shared mounted folders, users in the guest need to be a member of the group vboxsf. This can be done by running the command:
        sudo usermod -aG vboxsf userName
        The guest machine will need to restart to have the new group added.
      • Select the Make permanent option
    3. Create a subdirectory in a shared folder to be used as SUPR_HOME_USER instead of the default subdirectory ~/.supr. Alternatively, create a symbolic link to this subdirectory by running the command
      # mv ~/.supr ~/.supr_save
      ln -s SomeSharedFolder/SuprUserDir ~/.supr

  • Clone Virtual Machines to Create a Virtaul Cluster

    Right-click on the guest machine, select the Clone... option, and follow the instructions.

    1. Change the computer name on Ubuntu Linux (Ref: https://www.cyberciti.biz/faq/ubuntu-change-hostname-command/):
      • Type the following command to edit /etc/hostname using nano or vi text editor:
        sudo nano /etc/hostname
        Delete the old name and setup new name.
      • Next Edit the /etc/hosts file:
        sudo nano /etc/hosts
        Replace any occurrence of the existing computer name with your new one.
      • Reboot the system to changes take effect:
        sudo reboot
    2. Add host entries to the /etc/hosts file, one line for each host in the format in the example:
      
      127.0.0.1	localhost
      127.0.1.1	Ubuntu-03
      192.168.56.101	Ubuntu-01
      192.168.56.102	Ubuntu-02
      192.168.56.103	Ubuntu-03
      
      
      # The following lines are desirable for IPv6 capable hosts
      ::1     ip6-localhost ip6-loopback
      fe00::0 ip6-localnet
      ff00::0 ip6-mcastprefix
      ff02::1 ip6-allnodes
      ff02::2 ip6-allrouters
      	    

  • Miscellaneous Topics

    • Commands for mounting file systems: mount and umount
    • VBoxManage application on the host machine
    For more information, see https://www.virtualbox.org/manual/UserManual.html