🐝A.3) Manually Installing Apache Hive on Windows and Linux : step by step gudiances

 

Apache Hive Installation Guide

Apache Hive is a data warehouse infrastructure built on top of Hadoop. It provides a SQL-like interface to query large datasets stored in HDFS. In this guide, we will cover a step-by-step manual installation of Hive, including necessary configurations for both Linux and Windows systems.

Prerequisites

  • Operating System: Linux (Ubuntu/CentOS) or Windows
  • Java Development Kit (JDK 8 or higher)
  • Hadoop installed and configured
  • MySQL for Hive Metastore (Optional but recommended)
  • Apache Hive package

Required Downloads


Installation Steps for Linux

Step 1: Update System and Install Dependencies

First, ensure your system is updated and install necessary dependencies:

sudo apt update && sudo apt upgrade -y   # For Ubuntu
sudo yum update -y                      # For CentOS

Step 2: Install Java

Check if Java is installed:

java -version

If not installed, install OpenJDK:

sudo apt install openjdk-8-jdk -y  # Ubuntu
sudo yum install java-1.8.0-openjdk -y  # CentOS

Set JAVA_HOME:

echo "export JAVA_HOME=$(dirname $(dirname $(readlink -f $(which java))))" >> ~/.bashrc
source ~/.bashrc

Step 3: Install Hadoop (If Not Installed)

Hive requires Hadoop to be installed. If you haven't installed it, follow these steps:

Download Hadoop:

wget https://downloads.apache.org/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz

Extract and configure Hadoop:

tar -xvzf hadoop-3.3.4.tar.gz
mv hadoop-3.3.4 /usr/local/hadoop

Set environment variables in ~/.bashrc:

echo 'export HADOOP_HOME=/usr/local/hadoop' >> ~/.bashrc
echo 'export PATH=$HADOOP_HOME/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Verify installation:

hadoop version

Step 4: Install Apache Hive

Download Apache Hive:

wget https://downloads.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz

Extract and move Hive to /usr/local:

tar -xvzf apache-hive-3.1.3-bin.tar.gz
mv apache-hive-3.1.3-bin /usr/local/hive

Set environment variables:

echo 'export HIVE_HOME=/usr/local/hive' >> ~/.bashrc
echo 'export PATH=$HIVE_HOME/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Step 5: Configure Hive

Create a directory for Hive metadata:

mkdir -p /usr/local/hive/metastore_db

Edit hive-config.sh to set Hadoop home:

echo 'export HADOOP_HOME=/usr/local/hadoop' >> $HIVE_HOME/bin/hive-config.sh

Step 6: Configure Hive to Use MySQL

Navigate to the Hive conf directory and create/edit hive-site.xml:

nano $HIVE_HOME/conf/hive-site.xml

Paste the following configuration inside:

<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hiveuser</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hivepassword</value>
    </property>
</configuration>

Save the file and exit.

Step 7: Initialize Hive Metastore

Run the following command:

schematool -dbType mysql -initSchema

Step 8: Start Hive Services

Start Hive metastore and server:

hive --service metastore &
hive --service hiveserver2 &

Step 9: Verify Hive Installation

Launch Hive CLI:

hive

Run a test query:

SHOW DATABASES;

Installation Steps for Windows

Step 1: Install Java

Download and install Java JDK 8+. Set JAVA_HOME in System Environment Variables.

Step 2: Install Hadoop

  • Download Hadoop binaries for Windows.
  • Extract and set HADOOP_HOME in System Environment Variables.
  • Add %HADOOP_HOME%\bin to the PATH variable.
  • Verify installation by running hadoop version in Command Prompt.

Step 3: Install Apache Hive

  • Download Hive from Hive Downloads.
  • Extract Hive and set HIVE_HOME in System Environment Variables.
  • Add %HIVE_HOME%\bin to the PATH variable.
  • Create a directory for Hive metadata, e.g., C:\hive\metastore_db.

Step 4: Configure Hive

Navigate to C:\hive\conf and create/edit hive-site.xml.

Paste the same MySQL configuration as above.

Step 5: Start Hive

Run the following commands in Command Prompt:

hive --service metastore
hive --service hiveserver2

Step 6: Verify Hive Installation

Run Hive CLI:

hive

Execute test query:

SHOW DATABASES;

Conclusion

You have successfully installed Apache Hive on both Linux and Windows. If you encounter any issues, check logs in $HIVE_HOME/logs (Linux) or C:\hive\logs (Windows).

Happy querying!

Comments

Popular posts from this blog

🔥Apache Spark Architecture with RDD & DAG

🌐Filtering and Copying Files Dynamically in Azure Data Factory (ADF)

🌐End-to-End ETL Pipeline: MS SQL to MS SQL Using Azure Databricks