🐝A.3) Manually Installing Apache Hive on Windows and Linux : step by step gudiances
Apache Hive Installation Guide
Apache Hive is a data warehouse infrastructure built on top of Hadoop. It provides a SQL-like interface to query large datasets stored in HDFS. In this guide, we will cover a step-by-step manual installation of Hive, including necessary configurations for both Linux and Windows systems.
Prerequisites
- Operating System: Linux (Ubuntu/CentOS) or Windows
- Java Development Kit (JDK 8 or higher)
- Hadoop installed and configured
- MySQL for Hive Metastore (Optional but recommended)
- Apache Hive package
Required Downloads
Installation Steps for Linux
Step 1: Update System and Install Dependencies
First, ensure your system is updated and install necessary dependencies:
sudo apt update && sudo apt upgrade -y # For Ubuntu
sudo yum update -y # For CentOS
Step 2: Install Java
Check if Java is installed:
java -version
If not installed, install OpenJDK:
sudo apt install openjdk-8-jdk -y # Ubuntu
sudo yum install java-1.8.0-openjdk -y # CentOS
Set JAVA_HOME:
echo "export JAVA_HOME=$(dirname $(dirname $(readlink -f $(which java))))" >> ~/.bashrc
source ~/.bashrc
Step 3: Install Hadoop (If Not Installed)
Hive requires Hadoop to be installed. If you haven't installed it, follow these steps:
Download Hadoop:
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz
Extract and configure Hadoop:
tar -xvzf hadoop-3.3.4.tar.gz
mv hadoop-3.3.4 /usr/local/hadoop
Set environment variables in ~/.bashrc
:
echo 'export HADOOP_HOME=/usr/local/hadoop' >> ~/.bashrc
echo 'export PATH=$HADOOP_HOME/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
Verify installation:
hadoop version
Step 4: Install Apache Hive
Download Apache Hive:
wget https://downloads.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
Extract and move Hive to /usr/local
:
tar -xvzf apache-hive-3.1.3-bin.tar.gz
mv apache-hive-3.1.3-bin /usr/local/hive
Set environment variables:
echo 'export HIVE_HOME=/usr/local/hive' >> ~/.bashrc
echo 'export PATH=$HIVE_HOME/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
Step 5: Configure Hive
Create a directory for Hive metadata:
mkdir -p /usr/local/hive/metastore_db
Edit hive-config.sh
to set Hadoop home:
echo 'export HADOOP_HOME=/usr/local/hadoop' >> $HIVE_HOME/bin/hive-config.sh
Step 6: Configure Hive to Use MySQL
Navigate to the Hive conf
directory and create/edit hive-site.xml
:
nano $HIVE_HOME/conf/hive-site.xml
Paste the following configuration inside:
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepassword</value>
</property>
</configuration>
Save the file and exit.
Step 7: Initialize Hive Metastore
Run the following command:
schematool -dbType mysql -initSchema
Step 8: Start Hive Services
Start Hive metastore and server:
hive --service metastore &
hive --service hiveserver2 &
Step 9: Verify Hive Installation
Launch Hive CLI:
hive
Run a test query:
SHOW DATABASES;
Installation Steps for Windows
Step 1: Install Java
Download and install Java JDK 8+. Set JAVA_HOME
in System Environment Variables.
Step 2: Install Hadoop
- Download Hadoop binaries for Windows.
- Extract and set
HADOOP_HOME
in System Environment Variables. - Add
%HADOOP_HOME%\bin
to thePATH
variable. - Verify installation by running
hadoop version
in Command Prompt.
Step 3: Install Apache Hive
- Download Hive from Hive Downloads.
- Extract Hive and set
HIVE_HOME
in System Environment Variables. - Add
%HIVE_HOME%\bin
to thePATH
variable. - Create a directory for Hive metadata, e.g.,
C:\hive\metastore_db
.
Step 4: Configure Hive
Navigate to C:\hive\conf
and create/edit hive-site.xml
.
Paste the same MySQL configuration as above.
Step 5: Start Hive
Run the following commands in Command Prompt:
hive --service metastore
hive --service hiveserver2
Step 6: Verify Hive Installation
Run Hive CLI:
hive
Execute test query:
SHOW DATABASES;
Conclusion
You have successfully installed Apache Hive on both Linux and Windows. If you encounter any issues, check logs in $HIVE_HOME/logs
(Linux) or C:\hive\logs
(Windows).
Happy querying!
Comments
Post a Comment