Published on

VMware vRealize Operations Manager

Authors
  • Name
    Jackson Chen

VMware Operations Manager Home

https://docs.vmware.com/en/vRealize-Operations-Manager/index.html

Operations Manager Architecture Guide

vRealize Operations Manager Reference Architecture

Operations Manager Configuration Guide

vRealize Operations Manager Configuration Guide

Operations Manager Secure Configuration Guide

vRealize Operations Manager Secure Configuration Guide

Operations Manager User Guide

vRealize Operations Manager User Guide

Operations Manager Load Balancing

vRealize Operations Manager Load Balancing

Operations Manager Installation and vApp Installation Guide

vRealize Operations Manager Installation and vApp Installation Guide

Operations Manager Best Practices

vRealize Operations Manager Best Practices

Operations Manager API Guide

vRealize Operations Manager API Guide

Operations Manager Reference Guide

vRealize Operations Manager Reference Guide

vRealize Operations Manager - Definitions for Metrics Properties and Alerts

vRealize Operations Manager - Definitions for Metrics Properties and Alerts

vROP Reference Guide - Metrics Proerties and Alerts

vApp Deployment and Configuration

The vRealize Operations Manager vApp Deployment and Configuration Guide provides information about deploying the VMware® vRealize Operations Manager virtual appliance, including how to create and configure the vRealize Operations Manager cluster.

The vRealize Operations Manager installation process consists of deploying the vRealize Operations Manager virtual appliance once for each cluster node, and accessing the product to finish setting up the application.

How to login vROPs bypass SSO
# In the event of not able to login vROPs via Single-Sign-On
1. From Google Chrome browser, access the vROP URL
    https://<vROP-FQDN|IP-address>/ui/login.action?skipSSO=true
2. Login using local credential

How to verify VM or ESXi host network bandwith usage

# In vROPs console, navigate to


Troubleshooting vRealize Operations Components

https://hub.packtpub.com/troubleshooting-in-vrealize-operations-components/

Troubleshooting Services

Some of the most important vRealize Operations services.

Apache2 service

vRealize Operations internally uses the Apache2 web server to host Admin UI, Product UI, and Suite API. Here are some useful service log locations

Log File:   access_log & error_log
Log Loation:    /var/log/apache2
Watchdog service

Watchdog is a vRealize Operations service that maintains the necessary daemons/services and attempts to restart them as necessary should there be any failure. The vcops-watchdog is a Python script that runs every 5 minutes by means of the vcops-watchdog-daemon with the purpose of monitoring the various vRealize Operations services, including CaSA.

The Watchdog service performs the following checks:

  1. PID file of the service
  2. Service status
Log File:  vcops-watchdog.log
Log Location:   /usr/lib/vmware-vcops/user/log/vcops-watchdog/
Collector service

The Collector service sends a heartbeat to the controller every 30 seconds. By default, the Collector will wait for 30 minutes for adapters to synchronize.

The collector properties, including enabling or disabling Self Protection, can be configured from the collector.properties properties file located in

/usr/lib/vmware-vcops/user/conf/collector
Log File:  collector.log
Log Location:   /log/vcops/log/
Controller service

The Controller service is part of the analytics engine. The controller does the decision making on where new objects should reside. The controller manages the storage and retrieval of the inventory of the objects within the system.

The Controller service has the following uses:

  1. It will monitor the collector status every minute
  2. How long a deleted resource is available in the inventory
  3. How long a non-existing resource is stored in the database
Log File:  controller.properties
Log Location:   /usr/lib/vmware-vcops/user/conf/controller/
Databases

vRealize Operations contains quite a few databases, all of which are of great importance for the function of the product.

Cassandra DB

Currently, Cassandra DB stores the following information:

  1. User Preferences and Config
  2. Alerts definition
  3. Customizations
  4. Dashboards, Policies, and View
  5. Reports and Licensing
  6. Shard Maps
  7. Activities

Cassandra stores all the information which we see in the content folder; basically, any settings which are applied globally.

You are able to log into the Cassandra database from any Analytic Node. The information is the same across nodes.

There are two ways to connect to the Cassandra database:

  1. cqlshrc is a command-line tool used to get the data within Cassandra, in a SQL-like fashion (inbuilt).
# To connect to the DB, run the following from the command line (SSH):

$VMWARE_PYTHON_BIN $ALIVE_BASE/cassandra/apache-cassandra-2.1.8/bin/cqlsh --ssl --cqlshrc $ALIVE_BASE/user/conf/cassandra/cqlshrc

Once you are connected to the DB, we need to navigate to the globalpersistence key space using the following command:

vcops_user@cqlsh&gt; use globalpersistence ;
  1. The nodetool command-line tool Once you are logged on to the Cassandra DB, we can run the following commands to see information:
# To list all the relation (tables) in the current database instance
    Describe &lt;

# To list the content of that particular table
    table_name&gt;

# To exit the Cassandra command line
    Exit

# To select any Column data from a table
    select

# To delete any Column data from a table
    delete

The Cassandra database has the following configuration files:

File:  Cassandra.yaml
Location:   /usr/lib/vmware-vcops/user/conf/cassandra/

File:  vcops_cassandra.properties
Location:   /user/conf/cassandra/

File:  Cassandra conf scripts
Location:   /usr/lib/vmware_vcopssuite/utilities/vmware/vcops/cassandra/

The Cassandra.yaml file stores certain information such as the default location to save data (/storage/db/vcops/cassandra/data). The file contains information about all the nodes. When a new node joins the cluster, it refers to this file to make sure it contacts the right node (master node). It also has all the SSL certificate information.

Cassandra is started and stopped via the CaSA service, but just because CaSA is running does not mean that the Cassandra service is necessarily running.

The service command to check the status of the Cassandra DB service from the command line (SSH) is:

service vmware-vcops status cassandra

The Cassandra cassandraservice.Sh is located in:

$VCOPS_BASE/cassandra/bin/
Validate the cluster

By running the following commands from the command line (SSH)

# First, navigate to the directory
cd  /usr/lib/vmware-vcopssuite/utilities/

# Run the following command to validate the cluster
$VMWARE_PYTHON_BIN -m vmware.vcops.cassandra.validate_cluster &lt;IP_ADDRESS_1  IP_ADDRESS_2&gt;

You can also use the nodetool to perform a health check, and possibly resolve database load issues. For the 6.5 release and newer, VMware added the requirement of using a 'maintenanceAdmin' user along with a password file.

$VCOPS_BASE/cassandra/apache-cassandra-2.1.8/bin/nodetool -p 9008 --ssl -u maintenanceAdmin --password-file /usr/lib/vmware-vcops/user/conf/jmxremote.password status

Regardless of which method you choose to perform the health check, if any of the nodes have over 600 MB of load, you should consult with VMware Global Support Services on the next steps to take, and how to elevate the load issues.

Central (Repl DB)

The Postgres database was introduced in 6.1. It has two instances in version 6.6. The Central Postgres DB, also called repl, and the Alerts/HIS Postgres DB, also called data, are two separate database instances under the database called vcopsdb.

The central DB exists only on the master and the master replica nodes when HA is enabled. It is accessible via port 5433 and it is located in /storage/db/vcops/vpostgres/repl.

Currently, the database stores the Resources inventory.

You can connect to the central DB from the command line (SSH). Log in on the analytic node you wish to connect to and run:

# Login to postgres database
su - postgres

Note: The command should not prompt for a password if ran as root.

# Connect to the database instance
/opt/vmware/vpostgres/current/bin/psql -d vcopsdb -p 5433

# The service command to start the central DB from the command line (SSH) is
service vpostgres-repl start
Alerts or HIS (Data) DB

The Alerts DB is called data on all the data nodes including the master and master replica node. It is accessible via port 5432

Location:   /storage/db/vcops/vpostgres/data

The database stores:

  1. Alerts and alarm history
  2. History of resource property data
  3. History of resource relationship

You can connect to the central DB from the command line (SSH). Log in on the analytic node you wish to connect to and run:

# Login to postgres database
su - postgres

Note: The command should not prompt for a password if ran as root.

# Connect to the database instance
/opt/vmware/vpostgres/current/bin/psql -d vcopsdb -p 5432

# The service command to start the Alerts DB from the command line (SSH) is
service vpostgres start
FSDB

The File System Database (FSDB) contains all raw time series metrics and super metrics data for the discovered resources.

What is FSDB in vRealize Operations Manager:

  1. FSDB is a GemFire server and runs inside analytics JVM.
  2. FSDB in vRealize Operations uses the Sharding Manager to distribute data between nodes (new objects).
  3. The File System Database is available in all the nodes of a vRops Cluster deployment.
  4. It has its own properties file.
  5. FSDB stores data (time series data ) collected by adapters and data which is generated/calculated (system, super, badge, CIQ metrics, and so on) based on analysis of that data.

You can check the synchronization state of the FSDB to determine the overall health of the cluster by running the following command from the command line (SSH):

$VMWARE_PYTHON_BIN /usr/lib/vmware-vcops/tools/vrops-platform-cli/vrops-platform-cli.py getShardStateMappingInfo

By restarting the FSDB, you can trigger synchronization of all the data by getting missing data from other FSDBs. Synchronization takes place only when you have vRealize Operations HA configured.

Platform-cli

Platform-cli is a tool by which we can get information from various databases, including the GemFire cache, Cassandra, and the Alerts/HIS persistence databases.

In order to run this Python script, you need to run the following command:

$VMWARE_PYTHON_BIN /usr/lib/vmware-vcops/tools/vrops-platform-cli/vrops-platform-cli.py

The following example of using this command will list all the resources in ascending order and also show you which shard it is stored on:

$VMWARE_PYTHON_BIN /usr/lib/vmware-vcops/tools/vrops-platform-cli/vrops-platform-cli.py getResourceToShardMapping