- Article
- 8 minutes to read
Note
This article relies on an open source library hosted on GitHub at: https://github.com/mspnp/spark-monitoring. The library supports Azure Databricks 10.x (Spark 3.2.x) and earlier. Azure Databricks 11.0 includes breaking changes to the logging systems that the spark-monitoring library integrates with. The work required to update the spark-monitoring library to support Azure Databricks 11.0 (Spark 3.3.0) and newer is not currently planned.
This article shows how to set up a Grafana dashboard to monitor Azure Databricks jobs for performance issues.
Azure Databricks is a fast, powerful, and collaborative Apache Spark–based analytics service that makes it easy to rapidly develop and deploy big data analytics and artificial intelligence (AI) solutions. Monitoring is a critical component of operating Azure Databricks workloads in production. The first step is to gather metrics into a workspace for analysis. In Azure, the best solution for managing log data is Azure Monitor. Azure Databricks does not natively support sending log data to Azure monitor, but a library for this functionality is available in GitHub.
This library enables logging of Azure Databricks service metrics as well as Apache Spark structure streaming query event metrics. Once you've successfully deployed this library to an Azure Databricks cluster, you can further deploy a set of Grafana dashboards that you can deploy as part of your production environment.
Prerequisites
Configure your Azure Databricks cluster to use the monitoring library, as described in the GitHub readme.
Deploy the Azure Log Analytics workspace
To deploy the Azure Log Analytics workspace, follow these steps:
Navigate to the
/perftools/deployment/loganalytics
directory.Deploy the logAnalyticsDeploy.json Azure Resource Manager template. For more information about deploying Resource Manager templates, see Deploy resources with Resource Manager templates and Azure CLI. The template has the following parameters:
See AlsoMonitor Azure services and applications by using Grafana - Azure MonitorAn Overview of the Airport Security Force (ASF) Pakistan- location: The region where the Log Analytics workspace and dashboards are deployed.
- serviceTier: The workspace pricing tier. See here for a list of valid values.
- dataRetention (optional): The number of days the log data is retained in the Log Analytics workspace. The default value is 30 days. If the pricing tier is
Free
, the data retention must be seven days. - workspaceName (optional): A name for the workspace. If not specified, the template generates a name.
az deployment group create --resource-group <resource-group-name> --template-file logAnalyticsDeploy.json --parameters location='East US' serviceTier='Standalone'
This template creates the workspace and also creates a set of predefined queries that are used by dashboard.
Deploy Grafana in a virtual machine
Grafana is an open source project you can deploy to visualize the time series metrics stored in your Azure Log Analytics workspace using the Grafana plugin for Azure Monitor. Grafana executes on a virtual machine (VM) and requires a storage account, virtual network, and other resources. To deploy a virtual machine with the bitnami-certified Grafana image and associated resources, follow these steps:
Use the Azure CLI to accept the Azure Marketplace image terms for Grafana.
az vm image terms accept --publisher bitnami --offer grafana --plan default
Navigate to the
/spark-monitoring/perftools/deployment/grafana
directory in your local copy of the GitHub repo.Deploy the grafanaDeploy.json Resource Manager template as follows:
export DATA_SOURCE="https://raw.githubusercontent.com/mspnp/spark-monitoring/master/perftools/deployment/grafana/AzureDataSource.sh"az deployment group create \ --resource-group <resource-group-name> \ --template-file grafanaDeploy.json \ --parameters adminPass='<vm password>' dataSource=$DATA_SOURCE
Once the deployment is complete, the bitnami image of Grafana is installed on the virtual machine.
Update the Grafana password
As part of the setup process, the Grafana installation script outputs a temporary password for the admin user. You need this temporary password to sign in. To obtain the temporary password, follow these steps:
- Log in to the Azure portal.
- Select the resource group where the resources were deployed.
- Select the VM where Grafana was installed. If you used the default parameter name in the deployment template, the VM name is prefaced with sparkmonitoring-vm-grafana.
- In the Support + troubleshooting section, click Boot diagnostics to open the boot diagnostics page.
- Click Serial log on the boot diagnostics page.
- Search for the following string: "Setting Bitnami application password to".
- Copy the password to a safe location.
Next, change the Grafana administrator password by following these steps:
- In the Azure portal, select the VM and click Overview.
- Copy the public IP address.
- Open a web browser and navigate to the following URL:
http://<IP address>:3000
. - At the Grafana login screen, enter admin for the user name, and use the Grafana password from the previous steps.
- Once logged in, select Configuration (the gear icon).
- Select Server Admin.
- On the Users tab, select the admin login.
- Update the password.
Create an Azure Monitor data source
Create a service principal that allows Grafana to manage access to your Log Analytics workspace. For more information, see Create an Azure service principal with Azure CLI
(Video) What is Azure Databricks | Create a Stream Processing Solution with Event Hubs and Azure Databricksaz ad sp create-for-rbac --name http://<service principal name> \ --role "Log Analytics Reader" \ --scopes /subscriptions/mySubscriptionID
Note the values for appId, password, and tenant in the output from this command:
{ "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "displayName": "azure-cli-2019-03-27-00-33-39", "name": "http://<service principal name>", "password": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"}
Log into Grafana as described earlier. Select Configuration (the gear icon) and then Data Sources.
In the Data Sources tab, click Add data source.
Select Azure Monitor as the data source type.
In the Settings section, enter a name for the data source in the Name textbox.
In the Azure Monitor API Details section, enter the following information:
- Subscription Id: Your Azure subscription ID.
- Tenant Id: The tenant ID from earlier.
- Client Id: The value of "appId" from earlier.
- Client Secret: The value of "password" from earlier.
In the Azure Log Analytics API Details section, check the Same Details as Azure Monitor API checkbox.
Click Save & Test. If the Log Analytics data source is correctly configured, a success message is displayed.
Create the dashboard
Create the dashboards in Grafana by following these steps:
Navigate to the
/perftools/dashboards/grafana
directory in your local copy of the GitHub repo.Run the following script:
(Video) Expert Q&A: Azure Architecture Center | COM31export WORKSPACE=<your Azure Log Analytics workspace ID>export LOGTYPE=SparkListenerEvent_CLsh DashGen.sh
The output from the script is a file named SparkMonitoringDash.json.
Return to the Grafana dashboard and select Create (the plus icon).
Select Import.
Click Upload .json File.
Select the SparkMonitoringDash.json file created in step 2.
In the Options section, under ALA, select the Azure Monitor data source created earlier.
Click Import.
Visualizations in the dashboards
Both the Azure Log Analytics and Grafana dashboards include a set of time-series visualizations. Each graph is time-series plot of metric data related to an Apache Spark job, stages of the job, and tasks that make up each stage.
The visualizations are:
Job latency
This visualization shows execution latency for a job, which is a coarse view on the overall performance of a job. Displays the job execution duration from start to completion. Note that the job start time is not the same as the job submission time. Latency is represented as percentiles (10%, 30%, 50%, 90%) of job execution indexed by cluster ID and application ID.
Stage latency
The visualization shows the latency of each stage per cluster, per application, and per individual stage. This visualization is useful for identifying a particular stage that is running slowly.
Task latency
This visualization shows task execution latency. Latency is represented as a percentile of task execution per cluster, stage name, and application.
Sum Task Execution per host
This visualization shows the sum of task execution latency per host running on a cluster. Viewing task execution latency per host identifies hosts that have much higher overall task latency than other hosts. This may mean that tasks have been inefficiently or unevenly distributed to hosts.
Task metrics
This visualization shows a set of the execution metrics for a given task's execution. These metrics include the size and duration of a data shuffle, duration of serialization and deserialization operations, and others. For the full set of metrics, view the Log Analytics query for the panel. This visualization is useful for understanding the operations that make up a task and identifying resource consumption of each operation. Spikes in the graph represent costly operations that should be investigated.
Cluster throughput
This visualization is a high-level view of work items indexed by cluster and application to represent the amount of work done per cluster and application. It shows the number of jobs, tasks, and stages completed per cluster, application, and stage in one minute increments.
Streaming Throughput/Latency
This visualization is related to the metrics associated with a structured streaming query. The graph shows the number of input rows per second and the number of rows processed per second. The streaming metrics are also represented per application. These metrics are sent when the OnQueryProgress event is generated as the structured streaming query is processed and the visualization represents streaming latency as the amount of time, in milliseconds, taken to execute a query batch.
Resource consumption per executor
Next is a set of visualizations for the dashboard show the particular type of resource and how it is consumed per executor on each cluster. These visualizations help identify outliers in resource consumption per executor. For example, if the work allocation for a particular executor is skewed, resource consumption will be elevated in relation to other executors running on the cluster. This can be identified by spikes in the resource consumption for an executor.
Executor compute time metrics
Next is a set of visualizations for the dashboard that show the ratio of executor serialize time, deserialize time, CPU time, and Java virtual machine time to overall executor compute time. This demonstrates visually how much each of these four metrics is contributing to overall executor processing.
Shuffle metrics
The final set of visualizations shows the data shuffle metrics associated with a structured streaming query across all executors. These include shuffle bytes read, shuffle bytes written, shuffle memory, and disk usage in queries where the file system is used.
Next steps
Troubleshoot performance bottlenecks
- Monitoring Azure Databricks
- Send Azure Databricks application logs to Azure Monitor
- Modern analytics architecture with Azure Databricks
- Ingestion, ETL, and stream processing pipelines with Azure Databricks
FAQs
How do I monitor Azure Databricks in an Azure Log Analytics workspace? ›
- Prerequisites.
- Building the Azure Databricks monitoring library with Docker.
- Configuring Databricks workspace.
- Create and configure the Azure Databricks cluster.
- Go to the Azure portal.
- Select the Monitor pane.
- Select Metrics.
- Select a resource that you've emitted custom metrics against.
- Select the metrics namespace for your custom metric.
- Select the custom metric.
You can also build your own custom websites and applications using metric and log data in Azure Monitor accessed through a REST API. This approach gives you complete flexibility in UI, visualization, interactivity, and features.
How do I visualize data in Databricks? ›Create a new visualization
To create a visualization, click + above a result and select Visualization. The visualization editor appears. In the Visualization Type drop-down, choose a type. Select the data to appear in the visualization.
Application Insights is an extension of Azure Monitor and provides Application Performance Monitoring (also known as “APM”) features. APM tools are useful to monitor applications from development, through test, and into production in the following ways: Proactively understand how an application is performing.
Does Databricks support dashboard? ›View and organize dashboards
View new dashboards in the workspace browser by clicking. Workspace in the sidebar. These dashboards are viewable, by default, in the Home folder. Users can organize dashboards into folders in the workspace browser along with other Databricks objects.
Monitor is the brand, and Log Analytics is one of the solutions. Log Analytics and Application Insights have been consolidated into Azure Monitor to provide a single integrated experience for monitoring Azure resources and hybrid environments.
Which feature in Azure Monitor can allow you to visually analyze this data? ›Use the Azure Monitor metrics explorer user interface in the Azure portal to investigate the health and utilization of your resources. Metrics explorer helps you plot charts, visually correlate trends, and investigate spikes and dips in metric values.
What is the best way to display metrics? ›- Line graphs. Line graphs illustrate change over time by connecting individual data points. ...
- Gauges. ...
- Bar Graphs. ...
- Geographic maps. ...
- Progress bars. ...
- Color-coded alerts.
- In the Azure portal, select the menu at the top left of the screen.
- Select Dashboard from the menu. Your default Azure dashboard will appear.
- Select + New Dashboard and then Blank dashboard.
- Give your dashboard a name.
- Select Save.
What is the difference between Azure monitor and metrics? ›
Azure Monitor Metrics can only store numeric data in a particular structure, whereas Azure Monitor Logs can store a variety of data types that have their own structures. You can also perform complex analysis on Azure Monitor Logs data by using log queries, which can't be used for analysis of Azure Monitor Metrics data.
What is the difference between Azure dashboard and Azure monitor? ›Azure dashboards are a single pane from which you can oversee your Azure services and infrastructure. With Azure Monitor, these dashboards are broken down into views. Views are tiles that display specific summaries of information.
What 5 key metrics would you want to display on a data dashboard? ›- Traffic sources. sources report will tell you who is coming to your website and where they're coming from. ...
- Social media reach. ...
- CTA – ClickThrough Rates. ...
- Bounce rates. ...
- Progress to goal (monthly or quarterly)
To recap, Data Visualization is the process of presenting information in a visual form. Its purpose is to promote quick and easy understanding of the information. A Dashboard is a snapshot, or summary, of a large set of information. Data Visualization and a Dashboard are often used together.
How do I make a dashboard on Databricks? ›Create a dashboard
[dashboard demo](https://docs.databricks.com/_static/images/dashboards/dashboard-demo-0.png) Give your dashboard a name. Navigate to the View menu and select + New Dashboard. Give your dashboard a name. By default the new dashboard includes all cells that you've created thus far.
This article outlines the types of visualizations available to use in Databricks. In this article: Boxplot. Charts: line, bar, area, pie.
Can Tableau connect to Databricks? ›You can use Databricks Partner Connect to connect a cluster or SQL warehouse with Tableau Desktop in just a few clicks. Make sure your Azure Databricks account, workspace, and the signed-in user all meet the requirements for Partner Connect.
What are the three main functions of Azure Monitor? ›Microsoft combined three unique services—Azure Monitor, Log Analytics, and Application Insights—under the umbrella of Azure Monitor to provide powerful end-to-end monitoring of your applications and the components they rely on. Log Analytics and Application Insights are now features of Azure Monitor.
What are the limitations of Azure Monitor? ›...
Query pre-parsing limits.
Today, Amazon QuickSight is announcing the general availability of a new connector for QuickSight that will enable customers to natively connect to Databricks.
How do I display a plot in Databricks? ›
You can display Matplotlib objects in Python notebooks. %md In Databricks Runtime 6.2 and below, run the `display` command to view the plot. In Databricks Runtime 6.2 and below, run the display command to view the plot.
What is the difference between Databricks and azure Databricks? ›While they are both cloud-based data platforms, Azure Databricks is a proprietary platform from Microsoft that is built on top of the open-source Databricks platform. Databricks is similar to Hortonworks DataFlow in that they are both free and open source data management platforms.
What is Azure monitor metrics? ›Azure Monitor Metrics is a feature of Azure Monitor that collects numeric data from monitored resources into a time-series database. Metrics are numerical values that are collected at regular intervals and describe some aspect of a system at a particular time. Note.
Is Azure monitor part of Azure Security Center? ›The Azure Security Center uses a built-in Azure Policy initiative in audit-only mode (the Azure Security Benchmark) as well as Azure Monitor logs and other Azure security solutions like Microsoft Cloud App Security.
What is the use of Azure dashboard? ›Dashboards are a focused and organized view of your cloud resources in the Azure portal. Use dashboards as a workspace where you can monitor resources and quickly launch tasks for day-to-day operations. Build custom dashboards based on projects, tasks, or user roles, for example.
Which tool can you use to visualize Azure data in the form of interactive dashboards? ›You can use Redash to build dashboards and visualize data. Set up Azure Data Explorer as a data source for Redash, and then visualize the data.
What are the benefits of Azure dashboard? ›Advantages Of Azure Dashboard
Azure dashboard lets you have direct access to all your favorite resources in a way that works for you. Creating a custom dashboard can enable you to quickly consume relevant information, identify issues, and make navigation easier.
Some of the best data visualization tools include Google Charts, Tableau, Grafana, Chartist, FusionCharts, Datawrapper, Infogram, and ChartBlocks etc.
What are metric dashboards? ›What is a metrics dashboard? A metrics dashboard is a tool used to track and display key performance indicators in order to analyze marketing and business efforts over time and across multiple channels.
What is the best tool to create a dashboard? ›Some of the best free dashboard tools for visualizing your metrics and KPIs are Datapad, Grafana, Databox, Dasheroo, FineReport, Metabase, Google Data Studio, Matomo, Tableau, and Inetsoft.
How do I visualize data in Azure? ›
You can create a dashboard in the Azure Data Explorer web UI using the following steps. Alternatively, you can create a dashboard by importing a dashboard file. In the navigation bar, select Dashboards (Preview) and then select New dashboard. Enter a dashboard name and then select Create.
Can we programmatically create an Azure dashboard? ›Programmatically create a dashboard by using Azure CLI
Prepare your environment for the Azure CLI. Use the Bash environment in Azure Cloud Shell. For more information, see Quickstart for Bash in Azure Cloud Shell. If you prefer to run CLI reference commands locally, install the Azure CLI.
Metrics for regression models
The following metrics are reported for evaluating regression models. Mean absolute error (MAE) measures how close the predictions are to the actual outcomes; thus, a lower score is better. Root mean squared error (RMSE) creates a single value that summarizes the error in the model.
There are leading and lagging indicators in business. It is important that managers understand the difference between them and ensure they have both types of metrics to get an accurate picture of performance.
Where are Azure metrics stored? ›Logs in Azure Monitor are stored in a Log Analytics workspace that's based on Azure Data Explorer, which provides a powerful analysis engine and rich query language.
What is alternative to Azure dashboard? ›Splunk, ELK, New Relic, Dynatrace, and Datadog are the most popular alternatives and competitors to Azure Monitor.
What are the two types of process dashboards? ›Operational dashboards look at current performance related to your KPIs. They help organizations understand, in real-time, if their performance is on target. They are often used across various levels of an organization. Analytical dashboards help organizations establish targets based on insights into historical data.
What is the best KPI dashboard? ›Some of the best KPI dashboard software for KPI tracking are Datapad, Klipfolio, Geckoboard, Databox, Mixpanel, Arena Calibrate, Zoho Analytics, Mode, InetSoft, Tableau, Praxie, Smarten Augmented Analytics and DashThis.
What are the 4 basic metrics? ›- The four key metrics are used in different types of organizations. ...
- Accelerate metrics focus on the global outcome, as opposed to massive output. ...
- Deployment Frequency (DF) ...
- Lead Time to Changes (LTTC) ...
- Mean Time to Recovery (MTTR) ...
- Change Failure Rate (CFR)
Key Performance indicators (KPIs) are individual metrics that can be displayed on a dashboard to track key measurements. You can think of them as a single important number that is displayed in large text. Dashboards are a collection of Insights and KPIs put together on a page.
What are the 3 layers of dashboards? ›
A dashboard can also be segregated into three different informational layers: Monitoring information utilizing graphical, metrics data for executives. Analysis information utilizing summarized, dimensional data for analysts.
What is the purpose of using dashboards for data visualization? ›The purpose of using dashboards for data visualization is for users to get a bird's-eye view of the data from each of these platforms in one centralized location, with the ability to quickly understand what it means for the business.
What can I Monitor with Azure Monitor? ›Azure monitor is best tool to monitor whole azure tenant. we can gets overall reports through logs and metrics ad we can exports these logs to storage for audit purpose. these logs are very useful for diagnostics purpose. we can setup alerts in logs and metrics.
What can be monitored using Azure Monitor? ›- Applications.
- Virtual machines.
- Guest operating systems.
- Containers.
- Databases.
- Security events in combination with Azure Sentinel.
- Networking events and health in combination with Network Watcher.
View in portal or configure collection to Azure Monitor using a log profile. Data about the operation and performance of each Azure resource. Metrics collected automatically, view in Metrics Explorer. Configure diagnostic settings to collect logs in Azure Monitor.
How do I Monitor Azure Virtual Desktop by Azure Monitor? ›Open Azure Virtual Desktop Insights
Search for and select Azure Monitor from the Azure portal. Select Insights Hub under Insights, then select Azure Virtual Desktop. Once you have the page open, enter the Subscription, Resource group, Host pool, and Time range of the environment you want to monitor.
Azure Monitor Metrics can only store numeric data in a particular structure, whereas Azure Monitor Logs can store a variety of data types that have their own structures. You can also perform complex analysis on Azure Monitor Logs data by using log queries, which can't be used for analysis of Azure Monitor Metrics data.
What is the difference between Azure monitor and Log Analytics? ›Monitor is the brand, and Log Analytics is one of the solutions. Log Analytics and Application Insights have been consolidated into Azure Monitor to provide a single integrated experience for monitoring Azure resources and hybrid environments.
Which feature of Azure monitor allows you to visually analyze telemetry data? ›Rationale: Application Insights is a feature of Azure Monitor that allows you to analyze telemetry data visually. It is an Application Performance Management (APM) service that detects performance in real-time.
What is Azure Monitor metrics? ›Azure Monitor Metrics is a feature of Azure Monitor that collects numeric data from monitored resources into a time-series database. Metrics are numerical values that are collected at regular intervals and describe some aspect of a system at a particular time. Note.
What are the different types of monitoring in Azure? ›
Microsoft combined three unique services—Azure Monitor, Log Analytics, and Application Insights—under the umbrella of Azure Monitor to provide powerful end-to-end monitoring of your applications and the components they rely on. Log Analytics and Application Insights are now features of Azure Monitor.
How do I create an Azure monitor? ›- In the Azure portal, click All services. In the list of resources, type Monitor. As you begin typing, the list filters based on your input. Select Monitor.
- On the Monitor navigation menu, select Log Analytics and then select a workspace.
- In the menu dropdown on the left in the Azure portal, select Dashboard.
- On the Dashboard pane, select New dashboard > Blank dashboard.
- Enter a name for the dashboard.
- Look at the Tile Gallery for various tiles that you can add to your dashboard.
While logs are about a specific event, metrics are a measurement at a point in time for the system. This unit of measure can have the value, timestamp, and identifier of what that value applies to (like a source or a tag).
What are the two types of data collected in Azure monitor from your environment? ›Overall, Azure Monitor is using two fundamental types of data to work. Those are metrics and logs the tool is collecting into data stores to perform actions such as analysis, alert generation and streaming to connected external systems.