![]() ![]() Project name or select it from the dropdown menu. To create a new project, enter the desired project name. You can also optionallyĮdit the project ID displayed below the project name. Read our June 2020 announcement to learn more about the vision behind Delight.If you want to edit this project ID, you mustĭo it now as it cannot be altered after Firebase provisions resources for your Firebase generates a unique ID for your Firebase projectīased upon the name you give it. The release of this freely hosted dashboard and Spark History Server is but the first step towards replacing the Spark UI entirely by a more user-friendly interface display new metrics (memory, cpu, I/O) and visualizations. The collected data will automatically be deleted 30 days after your Spark application completion. Only you (and your colleagues, if you signed up with your company’s google account) will be able to see your application in our dashboard. This information is then stored securely behind an authentication layer. This data is encrypted using your personal access token and sent over the internet using the HTTPS protocol. The agent does * not* collect your application logs either - as typically they may contain sensitive information. The agent does not record sensitive information such as the data that your Spark applications actually work on. For example, for each Spark task there is metadata on memory usage, CPU usage, network traffic ( view a sample event). This is non-sensitive information about the metadata of your Spark application. The agent collects your Spark applications event metrics. Clicking on an application opens up the corresponding Spark UI. That’s it! Your applications will automatically appear on the web dashboard a couple of minutes after they’ve terminated.All the common Spark deployment modes are supported including spark-submit, Spark on Kubernetes using the spark-operator, Apache Livy as well as instructions specific to commercial platforms like Databricks, EMR, and Dataproc. Install the open-source Spark agent on your Spark infrastructure by following the instructions from the Github repo.This token will be used to encrypt the metrics collected by the Spark agent in step 2, and hence make sure the dashboard is only visible to you. Create a free account, then go under settings to create a personal access token.If you plan to run the SHS all the time, these OutOfMemory errors will happen periodically, so you should make sure the SHS automatically restarts in this case.Īccessing the Spark UI of terminated applications using open-sourced Delight (the easy way) The Spark History Server will load the full logs upon startup, and so if these logs are too large, the SHS will be very slow to startup and may run out of memory. Stability tips: - You should define a retention policy on the logs collected in (1). ![]() You should not make the SHS publicly accessible from the internet, as it does not have any authentication mechanism. If your platform does not provide it, run the Spark History Server (here are instructions for deploying it as a Helm Chart on Kubernetes) and give it the required permissions to read the Spark event logs. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |