Databricks, An Introduction: Chuck Connell, Insight Digital Innovation
Databricks, An Introduction: Chuck Connell, Insight Digital Innovation
Databricks, An Introduction: Chuck Connell, Insight Digital Innovation
Insight Presentation
Speaker Bio
• Complexity
• Software installs
• Hardware clusters
• File system setup
• Performance tuning
• Security
• Clusters
• Code / Jobs
• Data
• Getting data in
• CSV, JSON, Parquet, LZO, Zip, Avro
• Hive tables
• Azure Blob or Data Lake as DBFS directory
• Any RDBMS with JDBC
• Azure Data Hub, which has many source connectors
• Getting data out
• Write to many file formats
• JBDC and ODBC for programmatic inbound reads
• REST API
• Clusters, DBFS, jobs, libraries, workspaces…
Databricks Goodies
• https://databricks.com/spark/comparing-databricks-to-apache-spark
(Databricks vs Spark)
• http://community.cloud.databricks.com (Community edition)
• https://azure.microsoft.com/en-us/services/databricks (Azure
Databricks)
• https://academy.databricks.com (Databricks training)
• https://docs.databricks.com/spark/latest/mllib/index.html (Machine
learning)
• https://docs.databricks.com/spark/latest/graph-
analysis/graphframes/index.html (GraphFrames)
• https://docs.databricks.com/api/latest/index.html (REST API)
Questions / Discussion…. ??
Thank You
Insight Presentation